Ask a question to get started

Class●Since v0.3

AzureOpenAIWhisperParser

AzureOpenAIWhisperParser(
  self,
  *,
  api_key: Optional[str] = None,
  azure_endpoint:

Bases

BaseBlobParser

Constructors

Attributes

Methods

Inherited fromBaseBlobParser(langchain_core)

Methods

Mparse

View source on GitHub

azure_ad_token_provider

Parameters

Name	Type	Description
`api_key`	`Optional[str]`	Default:`None` Azure OpenAI API key. If not provided, defaults to the `AZURE_OPENAI_API_KEY` environment variable.
`azure_endpoint`	`Optional[str]`	Default:`None` Azure OpenAI service endpoint. Defaults to `AZURE_OPENAI_ENDPOINT` environment variable if not set.
`api_version`	`Optional[str]`	Default:`None`
`azure_ad_token_provider`	`Union[Callable[[], str], None]`	Default:`None`
`language`	`Optional[str]`	Default:`None`
`prompt`	`Optional[str]`	Default:`None`
`response_format`	`Union[str, None]`	Default:`None`
`temperature`	`Optional[float]`	Default:`None`
`deployment_name`*	`str`
`max_retries`	`int`	Default:`3`

constructor

__init__

Name	Type
api_key	Optional[str]
azure_endpoint	Optional[str]
api_version	Optional[str]
azure_ad_token_provider	Union[Callable[[], str], None]
language	Optional[str]
prompt	Optional[str]
response_format	Union[Literal['json', 'text', 'srt', 'verbose_json', 'vtt'], None]
temperature	Optional[float]
deployment_name	str
max_retries	int

Transcribe and parse audio files using Azure OpenAI Whisper.

This parser integrates with the Azure OpenAI Whisper model to transcribe audio files. It differs from the standard OpenAI Whisper parser, requiring an Azure endpoint and credentials. The parser is limited to files under 25 MB.

Note: This parser uses the Azure OpenAI API, providing integration with the Azure ecosystem, and making it suitable for workflows involving other Azure services.

For files larger than 25 MB, consider using Azure AI Speech batch transcription: https://learn.microsoft.com/azure/ai-services/speech-service/batch-transcription-create?pivots=rest-api#use-a-whisper-model

Setup:

Follow the instructions here to deploy Azure Whisper: https://learn.microsoft.com/azure/ai-services/openai/whisper-quickstart?tabs=command-line%2Cpython-new&pivots=programming-language-python
Install langchain and set the following environment variables:

.. code-block:: bash

pip install -U langchain langchain-community

export AZURE_OPENAI_API_KEY="your-api-key"
export AZURE_OPENAI_ENDPOINT="https://your-endpoint.openai.azure.com/"
export OPENAI_API_VERSION="your-api-version"

Example Usage:

.. code-block:: python

from langchain_classic.community import AzureOpenAIWhisperParser

whisper_parser = AzureOpenAIWhisperParser( deployment_name="your-whisper-deployment", api_version="2024-06-01", api_key="your-api-key", # other params... )

audio_blob = Blob(path="your-audio-file-path") response = whisper_parser.lazy_parse(audio_blob)

for document in response: print(document.page_content)

Integration with Other Loaders:

The AzureOpenAIWhisperParser can be used with video/audio loaders and GenericLoader to automate retrieval and parsing.

YoutubeAudioLoader Example:

.. code-block:: python

from langchain_community.document_loaders.blob_loaders import ( YoutubeAudioLoader ) from langchain_community.document_loaders.generic import GenericLoader

Must be a list

youtube_url = ["https://your-youtube-url"] save_dir = "directory-to-download-videos"

loader = GenericLoader( YoutubeAudioLoader(youtube_url, save_dir), AzureOpenAIWhisperParser(deployment_name="your-deployment-name") )

docs = loader.load()

API version to use, defaults to the OPENAI_API_VERSION environment variable.

Azure Active Directory token for authentication (if applicable).

Language in which the request should be processed.

Custom instructions or prompt for the Whisper model.

The desired output format. Options: "json", "text", "srt", "verbose_json", "vtt".

Controls the randomness of the model's output.

The deployment name of the Whisper model.

Maximum number of retries for failed API requests.

LangChain Assistant

Menu

AzureOpenAIWhisperParser

Bases

Constructors

Attributes

Methods

Inherited fromBaseBlobParser(langchain_core)

Methods

Parameters

Must be a list

Menu

AzureOpenAIWhisperParser

Bases

Used in Docs

Constructors

Attributes

Methods

Inherited fromBaseBlobParser(langchain_core)

Methods

Parameters

Must be a list