| Name | Type | Description |
|---|---|---|
api_key | Optional[str] | Default: NoneAzure OpenAI API key. If not provided, defaults to the
|
azure_endpoint | Optional[str] | Default: NoneAzure OpenAI service endpoint. Defaults to |
api_version | Optional[str] | Default: None |
azure_ad_token_provider | Union[Callable[[], str], None] | Default: None |
language | Optional[str] | Default: None |
prompt | Optional[str] | Default: None |
response_format | Union[str, None] | Default: None |
temperature | Optional[float] | Default: None |
deployment_name* | str | |
max_retries | int | Default: 3 |
| Name | Type |
|---|---|
| api_key | Optional[str] |
| azure_endpoint | Optional[str] |
| api_version | Optional[str] |
| azure_ad_token_provider | Union[Callable[[], str], None] |
| language | Optional[str] |
| prompt | Optional[str] |
| response_format | Union[Literal['json', 'text', 'srt', 'verbose_json', 'vtt'], None] |
| temperature | Optional[float] |
| deployment_name | str |
| max_retries | int |
Transcribe and parse audio files using Azure OpenAI Whisper.
This parser integrates with the Azure OpenAI Whisper model to transcribe audio files. It differs from the standard OpenAI Whisper parser, requiring an Azure endpoint and credentials. The parser is limited to files under 25 MB.
Note: This parser uses the Azure OpenAI API, providing integration with the Azure ecosystem, and making it suitable for workflows involving other Azure services.
For files larger than 25 MB, consider using Azure AI Speech batch transcription: https://learn.microsoft.com/azure/ai-services/speech-service/batch-transcription-create?pivots=rest-api#use-a-whisper-model
Setup:
langchain and set the following environment variables:.. code-block:: bash
pip install -U langchain langchain-community
export AZURE_OPENAI_API_KEY="your-api-key"
export AZURE_OPENAI_ENDPOINT="https://your-endpoint.openai.azure.com/"
export OPENAI_API_VERSION="your-api-version"
Example Usage:
.. code-block:: python
from langchain_classic.community import AzureOpenAIWhisperParser
whisper_parser = AzureOpenAIWhisperParser( deployment_name="your-whisper-deployment", api_version="2024-06-01", api_key="your-api-key", # other params... )
audio_blob = Blob(path="your-audio-file-path") response = whisper_parser.lazy_parse(audio_blob)
for document in response: print(document.page_content)
Integration with Other Loaders:
The AzureOpenAIWhisperParser can be used with video/audio loaders and
GenericLoader to automate retrieval and parsing.
YoutubeAudioLoader Example:
.. code-block:: python
from langchain_community.document_loaders.blob_loaders import ( YoutubeAudioLoader ) from langchain_community.document_loaders.generic import GenericLoader
youtube_url = ["https://your-youtube-url"] save_dir = "directory-to-download-videos"
loader = GenericLoader( YoutubeAudioLoader(youtube_url, save_dir), AzureOpenAIWhisperParser(deployment_name="your-deployment-name") )
docs = loader.load()
API version to use,
defaults to the OPENAI_API_VERSION environment variable.
Azure Active Directory token for authentication (if applicable).
Language in which the request should be processed.
Custom instructions or prompt for the Whisper model.
The desired output format. Options: "json", "text", "srt", "verbose_json", "vtt".
Controls the randomness of the model's output.
The deployment name of the Whisper model.
Maximum number of retries for failed API requests.