Transcribe and parse audio files using Azure OpenAI Whisper.
This parser integrates with the Azure OpenAI Whisper model to transcribe audio files. It differs from the standard OpenAI Whisper parser, requiring an Azure endpoint and credentials. The parser is limited to files under 25 MB.
Note: This parser uses the Azure OpenAI API, providing integration with the Azure ecosystem, and making it suitable for workflows involving other Azure services.
For files larger than 25 MB, consider using Azure AI Speech batch transcription: https://learn.microsoft.com/azure/ai-services/speech-service/batch-transcription-create?pivots=rest-api#use-a-whisper-model
Transcribe and parse audio files.
Audio transcription is with OpenAI Whisper model.
Transcribe and parse audio files with OpenAI Whisper model.
Audio transcription with OpenAI Whisper model locally from transformers.
Transcribe and parse audio files. Audio transcription is with OpenAI Whisper model.
Transcribe and parse audio files with faster-whisper.
faster-whisper is a reimplementation of OpenAI's Whisper model using CTranslate2, which is up to 4 times faster than openai/whisper for the same accuracy while using less memory. The efficiency can be further improved with 8-bit quantization on both CPU and GPU.
It can automatically detect the following 14 languages and transcribe the text into their respective languages: en, zh, fr, de, ja, ko, ru, es, th, it, pt, vi, ar, tr.
The gitbub repository for faster-whisper is : https://github.com/SYSTRAN/faster-whisper