Skip to main content
Version: 2.31-unstable

LocalWhisperTranscriber

Use LocalWhisperTranscriber to transcribe audio files using OpenAI's Whisper model using your local installation of Whisper.

Most common position in a pipelineAs the first component in an indexing pipeline
Mandatory run variablessources: A list of paths or binary streams that you want to transcribe
Output variablesdocuments: A list of documents
API referenceWhisper
GitHub linkhttps://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/whisper
Package namewhisper-haystack

Overview​

The component also needs to know which Whisper model to work with. Specify this in the model parameter when initializing the component. All transcription is completed on the executing machine, and the audio is never sent to a third-party provider.

See other optional parameters you can specify in our API documentation.

See the Whisper API documentation and the official Whisper GitHub repo for the supported audio formats and languages.

The LocalWhisperTranscriber is part of the whisper-haystack integration package. To work with it, install the package along with Whisper (which also pulls in torch) using the following commands:

python
pip install whisper-haystack
pip install -U openai-whisper

Usage​

On its own​

Here’s an example of how to use LocalWhisperTranscriber on its own:

python
import requests
from haystack_integrations.components.audio.whisper import LocalWhisperTranscriber

response = requests.get(
"https://ia903102.us.archive.org/19/items/100-Best--Speeches/EK_19690725_64kb.mp3",
)
with open("kennedy_speech.mp3", "wb") as file:
file.write(response.content)

transcriber = LocalWhisperTranscriber(model="tiny")

transcription = transcriber.run(sources=["./kennedy_speech.mp3"])
print(transcription["documents"][0].content)

In a pipeline​

The pipeline below fetches an audio file from a specified URL and transcribes it. It first retrieves the audio file using LinkContentFetcher, then transcribes the audio into text with LocalWhisperTranscriber, and finally outputs the transcription text.

python
from haystack_integrations.components.audio.whisper import LocalWhisperTranscriber
from haystack.components.fetchers import LinkContentFetcher
from haystack import Pipeline

pipe = Pipeline()
pipe.add_component("fetcher", LinkContentFetcher())
pipe.add_component("transcriber", LocalWhisperTranscriber(model="tiny"))

pipe.connect("fetcher", "transcriber")
result = pipe.run(
data={
"fetcher": {
"urls": [
"https://ia903102.us.archive.org/19/items/100-Best--Speeches/EK_19690725_64kb.mp3",
],
},
},
)
print(result["transcriber"]["documents"][0].content)

Additional References​

πŸ§‘β€πŸ³ Cookbook: Multilingual RAG from a podcast with Whisper, Qdrant and Mistral