`langchain_community.document_loaders.parsers.audio`.FasterWhisperParser¶

class langchain_community.document_loaders.parsers.audio.FasterWhisperParser(*, device: Optional[str] = 'cuda', model_size: Optional[str] = None)[source]¶

Transcribe and parse audio files with faster-whisper.

faster-whisper is a reimplementation of OpenAI’s Whisper model using CTranslate2, which is up to 4 times faster than openai/whisper for the same accuracy while using less memory. The efficiency can be further improved with 8-bit quantization on both CPU and GPU.

It can automatically detect the following 14 languages and transcribe the text into their respective languages: en, zh, fr, de, ja, ko, ru, es, th, it, pt, vi, ar, tr.

The gitbub repository for faster-whisper is : https://github.com/SYSTRAN/faster-whisper

Example: Load a YouTube video and transcribe the video speech into a document.

from langchain.document_loaders.generic import GenericLoader
from langchain_community.document_loaders.parsers.audio
    import FasterWhisperParser
from langchain.document_loaders.blob_loaders.youtube_audio
    import YoutubeAudioLoader


url="https://www.youtube.com/watch?v=your_video"
save_dir="your_dir/"
loader = GenericLoader(
    YoutubeAudioLoader([url],save_dir),
    FasterWhisperParser()
)
docs = loader.load()

Initialize the parser.

Parameters

device (Optional[str]) – It can be “cuda” or “cpu” based on the available device.
model_size (Optional[str]) – There are four model sizes to choose from: “base”, “small”, “medium”, and “large-v3”, based on the available GPU memory.

Methods

`__init__`(*[, device, model_size])	Initialize the parser.
`lazy_parse`(blob)	Lazily parse the blob.
`parse`(blob)	Eagerly parse the blob into a document or documents.

__init__(*, device: Optional[str] = 'cuda', model_size: Optional[str] = None)[source]¶

Initialize the parser.

Parameters

device (Optional[str]) – It can be “cuda” or “cpu” based on the available device.
model_size (Optional[str]) – There are four model sizes to choose from: “base”, “small”, “medium”, and “large-v3”, based on the available GPU memory.

lazy_parse(blob: Blob) → Iterator[Document][source]¶

Lazily parse the blob.

Parameters: blob (Blob) –
Return type: Iterator[Document]

parse(blob: Blob) → List[Document]¶

Eagerly parse the blob into a document or documents.

This is a convenience method for interactive development environment.

Production applications should favor the lazy_parse method instead.

Subclasses should generally not over-ride this parse method.

Parameters: blob (Blob) – Blob instance
Returns: List of documents
Return type: List[Document]

langchain_community.document_loaders.parsers.audio.FasterWhisperParser¶

`langchain_community.document_loaders.parsers.audio`.FasterWhisperParser¶