langchain.document_loaders.assemblyai.AssemblyAIAudioTranscriptLoader

class langchain.document_loaders.assemblyai.AssemblyAIAudioTranscriptLoader(file_path: str, *, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, config: Optional[assemblyai.TranscriptionConfig] = None, api_key: Optional[str] = None)[source]

Loader for AssemblyAI audio transcripts.

It uses the AssemblyAI API to transcribe audio files and loads the transcribed text into one or more Documents, depending on the specified format.

To use, you should have the assemblyai python package installed, and the environment variable ASSEMBLYAI_API_KEY set with your API key. Alternatively, the API key can also be passed as an argument.

Audio files can be specified via an URL or a local file path.

Initializes the AssemblyAI AudioTranscriptLoader.

Parameters
  • file_path – An URL or a local file path.

  • transcript_format – Transcript format to use. See class TranscriptFormat for more info.

  • config – Transcription options and features. If None is given, the Transcriber’s default configuration will be used.

  • api_key – AssemblyAI API key.

Methods

__init__(file_path, *[, transcript_format, ...])

Initializes the AssemblyAI AudioTranscriptLoader.

lazy_load()

A lazy loader for Documents.

load()

Transcribes the audio file and loads the transcript into documents.

load_and_split([text_splitter])

Load Documents and split into chunks.

__init__(file_path: str, *, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, config: Optional[assemblyai.TranscriptionConfig] = None, api_key: Optional[str] = None)[source]

Initializes the AssemblyAI AudioTranscriptLoader.

Parameters
  • file_path – An URL or a local file path.

  • transcript_format – Transcript format to use. See class TranscriptFormat for more info.

  • config – Transcription options and features. If None is given, the Transcriber’s default configuration will be used.

  • api_key – AssemblyAI API key.

lazy_load() Iterator[Document]

A lazy loader for Documents.

load() List[Document][source]

Transcribes the audio file and loads the transcript into documents.

It uses the AssemblyAI API to transcribe the audio file and blocks until the transcription is finished.

load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document]

Load Documents and split into chunks. Chunks are returned as Documents.

Parameters

text_splitter – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns

List of Documents.

Examples using AssemblyAIAudioTranscriptLoader