`langchain_text_splitters.markdown`.MarkdownHeaderTextSplitter¶

class langchain_text_splitters.markdown.MarkdownHeaderTextSplitter(headers_to_split_on: List[Tuple[str, str]], return_each_line: bool = False, strip_headers: bool = True)[source]¶

Splitting markdown files based on specified headers.

Create a new MarkdownHeaderTextSplitter.

Parameters

headers_to_split_on (List[Tuple[str, str]]) – Headers we want to track
return_each_line (bool) – Return each line w/ associated headers
strip_headers (bool) – Strip split headers from the content of the chunk

Methods

`__init__`(headers_to_split_on[, ...])	Create a new MarkdownHeaderTextSplitter.
`aggregate_lines_to_chunks`(lines)	Combine lines with common metadata into chunks :param lines: Line of text / associated header metadata
`split_text`(text)	Split markdown file :param text: Markdown file

__init__(headers_to_split_on: List[Tuple[str, str]], return_each_line: bool = False, strip_headers: bool = True)[source]¶

Create a new MarkdownHeaderTextSplitter.

Parameters

headers_to_split_on (List[Tuple[str, str]]) – Headers we want to track
return_each_line (bool) – Return each line w/ associated headers
strip_headers (bool) – Strip split headers from the content of the chunk

aggregate_lines_to_chunks(lines: List[LineType]) → List[Document][source]¶

Combine lines with common metadata into chunks :param lines: Line of text / associated header metadata

Parameters: lines (List[LineType]) –
Return type: List[Document]

split_text(text: str) → List[Document][source]¶

Split markdown file :param text: Markdown file

Parameters: text (str) –
Return type: List[Document]

Examples using MarkdownHeaderTextSplitter¶

MD splits

langchain_text_splitters.markdown.MarkdownHeaderTextSplitter¶

Examples using MarkdownHeaderTextSplitter¶

`langchain_text_splitters.markdown`.MarkdownHeaderTextSplitter¶