langchain.text_splitter.MarkdownHeaderTextSplitter

class langchain.text_splitter.MarkdownHeaderTextSplitter(headers_to_split_on: List[Tuple[str, str]], return_each_line: bool = False)[source]

Splitting markdown files based on specified headers.

Create a new MarkdownHeaderTextSplitter.

Parameters
  • headers_to_split_on – Headers we want to track

  • return_each_line – Return each line w/ associated headers

Methods

__init__(headers_to_split_on[, return_each_line])

Create a new MarkdownHeaderTextSplitter.

aggregate_lines_to_chunks(lines)

Combine lines with common metadata into chunks :param lines: Line of text / associated header metadata

split_text(text)

Split markdown file :param text: Markdown file

__init__(headers_to_split_on: List[Tuple[str, str]], return_each_line: bool = False)[source]

Create a new MarkdownHeaderTextSplitter.

Parameters
  • headers_to_split_on – Headers we want to track

  • return_each_line – Return each line w/ associated headers

aggregate_lines_to_chunks(lines: List[LineType]) List[Document][source]

Combine lines with common metadata into chunks :param lines: Line of text / associated header metadata

split_text(text: str) List[Document][source]

Split markdown file :param text: Markdown file

Examples using MarkdownHeaderTextSplitter