langchain_community.document_transformers.html2text.Html2TextTransformer¶

class langchain_community.document_transformers.html2text.Html2TextTransformer(ignore_links: bool = True, ignore_images: bool = True)[source]¶

Replace occurrences of a particular search pattern with a replacement string

Parameters
  • ignore_links (bool) – Whether links should be ignored; defaults to True.

  • ignore_images (bool) – Whether images should be ignored; defaults to True.

Example

Methods

__init__([ignore_links, ignore_images])

atransform_documents(documents, **kwargs)

Asynchronously transform a list of documents.

transform_documents(documents, **kwargs)

Transform a list of documents.

__init__(ignore_links: bool = True, ignore_images: bool = True) None[source]¶
Parameters
  • ignore_links (bool) –

  • ignore_images (bool) –

Return type

None

async atransform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document][source]¶

Asynchronously transform a list of documents.

Parameters
  • documents (Sequence[Document]) – A sequence of Documents to be transformed.

  • kwargs (Any) –

Returns

A list of transformed Documents.

Return type

Sequence[Document]

transform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document][source]¶

Transform a list of documents.

Parameters
  • documents (Sequence[Document]) – A sequence of Documents to be transformed.

  • kwargs (Any) –

Returns

A list of transformed Documents.

Return type

Sequence[Document]

Examples using Html2TextTransformer¶