langchain.document_transformers.html2text.Html2TextTransformer

class langchain.document_transformers.html2text.Html2TextTransformer(ignore_links: bool = True, ignore_images: bool = True)[source]

Replace occurrences of a particular search pattern with a replacement string

Parameters
  • ignore_links – Whether links should be ignored; defaults to True.

  • ignore_images – Whether images should be ignored; defaults to True.

Example

Methods

__init__([ignore_links, ignore_images])

atransform_documents(documents, **kwargs)

Asynchronously transform a list of documents.

transform_documents(documents, **kwargs)

Transform a list of documents.

__init__(ignore_links: bool = True, ignore_images: bool = True) None[source]
async atransform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document][source]

Asynchronously transform a list of documents.

Parameters

documents – A sequence of Documents to be transformed.

Returns

A list of transformed Documents.

transform_documents(documents: Sequence[Document], **kwargs: Any) Sequence[Document][source]

Transform a list of documents.

Parameters

documents – A sequence of Documents to be transformed.

Returns

A list of transformed Documents.

Examples using Html2TextTransformer