langchain.document_transformers.openai_functions.create_metadata_tagger

langchain.document_transformers.openai_functions.create_metadata_tagger(metadata_schema: Union[Dict[str, Any], Type[BaseModel]], llm: BaseLanguageModel, prompt: Optional[ChatPromptTemplate] = None, *, tagging_chain_kwargs: Optional[Dict] = None) OpenAIMetadataTagger[source]
Create a DocumentTransformer that uses an OpenAI function chain to automatically

tag documents with metadata based on their content and an input schema.

Args:
metadata_schema: Either a dictionary or pydantic.BaseModel class. If a dictionary

is passed in, it’s assumed to already be a valid JsonSchema. For best results, pydantic.BaseModels should have docstrings describing what the schema represents and descriptions for the parameters.

llm: Language model to use, assumed to support the OpenAI function-calling API.

Defaults to use “gpt-3.5-turbo-0613”

prompt: BasePromptTemplate to pass to the model.

Returns:

An LLMChain that will pass the given function to the model.

Example:
from langchain.chat_models import ChatOpenAI
from langchain.document_transformers import create_metadata_tagger
from langchain_core.documents import Document

schema = {
    "properties": {
        "movie_title": { "type": "string" },
        "critic": { "type": "string" },
        "tone": {
            "type": "string",
            "enum": ["positive", "negative"]
        },
        "rating": {
            "type": "integer",
            "description": "The number of stars the critic rated the movie"
        }
    },
    "required": ["movie_title", "critic", "tone"]
}

# Must be an OpenAI model that supports functions
llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613")

document_transformer = create_metadata_tagger(schema, llm)
original_documents = [
    Document(page_content="Review of The Bee Movie

By Roger Ebert

This is the greatest movie ever made. 4 out of 5 stars.”),

Document(page_content=”Review of The Godfather

By Anonymous

This movie was super boring. 1 out of 5 stars.”, metadata={“reliable”: False}),

]

enhanced_documents = document_transformer.transform_documents(original_documents)

Examples using create_metadata_tagger