create_metadata_tagger#
- langchain_community.document_transformers.openai_functions.create_metadata_tagger(metadata_schema: Dict[str, Any] | Type[BaseModel], llm: BaseLanguageModel, prompt: ChatPromptTemplate | None = None, *, tagging_chain_kwargs: Dict | None = None) OpenAIMetadataTagger [source]#
- Create a DocumentTransformer that uses an OpenAI function chain to automatically
tag documents with metadata based on their content and an input schema.
- Args:
- metadata_schema: Either a dictionary or pydantic.BaseModel class. If a dictionary
is passed in, itโs assumed to already be a valid JsonSchema. For best results, pydantic.BaseModels should have docstrings describing what the schema represents and descriptions for the parameters.
- llm: Language model to use, assumed to support the OpenAI function-calling API.
Defaults to use โgpt-3.5-turbo-0613โ
prompt: BasePromptTemplate to pass to the model.
- Returns:
An LLMChain that will pass the given function to the model.
- Example:
from langchain_community.chat_models import ChatOpenAI from langchain_community.document_transformers import create_metadata_tagger from langchain_core.documents import Document schema = { "properties": { "movie_title": { "type": "string" }, "critic": { "type": "string" }, "tone": { "type": "string", "enum": ["positive", "negative"] }, "rating": { "type": "integer", "description": "The number of stars the critic rated the movie" } }, "required": ["movie_title", "critic", "tone"] } # Must be an OpenAI model that supports functions llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo-0613") document_transformer = create_metadata_tagger(schema, llm) original_documents = [ Document(page_content="Review of The Bee Movie
By Roger Ebert
- This is the greatest movie ever made. 4 out of 5 stars.โ),
Document(page_content=โReview of The Godfather
By Anonymous
- This movie was super boring. 1 out of 5 stars.โ, metadata={โreliableโ: False}),
]
enhanced_documents = document_transformer.transform_documents(original_documents)
- Parameters:
metadata_schema (Dict[str, Any] | Type[BaseModel]) โ
llm (BaseLanguageModel) โ
prompt (ChatPromptTemplate | None) โ
tagging_chain_kwargs (Dict | None) โ
- Return type:
Examples using create_metadata_tagger