langchain_community.document_loaders.glue_catalog.GlueCatalogLoader¶

class langchain_community.document_loaders.glue_catalog.GlueCatalogLoader(database: str, *, session: Optional[Session] = None, profile_name: Optional[str] = None, table_filter: Optional[List[str]] = None)[source]¶

Load table schemas from AWS Glue.

This loader fetches the schema of each table within a specified AWS Glue database. The schema details include column names and their data types, similar to pandas dtype representation.

AWS credentials are automatically loaded using boto3, following the standard AWS method: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html

If a specific AWS profile is required, it can be specified and will be used to establish the session.

Initialize Glue database loader.

Parameters
  • database (str) – The name of the Glue database from which to load table schemas.

  • session (Optional[Session]) – Optional. A boto3 Session object. If not provided, a new session will be created.

  • profile_name (Optional[str]) – Optional. The name of the AWS profile to use for credentials.

  • table_filter (Optional[List[str]]) – Optional. List of table names to fetch schemas for, fetching all if None.

Methods

__init__(database, *[, session, ...])

Initialize Glue database loader.

alazy_load()

A lazy loader for Documents.

aload()

Load data into Document objects.

lazy_load()

Lazily load table schemas as Document objects.

load()

Load data into Document objects.

load_and_split([text_splitter])

Load Documents and split into chunks.

__init__(database: str, *, session: Optional[Session] = None, profile_name: Optional[str] = None, table_filter: Optional[List[str]] = None)[source]¶

Initialize Glue database loader.

Parameters
  • database (str) – The name of the Glue database from which to load table schemas.

  • session (Optional[Session]) – Optional. A boto3 Session object. If not provided, a new session will be created.

  • profile_name (Optional[str]) – Optional. The name of the AWS profile to use for credentials.

  • table_filter (Optional[List[str]]) – Optional. List of table names to fetch schemas for, fetching all if None.

async alazy_load() AsyncIterator[Document]¶

A lazy loader for Documents.

Return type

AsyncIterator[Document]

async aload() List[Document]¶

Load data into Document objects.

Return type

List[Document]

lazy_load() Iterator[Document][source]¶

Lazily load table schemas as Document objects.

Yields

Document objects, each representing the schema of a table.

Return type

Iterator[Document]

load() List[Document]¶

Load data into Document objects.

Return type

List[Document]

load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document]¶

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters

text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns

List of Documents.

Return type

List[Document]

Examples using GlueCatalogLoader¶