langchain_community.document_loaders.sharepoint.SharePointLoader

class langchain_community.document_loaders.sharepoint.SharePointLoader[source]

Bases: O365BaseLoader, BaseLoader

Load from SharePoint.

Create a new model by parsing and validating input data from keyword arguments.

Raises ValidationError if the input data cannot be parsed to form a valid model.

param auth_with_token: bool = False

Whether to authenticate with a token or not. Defaults to False.

param chunk_size: Union[int, str] = 5242880

Number of bytes to retrieve from each api call to the server. int or ‘auto’.

param document_library_id: str [Required]

The ID of the SharePoint document library to load data from.

param file_id: Optional[str] = None

The ID of the file for which we need auth identities

param folder_id: Optional[str] = None

The ID of the folder to load data from.

param folder_path: Optional[str] = None

The path to the folder to load data from.

param load_auth: Optional[bool] = False

Whether to load authorization identities.

param load_extended_metadata: Optional[bool] = False

Whether to load extended metadata. Size, Owner and full_path.

param object_ids: Optional[List[str]] = None

The IDs of the objects to load data from.

param recursive: bool = False

Should the loader recursively load subfolders?

param settings: _O365Settings [Optional]

Settings for the Office365 API client.

param site_id: Optional[str] = None

The ID of the Sharepoint site of the user where the file is present

param token_path: Path = PosixPath('/home/runner/.credentials/o365_token.txt')

The path to the token to make api calls

async alazy_load() AsyncIterator[Document]

A lazy loader for Documents.

Return type

AsyncIterator[Document]

async aload() List[Document]

Load data into Document objects.

Return type

List[Document]

authorized_identities(file_id: str) List[source]

Retrieve the access identities (user/group emails) for a given file. :param file_id: The ID of the file. :type file_id: str

Returns

A list of group names (email addresses) that have

access to the file.

Return type

List

Parameters

file_id (str) –

get_extended_metadata(file_id: str) dict[source]

Retrieve extended metadata for a file in SharePoint. As of today, following fields are supported in the extended metadata: - size: size of the source file. - owner: display name of the owner of the source file. - full_path: pretty human readable path of the source file. :param file_id: The ID of the file. :type file_id: str

Returns

A dictionary containing the extended metadata of the file,

including size, owner, and full path.

Return type

dict

Parameters

file_id (str) –

lazy_load() Iterator[Document][source]

Load documents lazily. Use this when working at a large scale. :Yields: Document – A document object representing the parsed blob.

Return type

Iterator[Document]

load() List[Document]

Load data into Document objects.

Return type

List[Document]

load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document]

Load Documents and split into chunks. Chunks are returned as Documents.

Do not override this method. It should be considered to be deprecated!

Parameters

text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns

List of Documents.

Return type

List[Document]

Examples using SharePointLoader