langchain_community.document_loaders.sharepoint
.SharePointLoader¶
- class langchain_community.document_loaders.sharepoint.SharePointLoader[source]¶
Bases:
O365BaseLoader
,BaseLoader
Load from SharePoint.
Create a new model by parsing and validating input data from keyword arguments.
Raises ValidationError if the input data cannot be parsed to form a valid model.
- param auth_with_token: bool = False¶
Whether to authenticate with a token or not. Defaults to False.
- param chunk_size: Union[int, str] = 5242880¶
Number of bytes to retrieve from each api call to the server. int or ‘auto’.
- param document_library_id: str [Required]¶
The ID of the SharePoint document library to load data from.
- param folder_id: Optional[str] = None¶
The ID of the folder to load data from.
- param folder_path: Optional[str] = None¶
The path to the folder to load data from.
- param load_auth: Optional[bool] = False¶
Whether to load authorization identities.
- param load_extended_metadata: Optional[bool] = False¶
Whether to load extended metadata. Size, Owner and full_path.
- param object_ids: Optional[List[str]] = None¶
The IDs of the objects to load data from.
- param recursive: bool = False¶
Should the loader recursively load subfolders?
- param settings: _O365Settings [Optional]¶
Settings for the Office365 API client.
- param token_path: Path = PosixPath('/home/runner/.credentials/o365_token.txt')¶
The path to the token to make api calls
- async alazy_load() → AsyncIterator[Document]¶
A lazy loader for Documents.
- Return type
AsyncIterator[Document]
- authorized_identities(file_id: str) → List[source]¶
Retrieve the access identities (user/group emails) for a given file. :param file_id: The ID of the file. :type file_id: str
- Returns
- A list of group names (email addresses) that have
access to the file.
- Return type
List
- Parameters
file_id (str) –
- get_extended_metadata(file_id: str) → dict[source]¶
Retrieve extended metadata for a file in SharePoint. As of today, following fields are supported in the extended metadata: - size: size of the source file. - owner: display name of the owner of the source file. - full_path: pretty human readable path of the source file. :param file_id: The ID of the file. :type file_id: str
- Returns
- A dictionary containing the extended metadata of the file,
including size, owner, and full path.
- Return type
dict
- Parameters
file_id (str) –
- lazy_load() → Iterator[Document][source]¶
Load documents lazily. Use this when working at a large scale. :Yields: Document – A document object representing the parsed blob.
- Return type
Iterator[Document]
- load_and_split(text_splitter: Optional[TextSplitter] = None) → List[Document]¶
Load Documents and split into chunks. Chunks are returned as Documents.
Do not override this method. It should be considered to be deprecated!
- Parameters
text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
- Returns
List of Documents.
- Return type
List[Document]