langchain.document_loaders.obs_directory.OBSDirectoryLoader

class langchain.document_loaders.obs_directory.OBSDirectoryLoader(bucket: str, endpoint: str, config: Optional[dict] = None, prefix: str = '')[source]

Load from Huawei OBS directory.

Initialize the OBSDirectoryLoader with the specified settings.

Parameters
  • bucket (str) – The name of the OBS bucket to be used.

  • endpoint (str) – The endpoint URL of your OBS bucket.

  • config (dict) – The parameters for connecting to OBS, provided as a dictionary. The dictionary could have the following keys: - “ak” (str, optional): Your OBS access key (required if get_token_from_ecs is False and bucket policy is not public read). - “sk” (str, optional): Your OBS secret key (required if get_token_from_ecs is False and bucket policy is not public read). - “token” (str, optional): Your security token (required if using temporary credentials). - “get_token_from_ecs” (bool, optional): Whether to retrieve the security token from ECS. Defaults to False if not provided. If set to True, ak, sk, and token will be ignored.

  • prefix (str, optional) – The prefix to be added to the OBS key. Defaults to “”.

Note

Before using this class, make sure you have registered with OBS and have the necessary credentials. The ak, sk, and endpoint values are mandatory unless get_token_from_ecs is True or the bucket policy is public read. token is required when using temporary credentials.

Example

To create a new OBSDirectoryLoader: ``` config = {

“ak”: “your-access-key”, “sk”: “your-secret-key”

directory_loader = OBSDirectoryLoader(“your-bucket-name”, “your-end-endpoint”, config, “your-prefix”)

Methods

__init__(bucket, endpoint[, config, prefix])

Initialize the OBSDirectoryLoader with the specified settings.

lazy_load()

A lazy loader for Documents.

load()

Load documents.

load_and_split([text_splitter])

Load Documents and split into chunks.

__init__(bucket: str, endpoint: str, config: Optional[dict] = None, prefix: str = '')[source]

Initialize the OBSDirectoryLoader with the specified settings.

Parameters
  • bucket (str) – The name of the OBS bucket to be used.

  • endpoint (str) – The endpoint URL of your OBS bucket.

  • config (dict) – The parameters for connecting to OBS, provided as a dictionary. The dictionary could have the following keys: - “ak” (str, optional): Your OBS access key (required if get_token_from_ecs is False and bucket policy is not public read). - “sk” (str, optional): Your OBS secret key (required if get_token_from_ecs is False and bucket policy is not public read). - “token” (str, optional): Your security token (required if using temporary credentials). - “get_token_from_ecs” (bool, optional): Whether to retrieve the security token from ECS. Defaults to False if not provided. If set to True, ak, sk, and token will be ignored.

  • prefix (str, optional) – The prefix to be added to the OBS key. Defaults to “”.

Note

Before using this class, make sure you have registered with OBS and have the necessary credentials. The ak, sk, and endpoint values are mandatory unless get_token_from_ecs is True or the bucket policy is public read. token is required when using temporary credentials.

Example

To create a new OBSDirectoryLoader: ``` config = {

“ak”: “your-access-key”, “sk”: “your-secret-key”

directory_loader = OBSDirectoryLoader(“your-bucket-name”, “your-end-endpoint”, config, “your-prefix”)

lazy_load() Iterator[Document]

A lazy loader for Documents.

load() List[Document][source]

Load documents.

load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document]

Load Documents and split into chunks. Chunks are returned as Documents.

Parameters

text_splitter – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.

Returns

List of Documents.

Examples using OBSDirectoryLoader