langchain_community.utilities.pebblo
.PebbloLoaderAPIWrapper¶
- class langchain_community.utilities.pebblo.PebbloLoaderAPIWrapper[source]¶
Bases:
BaseModel
Wrapper for Pebblo Loader API.
Validate that api key in environment.
- param api_key: Optional[str] = None¶
API key for Pebblo Cloud
- param classifier_location: str = 'local'¶
Location of the classifier, local or cloud. Defaults to ‘local’
- param classifier_url: Optional[str] = None¶
URL of the Pebblo Classifier
- param cloud_url: Optional[str] = None¶
URL of the Pebblo Cloud
- build_classification_payload(app: App, docs: List[dict], loader_details: dict, source_owner: str, source_aggregate_size: int, loading_end: bool) dict [source]¶
Build the payload for document classification.
- Parameters
app (App) – App instance.
docs (List[dict]) – List of documents to be classified.
loader_details (dict) – Loader details.
source_owner (str) – Owner of the source.
source_aggregate_size (int) – Aggregate size of the source.
loading_end (bool) – Boolean indicating the halt of data loading by loader.
- Returns
Payload for document classification.
- Return type
dict
- classify_documents(docs_with_id: List[IndexedDocument], app: App, loader_details: dict, loading_end: bool = False) dict [source]¶
Send documents to Pebblo server for classification. Then send classified documents to Daxa cloud(If api_key is present).
- Parameters
docs_with_id (List[IndexedDocument]) – List of documents to be classified.
app (App) – App instance.
loader_details (dict) – Loader details.
loading_end (bool) – Boolean, indicating the halt of data loading by loader.
- Return type
dict
- static make_request(method: str, url: str, headers: dict, payload: Optional[dict] = None, timeout: int = 20) Optional[Response] [source]¶
Make a request to the Pebblo API
- Parameters
method (str) – HTTP method (GET, POST, PUT, DELETE, etc.).
url (str) – URL for the request.
headers (dict) – Headers for the request.
payload (Optional[dict]) – Payload for the request (for POST, PUT, etc.).
timeout (int) – Timeout for the request in seconds.
- Returns
Response object if the request is successful.
- Return type
Optional[Response]
- static prepare_docs_for_classification(docs_with_id: List[IndexedDocument], source_path: str, loader_details: dict) Tuple[List[dict], int] [source]¶
Prepare documents for classification.
- Parameters
docs_with_id (List[IndexedDocument]) – List of documents to be classified.
source_path (str) – Source path of the documents.
loader_details (dict) – Contains loader info.
- Returns
Documents and the aggregate size of the source.
- Return type
Tuple[List[dict], int]
- send_docs_to_pebblo_cloud(payload: dict) None [source]¶
Send documents to Pebblo cloud.
- Parameters
payload (dict) – The payload containing documents to be sent.
- Return type
None