langchain.document_loaders.blob_loaders.file_system
.FileSystemBlobLoader¶
- class langchain.document_loaders.blob_loaders.file_system.FileSystemBlobLoader(path: Union[str, Path], *, glob: str = '**/[!.]*', exclude: Sequence[str] = (), suffixes: Optional[Sequence[str]] = None, show_progress: bool = False)[source]¶
Load blobs in the local file system.
Example:
from langchain.document_loaders.blob_loaders import FileSystemBlobLoader loader = FileSystemBlobLoader("/path/to/directory") for blob in loader.yield_blobs(): print(blob)
Initialize with a path to directory and how to glob over it.
- Parameters
path – Path to directory to load from
glob – Glob pattern relative to the specified path by default set to pick up all non-hidden files
exclude – patterns to exclude from results, use glob syntax
suffixes – Provide to keep only files with these suffixes Useful when wanting to keep files with different suffixes Suffixes must include the dot, e.g. “.txt”
show_progress – If true, will show a progress bar as the files are loaded. This forces an iteration through all matching files to count them prior to loading them.
Examples
# Recursively load all text files in a directory. loader = FileSystemBlobLoader("/path/to/directory", glob="**/*.txt") # Recursively load all non-hidden files in a directory. loader = FileSystemBlobLoader("/path/to/directory", glob="**/[!.]*") # Load all files in a directory without recursion. loader = FileSystemBlobLoader("/path/to/directory", glob="*") # Recursively load all files in a directory, except for py or pyc files. loader = FileSystemBlobLoader( "/path/to/directory", glob="**/*.txt", exclude=["**/*.py", "**/*.pyc"] )
Methods
__init__
(path, *[, glob, exclude, suffixes, ...])Initialize with a path to directory and how to glob over it.
Count files that match the pattern without loading them.
Yield blobs that match the requested pattern.
- __init__(path: Union[str, Path], *, glob: str = '**/[!.]*', exclude: Sequence[str] = (), suffixes: Optional[Sequence[str]] = None, show_progress: bool = False) None [source]¶
Initialize with a path to directory and how to glob over it.
- Parameters
path – Path to directory to load from
glob – Glob pattern relative to the specified path by default set to pick up all non-hidden files
exclude – patterns to exclude from results, use glob syntax
suffixes – Provide to keep only files with these suffixes Useful when wanting to keep files with different suffixes Suffixes must include the dot, e.g. “.txt”
show_progress – If true, will show a progress bar as the files are loaded. This forces an iteration through all matching files to count them prior to loading them.
Examples
# Recursively load all text files in a directory. loader = FileSystemBlobLoader("/path/to/directory", glob="**/*.txt") # Recursively load all non-hidden files in a directory. loader = FileSystemBlobLoader("/path/to/directory", glob="**/[!.]*") # Load all files in a directory without recursion. loader = FileSystemBlobLoader("/path/to/directory", glob="*") # Recursively load all files in a directory, except for py or pyc files. loader = FileSystemBlobLoader( "/path/to/directory", glob="**/*.txt", exclude=["**/*.py", "**/*.pyc"] )