eformer.serialization.fsspec_utils#
- async eformer.serialization.fsspec_utils.async_remove(url, *, recursive=False, **kwargs)[source]#
Asynchronously remove a file or directory.
Uses async operations when the filesystem supports it (e.g., gcsfs, s3fs), otherwise falls back to synchronous removal. Useful for non-blocking I/O in async contexts.
- Parameters
url – URL or path of the file/directory to remove. Supports local paths and cloud storage URLs (gs://, s3://, etc.).
recursive – If True, remove directories and their contents recursively. Required for non-empty directories. Defaults to False.
**kwargs – Additional arguments passed to fsspec.core.url_to_fs (e.g., credentials for cloud storage).
- Returns
None. The async filesystem operation result is awaited internally.
Note
For AsyncFileSystem backends (GCS, S3), uses native async _rm method.
For synchronous backends (local filesystem), blocks during removal.
Prefer this over remove() when running in async contexts for better performance with cloud storage.
Example
>>> await async_remove("gs://my-bucket/old-checkpoint", recursive=True)
- eformer.serialization.fsspec_utils.exists(url, **kwargs) bool[source]#
Check if a file or directory exists at the given URL.
Uses fsspec to support multiple storage backends including local filesystem, Google Cloud Storage (gs://), Amazon S3 (s3://), and more.
- Parameters
url – URL or path to check. Supports various protocols (file://, gs://, s3://, etc.).
**kwargs – Additional arguments passed to fsspec.core.url_to_fs (e.g., credentials).
- Returns
True if the path exists, False otherwise.
Example
>>> exists("/local/path/file.txt") True >>> exists("gs://my-bucket/checkpoint/model.safetensors") False
- eformer.serialization.fsspec_utils.expand_glob(url)[source]#
Expand glob patterns and brace expressions in URLs.
Supports both brace expansion (e.g., file_{1..3}.txt) and glob patterns (e.g., *.txt). Works with various filesystem protocols.
- Parameters
url – URL or path with potential glob patterns and/or brace expressions.
- Yields
Expanded URLs/paths matching the pattern.
Example
>>> list(expand_glob("data/*.json")) ['data/file1.json', 'data/file2.json'] >>> list(expand_glob("file_{1..3}.txt")) ['file_1.txt', 'file_2.txt', 'file_3.txt']
- eformer.serialization.fsspec_utils.join_path(lhs, rhs)[source]#
Join two path components intelligently, handling protocols.
Similar to os.path.join but handles URLs with protocols (gs://, s3://, etc.). If rhs has a protocol, it’s returned as-is (absolute path behavior).
- Parameters
lhs – Left-hand side path or URL.
rhs – Right-hand side path or URL.
- Returns
Joined path, preserving protocols appropriately.
- Raises
ValueError – If both paths have different protocols.
Example
>>> join_path("gs://bucket/dir", "file.txt") 'gs://bucket/dir/file.txt' >>> join_path("/local/dir", "gs://bucket/file.txt") 'gs://bucket/file.txt'
- eformer.serialization.fsspec_utils.mkdirs(path)[source]#
Create a directory and all necessary parent directories.
Uses fsspec to support multiple storage backends. Creates the directory and any missing parent directories recursively.
- Parameters
path – Path or URL of the directory to create. Supports local paths and cloud storage URLs (gs://, s3://, etc.).
Note
Uses exist_ok=True, so no error is raised if the directory already exists. This makes it safe to call multiple times on the same path.
Example
>>> mkdirs("/local/path/to/checkpoint/dir") >>> mkdirs("gs://my-bucket/checkpoints/run-1000")
- eformer.serialization.fsspec_utils.remove(url, *, recursive=False, **kwargs)[source]#
Remove a file or directory.
Uses fsspec for cross-platform and cloud-compatible file removal.
- Parameters
url – URL or path of the file/directory to remove. Supports local paths and cloud storage URLs (gs://, s3://, etc.).
recursive – If True, remove directories and their contents recursively. Required for non-empty directories. Defaults to False.
**kwargs – Additional arguments passed to fsspec.core.url_to_fs (e.g., credentials for cloud storage).
- Raises
FileNotFoundError – If the path does not exist.
OSError – If trying to remove a non-empty directory without recursive=True.
Example
>>> remove("/local/path/file.txt") >>> remove("gs://my-bucket/old-checkpoint", recursive=True)