eformer.serialization.fsspec_utils#

async eformer.serialization.fsspec_utils.async_remove(url, *, recursive=False, **kwargs)[source]#

Asynchronously remove a file or directory.

Uses async operations when the filesystem supports it (e.g., gcsfs, s3fs), otherwise falls back to synchronous removal. Useful for non-blocking I/O in async contexts.

Parameters
  • url – URL or path of the file/directory to remove. Supports local paths and cloud storage URLs (gs://, s3://, etc.).

  • recursive – If True, remove directories and their contents recursively. Required for non-empty directories. Defaults to False.

  • **kwargs – Additional arguments passed to fsspec.core.url_to_fs (e.g., credentials for cloud storage).

Returns

None. The async filesystem operation result is awaited internally.

Note

  • For AsyncFileSystem backends (GCS, S3), uses native async _rm method.

  • For synchronous backends (local filesystem), blocks during removal.

  • Prefer this over remove() when running in async contexts for better performance with cloud storage.

Example

>>> await async_remove("gs://my-bucket/old-checkpoint", recursive=True)
eformer.serialization.fsspec_utils.exists(url, **kwargs) bool[source]#

Check if a file or directory exists at the given URL.

Uses fsspec to support multiple storage backends including local filesystem, Google Cloud Storage (gs://), Amazon S3 (s3://), and more.

Parameters
  • url – URL or path to check. Supports various protocols (file://, gs://, s3://, etc.).

  • **kwargs – Additional arguments passed to fsspec.core.url_to_fs (e.g., credentials).

Returns

True if the path exists, False otherwise.

Example

>>> exists("/local/path/file.txt")
True
>>> exists("gs://my-bucket/checkpoint/model.safetensors")
False
eformer.serialization.fsspec_utils.expand_glob(url)[source]#

Expand glob patterns and brace expressions in URLs.

Supports both brace expansion (e.g., file_{1..3}.txt) and glob patterns (e.g., *.txt). Works with various filesystem protocols.

Parameters

url – URL or path with potential glob patterns and/or brace expressions.

Yields

Expanded URLs/paths matching the pattern.

Example

>>> list(expand_glob("data/*.json"))
['data/file1.json', 'data/file2.json']
>>> list(expand_glob("file_{1..3}.txt"))
['file_1.txt', 'file_2.txt', 'file_3.txt']
eformer.serialization.fsspec_utils.join_path(lhs, rhs)[source]#

Join two path components intelligently, handling protocols.

Similar to os.path.join but handles URLs with protocols (gs://, s3://, etc.). If rhs has a protocol, it’s returned as-is (absolute path behavior).

Parameters
  • lhs – Left-hand side path or URL.

  • rhs – Right-hand side path or URL.

Returns

Joined path, preserving protocols appropriately.

Raises

ValueError – If both paths have different protocols.

Example

>>> join_path("gs://bucket/dir", "file.txt")
'gs://bucket/dir/file.txt'
>>> join_path("/local/dir", "gs://bucket/file.txt")
'gs://bucket/file.txt'
eformer.serialization.fsspec_utils.mkdirs(path)[source]#

Create a directory and all necessary parent directories.

Uses fsspec to support multiple storage backends. Creates the directory and any missing parent directories recursively.

Parameters

path – Path or URL of the directory to create. Supports local paths and cloud storage URLs (gs://, s3://, etc.).

Note

Uses exist_ok=True, so no error is raised if the directory already exists. This makes it safe to call multiple times on the same path.

Example

>>> mkdirs("/local/path/to/checkpoint/dir")
>>> mkdirs("gs://my-bucket/checkpoints/run-1000")
eformer.serialization.fsspec_utils.remove(url, *, recursive=False, **kwargs)[source]#

Remove a file or directory.

Uses fsspec for cross-platform and cloud-compatible file removal.

Parameters
  • url – URL or path of the file/directory to remove. Supports local paths and cloud storage URLs (gs://, s3://, etc.).

  • recursive – If True, remove directories and their contents recursively. Required for non-empty directories. Defaults to False.

  • **kwargs – Additional arguments passed to fsspec.core.url_to_fs (e.g., credentials for cloud storage).

Raises
  • FileNotFoundError – If the path does not exist.

  • OSError – If trying to remove a non-empty directory without recursive=True.

Example

>>> remove("/local/path/file.txt")
>>> remove("gs://my-bucket/old-checkpoint", recursive=True)
eformer.serialization.fsspec_utils.should_write_shared_checkpoint_files(path) bool[source]#

Whether the current process should write shared checkpoint metadata.

Local files keep the historical behavior where every process may perform the shared setup/writes. Remote/object-store paths are restricted to process 0 to avoid cross-host contention on shared metadata files.