eformer.executor.docker_tpu#

Docker TPU execution utilities for managing Docker images and containers on Google Cloud Platform.

This module provides functionality for: - Building and pushing Docker images to GCP Artifact Registry and GitHub Container Registry - Configuring GCP Docker repositories with cleanup policies - Managing Docker containers for TPU workloads - Handling file operations and command execution with proper TTY support

eformer.executor.docker_tpu.build_docker_image(docker_file: str | pathlib.Path, image_name: str, tag: str, build_args: dict[str, str] | None = None) str[source]#

Build a Docker image for Linux/AMD64 platform.

Parameters
  • docker_file – Path to the Dockerfile.

  • image_name – Name for the Docker image.

  • tag – Tag for the Docker image.

  • build_args – Optional dictionary of build arguments.

Returns

tag).

Return type

Full image identifier (image_name

Raises

subprocess.CalledProcessError – If Docker build fails.

eformer.executor.docker_tpu.configure_gcp_artifact_registry(project_id: str, region: str, repository: str) None[source]#

Set up GCP Artifact Registry repository with appropriate permissions for TPU access.

Creates a new Docker repository in Artifact Registry if it doesn’t exist, configures cleanup policies, and sets up public read access.

Parameters
  • project_id – GCP project ID.

  • region – GCP region for the repository.

  • repository – Name of the Artifact Registry repository.

Raises

subprocess.CalledProcessError – If GCP commands fail (except for expected errors).

eformer.executor.docker_tpu.copy_path(src: Path, dst: Path) None[source]#

Copy a file or directory from source to destination.

Removes the destination if it exists before copying.

Parameters
  • src – Source path to copy from.

  • dst – Destination path to copy to.

Raises

RuntimeError – If the source path is neither a file nor a directory.

eformer.executor.docker_tpu.create_docker_run_command(image_id: str, command: list[str], *, foreground: bool, env: dict[str, Any], name: str = 'eformer') list[str][source]#

Construct a Docker run command with TPU-specific configuration.

Creates a privileged container with host networking, shared memory, and environment variables for MegaScale TPU coordination.

Parameters
  • image_id – Docker image identifier to run.

  • command – Command and arguments to execute in the container.

  • foreground – If True, run interactively; if False, run detached.

  • env – Additional environment variables to set.

  • name – Container name (default: ‘eformer’).

Returns

Complete Docker run command as a list of arguments.

eformer.executor.docker_tpu.manage_extra_context(extra_ctx: pathlib.Path | None)[source]#

Context manager for handling temporary extra context directory.

Copies extra context to a temporary mount point and cleans up after use.

Parameters

extra_ctx – Optional path to extra context directory to copy.

Yields

The extra context path if provided, None otherwise.

eformer.executor.docker_tpu.parse_image_name_and_tag(docker_base_image: str) tuple[str, str][source]#

Parse a Docker image string into image name and tag components.

Parameters

docker_base_image – Full Docker image identifier (e.g., ‘ubuntu:20.04’).

Returns

Tuple of (image_name, tag). Defaults to ‘latest’ if no tag specified.

eformer.executor.docker_tpu.push_image_to_gcp(local_id: str, project_id: str, region: str, repository: str) str[source]#

Push a Docker image to Google Cloud Artifact Registry.

Parameters
  • local_id – Local Docker image identifier.

  • project_id – GCP project ID.

  • region – GCP region for the repository.

  • repository – Name of the Artifact Registry repository.

Returns

Full remote image identifier in Artifact Registry.

Raises

subprocess.CalledProcessError – If push operation fails.

eformer.executor.docker_tpu.push_image_to_github(local_id: str, github_user: str, github_token: str | None = None) str[source]#

Push a Docker image to GitHub Container Registry.

Parameters
  • local_id – Local Docker image identifier.

  • github_user – GitHub username.

  • github_token – Optional GitHub personal access token for authentication.

Returns

Full remote image identifier on ghcr.io.

Raises

subprocess.CalledProcessError – If push operation fails.

eformer.executor.docker_tpu.remove_path(path: Path) None[source]#

Remove a file or directory at the specified path.

Parameters

path – Path to the file or directory to remove.

Raises

RuntimeError – If the path exists but is neither a file nor a directory.

eformer.executor.docker_tpu.run_command(argv: list[str]) bytes[source]#

Execute a command with proper TTY handling.

Runs commands with pseudo-terminal allocation when stdout is a TTY, otherwise runs normally. Captures and returns output.

Parameters

argv – Command and arguments to execute.

Returns

Command output as bytes.

Raises

subprocess.CalledProcessError – If the command returns non-zero exit code.