eformer.executor.docker_tpu#
Docker TPU execution utilities for managing Docker images and containers on Google Cloud Platform.
This module provides functionality for: - Building and pushing Docker images to GCP Artifact Registry and GitHub Container Registry - Configuring GCP Docker repositories with cleanup policies - Managing Docker containers for TPU workloads - Handling file operations and command execution with proper TTY support
- eformer.executor.docker_tpu.build_docker_image(docker_file: str | pathlib.Path, image_name: str, tag: str, build_args: dict[str, str] | None = None) str[source]#
Build a Docker image for Linux/AMD64 platform.
- Parameters
docker_file – Path to the Dockerfile.
image_name – Name for the Docker image.
tag – Tag for the Docker image.
build_args – Optional dictionary of build arguments.
- Returns
tag).
- Return type
Full image identifier (image_name
- Raises
subprocess.CalledProcessError – If Docker build fails.
- eformer.executor.docker_tpu.configure_gcp_artifact_registry(project_id: str, region: str, repository: str) None[source]#
Set up GCP Artifact Registry repository with appropriate permissions for TPU access.
Creates a new Docker repository in Artifact Registry if it doesn’t exist, configures cleanup policies, and sets up public read access.
- Parameters
project_id – GCP project ID.
region – GCP region for the repository.
repository – Name of the Artifact Registry repository.
- Raises
subprocess.CalledProcessError – If GCP commands fail (except for expected errors).
- eformer.executor.docker_tpu.copy_path(src: Path, dst: Path) None[source]#
Copy a file or directory from source to destination.
Removes the destination if it exists before copying.
- Parameters
src – Source path to copy from.
dst – Destination path to copy to.
- Raises
RuntimeError – If the source path is neither a file nor a directory.
- eformer.executor.docker_tpu.create_docker_run_command(image_id: str, command: list[str], *, foreground: bool, env: dict[str, Any], name: str = 'eformer') list[str][source]#
Construct a Docker run command with TPU-specific configuration.
Creates a privileged container with host networking, shared memory, and environment variables for MegaScale TPU coordination.
- Parameters
image_id – Docker image identifier to run.
command – Command and arguments to execute in the container.
foreground – If True, run interactively; if False, run detached.
env – Additional environment variables to set.
name – Container name (default: ‘eformer’).
- Returns
Complete Docker run command as a list of arguments.
- eformer.executor.docker_tpu.manage_extra_context(extra_ctx: pathlib.Path | None)[source]#
Context manager for handling temporary extra context directory.
Copies extra context to a temporary mount point and cleans up after use.
- Parameters
extra_ctx – Optional path to extra context directory to copy.
- Yields
The extra context path if provided, None otherwise.
- eformer.executor.docker_tpu.parse_image_name_and_tag(docker_base_image: str) tuple[str, str][source]#
Parse a Docker image string into image name and tag components.
- Parameters
docker_base_image – Full Docker image identifier (e.g., ‘ubuntu:20.04’).
- Returns
Tuple of (image_name, tag). Defaults to ‘latest’ if no tag specified.
- eformer.executor.docker_tpu.push_image_to_gcp(local_id: str, project_id: str, region: str, repository: str) str[source]#
Push a Docker image to Google Cloud Artifact Registry.
- Parameters
local_id – Local Docker image identifier.
project_id – GCP project ID.
region – GCP region for the repository.
repository – Name of the Artifact Registry repository.
- Returns
Full remote image identifier in Artifact Registry.
- Raises
subprocess.CalledProcessError – If push operation fails.
- eformer.executor.docker_tpu.push_image_to_github(local_id: str, github_user: str, github_token: str | None = None) str[source]#
Push a Docker image to GitHub Container Registry.
- Parameters
local_id – Local Docker image identifier.
github_user – GitHub username.
github_token – Optional GitHub personal access token for authentication.
- Returns
Full remote image identifier on ghcr.io.
- Raises
subprocess.CalledProcessError – If push operation fails.
- eformer.executor.docker_tpu.remove_path(path: Path) None[source]#
Remove a file or directory at the specified path.
- Parameters
path – Path to the file or directory to remove.
- Raises
RuntimeError – If the path exists but is neither a file nor a directory.
- eformer.executor.docker_tpu.run_command(argv: list[str]) bytes[source]#
Execute a command with proper TTY handling.
Runs commands with pseudo-terminal allocation when stdout is a TTY, otherwise runs normally. Captures and returns output.
- Parameters
argv – Command and arguments to execute.
- Returns
Command output as bytes.
- Raises
subprocess.CalledProcessError – If the command returns non-zero exit code.