Jig is a lightweight CLI for building Docker images from aDocumentation Index
Fetch the complete documentation index at: https://docs.together.ai/llms.txt
Use this file to discover all available pages before exploring further.
pyproject.toml, pushing them to Together’s private container registry, and managing deployments. It’s included with the Together Python library.
The Deploy Workflow
Jig combines several steps into a singledeploy command:
- Init —
together beta jig initscaffolds apyproject.tomlwith sensible defaults - Build — Generates a Dockerfile from your config and builds the image locally
- Push — Pushes the image to Together’s registry at
registry.together.ai - Deploy — Creates or updates the deployment on Together’s infrastructure
Cache Warmup
The--warmup option lets you pre-generate inference engine compile caches — such as those created by torch.compile or TensorRT — at build time, rather than waiting for the first request in production. This can significantly reduce cold-start latency.
How It Works
- Build phase: Jig builds the base image normally
- Warmup phase: Jig runs the container with GPU access, mounting your local workspace to
/app - Cache capture: The container runs your Sprocket’s
warmup_inputs, generating compile caches - Final image: Jig builds a new image layer with the cache baked in
WARMUP_ENV_NAME (default: TORCHINDUCTOR_CACHE_DIR) and WARMUP_DEST (default: torch_cache).
Jig sets the environment variable to point to the cache directory during warmup and copies its contents into the final image.
Sprocket Integration
Definewarmup_inputs on your Sprocket class to specify what inputs to run during warmup:
predict(...) function is invoked once for each input specified in warmup_inputs. If warmup_inputs is empty or not defined, the warmup step invokes predict({}) once as a fallback. Make sure all the compile paths would be exercised by the warmup inputs.
In normal build (no --warmup), an empty warmup_inputs means no warmup runs at all.
Since the local workspace is mounted to /app, model weights and example inputs can live in your project directory and be referenced directly.
Requirements
- A GPU on your build machine — warmup runs your model locally to generate caches. If you don’t have a local GPU, Together Instant Clusters provide on-demand H100s with fast connectivity to Together’s container registry.
warmup_inputsdefined on your Sprocket with representative inputs- Weights and example inputs accessible in local workspace
Secrets
Secrets are encrypted environment variables injected into your container at runtime. Use them for API keys, tokens, and other sensitive values that shouldn’t be baked into the image.pyproject.toml as environment variables, and they’ll be available to your container at runtime. See the Jig CLI reference for all secrets commands.
Volumes
Volumes let you mount read-only data — like model weights — into your container without baking them into the image. This keeps images small and lets you update weights independently of code. Create a volume and upload files:pyproject.toml: