Dedicated Containers let you run your own Dockerized inference workloads on Together’s managed GPU infrastructure. You bring the container — Together handles compute provisioning, autoscaling, networking, and observability. You build and push a Docker image using the Jig CLI. Inside your container, the Sprocket SDK connects your inference code to Together’s managed job queue. Once deployed, your workers can receive requests.Documentation Index
Fetch the complete documentation index at: https://docs.together.ai/llms.txt
Use this file to discover all available pages before exploring further.
- Wrap and deploy your model in 20 minutes
- Boost conversion and margins with fair priority queueing
- Bottomless capacity just before you need it
Quickstart
Deploy Your First Container
Deploy your first container from the command line
Concepts
Platform Overview
Architecture, deployment lifecycle, autoscaling, and troubleshooting
Jig CLI
Build, deploy, secrets, and volumes
Sprocket SDK
Inference workers with setup() and predict()
Queue API
Async jobs with priority and progress
Guides
Image Generation
Single-GPU Flux2 model
Video Generation
Multi-GPU Wan 2.1 with torchrun
Reference
Jig CLI
CLI commands and pyproject.toml configuration
Sprocket SDK
Base classes, file handling, and error reference
REST API
Deployments, secrets, storage, and queue
Get Access
Contact your account representative or support@together.ai to enable Dedicated Containers for your organization.