Skip to main content
POST
/
compute
/
clusters
Python
from together import Together

client = Together()

response = client.beta.clusters.create(
  cluster_name="my-gpu-cluster",
  region="us-central-8",
  gpu_type="H100_SXM",
  num_gpus=8,
  driver_version="CUDA_12_6_560",
  billint_type="ON_DEMAND",
)

print(response.cluster_id)
{
  "cluster_id": "<string>",
  "cluster_type": "KUBERNETES",
  "region": "<string>",
  "gpu_type": "H100_SXM",
  "cluster_name": "<string>",
  "duration_hours": 123,
  "driver_version": "CUDA_12_5_555",
  "volumes": [
    {
      "volume_id": "<string>",
      "volume_name": "<string>",
      "size_tib": 123,
      "status": "<string>"
    }
  ],
  "status": "WaitingForControlPlaneNodes",
  "control_plane_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "memory_gib": 123,
      "network": "<string>"
    }
  ],
  "gpu_worker_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "num_gpus": 123,
      "memory_gib": 123,
      "networks": [
        "<string>"
      ]
    }
  ],
  "kube_config": "<string>",
  "num_gpus": 123
}

Authorizations

Authorization
string
header
default:default
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

GPU Cluster create request

region
enum<string>
required

Region to create the GPU cluster in. Valid values are us-central-8 and us-central-4.

Available options:
us-central-8,
us-central-4
gpu_type
enum<string>
required

Type of GPU to use in the cluster

Available options:
H100_SXM,
H200_SXM,
RTX_6000_PCI,
L40_PCIE,
B200_SXM,
H100_SXM_INF
num_gpus
integer
required

Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24

cluster_name
string
required

Name of the GPU cluster.

driver_version
enum<string>
required

NVIDIA driver version to use in the cluster.

Available options:
CUDA_12_5_555,
CUDA_12_6_560,
CUDA_12_6_565,
CUDA_12_8_570
billing_type
enum<string>
required
Available options:
RESERVED,
ON_DEMAND
cluster_type
enum<string>
Available options:
KUBERNETES,
SLURM
duration_days
integer

Duration in days to keep the cluster running.

shared_volume
object
volume_id
string

Response

200 - application/json

OK

cluster_id
string
required
cluster_type
enum<string>
required
Available options:
KUBERNETES,
SLURM
region
string
required
gpu_type
enum<string>
required
Available options:
H100_SXM,
H200_SXM,
RTX_6000_PCI,
L40_PCIE,
B200_SXM,
H100_SXM_INF
cluster_name
string
required
duration_hours
integer
required
driver_version
enum<string>
required
Available options:
CUDA_12_5_555,
CUDA_12_6_560,
CUDA_12_6_565,
CUDA_12_8_570
volumes
object[]
required
status
enum<string>
required

Current status of the GPU cluster.

Available options:
WaitingForControlPlaneNodes,
WaitingForDataPlaneNodes,
WaitingForSubnet,
WaitingForSharedVolume,
InstallingDrivers,
RunningAcceptanceTests,
Paused,
OnDemandComputePaused,
Ready,
Degraded,
Deleting
control_plane_nodes
object[]
required
gpu_worker_nodes
object[]
required
kube_config
string
required
num_gpus
integer
required