Create a Cluster

Python

from together import Together

client = Together()

response = client.beta.clusters.create(
  cluster_name="my-gpu-cluster",
  region="us-central-8",
  gpu_type="H100_SXM",
  num_gpus=8,
  driver_version="CUDA_12_6_560",
  billint_type="ON_DEMAND",
)

print(response.cluster_id)

{
  "cluster_id": "<string>",
  "cluster_type": "KUBERNETES",
  "region": "<string>",
  "gpu_type": "H100_SXM",
  "cluster_name": "<string>",
  "duration_hours": 123,
  "driver_version": "CUDA_12_5_555",
  "volumes": [
    {
      "volume_id": "<string>",
      "volume_name": "<string>",
      "size_tib": 123,
      "status": "<string>"
    }
  ],
  "status": "WaitingForControlPlaneNodes",
  "control_plane_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "memory_gib": 123,
      "network": "<string>"
    }
  ],
  "gpu_worker_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "num_gpus": 123,
      "memory_gib": 123,
      "networks": [
        "<string>"
      ]
    }
  ],
  "kube_config": "<string>",
  "num_gpus": 123
}

POST

compute

clusters

Python

from together import Together

client = Together()

response = client.beta.clusters.create(
  cluster_name="my-gpu-cluster",
  region="us-central-8",
  gpu_type="H100_SXM",
  num_gpus=8,
  driver_version="CUDA_12_6_560",
  billint_type="ON_DEMAND",
)

print(response.cluster_id)

{
  "cluster_id": "<string>",
  "cluster_type": "KUBERNETES",
  "region": "<string>",
  "gpu_type": "H100_SXM",
  "cluster_name": "<string>",
  "duration_hours": 123,
  "driver_version": "CUDA_12_5_555",
  "volumes": [
    {
      "volume_id": "<string>",
      "volume_name": "<string>",
      "size_tib": 123,
      "status": "<string>"
    }
  ],
  "status": "WaitingForControlPlaneNodes",
  "control_plane_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "memory_gib": 123,
      "network": "<string>"
    }
  ],
  "gpu_worker_nodes": [
    {
      "node_id": "<string>",
      "node_name": "<string>",
      "status": "<string>",
      "host_name": "<string>",
      "num_cpu_cores": 123,
      "num_gpus": 123,
      "memory_gib": 123,
      "networks": [
        "<string>"
      ]
    }
  ],
  "kube_config": "<string>",
  "num_gpus": 123
}

Authorizations

Authorization

string

header

default:default

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

GPU Cluster create request

region

enum<string>

required

Region to create the GPU cluster in. Valid values are us-central-8 and us-central-4.

Available options:

us-central-8,

us-central-4

gpu_type

enum<string>

required

Type of GPU to use in the cluster

Available options:

H100_SXM,

H200_SXM,

RTX_6000_PCI,

L40_PCIE,

B200_SXM,

H100_SXM_INF

num_gpus

integer

required

Number of GPUs to allocate in the cluster. This must be multiple of 8. For example, 8, 16 or 24

cluster_name

string

required

Name of the GPU cluster.

driver_version

enum<string>

required

NVIDIA driver version to use in the cluster.

Available options:

CUDA_12_5_555,

CUDA_12_6_560,

CUDA_12_6_565,

CUDA_12_8_570

billing_type

enum<string>

required

Available options:

RESERVED,

ON_DEMAND

cluster_type

enum<string>

Available options:

KUBERNETES,

SLURM

duration_days

integer

Duration in days to keep the cluster running.

shared_volume

object

Show child attributes

volume_id

string

Response

200 - application/json

cluster_id

string

required

cluster_type

enum<string>

required

Available options:

KUBERNETES,

SLURM

region

string

required

gpu_type

enum<string>

required

Available options:

H100_SXM,

H200_SXM,

RTX_6000_PCI,

L40_PCIE,

B200_SXM,

H100_SXM_INF

cluster_name

string

required

duration_hours

integer

required

driver_version

enum<string>

required

Available options:

CUDA_12_5_555,

CUDA_12_6_560,

CUDA_12_6_565,

CUDA_12_8_570

volumes

object[]

required

Show child attributes

status

enum<string>

required

Current status of the GPU cluster.

Available options:

WaitingForControlPlaneNodes,

WaitingForDataPlaneNodes,

WaitingForSubnet,

WaitingForSharedVolume,

InstallingDrivers,

RunningAcceptanceTests,

Paused,

OnDemandComputePaused,

Ready,

Degraded,

Deleting

control_plane_nodes

object[]

required

Show child attributes

gpu_worker_nodes

object[]

required

Show child attributes

kube_config

string

required

num_gpus

integer

required

Upload a file

List all Clusters

⌘I

Together APIs

Command Line Interface

General

Authorizations

Body

Response