API & Integrations

Overview

All cluster management operations are available through multiple interfaces for programmatic control and automation:

tcloud CLI – Command-line tool for cluster operations
REST API – Full HTTP API for custom integrations
Terraform Provider – Infrastructure-as-code for reproducible deployments
SkyPilot – Orchestrate AI workloads across clusters

tcloud CLI

The tcloud CLI provides a command-line interface for managing clusters, storage, and scaling.

Installation

Download the CLI for your platform:

Authentication

Authenticate via Google SSO:

tcloud sso login

Common Commands

Create a cluster:

tcloud cluster create my-cluster \
  --num-gpus 8 \
  --reservation-duration 1 \
  --instance-type H100-SXM \
  --region us-central-8 \
  --shared-volume-name my-volume \
  --size-tib 1

Specify billing type (reserved vs on-demand):

# Reserved capacity
tcloud cluster create my-cluster \
  --num-gpus 8 \
  --billing-type prepaid \
  --reservation-duration 30 \
  --instance-type H100-SXM \
  --region us-central-8 \
  --shared-volume-name my-volume \
  --size-tib 1

# On-demand capacity
tcloud cluster create my-cluster \
  --num-gpus 8 \
  --billing-type on_demand \
  --instance-type H100-SXM \
  --region us-central-8 \
  --shared-volume-name my-volume \
  --size-tib 1

Delete a cluster:

tcloud cluster delete <CLUSTER_UUID>

List clusters:

tcloud cluster list

Scale a cluster:

tcloud cluster scale <CLUSTER_UUID> --num-gpus 16

REST API

All cluster management actions are available via REST API endpoints.

API Reference

Complete API documentation is available at: GPU Cluster API Reference →

Example: Create Cluster

curl -X POST "https://manager.cloud.together.ai/api/v1/gpu_cluster" \
  -H "Authorization: Bearer $TOGETHER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-cluster",
    "num_gpus": 8,
    "instance_type": "H100-SXM",
    "region": "us-central-8",
    "billing_type": "prepaid",
    "reservation_duration": 30,
    "shared_volume": {
      "name": "my-volume",
      "size_tib": 1
    }
  }'

Example: List Clusters

curl -X GET "https://manager.cloud.together.ai/api/v1/gpu_clusters" \
  -H "Authorization: Bearer $TOGETHER_API_KEY"

Example: Delete Cluster

curl -X DELETE "https://manager.cloud.together.ai/api/v1/gpu_cluster/{cluster_id}" \
  -H "Authorization: Bearer $TOGETHER_API_KEY"

Terraform Provider

Use the Together Terraform Provider to define clusters, storage, and scaling policies as code.

Setup

terraform {
  required_providers {
    together = {
      source = "together-ai/together"
      version = "~> 1.0"
    }
  }
}

provider "together" {
  api_key = var.together_api_key
}

Example: Define a Cluster

resource "together_gpu_cluster" "training_cluster" {
  name              = "training-cluster"
  num_gpus          = 8
  instance_type     = "H100-SXM"
  region            = "us-central-8"
  billing_type      = "prepaid"
  reservation_days  = 30

  shared_volume {
    name     = "training-data"
    size_tib = 5
  }
}

Benefits

Version control – Track infrastructure changes in Git
Reproducibility – Deploy identical clusters across environments
Automation – Integrate with CI/CD pipelines
State management – Terraform tracks cluster state automatically

SkyPilot Integration

Orchestrate AI workloads on GPU Clusters using SkyPilot for simplified cluster management and job scheduling.

Installation

uv pip install skypilot[kubernetes]

Setup

Launch a Kubernetes cluster via Together Cloud
Configure kubeconfig:

Download the kubeconfig from the cluster UI and merge it:

# Option 1: Replace existing config
cp together-kubeconfig ~/.kube/config

# Option 2: Merge with existing config
KUBECONFIG=./together-kubeconfig:~/.kube/config \
  kubectl config view --flatten > /tmp/merged_kubeconfig && \
  mv /tmp/merged_kubeconfig ~/.kube/config

Verify SkyPilot access:

sky check k8s

Expected output:

Checking credentials to enable infra for SkyPilot.
  Kubernetes: enabled [compute]
    Allowed contexts:
    └── t-51326e6b-25ec-42dd-8077-6f3c9b9a34c6-admin: enabled.

🎉 Enabled infra 🎉
  Kubernetes [compute]

Check available GPUs:

sky show-gpus --infra k8s

Example: Launch a Workload

Create a SkyPilot task file (task.yaml):

resources:
  accelerators: H100:8
  cloud: kubernetes

setup: |
  pip install torch transformers

run: |
  python train.py

Launch the task:

sky launch -c my-job task.yaml

Example: Fine-tune GPT OSS

Download the gpt-oss-20b.yaml configuration. Launch fine-tuning:

sky launch -c gpt-together gpt-oss-20b.yaml

Benefits

Simplified orchestration – Abstract away Kubernetes complexity
Multi-cloud support – Same workflow across different clouds
Cost optimization – Auto-select cheapest available resources
Job management – Easy monitoring and cancellation

Automation Patterns

CI/CD Integration

GitHub Actions example:

name: Train Model

on: push

jobs:
  train:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      
      - name: Create GPU Cluster
        run: |
          tcloud cluster create training-${{ github.sha }} \
            --num-gpus 8 \
            --billing-type on_demand \
            --instance-type H100-SXM \
            --region us-central-8
      
      - name: Run Training
        run: |
          # Submit training job to cluster
          kubectl apply -f training-job.yaml
      
      - name: Cleanup
        if: always()
        run: |
          tcloud cluster delete training-${{ github.sha }}

Scheduled Jobs

Cron-based cluster creation:

# Create cluster daily at 6 AM for batch processing
0 6 * * * tcloud cluster create daily-batch \
  --num-gpus 16 \
  --billing-type on_demand \
  --instance-type H100-SXM

Auto-scaling Scripts

import requests


def scale_cluster(cluster_id, target_gpus):
    response = requests.put(
        f"https://manager.cloud.together.ai/api/v1/gpu_cluster",
        headers={"Authorization": f"Bearer {API_KEY}"},
        json={"cluster_id": cluster_id, "num_gpus": target_gpus},
    )
    return response.json()


# Scale based on job queue length
if job_queue_length > 100:
    scale_cluster("cluster-123", 16)
else:
    scale_cluster("cluster-123", 8)

Getting Started

Inference

Training

Capabilities

Other APIs

Overview

tcloud CLI

Installation

Authentication

Common Commands

REST API

API Reference

Example: Create Cluster

Example: List Clusters

Example: Delete Cluster

Terraform Provider

Setup

Example: Define a Cluster

Benefits

SkyPilot Integration

Installation

Setup

Example: Launch a Workload

Example: Fine-tune GPT OSS

Benefits

Automation Patterns

CI/CD Integration

Scheduled Jobs

Auto-scaling Scripts

Best Practices

API Usage

CLI Usage

Terraform

Troubleshooting

Authentication issues

API rate limits

Terraform state conflicts

What’s Next?

Getting Started

Inference

Training

Capabilities

Other APIs

​Overview

​tcloud CLI

​Installation

​Authentication

​Common Commands

​REST API

​API Reference

​Example: Create Cluster

​Example: List Clusters

​Example: Delete Cluster

​Terraform Provider

​Setup

​Example: Define a Cluster

​Benefits

​SkyPilot Integration

​Installation

​Setup

​Example: Launch a Workload

​Example: Fine-tune GPT OSS

​Benefits

​Automation Patterns

​CI/CD Integration

​Scheduled Jobs

​Auto-scaling Scripts

​Best Practices

​API Usage

​CLI Usage

​Terraform

​Troubleshooting

​Authentication issues

​API rate limits

​Terraform state conflicts

​What’s Next?

Overview

tcloud CLI

Installation

Authentication

Common Commands

REST API

API Reference

Example: Create Cluster

Example: List Clusters

Example: Delete Cluster

Terraform Provider

Setup

Example: Define a Cluster

Benefits

SkyPilot Integration

Installation

Setup

Example: Launch a Workload

Example: Fine-tune GPT OSS

Benefits

Automation Patterns

CI/CD Integration

Scheduled Jobs

Auto-scaling Scripts

Best Practices

API Usage

CLI Usage

Terraform

Troubleshooting

Authentication issues

API rate limits

Terraform state conflicts

What’s Next?