Endpoints

Create and manage dedicated endpoints for model inference.

Endpoint ID

Many commands require an ENDPOINT_ID to identify which endpoint to operate on. The endpoint ID is a unique identifier assigned when an endpoint is created, in the format:

endpoint-<uuid>

For example: endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

The endpoint ID is different from the model name (e.g., meta-llama/Llama-3.3-70B-Instruct-Turbo) or the display name you set with --display-name.

Find your endpoint ID

To find your endpoint ID, you can:

Run the tg endpoints create command to create an endpoint. The endpoint ID is returned in the output.
Run the tg endpoints list command to list all your endpoints. The endpoint ID is displayed for each endpoint.
View the endpoint details page in the Together AI console.

Create

Create a new dedicated endpoint.

Shell

tg endpoints create \
  --model meta-llama/Llama-3.3-70B-Instruct-Turbo \
  --hardware 4x_nvidia_h100_80gb_sxm \
  --display-name "My Endpoint" \
  --wait

Parameters

Flag	Description
`--model [string]`	(required) The model to deploy
`--hardware [string]`	(required) GPU type to use for inference. Use `tg endpoints hardware` to discover available gpu identifiers
`--min-replicas [number]`	Minimum number of replicas to deploy
`--max-replicas [number]`	Maximum number of replicas to deploy
`--display-name [string]`	A human-readable name for the endpoint
`--no-auto-start`	Create the endpoint in STOPPED state instead of auto-starting it
`--no-speculative-decoding`	Disable speculative decoding for this endpoint
`--inactive-timeout [number]`	Number of minutes of inactivity after which the endpoint will be automatically stopped. Set to 0 to disable.
`--availability-zone [string]`	Start endpoint in specified availability zone. Use `tg endpoints availability-zones` to discover valid options.
`--wait`	Wait for the endpoint to be ready after creation

Hardware

List all hardware options (optionally filtered by model and availability).

tg endpoints hardware

Parameters

Flag	Description
`--model [string]`	Filter hardware that is compatible with a given model.
`--available`	Filter for only hardware that is currently available.

Retrieve

Print details for a specific endpoint.

Shell

tg endpoints retrieve endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

Update

Update the configuration of an existing endpoint.

Shell

tg endpoints update endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462 \
  --min-replicas 2 \
  --max-replicas 4 

Parameters

Flag	Description
`--display-name [string]`	New human-readable name for the endpoint.
`--min-replicas [number]`	New minimum number of replicas to maintain.
`--max-replicas [number]`	New maximum number of replicas to scale up to.
`--inactive-timeout [number]`	Number of minutes of inactivity after which the endpoint will be automatically stopped. Set to 0 to disable.

Note: Both --min-replicas and --max-replicas must be specified together

Start

Start a dedicated endpoint.

Shell

tg endpoints start endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

Parameters

Flag	Description
`--wait`	Wait for the endpoint to start

Stop

Stop a dedicated endpoint.

Shell

tg endpoints stop endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

Parameters

Flag	Description
`--wait`	Wait for the endpoint to stop

Delete

Delete a dedicated endpoint.

Shell

tg endpoints delete endpoint-c2a48674-9ec7-45b3-ac30-0f25f2ad9462

List

List your dedicated endpoints.

Shell

tg endpoints list

Options

Options	Description
`--usage-type [on-demand \| reserved]`	Filter by usage type options.
`--after [string]`	The cursor to start from.

TOGETHER CLI

COMMANDS

Endpoint ID

Find your endpoint ID

Create

Parameters

Hardware

Parameters

Retrieve

Update

Parameters

Start

Parameters

Stop

Parameters

Delete

List

Options

TOGETHER CLI

COMMANDS

Documentation Index

​Endpoint ID

​Find your endpoint ID

​Create

​Parameters

​Hardware

​Parameters

​Retrieve

​Update

​Parameters

​Start

​Parameters

​Stop

​Parameters

​Delete

​List

​Options

Endpoint ID

Find your endpoint ID

Create

Parameters

Hardware

Parameters

Retrieve

Update

Parameters

Start

Parameters

Stop

Parameters

Delete

List

Options