Coding agent skills

Together AI Skills are instruction files that give AI coding agents domain-specific knowledge about the Together AI platform. When your agent detects a relevant task, it automatically loads the right skill and uses it to write correct code with proper model IDs, SDK patterns, and best practices. Together AI offers 12 skills covering the full platform. They work with Claude Code, Cursor, Codex, Gemini CLI, and any other coding agent you use.

Installation

npx skills add togethercomputer/skills

Verify installation

You should see one SKILL.md per installed skill.

ls your-project/.claude/skills/together-*/SKILL.md

Available skills

Once installed, skills activate automatically when the agent detects a relevant task. You can also call a skill explicitly with /<skill-name>, but most of the time the agent picks the right one on its own.

Skill	What it covers
together-chat-completions	Serverless chat inference, streaming, multi-turn conversations, function calling (6 patterns), structured JSON outputs, and reasoning models.
together-images	Text-to-image generation, image editing with Kontext, FLUX model selection, LoRA-based styling, and reference-image guidance.
together-video	Text-to-video and image-to-video generation, keyframe control, model and dimension selection, and async job polling.
together-audio	Text-to-speech (REST, streaming, realtime WebSocket) and speech-to-text (transcription, translation, diarization, timestamps).
together-embeddings	Dense vector generation, semantic search, RAG pipelines, and reranking with dedicated endpoints.
together-fine-tuning	LoRA, full, DPO preference, VLM, function-calling, and reasoning fine-tuning, plus BYOM uploads.
together-batch-inference	Async batch jobs with JSONL input, polling, result downloads, and up to 50% cost savings.
together-evaluations	LLM-as-a-judge workflows: classify, score, and compare evaluations with external provider support.
together-sandboxes	Remote sandboxed Python execution with session reuse, file uploads, and chart outputs.
together-dedicated-endpoints	Single-tenant GPU endpoints with hardware sizing, autoscaling, and fine-tuned model deployment.
together-dedicated-containers	Custom Dockerized inference workers using the Jig CLI, Sprocket SDK, and queue API.
together-gpu-clusters	On-demand and reserved GPU clusters (H100, H200, B200) with Kubernetes, Slurm, and shared storage.

How skills are structured

Each skill is a self-contained directory:

skills/together-<product>/
├── SKILL.md           # Core instructions (loaded when the skill triggers)
├── references/        # Detailed docs: model lists, API parameters, CLI commands
└── scripts/           # Runnable Python and TypeScript examples

When a skill triggers, the agent first loads SKILL.md for high-level routing and rules. If it needs deeper detail (model tables, full API specs, or data format docs), it pulls from references/. For complete working code, it uses the scripts/ directory.

Use a single skill

Each skill works on its own for focused tasks. Describe what you want and the right skill activates, or invoke a specific skill with /<skill-name>. For example, /together-fine-tuning.

Chat with streaming and tool use

Build a multi-turn chatbot using Together AI with Kimi-K2.5
that can call a weather API and return structured JSON.

The agent uses together-chat-completions to generate correct v2 SDK code with the right model ID, streaming setup, tool definitions, and the complete tool-call loop.

Generate and edit images

Generate a product hero image with FLUX.2, then use Kontext
to change the background to a rainy cyberpunk alley.

The agent uses together-images for both the initial generation and the Kontext editing call, handling base64 decoding and file saving.

Fine-tune a model

Fine-tune Llama 3.3 70B on my support conversations using LoRA,
then deploy the result to a dedicated endpoint.

The agent uses together-fine-tuning for data preparation, upload, training configuration, and monitoring, then hands off to together-dedicated-endpoints for deployment.

Multi-step workflow examples

Skills define hand-off boundaries between products so the agent can chain them together for multi-step workflows. Here are four examples that span multiple skills.

Build a RAG pipeline with evaluation

Embed my document corpus with Together AI, build a retrieval pipeline
with reranking, then evaluate the answer quality with an LLM judge.

The agent chains three skills:

together-embeddings: generates dense vectors for your documents and builds a cosine-similarity retriever with reranking.
together-chat-completions: generates answers from the retrieved context using a chat model.
together-evaluations: sets up a score evaluation to grade answer quality with an LLM judge, polls for results, and downloads the per-row scores.

Fine-tune, deploy, and benchmark

Fine-tune Qwen on my preference data with DPO, deploy the result,
then compare it against the base model using Together AI evaluations.

The agent chains three skills:

together-fine-tuning: prepares preference pairs, runs SFT first, then DPO training, and monitors the job.
together-dedicated-endpoints: deploys the fine-tuned checkpoint to a dedicated endpoint with hardware sizing and autoscaling.
together-evaluations: runs a compare evaluation between the base model and your fine-tuned model, then downloads the results.

Generate product media from a single prompt

Generate a product photo with FLUX.2, edit it with Kontext to add
studio lighting, then animate the final image into a 5-second video.

The agent chains two skills:

together-images: generates the initial image, then edits it with Kontext for studio lighting.
together-video: takes the edited image as a first-frame keyframe, submits an image-to-video job, polls until completion, and downloads the MP4.

Batch-process and analyze results

Classify 50,000 support tickets overnight with the batch API,
then run the results through Together Sandboxes to generate
a breakdown chart by category.

The agent chains two skills:

together-batch-inference: prepares the JSONL input, uploads it, creates the batch job, and polls until the results are ready.
together-sandboxes: uploads the results file to a sandboxed Python session, runs pandas analysis, and generates a matplotlib chart.

SDK compatibility

All code generated by these skills targets the Together Python v2 SDK (together>=2.0.0) and the Together TypeScript SDK (together-ai). If you are upgrading from v1, see the migration guide for breaking changes in method names, argument styles, and response shapes.

Resources

Skills repository on GitHub: Source code, full reference docs, and runnable scripts for all 12 skills.
Together AI MCP server: Connect your coding agent to the Together AI documentation via MCP.
Together AI cookbook: End-to-end examples and tutorials.
Agent Skills specification: The open standard these skills follow.

GET STARTED

INFERENCE

TRAINING

CODE EXECUTION

ACCOUNTS

Installation

Verify installation

Available skills

How skills are structured

Use a single skill

Chat with streaming and tool use

Generate and edit images

Fine-tune a model

Multi-step workflow examples

Build a RAG pipeline with evaluation

Fine-tune, deploy, and benchmark

Generate product media from a single prompt

Batch-process and analyze results

SDK compatibility

Resources

GET STARTED

INFERENCE

TRAINING

CODE EXECUTION

ACCOUNTS

Documentation Index

​Installation

​Verify installation

​Available skills

​How skills are structured

​Use a single skill

​Chat with streaming and tool use

​Generate and edit images

​Fine-tune a model

​Multi-step workflow examples

​Build a RAG pipeline with evaluation

​Fine-tune, deploy, and benchmark

​Generate product media from a single prompt

​Batch-process and analyze results

​SDK compatibility

​Resources

Installation

Verify installation

Available skills

How skills are structured

Use a single skill

Chat with streaming and tool use

Generate and edit images

Fine-tune a model

Multi-step workflow examples

Build a RAG pipeline with evaluation

Fine-tune, deploy, and benchmark

Generate product media from a single prompt

Batch-process and analyze results

SDK compatibility

Resources