Together AI Skills are instruction files that give AI coding agents domain-specific knowledge about the Together AI platform. When your agent detects a relevant task, it automatically loads the right skill and uses it to write correct code with proper model IDs, SDK patterns, and best practices. Together AI offers 12 skills covering the full platform. They work with Claude Code, Cursor, Codex, Gemini CLI, and any other coding agent you use.Documentation Index
Fetch the complete documentation index at: https://docs.together.ai/llms.txt
Use this file to discover all available pages before exploring further.
Installation
Verify installation
You should see oneSKILL.md per installed skill.
Available skills
Once installed, skills activate automatically when the agent detects a relevant task. You can also call a skill explicitly with/<skill-name>, but most of the time the agent picks the right one on its own.
| Skill | What it covers |
|---|---|
| together-chat-completions | Serverless chat inference, streaming, multi-turn conversations, function calling (6 patterns), structured JSON outputs, and reasoning models. |
| together-images | Text-to-image generation, image editing with Kontext, FLUX model selection, LoRA-based styling, and reference-image guidance. |
| together-video | Text-to-video and image-to-video generation, keyframe control, model and dimension selection, and async job polling. |
| together-audio | Text-to-speech (REST, streaming, realtime WebSocket) and speech-to-text (transcription, translation, diarization, timestamps). |
| together-embeddings | Dense vector generation, semantic search, RAG pipelines, and reranking with dedicated endpoints. |
| together-fine-tuning | LoRA, full, DPO preference, VLM, function-calling, and reasoning fine-tuning, plus BYOM uploads. |
| together-batch-inference | Async batch jobs with JSONL input, polling, result downloads, and up to 50% cost savings. |
| together-evaluations | LLM-as-a-judge workflows: classify, score, and compare evaluations with external provider support. |
| together-sandboxes | Remote sandboxed Python execution with session reuse, file uploads, and chart outputs. |
| together-dedicated-endpoints | Single-tenant GPU endpoints with hardware sizing, autoscaling, and fine-tuned model deployment. |
| together-dedicated-containers | Custom Dockerized inference workers using the Jig CLI, Sprocket SDK, and queue API. |
| together-gpu-clusters | On-demand and reserved GPU clusters (H100, H200, B200) with Kubernetes, Slurm, and shared storage. |
How skills are structured
Each skill is a self-contained directory:SKILL.md for high-level routing and rules. If it needs deeper detail (model tables, full API specs, or data format docs), it pulls from references/. For complete working code, it uses the scripts/ directory.
Use a single skill
Each skill works on its own for focused tasks. Describe what you want and the right skill activates, or invoke a specific skill with/<skill-name>. For example, /together-fine-tuning.
Chat with streaming and tool use
together-chat-completions to generate correct v2 SDK code with the right model ID, streaming setup, tool definitions, and the complete tool-call loop.
Generate and edit images
together-images for both the initial generation and the Kontext editing call, handling base64 decoding and file saving.
Fine-tune a model
together-fine-tuning for data preparation, upload, training configuration, and monitoring, then hands off to together-dedicated-endpoints for deployment.
Multi-step workflow examples
Skills define hand-off boundaries between products so the agent can chain them together for multi-step workflows. Here are four examples that span multiple skills.Build a RAG pipeline with evaluation
together-embeddings: generates dense vectors for your documents and builds a cosine-similarity retriever with reranking.together-chat-completions: generates answers from the retrieved context using a chat model.together-evaluations: sets up a score evaluation to grade answer quality with an LLM judge, polls for results, and downloads the per-row scores.
Fine-tune, deploy, and benchmark
together-fine-tuning: prepares preference pairs, runs SFT first, then DPO training, and monitors the job.together-dedicated-endpoints: deploys the fine-tuned checkpoint to a dedicated endpoint with hardware sizing and autoscaling.together-evaluations: runs a compare evaluation between the base model and your fine-tuned model, then downloads the results.
Generate product media from a single prompt
together-images: generates the initial image, then edits it with Kontext for studio lighting.together-video: takes the edited image as a first-frame keyframe, submits an image-to-video job, polls until completion, and downloads the MP4.
Batch-process and analyze results
together-batch-inference: prepares the JSONL input, uploads it, creates the batch job, and polls until the results are ready.together-sandboxes: uploads the results file to a sandboxed Python session, runs pandas analysis, and generates a matplotlib chart.
SDK compatibility
All code generated by these skills targets the Together Python v2 SDK (together>=2.0.0) and the Together TypeScript SDK (together-ai).
If you are upgrading from v1, see the migration guide for breaking changes in method names, argument styles, and response shapes.
Resources
- Skills repository on GitHub: Source code, full reference docs, and runnable scripts for all 12 skills.
- Together AI MCP server: Connect your coding agent to the Together AI documentation via MCP.
- Together AI cookbook: End-to-end examples and tutorials.
- Agent Skills specification: The open standard these skills follow.