Skip to main content
We host 100+ open-source models on our serverless inference platform and even more on dedicated endpoints. This guide helps you choose the right model for your specific use case. For a complete list of all available models with detailed specifications, visit our Serverless and Dedicated Models pages.
Use CaseRecommended ModelModel StringAlternativesLearn More
ChatKimi K2.5 (instant mode)moonshotai/Kimi-K2.5deepseek-ai/DeepSeek-V3.1, openai/gpt-oss-120bChat
ReasoningKimi K2.5 (reasoning mode)moonshotai/Kimi-K2.5deepseek-ai/DeepSeek-R1, Qwen/Qwen3-235B-A22B-Thinking-2507Reasoning Guide, DeepSeek R1
Coding AgentsKimi K2.5 (reasoning mode)moonshotai/Kimi-K2.5Qwen/Qwen3-Coder-480B-A35B-Instruct-FP8, deepseek-ai/DeepSeek-V3.1Building Agents
Small & FastGPT-OSS 20Bopenai/gpt-oss-20bQwen/Qwen2.5-7B-Instruct-Turbo, meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo-
Medium General PurposeGPT-OSS 120Bopenai/gpt-oss-120bzai-org/GLM-4.5-Air-FP8, Qwen/Qwen3-Next-80B-A3B-Instruct-
Function CallingGLM 4.7zai-org/GLM-4.7moonshotai/Kimi-K2.5, moonshotai/Kimi-K2-Instruct-0905Function Calling
VisionKimi K2.5moonshotai/Kimi-K2.5meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8, Qwen/Qwen3-VL-32B-InstructVision, OCR
Image GenerationFlash Image 2.5 (Nano Banana)google/flash-image-2.5black-forest-labs/FLUX.2-pro, ByteDance-Seed/Seedream-4.0Images
Image-to-ImageFlash Image 2.5 (Nano Banana)google/flash-image-2.5black-forest-labs/FLUX.1-kontext-max, google/gemini-3-pro-imageFlux Kontext
Text-to-VideoSora 2openai/sora-2-progoogle/veo-3.0, ByteDance/Seedance-1.0-proVideo Generation
Image-to-VideoVeo 3.0google/veo-3.0ByteDance/Seedance-1.0-pro, kwaivgI/kling-2.1-masterVideo Generation
Text-to-SpeechCartesia Sonic 3cartesia/sonic-3canopylabs/orpheus-3b-0.1-ft, hexgrad/Kokoro-82MText-to-Speech
Speech-to-TextWhisper Large v3openai/whisper-large-v3mistralai/Voxtral-Mini-3B-2507Speech-to-Text
EmbeddingsGTE-Modernbert-baseAlibaba-NLP/gte-modernbert-baseintfloat/multilingual-e5-large-instructEmbeddings
RerankMixedBread Rerank Largemixedbread-ai/Mxbai-Rerank-Large-V2-Rerank, Guide
ModerationVirtue GuardVirtueAI/VirtueGuard-Text-Litemeta-llama/Llama-Guard-4-12B-

Need Help Choosing? For high-volume production workloads, consider Dedicated Inference for guaranteed capacity and predictable performance.