What OpenRouter is
- Proxy that routes one OpenAI-compatible API call to 300+ models from 60+ providers.
- Handles key management, format differences, and some fallbacks so you swap models without changing much code.
- Free tier gives 25+ models at 50 requests per day. Paid adds the full catalog plus a 5.5% platform fee on credit purchases.[1]
Quick start
Use the OpenAI SDK with a custom base URL. Add the two optional headers for OpenRouter rankings.
Python (requests - works everywhere)
import requests
import json
import os
API_KEY = os.getenv("OPENROUTER_API_KEY")
headers = {
"Authorization": f"Bearer {API_KEY}",
"HTTP-Referer": "https://yourapp.com",
"X-OpenRouter-Title": "Your App Name",
}
data = {
"model": "anthropic/claude-sonnet-4.6",
"messages": [{"role": "user", "content": "learn more PID control in 50 words."}]
}
response = requests.post(
"https://openrouter.ai/api/v1/chat/completions",
headers=headers,
data=json.dumps(data)
)
print(response.json()["choices"][0]["message"]["content"])
TypeScript with official SDK (beta)
import { OpenRouter } from '@openrouter/sdk';
const or = new OpenRouter({
apiKey: process.env.OPENROUTER_API_KEY!,
defaultHeaders: {
'HTTP-Referer': 'https://yourapp.com',
'X-OpenRouter-Title': 'Your App Name',
},
});
const completion = await or.chat.send({
model: 'anthropic/claude-sonnet-4.6',
messages: [{ role: 'user', content: 'Explain PID control in 50 words.' }],
stream: false,
});
console.log(completion.choices[0].message.content);
Streaming example (add to either)
Add "stream": true and handle the SSE stream. The response shape matches OpenAI.
Tool calling
Pass "tools": [...] exactly as in the OpenAI spec. Most frontier models on the platform support it. The router forwards the calls.
Web search / plugins
Some models expose this via tools or specific provider parameters. Check the model card, and no universal plugin flag exists.
Best models on OpenRouter (April 2026)
Pricing shown as input/output per million tokens. Add 5.5% platform fee. Context windows vary.[2]
| Model | Input | Output | Context | Best use case |
|---|---|---|---|---|
| MiMo-V2-Flash | $0.09 | ~$0.40 | ~130K | High-volume cheap tasks, fast text |
| DeepSeek V3.2 | $0.25 | $0.38 | 128K+ | Coding, logic, strong value |
| MiniMax M2.5 | $0.27 | $1.10 | 200K+ | Agentic coding, real SWE-Bench performance |
| Gemini 3 Flash | $0.50 | $3.00 | 1M | Multimodal, speed, audio |
| Kimi K2.5 / GLM-5 | $0.45 - 0.80 | $2 - 4 | 200K - 260K | Long context reasoning, general knowledge |
| Grok 4.1 Fast | $0.20 | ~$0.80 | 2M | Massive context, extraction, routing |
| Claude Sonnet 4.6 | $3.00 | $15 | 1M | Workhorse coding, agent planning |
| Claude Opus 4.6 | $5.00 | $25 | 1M | Hard reasoning, complex software engineering |
| GPT-5.4 | $2.50 | $15 | 1M | Top-tier reasoning, computer use features |
Honest review
Strengths
- One key and one endpoint for everything. Swap from Claude to DeepSeek in one config change.
- Huge catalog. You get access to new Chinese models and experimental releases faster than chasing individual providers.
- Free models let you prototype without a card.
- Unified error handling and some routing options (cost, speed, fallback).
Limitations / gotchas
- The 5.5% fee is real on every credit top-up. It adds up on high volume.
- Extra network hop adds latency. Direct provider calls stay faster.
- Rate limits and availability still depend on the underlying provider. A popular model can queue or return 429s.
- No provider-specific features like Anthropic's batch API or OpenAI's advanced caching. You get the common denominator.
- Model quantization or slight performance differences across hosts exist. The card doesn't always disclose them.
When to use OpenRouter vs direct API
- Use OpenRouter when you experiment with many models, need quick fallback, or run small-to-medium scale projects.
- Go direct when you ship high-volume production code, chase every last millisecond, or negotiate enterprise deals with one provider.
Direct vs OpenRouter
- Single model, high volume: Direct wins on price (no 5.5%) and latency.
- Multiple models or frequent swaps: OpenRouter saves engineering time.
- Prototyping or agents: OpenRouter. The catalog and unified format matter more than the fee.
- Strict SLAs or dedicated capacity: Enterprise plan or direct.
If your monthly spend exceeds a few thousand dollars, run the numbers. The convenience tax becomes real. For most side projects and mid-size apps the platform still wins on velocity.