Jump to content
Jump to content
✓ Done
/resources · Quick Start

OpenRouter Quick Start & Review

Reviewed by Josh Ausmus · Updated April 2026

Live reference · updated continuously

What OpenRouter is

  • Proxy that routes one OpenAI-compatible API call to 300+ models from 60+ providers.
  • Handles key management, format differences, and some fallbacks so you swap models without changing much code.
  • Free tier gives 25+ models at 50 requests per day. Paid adds the full catalog plus a 5.5% platform fee on credit purchases.[1]

Quick start

Use the OpenAI SDK with a custom base URL. Add the two optional headers for OpenRouter rankings.

Python (requests - works everywhere)

import requests
import json
import os

API_KEY = os.getenv("OPENROUTER_API_KEY")
headers = {
 "Authorization": f"Bearer {API_KEY}",
 "HTTP-Referer": "https://yourapp.com",
 "X-OpenRouter-Title": "Your App Name",
}

data = {
 "model": "anthropic/claude-sonnet-4.6",
 "messages": [{"role": "user", "content": "learn more PID control in 50 words."}]
}

response = requests.post(
 "https://openrouter.ai/api/v1/chat/completions",
 headers=headers,
 data=json.dumps(data)
)
print(response.json()["choices"][0]["message"]["content"])

TypeScript with official SDK (beta)

import { OpenRouter } from '@openrouter/sdk';

const or = new OpenRouter({
 apiKey: process.env.OPENROUTER_API_KEY!,
 defaultHeaders: {
 'HTTP-Referer': 'https://yourapp.com',
 'X-OpenRouter-Title': 'Your App Name',
 },
});

const completion = await or.chat.send({
 model: 'anthropic/claude-sonnet-4.6',
 messages: [{ role: 'user', content: 'Explain PID control in 50 words.' }],
 stream: false,
});

console.log(completion.choices[0].message.content);

Streaming example (add to either)

Add "stream": true and handle the SSE stream. The response shape matches OpenAI.

Tool calling

Pass "tools": [...] exactly as in the OpenAI spec. Most frontier models on the platform support it. The router forwards the calls.

Web search / plugins

Some models expose this via tools or specific provider parameters. Check the model card, and no universal plugin flag exists.

Best models on OpenRouter (April 2026)

Pricing shown as input/output per million tokens. Add 5.5% platform fee. Context windows vary.[2]

Model Input Output Context Best use case
MiMo-V2-Flash $0.09 ~$0.40 ~130K High-volume cheap tasks, fast text
DeepSeek V3.2 $0.25 $0.38 128K+ Coding, logic, strong value
MiniMax M2.5 $0.27 $1.10 200K+ Agentic coding, real SWE-Bench performance
Gemini 3 Flash $0.50 $3.00 1M Multimodal, speed, audio
Kimi K2.5 / GLM-5 $0.45 - 0.80 $2 - 4 200K - 260K Long context reasoning, general knowledge
Grok 4.1 Fast $0.20 ~$0.80 2M Massive context, extraction, routing
Claude Sonnet 4.6 $3.00 $15 1M Workhorse coding, agent planning
Claude Opus 4.6 $5.00 $25 1M Hard reasoning, complex software engineering
GPT-5.4 $2.50 $15 1M Top-tier reasoning, computer use features

Honest review

Strengths

  • One key and one endpoint for everything. Swap from Claude to DeepSeek in one config change.
  • Huge catalog. You get access to new Chinese models and experimental releases faster than chasing individual providers.
  • Free models let you prototype without a card.
  • Unified error handling and some routing options (cost, speed, fallback).

Limitations / gotchas

  • The 5.5% fee is real on every credit top-up. It adds up on high volume.
  • Extra network hop adds latency. Direct provider calls stay faster.
  • Rate limits and availability still depend on the underlying provider. A popular model can queue or return 429s.
  • No provider-specific features like Anthropic's batch API or OpenAI's advanced caching. You get the common denominator.
  • Model quantization or slight performance differences across hosts exist. The card doesn't always disclose them.

When to use OpenRouter vs direct API

  • Use OpenRouter when you experiment with many models, need quick fallback, or run small-to-medium scale projects.
  • Go direct when you ship high-volume production code, chase every last millisecond, or negotiate enterprise deals with one provider.

Direct vs OpenRouter

  • Single model, high volume: Direct wins on price (no 5.5%) and latency.
  • Multiple models or frequent swaps: OpenRouter saves engineering time.
  • Prototyping or agents: OpenRouter. The catalog and unified format matter more than the fee.
  • Strict SLAs or dedicated capacity: Enterprise plan or direct.

If your monthly spend exceeds a few thousand dollars, run the numbers. The convenience tax becomes real. For most side projects and mid-size apps the platform still wins on velocity.

Ai Model Cost Per Token 2026: 12 Hidden Cost Layers
ai model cost per token 2026 reveals the 12 hidden cost layers in LLM APIs beyond simple rates. This guide explains the real costs for AI teams in 2026.
How To Reduce Ai Api Costs: Save 60-80% On Llm Spend
how to reduce ai api costs by tracking layered expenses in production AI agents. Non-LLM costs can account for 27-50% of total spend in 2026.
Deepseek Vs Openai Pricing Comparison 2026
deepseek vs openai pricing comparison in 2026: 12 cost components beyond tokens including cache and search fees. Non-LLM costs exceed 50% of agent workflows.