Jump to content
Jump to content
✓ Done
/resources · Quick Start

OpenRouter Quick Start & Review

Reviewed by Josh Ausmus · Updated April 2026

Download PDF ↓

What OpenRouter is

  • Unified OpenAI-compatible API that routes your requests to 300+ models from 60+ providers.
  • Handles provider keys, fallbacks, uptime optimization and billing in one place.
  • Free tier gives 25+ models and 50 requests per day. Paid adds everything else with a 5.5% platform fee on credit purchases.

Quick start

Get your API key from openrouter.ai/keys. Add these headers for attribution (helps their rankings):

HTTP-Referer: your site URL X-OpenRouter-Title: your app name

Python basic completion

import requests
import json
import os

API_KEY = os.getenv("OPENROUTER_API_KEY")
url = "https://openrouter.ai/api/v1/chat/completions"

headers = {
 "Authorization": f"Bearer {API_KEY}",
 "HTTP-Referer": "https://yourapp.com",
 "X-OpenRouter-Title": "MyApp",
}

payload = {
 "model": "openai/gpt-5.2",
 "messages": [{"role": "user", "content": "Explain PID control loops in 50 words."}]
}

response = requests.post(url, headers=headers, data=json.dumps(payload))
print(response.json()["choices"][0]["message"]["content"])

Python streaming

Add "stream": True and handle the SSE stream yourself or use openai library with base_url="https://openrouter.ai/api/v1".

TypeScript with OpenAI SDK (recommended)

import OpenAI from 'openai';

const openai = new OpenAI({
 baseURL: 'https://openrouter.ai/api/v1',
 apiKey: process.env.OPENROUTER_API_KEY,
 defaultHeaders: {
 'HTTP-Referer': 'https://yourapp.com',
 'X-OpenRouter-Title': 'MyApp',
 },
});

const completion = await openai.chat.completions.create({
 model: 'anthropic/claude-sonnet-4.6',
 messages: [{ role: 'user', content: 'Your prompt here' }],
 stream: true, // set false for non-streaming
});

for await (const chunk of completion) {
 process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

Tool calling example (add to payload)

"tools": [{
 "type": "function",
 "function": {
 "name": "get_weather",
 "parameters": { "type": "object", "properties": { "location": { "type": "string" } } }
 }
}]

Web search or plugins work through provider-specific models that support them (e.g. certain Gemini or Claude variants). Specify the model that has the feature.

Best models on OpenRouter (April 2026)

Model Approx Input/Output ($/M tokens) Context Best use case
MiMo-V2-Flash 0.09 / ~0.3 128k+ Fast cheap general tasks
Step 3.5 Flash (free variant) 0 / 0 on free 256k Experimentation, prototyping
DeepSeek V3.2 / R1 0.25 / 0.38 - 2.50 128k-200k Strong coding and reasoning
Grok 4.1 Fast ~0.20 / ~0.8 2M Long context extraction
Qwen3.6 Plus / variants 0.45-0.60 / ~1-2 1M General knowledge, math
Gemini 3 Flash / 3.1 Pro 0.50 / 1.5 - 2.0 1M+ Multimodal (vision, audio)
Claude Sonnet 4.6 ~3.0 / ~15 1M Reliable coding, agent work
GPT-5.4 / GPT-5.2 2.50 / 15+ 200k-1M Top-tier reasoning, complex tasks
Claude Opus 4.6 ~5.0 / 25+ 1M Hardest software engineering
Llama 3.3 70B (free) 0 / 0 on free 128k Local-like open model testing

Prices reflect weighted provider averages. Check the model page for exact routing options. Free variants have :free in the ID and stricter rate limits.

Honest review

Strengths

  • Single key and one SDK gets you real choice between providers without managing 60 billing accounts.
  • Routing and fallbacks actually improve uptime compared to raw provider APIs.
  • Free tier works for testing. No credit card needed for the basic models.
  • Model pages show real benchmark tags and per-provider pricing transparency.

Limitations and gotchas

  • 5.5% fee adds up on heavy usage. Minimum top-up charge exists.
  • Extra network hop adds 50-200ms latency versus direct provider calls.
  • Some models are quantized or routed to cheaper providers with slightly lower quality.
  • Free tier is 50 requests per day total. Heavy testing burns through it fast.
  • You still need to watch provider-specific quirks. The router doesn't fix broken system prompts.

When to use OpenRouter vs direct API access

Use OpenRouter when

  • You want to A/B test multiple models without code changes.
  • Your app needs automatic fallbacks or uptime routing.
  • You run at low-to-medium volume and value developer time over 5% savings.
  • You need one billing dashboard for everything.

Use direct provider APIs when

  • You have high volume and the 5.5% matters.
  • Latency is critical (real-time apps).
  • You already have volume discounts or credits with one provider.
  • You need provider-specific features that the router doesn't expose cleanly.

The router shines for prototypes and mid-size products. At large scale the fee and extra hop usually push teams to direct connections or self-hosted gateways.

If your workload fits in the free tier or you swap models often, it saves setup time. Otherwise the convenience tax is real.

Related Guides
what are ai reasoning tokens: hidden compute costs
what are ai reasoning tokens? Hidden chain-of-thought computations in OpenAI o3 and DeepSeek R1 multiply costs 5-20x during test-time compute.
FPGA vs Microcontroller: Which Runs Your Smart Home Hub
FPGA vs Microcontroller: Which Runs Your Smart Home Hub. MCUs are preferred for lower cost, simpler updates, and better power in smart home hubs.
Zigbee vs Z-Wave: The Protocols Running Your Smart Home
Zigbee vs Z-Wave: The Protocols Running Your Smart Home. Key tradeoffs in mesh behavior, RF reliability, MCU overhead for smart home scaling.