realtime
Realtime
p95 < 300ms TTFT
Cost + 10%
Direct passthrough to the upstream provider, full precision. The only tier with a hard delivery guarantee on closed-weight models. Use when latency is non-negotiable.
We never charge more than the provider would charge you direct. Usually less — Atlas routes to a cheaper variant whenever quality holds, and you keep the savings. Don’t like a call? Thumbs-down it and we credit it back automatically, no questions asked. Start on free models with no card at all.
export OPENAI_BASE_URL=https://api.newmen.ai/v1 (or ANTHROPIC_BASE_URL=https://api.newmen.ai) plus a Newmen API key. That’s the migration. Card statement descriptor: NEWMEN.AI*CREDITS.
Atlas-1 is a router, not a single model. It looks at the operation you tagged, the eval-gate history for that operation, and the provider variants available for the underlying model — then picks the cheapest path that has historically stayed green.
- model: "chatgpt-5.5",
+ model: "atlas-1", // smart routing across providers
+ tier: "standard", // optional; default is "standard"
metadata: { operation_id: "summarize_ticket" },What you keep
OpenAI-compatible API, your existing SDK, every supported provider model. Pin a specific model anytime.
What changes
atlas-1 picks the upstream + provider variant + quantization per call. The response carries a `delivery` block telling you exactly what was used.
What protects quality
If `metadata.operation_id` resolves to an operation with a bound evaluator and a min_score, calls scoring below threshold aren't billed.
Tier is a per-call hint. Atlas-1 honours it; pinned models honour it within the variants the upstream provides.
realtime
p95 < 300ms TTFT
Cost + 10%
Direct passthrough to the upstream provider, full precision. The only tier with a hard delivery guarantee on closed-weight models. Use when latency is non-negotiable.
standard
p95 < 8s TTFT
30–55% off
Atlas routes to the cheapest provider variant whose quantization has stayed green for your operation's eval gates. Default tier — your bill drops without changing business logic. Best-effort: may upgrade to Realtime when no path passes; bill follows actual delivery.
batch
~24h SLA
50–70% off
Async. Routes to provider batch APIs (OpenAI / Anthropic) where supported, queued spot capacity for open-weight models. Largest discounts. Streaming not supported.
Discounts are computed against the realtime cost-leader for the same logical model. Per-task numbers are published openly on the benchmarks page; per-call savings vary with prompt mix and time of day.
A customer running 30M tokens / month of support-ticket summarization on chatgpt-5.5 (input $2.75 / output $11 per 1M) pays roughly $210 / month.
They flip model: "atlas-1" and add a regex evaluator with min_score: 0.95 to the summarize_ticket operation. After two weeks of green eval gates on Standard tier, Atlas routes ~80% of calls to meta-llama/llama-3.1-70b-instruct Q8 on Hyperbolic ($0.40 / $0.40 per 1M) and ~20% to ChatGPT-5.5 on calls the loop says still need it.
New monthly cost: ~$84. That’s a 60% reduction. Calls scoring below the regex threshold aren’t metered, so the customer absorbs zero quality risk.
Numbers are illustrative. Run the comparison runner on your own prompts for a real per-workload estimate.
The eval-refund is the only thing standing between Atlas's cost claim and the long history of inference brokers promising the quantized version is fine. Bind a numeric threshold; calls below it never show up on your invoice. On every plan.
01 · Bind
Bind an evaluator with a min_score to your operation.
Regex, LLM-judge, structural — any evaluator that returns a numeric score. Set once per operation in /console/evaluators.
02 · Score
Atlas runs the evaluator on the served output, on the call path.
Same after() hook that meters usage. Score appears on the call record within about a second.
03 · Refund
Score below min_score → call's net metered quantity is zero.
A compensating Stripe meter event fires automatically. The call still appears in /console/calls so your team can correct it via feedback — but it never appears on your invoice.
Works on PAYG. Works on Reliability Loop. Works on Strategic. The mechanic is platform-wide — the only requirement is a bound evaluator with a numeric threshold and metadata.operation_id on the call.
Usage rates are the same across all plans — the per-call tier (Realtime / Standard / Batch) decides the bill. Plan choice gates the reliability loop, the quality refund, and Atlas Network defaults.
$0
$5 free credits · free models, no card
Match-or-better rate on every provider. We never charge more than direct would have, and usually less because Atlas routes to a cheaper variant when quality holds. Start free on free models with no card; add one for the full catalogue. Thumbs-down any call you don't like and we credit it back automatically.
$1,500 / mo
platform fee + 10% on usage
Everything in Pay as you go plus the productized reliability workflow — operation-aware routing, eval-gated auto-refund, evaluators as a first-class concept, ship gates as a deploy concept, dataset promotion, and per-tenant LoRA training. The 10% markup pays for the loop; the savings on the routing side usually cover it.
Custom
annual commitment
For teams running significant volume. Negotiated rates per quantization tier, dedicated capacity, and a solutions engineer who knows your evals.
On Pay as you go, thumbs-down any call and we credit it back automatically (fair-use limits in the terms). On the Reliability Loop tier this stacks with the eval-gated auto-refund: bind an evaluator with a numeric min_score to any operation, and calls below threshold are refunded synchronously without anyone hitting thumbs-down. Strategic customers negotiate per-tier rates and dedicated capacity.
Pass any model id in the model field. The most-used models across the network this week are pinned to the top; the full provider-grouped catalogue follows. Pay as you go is at-or-below the upstream provider's list price — no markup. Reliability Loop carries the standard 10% itemized separately.
321 models supported
| Model ID | Realtime | Standard | |||
|---|---|---|---|---|---|
| Most popular this week · top 12 | |||||
DeepSeek: DeepSeek V4 Flash | $0.1 / 1M | $0.1 / 1Mq4 · DeepInfra | |||
Tencent: Hy3 preview | $0.066 / 1M | — | |||
Anthropic: Claude Opus 4.7 | $5.00 / 1M | — | |||
Anthropic: Claude Sonnet 4.6 | $3.00 / 1M | — | |||
DeepSeek: DeepSeek V4 Pro | $0.435 / 1M | $1.30 / 1Mq4 · DeepInfra | |||
Google: Gemini 3 Flash Preview | $0.5 / 1M | — | |||
DeepSeek: DeepSeek V3.2 | $0.252 / 1M | $0.252 / 1Mq8 · Baidu | |||
NVIDIA: Nemotron 3 Super | $0.09 / 1M | $0.09 / 1Mq8 · DekaLLM | |||
Google: Gemini 2.5 Flash Lite | $0.1 / 1M | — | |||
MoonshotAI: Kimi K2.6 | $0.73 / 1M | $0.73 / 1Mq4 · Io Net | |||
Google: Gemini 2.5 Flash | $0.3 / 1M | — | |||
MiniMax: MiniMax M2.7 | $0.279 / 1M | $0.3 / 1Mq8 · Minimax | |||
| OpenAI · 63 models | |||||
OpenAI: GPT Audio | $2.50 / 1M | — | |||
OpenAI: GPT Audio Mini | $0.6 / 1M | — | |||
OpenAI: GPT Chat Latest | $5.00 / 1M | — | |||
OpenAI: GPT-3.5 Turbo | $0.5 / 1M | — | |||
OpenAI: GPT-3.5 Turbo (older v0613) | $1.00 / 1M | — | |||
OpenAI: GPT-3.5 Turbo 16k | $3.00 / 1M | — | |||
OpenAI: GPT-3.5 Turbo Instruct | $1.50 / 1M | — | |||
OpenAI: GPT-4 | $30.00 / 1M | — | |||
OpenAI: GPT-4 (older v0314) | $30.00 / 1M | — | |||
OpenAI: GPT-4 Turbo | $10.00 / 1M | — | |||
OpenAI: GPT-4 Turbo (older v1106) | $10.00 / 1M | — | |||
OpenAI: GPT-4 Turbo Preview | $10.00 / 1M | — | |||
OpenAI: GPT-4.1 | $2.00 / 1M | — | |||
OpenAI: GPT-4.1 Mini | $0.4 / 1M | — | |||
OpenAI: GPT-4.1 Nano | $0.1 / 1M | — | |||
OpenAI: GPT-4o | $2.50 / 1M | — | |||
OpenAI: GPT-4o (2024-05-13) | $5.00 / 1M | — | |||
OpenAI: GPT-4o (2024-08-06) | $2.50 / 1M | — | |||
OpenAI: GPT-4o (2024-11-20) | $2.50 / 1M | — | |||
OpenAI: GPT-4o Audio | $2.50 / 1M | — | |||
OpenAI: GPT-4o Search Preview | $2.50 / 1M | — | |||
OpenAI: GPT-4o-mini | $0.15 / 1M | — | |||
OpenAI: GPT-4o-mini (2024-07-18) | $0.15 / 1M | — | |||
OpenAI: GPT-4o-mini Search Preview | $0.15 / 1M | — | |||
OpenAI: GPT-5 | $1.25 / 1M | — | |||
OpenAI: GPT-5 Chat | $1.25 / 1M | — | |||
OpenAI: GPT-5 Codex | $1.25 / 1M | — | |||
OpenAI: GPT-5 Image | $10.00 / 1M | — | |||
OpenAI: GPT-5 Image Mini | $2.50 / 1M | — | |||
OpenAI: GPT-5 Mini | $0.25 / 1M | — | |||
OpenAI: GPT-5 Nano | $0.05 / 1M | — | |||
OpenAI: GPT-5 Pro | $15.00 / 1M | — | |||
OpenAI: GPT-5.1 | $1.25 / 1M | — | |||
OpenAI: GPT-5.1 Chat | $1.25 / 1M | — | |||
OpenAI: GPT-5.1-Codex | $1.25 / 1M | — | |||
OpenAI: GPT-5.1-Codex-Max | $1.25 / 1M | — | |||
OpenAI: GPT-5.1-Codex-Mini | $0.25 / 1M | — | |||
OpenAI: GPT-5.2 | $1.75 / 1M | — | |||
OpenAI: GPT-5.2 Chat | $1.75 / 1M | — | |||
OpenAI: GPT-5.2 Pro | $21.00 / 1M | — | |||
OpenAI: GPT-5.2-Codex | $1.75 / 1M | — | |||
OpenAI: GPT-5.3 Chat | $1.75 / 1M | — | |||
OpenAI: GPT-5.3-Codex | $1.75 / 1M | — | |||
OpenAI: GPT-5.4 | $2.50 / 1M | — | |||
OpenAI: GPT-5.4 Image 2 | $8.00 / 1M | — | |||
OpenAI: GPT-5.4 Mini | $0.75 / 1M | — | |||
OpenAI: GPT-5.4 Nano | $0.2 / 1M | — | |||
OpenAI: GPT-5.4 Pro | $30.00 / 1M | — | |||
OpenAI: GPT-5.5 | $5.00 / 1M | — | |||
OpenAI: GPT-5.5 Pro | $30.00 / 1M | — | |||
OpenAI: gpt-oss-120b | $0.039 / 1M | $0.05 / 1Mq4 · Novita | |||
OpenAI: gpt-oss-20b | $0.03 / 1M | $0.04 / 1Mq4 · Novita | |||
OpenAI: gpt-oss-safeguard-20b | $0.075 / 1M | — | |||
OpenAI: o1 | $15.00 / 1M | — | |||
OpenAI: o1-pro | $150.00 / 1M | — | |||
OpenAI: o3 | $2.00 / 1M | — | |||
OpenAI: o3 Deep Research | $10.00 / 1M | — | |||
OpenAI: o3 Mini | $1.10 / 1M | — | |||
OpenAI: o3 Mini High | $1.10 / 1M | — | |||
OpenAI: o3 Pro | $20.00 / 1M | — | |||
OpenAI: o4 Mini | $1.10 / 1M | — | |||
OpenAI: o4 Mini Deep Research | $2.00 / 1M | — | |||
OpenAI: o4 Mini High | $1.10 / 1M | — | |||
| Anthropic · 13 models | |||||
Anthropic: Claude 3 Haiku | $0.25 / 1M | — | |||
Anthropic: Claude 3.5 Haiku | $0.8 / 1M | — | |||
Anthropic: Claude Haiku 4.5 | $1.00 / 1M | — | |||
Anthropic: Claude Opus 4 | $15.00 / 1M | — | |||
Anthropic: Claude Opus 4.1 | $15.00 / 1M | — | |||
Anthropic: Claude Opus 4.5 | $5.00 / 1M | — | |||
Anthropic: Claude Opus 4.6 | $5.00 / 1M | — | |||
Anthropic: Claude Opus 4.6 (Fast) | $30.00 / 1M | — | |||
Anthropic: Claude Opus 4.7 | $5.00 / 1M | — | |||
Anthropic: Claude Opus 4.7 (Fast) | $30.00 / 1M | — | |||
Anthropic: Claude Sonnet 4 | $3.00 / 1M | — | |||
Anthropic: Claude Sonnet 4.5 | $3.00 / 1M | — | |||
Anthropic: Claude Sonnet 4.6 | $3.00 / 1M | — | |||
| Google · 24 models | |||||
Google: Gemini 2.0 Flash | $0.1 / 1M | — | |||
Google: Gemini 2.0 Flash Lite | $0.075 / 1M | — | |||
Google: Gemini 2.5 Flash | $0.3 / 1M | — | |||
Google: Gemini 2.5 Flash Lite | $0.1 / 1M | — | |||
Google: Gemini 2.5 Flash Lite Preview 09-2025 | $0.1 / 1M | — | |||
Google: Gemini 2.5 Pro | $1.25 / 1M | — | |||
Google: Gemini 2.5 Pro Preview 05-06 | $1.25 / 1M | — | |||
Google: Gemini 2.5 Pro Preview 06-05 | $1.25 / 1M | — | |||
Google: Gemini 3 Flash Preview | $0.5 / 1M | — | |||
Google: Gemini 3.1 Flash Lite | $0.25 / 1M | — | |||
Google: Gemini 3.1 Flash Lite Preview | $0.25 / 1M | — | |||
Google: Gemini 3.1 Pro Preview | $2.00 / 1M | — | |||
Google: Gemini 3.1 Pro Preview Custom Tools | $2.00 / 1M | — | |||
Google: Gemini 3.5 Flash | $1.50 / 1M | — | |||
Google: Gemma 2 27B | $0.65 / 1M | $0.65 / 1Mq4 · NextBit | |||
Google: Gemma 3 12B | $0.04 / 1M | — | |||
Google: Gemma 3 27B | $0.08 / 1M | $0.08 / 1Mq8 · DeepInfra | |||
Google: Gemma 3 4B | $0.04 / 1M | — | |||
Google: Gemma 3n 4B | $0.06 / 1M | — | |||
Google: Gemma 4 26B A4B | $0.06 / 1M | $0.07 / 1Mq8 · DeepInfra | |||
Google: Gemma 4 31B | $0.12 / 1M | $0.12 / 1Mq4 · DeepInfra | |||
Google: Nano Banana (Gemini 2.5 Flash Image) | $0.3 / 1M | — | |||
Google: Nano Banana 2 (Gemini 3.1 Flash Image Preview) | $0.5 / 1M | — | |||
Google: Nano Banana Pro (Gemini 3 Pro Image Preview) | $2.00 / 1M | — | |||
| Meta · 12 models | |||||
Llama Guard 3 8B | $0.484 / 1M | — | |||
Meta: Llama 3 70B Instruct | $0.51 / 1M | $0.51 / 1Mq8 · Novita | |||
Meta: Llama 3 8B Instruct | $0.04 / 1M | $0.1 / 1Mq4 · Together | |||
Meta: Llama 3.1 70B Instruct | $0.4 / 1M | $0.4 / 1Mq8 · DeepInfra | |||
Meta: Llama 3.1 8B Instruct | $0.02 / 1M | $0.02 / 1Mq8 · Novita | |||
Meta: Llama 3.2 11B Vision Instruct | $0.245 / 1M | $0.245 / 1Mq8 · DeepInfra | |||
Meta: Llama 3.2 1B Instruct | $0.027 / 1M | — | |||
Meta: Llama 3.2 3B Instruct | $0.051 / 1M | — | |||
Meta: Llama 3.3 70B Instruct | $0.1 / 1M | $0.1 / 1Mq8 · DeepInfra | |||
Meta: Llama 4 Maverick | $0.15 / 1M | $0.15 / 1Mq8 · DeepInfra | |||
Meta: Llama 4 Scout | $0.08 / 1M | $0.08 / 1Mq8 · DeepInfra | |||
Meta: Llama Guard 4 12B | $0.18 / 1M | — | |||
| Mistral · 24 models | |||||
Mistral Large | $2.00 / 1M | — | |||
Mistral Large 2407 | $2.00 / 1M | — | |||
Mistral Large 2411 | $2.00 / 1M | — | |||
Mistral: Codestral 2508 | $0.3 / 1M | — | |||
Mistral: Devstral 2 2512 | $0.4 / 1M | — | |||
Mistral: Devstral Medium | $0.4 / 1M | — | |||
Mistral: Devstral Small 1.1 | $0.1 / 1M | — | |||
Mistral: Ministral 3 14B 2512 | $0.2 / 1M | $0.35 / 1Mq8 · NextBit | |||
Mistral: Ministral 3 3B 2512 | $0.1 / 1M | $0.15 / 1Mq8 · NextBit | |||
Mistral: Ministral 3 8B 2512 | $0.15 / 1M | $0.3 / 1Mq8 · NextBit | |||
Mistral: Mistral 7B Instruct v0.1 | $0.11 / 1M | — | |||
Mistral: Mistral Large 3 2512 | $0.5 / 1M | — | |||
Mistral: Mistral Medium 3 | $0.4 / 1M | — | |||
Mistral: Mistral Medium 3.1 | $0.4 / 1M | — | |||
Mistral: Mistral Medium 3.5 | $1.50 / 1M | — | |||
Mistral: Mistral Nemo | $0.02 / 1M | $0.02 / 1Mq8 · DeepInfra | |||
Mistral: Mistral Small 3 | $0.05 / 1M | $0.05 / 1Mq8 · DeepInfra | |||
Mistral: Mistral Small 3.1 24B | $0.351 / 1M | — | |||
Mistral: Mistral Small 3.2 24B | $0.075 / 1M | $0.075 / 1Mq8 · DeepInfra | |||
Mistral: Mistral Small 4 | $0.15 / 1M | $0.188 / 1Mq8 · Venice | |||
Mistral: Mixtral 8x22B Instruct | $2.00 / 1M | — | |||
Mistral: Pixtral Large 2411 | $2.00 / 1M | — | |||
Mistral: Saba | $0.2 / 1M | — | |||
Mistral: Voxtral Small 24B 2507 | $0.1 / 1M | — | |||
| xAI · 4 models | |||||
xAI: Grok 4.20 | $1.25 / 1M | — | |||
xAI: Grok 4.20 Multi-Agent | $2.00 / 1M | — | |||
xAI: Grok 4.3 | $1.25 / 1M | — | |||
xAI: Grok Build 0.1 | $1.00 / 1M | — | |||
| DeepSeek · 13 models | |||||
DeepSeek: DeepSeek V3 | $0.229 / 1M | $0.32 / 1Mq4 · DeepInfra | |||
DeepSeek: DeepSeek V3 0324 | $0.2 / 1M | $0.2 / 1Mq4 · DeepInfra | |||
DeepSeek: DeepSeek V3.1 | $0.21 / 1M | $0.21 / 1Mq4 · DeepInfra | |||
DeepSeek: DeepSeek V3.1 Terminus | $0.27 / 1M | $0.27 / 1Mq4 · DeepInfra | |||
DeepSeek: DeepSeek V3.2 | $0.252 / 1M | $0.252 / 1Mq8 · Baidu | |||
DeepSeek: DeepSeek V3.2 Exp | $0.27 / 1M | $0.27 / 1Mq8 · AtlasCloud | |||
DeepSeek: DeepSeek V3.2 Speciale | $0.287 / 1M | $0.287 / 1Mq8 · AtlasCloud | |||
DeepSeek: DeepSeek V4 Flash | $0.1 / 1M | $0.1 / 1Mq4 · DeepInfra | |||
DeepSeek: DeepSeek V4 Pro | $0.435 / 1M | $1.30 / 1Mq4 · DeepInfra | |||
DeepSeek: R1 | $0.7 / 1M | $0.7 / 1Mq8 · Novita | |||
DeepSeek: R1 0528 | $0.5 / 1M | $0.5 / 1Mq4 · DeepInfra | |||
DeepSeek: R1 Distill Llama 70B | $0.7 / 1M | $0.7 / 1Mq8 · DeepInfra | |||
DeepSeek: R1 Distill Qwen 32B | $0.29 / 1M | $0.29 / 1Mq8 · NextBit | |||
| Alibaba (Qwen) · 46 models | |||||
Qwen: Qwen Plus 0728 | $0.26 / 1M | — | |||
Qwen: Qwen Plus 0728 (thinking) | $0.26 / 1M | — | |||
Qwen: Qwen-Plus | $0.26 / 1M | — | |||
Qwen: Qwen2.5 7B Instruct | $0.04 / 1M | $0.04 / 1Mq8 · AtlasCloud | |||
Qwen: Qwen2.5 VL 72B Instruct | $0.25 / 1M | $0.25 / 1Mq8 · Nebius | |||
Qwen: Qwen3 14B | $0.1 / 1M | $0.1 / 1Mq4 · NextBit | |||
Qwen: Qwen3 235B A22B | $0.455 / 1M | — | |||
Qwen: Qwen3 235B A22B Instruct 2507 | $0.071 / 1M | $0.071 / 1Mq8 · DeepInfra | |||
Qwen: Qwen3 235B A22B Thinking 2507 | $0.149 / 1M | $0.23 / 1Mq8 · DeepInfra | |||
Qwen: Qwen3 30B A3B | $0.09 / 1M | $0.09 / 1Mq8 · DeepInfra | |||
Qwen: Qwen3 30B A3B Instruct 2507 | $0.09 / 1M | $0.09 / 1Mq8 · SiliconFlow | |||
Qwen: Qwen3 30B A3B Thinking 2507 | $0.08 / 1M | $0.08 / 1Mq8 · AtlasCloud | |||
Qwen: Qwen3 32B | $0.08 / 1M | $0.08 / 1Mq8 · DeepInfra | |||
Qwen: Qwen3 8B | $0.05 / 1M | $0.05 / 1Mq8 · AtlasCloud | |||
Qwen: Qwen3 Coder 30B A3B Instruct | $0.07 / 1M | $0.07 / 1Mq8 · Novita | |||
Qwen: Qwen3 Coder 480B A35B | $0.22 / 1M | $0.3 / 1Mq4 · DeepInfra | |||
Qwen: Qwen3 Coder Flash | $0.195 / 1M | — | |||
Qwen: Qwen3 Coder Next | $0.11 / 1M | $0.11 / 1Mq8 · Ionstream | |||
Qwen: Qwen3 Coder Plus | $0.65 / 1M | — | |||
Qwen: Qwen3 Max | $0.78 / 1M | — | |||
Qwen: Qwen3 Max Thinking | $0.78 / 1M | — | |||
Qwen: Qwen3 Next 80B A3B Instruct | $0.09 / 1M | $0.09 / 1Mq8 · DeepInfra | |||
Qwen: Qwen3 Next 80B A3B Thinking | $0.098 / 1M | $0.15 / 1Mq8 · AtlasCloud | |||
Qwen: Qwen3 VL 235B A22B Instruct | $0.2 / 1M | $0.2 / 1Mq8 · DeepInfra | |||
Qwen: Qwen3 VL 235B A22B Thinking | $0.26 / 1M | — | |||
Qwen: Qwen3 VL 30B A3B Instruct | $0.13 / 1M | $0.15 / 1Mq8 · AtlasCloud | |||
Qwen: Qwen3 VL 30B A3B Thinking | $0.13 / 1M | $0.29 / 1Mq8 · SiliconFlow | |||
Qwen: Qwen3 VL 32B Instruct | $0.104 / 1M | — | |||
Qwen: Qwen3 VL 8B Instruct | $0.08 / 1M | $0.08 / 1Mq8 · AtlasCloud | |||
Qwen: Qwen3 VL 8B Thinking | $0.117 / 1M | — | |||
Qwen: Qwen3.5 397B A17B | $0.39 / 1M | $0.45 / 1Mq8 · Chutes | |||
Qwen: Qwen3.5 Plus 2026-02-15 | $0.26 / 1M | — | |||
Qwen: Qwen3.5 Plus 2026-04-20 | $0.3 / 1M | — | |||
Qwen: Qwen3.5-122B-A10B | $0.26 / 1M | $0.26 / 1Mq8 · SiliconFlow | |||
Qwen: Qwen3.5-27B | $0.195 / 1M | $0.25 / 1Mq8 · SiliconFlow | |||
Qwen: Qwen3.5-35B-A3B | $0.139 / 1M | $0.139 / 1Mq8 · DekaLLM | |||
Qwen: Qwen3.5-9B | $0.04 / 1M | $0.1 / 1Mq8 · SiliconFlow | |||
Qwen: Qwen3.5-Flash | $0.065 / 1M | — | |||
Qwen: Qwen3.6 27B | $0.29 / 1M | $0.29 / 1Mq8 · Io Net | |||
Qwen: Qwen3.6 35B A3B | $0.14 / 1M | $0.14 / 1Mq8 · Io Net | |||
Qwen: Qwen3.6 Flash | $0.188 / 1M | — | |||
Qwen: Qwen3.6 Max Preview | $1.04 / 1M | — | |||
Qwen: Qwen3.6 Plus | $0.325 / 1M | — | |||
Qwen: Qwen3.7 Max | $1.25 / 1M | — | |||
Qwen2.5 72B Instruct | $0.36 / 1M | $0.36 / 1Mq8 · DeepInfra | |||
Qwen2.5 Coder 32B Instruct | $0.66 / 1M | — | |||
| Cohere · 4 models | |||||
Cohere: Command A | $2.50 / 1M | — | |||
Cohere: Command R (08-2024) | $0.15 / 1M | — | |||
Cohere: Command R+ (08-2024) | $2.50 / 1M | — | |||
Cohere: Command R7B (12-2024) | $0.037 / 1M | — | |||
| Amazon · 5 models | |||||
Amazon: Nova 2 Lite | $0.3 / 1M | — | |||
Amazon: Nova Lite 1.0 | $0.06 / 1M | — | |||
Amazon: Nova Micro 1.0 | $0.035 / 1M | — | |||
Amazon: Nova Premier 1.0 | $2.50 / 1M | — | |||
Amazon: Nova Pro 1.0 | $0.8 / 1M | — | |||
| Microsoft · 3 models | |||||
Microsoft: Phi 4 | $0.065 / 1M | $0.065 / 1Mq4 · NextBit | |||
Microsoft: Phi 4 Mini Instruct | $0.08 / 1M | — | |||
WizardLM-2 8x22B | $0.62 / 1M | — | |||
| NVIDIA · 4 models | |||||
NVIDIA: Llama 3.3 Nemotron Super 49B V1.5 | $0.1 / 1M | $0.1 / 1Mq8 · DeepInfra | |||
NVIDIA: Nemotron 3 Nano 30B A3B | $0.05 / 1M | $0.05 / 1Mq4 · DeepInfra | |||
NVIDIA: Nemotron 3 Super | $0.09 / 1M | $0.09 / 1Mq8 · DekaLLM | |||
NVIDIA: Nemotron Nano 9B V2 | $0.04 / 1M | — | |||
| Perplexity · 5 models | |||||
Perplexity: Sonar | $1.00 / 1M | — | |||
Perplexity: Sonar Deep Research | $2.00 / 1M | — | |||
Perplexity: Sonar Pro | $3.00 / 1M | — | |||
Perplexity: Sonar Pro Search | $3.00 / 1M | — | |||
Perplexity: Sonar Reasoning Pro | $2.00 / 1M | — | |||
| AI21 · 1 model | |||||
AI21: Jamba Large 1.7 | $2.00 / 1M | $2.00 / 1Mq8 · AI21 | |||
| Aion · 4 models | |||||
AionLabs: Aion-1.0 | $4.00 / 1M | — | |||
AionLabs: Aion-1.0-Mini | $0.7 / 1M | — | |||
AionLabs: Aion-2.0 | $0.8 / 1M | — | |||
AionLabs: Aion-RP 1.0 (8B) | $0.8 / 1M | — | |||
| AlfredPros · 1 model | |||||
AlfredPros: CodeLLaMa 7B Instruct Solidity | $0.8 / 1M | — | |||
| Allen AI · 1 model | |||||
AllenAI: Olmo 3 32B Think | $0.15 / 1M | — | |||
| Anthracite · 1 model | |||||
Magnum v4 72B | $3.00 / 1M | $3.00 / 1Mq8 · Mancer 2 | |||
| Arcee · 6 models | |||||
Arcee AI: Coder Large | $0.5 / 1M | — | |||
Arcee AI: Maestro Reasoning | $0.9 / 1M | — | |||
Arcee AI: Spotlight | $0.18 / 1M | — | |||
Arcee AI: Trinity Large Thinking | $0.22 / 1M | $0.22 / 1Mq8 · Parasail | |||
Arcee AI: Trinity Mini | $0.045 / 1M | — | |||
Arcee AI: Virtuoso Large | $0.75 / 1M | — | |||
| Baidu · 6 models | |||||
Baidu: ERNIE 4.5 21B A3B | $0.07 / 1M | — | |||
Baidu: ERNIE 4.5 21B A3B Thinking | $0.07 / 1M | $0.07 / 1Mq8 · Novita | |||
Baidu: ERNIE 4.5 300B A47B | $0.28 / 1M | — | |||
Baidu: ERNIE 4.5 VL 28B A3B | $0.14 / 1M | — | |||
Baidu: ERNIE 4.5 VL 424B A47B | $0.42 / 1M | — | |||
Baidu: Qianfan-OCR-Fast | $0.68 / 1M | $0.68 / 1Mq8 · Baidu | |||
| ByteDance · 1 model | |||||
ByteDance: UI-TARS 7B | $0.1 / 1M | — | |||
| ByteDance Seed · 4 models | |||||
ByteDance Seed: Seed 1.6 | $0.25 / 1M | $0.25 / 1Mq8 · Seed | |||
ByteDance Seed: Seed 1.6 Flash | $0.075 / 1M | $0.075 / 1Mq8 · Seed | |||
ByteDance Seed: Seed-2.0-Lite | $0.25 / 1M | $0.25 / 1Mq8 · Seed | |||
ByteDance Seed: Seed-2.0-Mini | $0.1 / 1M | $0.1 / 1Mq8 · Seed | |||
| DeepCogito · 1 model | |||||
Deep Cogito: Cogito v2.1 671B | $1.25 / 1M | — | |||
| Essential AI · 1 model | |||||
EssentialAI: Rnj 1 Instruct | $0.15 / 1M | — | |||
| Gryphe · 1 model | |||||
MythoMax 13B | $0.06 / 1M | $0.06 / 1Mq4 · NextBit | |||
| IBM Granite · 2 models | |||||
IBM: Granite 4.0 Micro | $0.017 / 1M | — | |||
IBM: Granite 4.1 8B | $0.05 / 1M | — | |||
| Inception · 1 model | |||||
Inception: Mercury 2 | $0.25 / 1M | — | |||
| Inclusion AI · 3 models | |||||
inclusionAI: Ling-2.6-1T | $0.075 / 1M | — | |||
inclusionAI: Ling-2.6-flash | $0.01 / 1M | — | |||
inclusionAI: Ring-2.6-1T | $0.075 / 1M | — | |||
| Inflection · 2 models | |||||
Inflection: Inflection 3 Pi | $2.50 / 1M | — | |||
Inflection: Inflection 3 Productivity | $2.50 / 1M | — | |||
| Kwai · 1 model | |||||
Kwaipilot: KAT-Coder-Pro V2 | $0.3 / 1M | $0.3 / 1Mq8 · AtlasCloud | |||
| Liquid · 1 model | |||||
LiquidAI: LFM2-24B-A2B | $0.03 / 1M | — | |||
| Mancer · 1 model | |||||
Mancer: Weaver (alpha) | $0.75 / 1M | $0.75 / 1Mq8 · Mancer 2 | |||
| MiniMax · 7 models | |||||
MiniMax: MiniMax M1 | $0.4 / 1M | — | |||
MiniMax: MiniMax M2 | $0.255 / 1M | $0.255 / 1Mq8 · AtlasCloud | |||
MiniMax: MiniMax M2-her | $0.3 / 1M | — | |||
MiniMax: MiniMax M2.1 | $0.29 / 1M | $0.29 / 1Mq8 · AtlasCloud | |||
MiniMax: MiniMax M2.5 | $0.15 / 1M | $0.15 / 1Mq8 · AkashML | |||
MiniMax: MiniMax M2.7 | $0.279 / 1M | $0.3 / 1Mq8 · Minimax | |||
MiniMax: MiniMax-01 | $0.2 / 1M | — | |||
| Moonshot · 5 models | |||||
MoonshotAI: Kimi K2 0711 | $0.57 / 1M | $0.57 / 1Mq8 · Novita | |||
MoonshotAI: Kimi K2 0905 | $0.6 / 1M | $0.6 / 1Mq8 · AtlasCloud | |||
MoonshotAI: Kimi K2 Thinking | $0.6 / 1M | $0.6 / 1Mq4 · AtlasCloud | |||
MoonshotAI: Kimi K2.5 | $0.4 / 1M | $0.4 / 1Mq4 · ModelRun | |||
MoonshotAI: Kimi K2.6 | $0.73 / 1M | $0.73 / 1Mq4 · Io Net | |||
| Morph · 2 models | |||||
Morph: Morph V3 Fast | $0.8 / 1M | — | |||
Morph: Morph V3 Large | $0.9 / 1M | — | |||
| Newmen · 1 model | |||||
atlas-1 Routes to best model per call. Trains LoRA adapters per operation. | Newmen rate | Newmen rateatlas picks the variant | |||
| Nex AGI · 1 model | |||||
Nex AGI: DeepSeek V3.1 Nex N1 | $0.135 / 1M | $0.135 / 1Mq8 · SiliconFlow | |||
| Nous Research · 5 models | |||||
Nous: Hermes 3 405B Instruct | $1.00 / 1M | $1.00 / 1Mq8 · DeepInfra | |||
Nous: Hermes 3 70B Instruct | $0.3 / 1M | $0.3 / 1Mq8 · DeepInfra | |||
Nous: Hermes 4 405B | $1.00 / 1M | $1.00 / 1Mq8 · Nebius | |||
Nous: Hermes 4 70B | $0.13 / 1M | $0.13 / 1Mq8 · Nebius | |||
NousResearch: Hermes 2 Pro - Llama-3 8B | $0.14 / 1M | — | |||
| OpenRouter · 3 models | |||||
Auto Router | $-1000000 / 1M | — | |||
Body Builder (beta) | $-1000000 / 1M | — | |||
Pareto Code Router | $-1000000 / 1M | — | |||
| Perceptron · 1 model | |||||
Perceptron: Perceptron Mk1 | $0.15 / 1M | — | |||
| Prime Intellect · 1 model | |||||
Prime Intellect: INTELLECT-3 | $0.2 / 1M | $0.2 / 1Mq8 · Nebius | |||
| Reka · 2 models | |||||
Reka Edge | $0.1 / 1M | — | |||
Reka Flash 3 | $0.1 / 1M | $0.1 / 1Mq8 · Reka | |||
| Relace · 2 models | |||||
Relace: Relace Apply 3 | $0.85 / 1M | $0.85 / 1Mq8 · Relace | |||
Relace: Relace Search | $1.00 / 1M | — | |||
| Sao10k · 5 models | |||||
Sao10K: Llama 3 8B Lunaris | $0.04 / 1M | $0.04 / 1Mq8 · DeepInfra | |||
Sao10k: Llama 3 Euryale 70B v2.1 | $1.48 / 1M | — | |||
Sao10K: Llama 3.1 70B Hanami x1 | $3.00 / 1M | — | |||
Sao10K: Llama 3.1 Euryale 70B v2.2 | $0.85 / 1M | $0.85 / 1Mq8 · DeepInfra | |||
Sao10K: Llama 3.3 Euryale 70B | $0.65 / 1M | — | |||
| StepFun · 1 model | |||||
StepFun: Step 3.5 Flash | $0.09 / 1M | $0.09 / 1Mq8 · DeepInfra | |||
| Switchpoint · 1 model | |||||
Switchpoint Router | $0.85 / 1M | — | |||
| Tencent · 2 models | |||||
Tencent: Hunyuan A13B Instruct | $0.14 / 1M | $0.14 / 1Mq8 · SiliconFlow | |||
Tencent: Hy3 preview | $0.066 / 1M | — | |||
| TheDrummer · 4 models | |||||
TheDrummer: Cydonia 24B V4.1 | $0.3 / 1M | — | |||
TheDrummer: Rocinante 12B | $0.17 / 1M | — | |||
TheDrummer: Skyfall 36B V2 | $0.55 / 1M | $0.55 / 1Mq8 · Parasail | |||
TheDrummer: UnslopNemo 12B | $0.4 / 1M | $0.4 / 1Mq8 · NextBit | |||
| Undi95 · 1 model | |||||
ReMM SLERP 13B | $0.45 / 1M | $0.5 / 1Mq8 · Mancer 2 | |||
| Upstage · 1 model | |||||
Upstage: Solar Pro 3 | $0.15 / 1M | — | |||
| Writer · 1 model | |||||
Writer: Palmyra X5 | $0.6 / 1M | — | |||
| Xiaomi · 5 models | |||||
Xiaomi: MiMo-V2-Flash | $0.1 / 1M | $0.1 / 1Mq8 · Xiaomi | |||
Xiaomi: MiMo-V2-Omni | $0.4 / 1M | $0.4 / 1Mq8 · Xiaomi | |||
Xiaomi: MiMo-V2-Pro | $1.00 / 1M | $1.00 / 1Mq8 · Xiaomi | |||
Xiaomi: MiMo-V2.5 | $0.14 / 1M | $0.14 / 1Mq8 · Xiaomi | |||
Xiaomi: MiMo-V2.5-Pro | $0.435 / 1M | $0.435 / 1Mq8 · Xiaomi | |||
| Z.AI · 12 models | |||||
Z.ai: GLM 4 32B | $0.1 / 1M | — | |||
Z.ai: GLM 4.5 | $0.6 / 1M | $0.6 / 1Mq8 · Novita | |||
Z.ai: GLM 4.5 Air | $0.125 / 1M | $0.125 / 1Mq8 · Io Net | |||
Z.ai: GLM 4.5V | $0.6 / 1M | $0.6 / 1Mq8 · Novita | |||
Z.ai: GLM 4.6 | $0.43 / 1M | $0.43 / 1Mq4 · DeepInfra | |||
Z.ai: GLM 4.6V | $0.3 / 1M | $0.3 / 1Mq8 · Z.AI | |||
Z.ai: GLM 4.7 | $0.4 / 1M | $0.4 / 1Mq4 · DeepInfra | |||
Z.ai: GLM 4.7 Flash | $0.06 / 1M | $0.125 / 1Mq8 · Venice | |||
Z.ai: GLM 5 | $0.6 / 1M | $0.6 / 1Mq4 · DeepInfra | |||
Z.ai: GLM 5 Turbo | $1.20 / 1M | $1.20 / 1Mq8 · AtlasCloud | |||
Z.ai: GLM 5.1 | $0.98 / 1M | $0.98 / 1Mq8 · Baidu | |||
Z.ai: GLM 5V Turbo | $1.20 / 1M | $1.20 / 1Mq8 · Z.AI | |||
Prices are input tokens per 1M. Standard / Batch show the cheapest q8/q4 provider variant Atlas would route to today; the decision is made per call from your operation’s eval-gate history. Atlas Network rates publish once the partner desktop app ships out of invite-only beta.
On Pay as you go, what you see is what you pay (and often less, when Atlas routes to a cheaper variant). Reliability Loop adds a 10% markup itemised separately on every invoice in exchange for the per-operation tuning, eval-gated refund, and LoRA training. Card statement descriptor: NEWMEN.AI*CREDITS. Auto top-up keeps your balance above zero so calls never fail mid-flight; thresholds are configurable in /console/settings/billing.
Atlas is sold to teams who commit to meaningful production volume. That commitment unlocks the reliability loop.