Atlas.
Faster. Cheaper. Smarter. Or the call’s on us.
One API for every model — and never priced above going direct. Usually less, because Atlas routes to a cheaper variant that holds quality. Don’t like a call? Money back. Change two env vars and ship.
Atlas speaks both OpenAI Chat Completions and Anthropic Messages on the wire. Codex CLI, Claude Code, Cursor, OpenAI / Anthropic SDKs, langchain, llamaindex — everything you already use just works.
export ANTHROPIC_BASE_URL="https://api.newmen.ai"
export ANTHROPIC_API_KEY="nm_live_..."Worked example · 30M tokens/mo · openai/gpt-4o pinned
$210/mo direct → $84/mo on Newmen
60% off, no markup, same model id. Atlas routed to a quality-equivalent cheaper variant; you pay the savings, not the premium. Don’t like a call? Thumbs-down it → instant credit-back. See the full math →
Not theory — we ran a real coding agent through Newmen and built the same app twice. Read the case study: same apps, half the bill →
Top by intelligence.
Pin any provider id you already use — openai/gpt-4o, anthropic/claude-sonnet-4, meta-llama/llama-3.1-70b-instruct — at-or-below the provider-direct rate, and Newmen routes each call to the cheapest variant that holds quality.
- #01OpenAI: GPT-5.5Closed90.7openai/gpt-5.5·from $5.00/M
- #02Anthropic: Claude Opus 4.7Closed89.8anthropic/claude-opus-4.7·from $5.00/M
- #03OpenAI: GPT-5Closed89.2openai/gpt-5·from $1.25/M
- #04OpenAI: o3Closed88.0openai/o3·from $2.00/M
- #05Anthropic: Claude Opus 4Closed87.7anthropic/claude-opus-4·from $15.00/M
- #06DeepSeek: R1Open84.7deepseek/deepseek-r1·from $0.70/M
- #07Qwen: Qwen3 235B A22BOpen82.8qwen/qwen3-235b-a22b·from $0.46/M
- #08Qwen: Qwen3 MaxOpen81.4qwen/qwen3-max·from $0.78/M
Every other broker says “trust us, the cheaper one is fine.” We promise the opposite: never more than direct, money back if you don’t like it. Two env vars to switch.
Cheaper without quality loss.
Atlas inspects the model you pinned, the provider variants available for it, and the live latency / throughput / quality telemetry across every provider. It picks the cheapest variant that holds quality — and never charges you more than the original provider would have direct.
- Pay as you go — zero markup, match-or-better rate. Margin comes from routing to a cheaper variant; you keep the savings. Free models cost nothing and need no card.
- Pro — per-operation tuning, eval-gated auto-refund, LoRA adapters. 10% markup itemised separately.
- Speed is a side effect — Atlas avoids over-served capacity and skips queues at peak hours, so p95 latency typically drops 10–30%.
Don’t like a routed call? Thumbs-down it and we credit it back automatically, no questions asked.
03 — Switch in seconds
Two env vars. Any tool.
Codex CLI, Claude Code, Cursor, the OpenAI / Anthropic SDKs in any language, langchain, llamaindex, OpenRouter clients — they all read the same env vars. Set them and ship.
export ANTHROPIC_BASE_URL="https://api.newmen.ai"
export ANTHROPIC_API_KEY="nm_live_..."Why Newmen
Never pay more than direct.
One key. Any model. We charge you the cheapest path that holds quality, never more than the original provider would direct, and credit you back automatically when a call disappoints.
Match-or-better rate, every call
Pay no more than the original provider would charge direct — usually less, because Atlas routes to a quality-equivalent cheaper variant when it can. The savings show up line-by-line in /console/usage.
Thumbs-down → instant credit
Don't like a call? Hit thumbs-down and we credit it back automatically, no questions asked. Your $5 of signup credits are real money on a real card — and free models cost nothing at all.
Two env vars to switch
Set OPENAI_BASE_URL or ANTHROPIC_BASE_URL plus a Newmen key. Codex CLI, Claude Code, Cursor, OpenAI / Anthropic SDKs, langchain, llamaindex — everything routes through Atlas. No SDK rewrite. Existing model ids keep working.
Pro tier · optional
Reliability Loop — when quality has to be guaranteed.
Tag calls with operation_id, bind evaluators with a minimum score, and Atlas routes per-operation against your live eval history. Failures refund synchronously without anyone hitting thumbs-down. LoRA adapters train on your traffic and keep beating the base model. $1,500 / month, on top of the same routing.
See it on your own prompts.
Paste 20–100 representative prompts. We’ll run them against your current model and against Newmen smart routing in parallel and show you the cost, latency, and (if you bind evaluators) eval-score deltas. Shareable result page you can forward to your team.
Talk to a solutions engineer
Atlas is sold to teams who commit to meaningful production volume. That commitment unlocks the reliability loop.