NEWMEN
NEWMEN · TWO ENV VARS · $5 FREE

Atlas.

Atlas

Faster. Cheaper. Smarter. Or the call’s on us.

One API for every model — and never priced above going direct. Usually less, because Atlas routes to a cheaper variant that holds quality. Don’t like a call? Money back. Change two env vars and ship.

Atlas speaks both OpenAI Chat Completions and Anthropic Messages on the wire. Codex CLI, Claude Code, Cursor, OpenAI / Anthropic SDKs, langchain, llamaindex — everything you already use just works.

export ANTHROPIC_BASE_URL="https://api.newmen.ai"
export ANTHROPIC_API_KEY="nm_live_..."
Sign in to insert your real key automatically.

Worked example · 30M tokens/mo · openai/gpt-4o pinned

$210/mo direct$84/mo on Newmen

60% off, no markup, same model id. Atlas routed to a quality-equivalent cheaper variant; you pay the savings, not the premium. Don’t like a call? Thumbs-down it → instant credit-back. See the full math →

Not theory — we ran a real coding agent through Newmen and built the same app twice. Read the case study: same apps, half the bill →

Platform
two env varsMigration
0% pay-as-you-goMarkup
thumbs-down, money backRefund
no card requiredFree models
OpenAI + AnthropicWire format
operation_id + LoRAsPro
01 — Every model, one key

Top by intelligence.

Pin any provider id you already use — openai/gpt-4o, anthropic/claude-sonnet-4, meta-llama/llama-3.1-70b-instruct — at-or-below the provider-direct rate, and Newmen routes each call to the cheapest variant that holds quality.

Every other broker says “trust us, the cheaper one is fine.” We promise the opposite: never more than direct, money back if you don’t like it. Two env vars to switch.

02 — How Atlas routes

Cheaper without quality loss.

Atlas inspects the model you pinned, the provider variants available for it, and the live latency / throughput / quality telemetry across every provider. It picks the cheapest variant that holds quality — and never charges you more than the original provider would have direct.

  • Pay as you go — zero markup, match-or-better rate. Margin comes from routing to a cheaper variant; you keep the savings. Free models cost nothing and need no card.
  • Pro — per-operation tuning, eval-gated auto-refund, LoRA adapters. 10% markup itemised separately.
  • Speed is a side effect — Atlas avoids over-served capacity and skips queues at peak hours, so p95 latency typically drops 10–30%.

Don’t like a routed call? Thumbs-down it and we credit it back automatically, no questions asked.

Qualityeval-gated accuracyCostadapters get cheaperSpeedoptimal routing latencyAtlasrouting

03 — Switch in seconds

Two env vars. Any tool.

Codex CLI, Claude Code, Cursor, the OpenAI / Anthropic SDKs in any language, langchain, llamaindex, OpenRouter clients — they all read the same env vars. Set them and ship.

export ANTHROPIC_BASE_URL="https://api.newmen.ai"
export ANTHROPIC_API_KEY="nm_live_..."
Sign in to insert your real key automatically.

Why Newmen

Never pay more than direct.

One key. Any model. We charge you the cheapest path that holds quality, never more than the original provider would direct, and credit you back automatically when a call disappoints.

Match-or-better rate, every call

Pay no more than the original provider would charge direct — usually less, because Atlas routes to a quality-equivalent cheaper variant when it can. The savings show up line-by-line in /console/usage.

Thumbs-down → instant credit

Don't like a call? Hit thumbs-down and we credit it back automatically, no questions asked. Your $5 of signup credits are real money on a real card — and free models cost nothing at all.

Two env vars to switch

Set OPENAI_BASE_URL or ANTHROPIC_BASE_URL plus a Newmen key. Codex CLI, Claude Code, Cursor, OpenAI / Anthropic SDKs, langchain, llamaindex — everything routes through Atlas. No SDK rewrite. Existing model ids keep working.

Pro tier · optional

Reliability Loop — when quality has to be guaranteed.

Tag calls with operation_id, bind evaluators with a minimum score, and Atlas routes per-operation against your live eval history. Failures refund synchronously without anyone hitting thumbs-down. LoRA adapters train on your traffic and keep beating the base model. $1,500 / month, on top of the same routing.

See it on your own prompts.

Paste 20–100 representative prompts. We’ll run them against your current model and against Newmen smart routing in parallel and show you the cost, latency, and (if you bind evaluators) eval-score deltas. Shareable result page you can forward to your team.

Talk to a solutions engineer

Atlas is sold to teams who commit to meaningful production volume. That commitment unlocks the reliability loop.