Newmen · Atlas-1
Atlas-1.
Faster. Cheaper. Smarter. Or the call’s on us.
Atlas is not a single static model. It is a routing layer that picks the best provider per call, runs your evaluators on the served output, refunds calls that score below threshold, and trains LoRA adapters on your production corrections — so every operation gets cheaper and more accurate every week. General models are general; yours shouldn’t be.
30+
Models available
1M
Context window
210 ms
p50 TTFT
cost + 10%
Third-party rate
99.9%
Uptime SLO
How Atlas works
Route, learn, improve.
Three roles in one model field. Atlas routes by default, passes through on demand, and trains adapters as your corrections accumulate.
Routing layer
Every call without an explicit model goes to Atlas. It picks the optimal provider based on operation type, latency history, cost, and accuracy on your past traffic. You get the right model without writing routing logic.
Direct passthrough
Specify `model: "openai/chatgpt-5.5"` and the call routes there directly, billed at cost + 10%. One API key, any provider — OpenAI, Anthropic, Google, and more.
Continuous training
As tagged corrections accumulate per operation, Atlas trains LoRA adapters specific to that task. The adapter is smaller, cheaper, and more accurate than the general model — and it improves every week.
Drop-in compatible
OpenAI-compatible endpoint. No SDK change required. `model: "atlas-1"` is the only diff from your existing integration. The loop activates on day one.
Model field
What goes in the model field.
One API surface. Three behaviors depending on what you pass.
| Value | Behavior |
|---|---|
| atlas / atlas-1 | Default routing layer — picks best model per call |
| openai/gpt-4o | Direct passthrough, cost + 10% |
| openai/gpt-4o-mini | Direct passthrough, cost + 10% |
| anthropic/claude-opus-4 | Direct passthrough, cost + 10% |
| anthropic/claude-sonnet-4 | Direct passthrough, cost + 10% |
| google/gemini-2.5-pro | Direct passthrough, cost + 10% |
| google/gemma-4-31b-it | Direct passthrough, cost + 10% |
| atlas/<adapter-id> | Trained adapter for your operation (roadmap) |
Full model list at /pricing.
The optimization
Quality, speed, cost. All three.
Routing to the right model per call breaks the classic tradeoff. Adapters that train on your traffic make the lead permanent.
Quality
Eval-gated routing and per-operation adapters trained on your corrections — accuracy improves every week.
Speed
Atlas routes to the optimal provider for latency on your operation type. Adapters are smaller, so they're faster too.
Cost
Third-party models at cost + 10%. As adapters mature they replace general models — smaller, cheaper, more accurate.
Specifications
By the numbers
| Default model field | atlas / atlas-1 |
|---|---|
| Third-party format | provider/model — e.g. openai/chatgpt-5.5 |
| Context window (Atlas) | 1,000,000 tokens |
| Knowledge cutoff | May 2026 |
| Modalities | Text in, text out · vision in roadmap |
| Latency SLO (p50) | 210 ms time-to-first-token |
| Latency SLO (p95) | 640 ms time-to-first-token |
| API surface | OpenAI-compatible · /api/v1/chat/completions |
The reliability loop
From production call to next version
Every customer call is a candidate observation. Every correction is a candidate training signal. The loop closes on your traffic — across any model you use.
See the full loop explained.
Pricing
Atlas rates. Third-party at cost.
Atlas inference runs at Newmen-controlled rates. Any other model routes at provider cost + 10%. The $1,500/mo reliability loop is a platform fee — same regardless of which models you use.
Atlas inference
Newmen rates
Routing layer picks best model. Improves over time as adapters train.
Third-party models
cost + 10%
Pass through to any provider. One key, transparent markup.
Reliability loop
$1,500 / mo
Recording, evaluators, ship gates, training workflow. Any model.
See full pricing including the complete model table.
Talk to a solutions engineer
Atlas is sold to teams who commit to meaningful production volume. That commitment unlocks the reliability loop.