Newmen · Smart Routing Case Study
Same apps.Half the bill.
We pointed a real coding agent at Newmen with nothing but an API key and asked it to build a genuine game — twice. Once on the client’s default model, once with Newmen smart routing. Same prompt, same client, same code. Only the bill changed.
lower cost across the runs — for materially the same result
It’s one line to switch
Keep your client, your prompts, and your model choices exactly as they are. Point the base URL at Newmen, drop in your key, and smart routing handles the rest.
export ANTHROPIC_BASE_URL="https://api.newmen.ai"
export ANTHROPIC_API_KEY="nm_live_..."That’s the entire integration. No SDK swaps, no prompt changes, no model config. Every call is logged in your console with full payloads for review.
Run 1 · Heavyweight coding-agent build
A premium request, routed smart
The agent built a full Line-Rider-style physics game (draw tracks, launch a sled, gravity + collisions + particles). On the client's top-tier default it billed at premium rates; with smart routing the identical task came in far cheaper — with a near-identical result.
The prompt
Build a single self-contained HTML file for a Line-Rider-style physics sandbox called Sled Sketch: draw tracks with the mouse, place a rider, press play, and watch a sled slide with gravity, collisions, momentum, bouncing and air. Include tool modes (draw / erase / move camera / place rider), play-pause-reset-clear, camera follow, zoom/minimap, live speed-distance-airtime-crash stats, particle effects, and best-distance saved locally — clean winter/arcade style, and genuinely playable. Then self-review the code and fix anything broken before finishing.
▶ Play it live
Polished: parallax mountains, layered glow on the track, crisp high-DPI canvas, zoom-to-fit, fine-grained collision. Best-looking of the four.
▶ Play it live
Feature-complete and clean: live speed bar, world grid, animated scarf, camera-follow toggle. A touch less visual depth than the default; plays just as well.
60% cheaper to build the same game. Trade-off seen here: the routed build ran ~12% slower in wall-clock — worth it for cost-sensitive or agentic workloads.
Run 2 · Standard request
A standard request, routed smart
The same game build at a lighter tier. Smart routing again cut the cost while delivering a complete, playable sandbox.
The prompt
Build a single self-contained HTML file for a Line-Rider-style physics sandbox called Sled Sketch: draw tracks with the mouse, place a rider, press play, and watch a sled slide with gravity, collisions, momentum, bouncing and air. Include tool modes (draw / erase / move camera / place rider), play-pause-reset-clear, camera follow, zoom/minimap, live speed-distance-airtime-crash stats, particle effects, and best-distance saved locally — clean winter/arcade style, and genuinely playable. Then self-review the code and fix anything broken before finishing.
▶ Play it live
Most configurable: track-color picker and live gravity / friction / bounce sliders. Strong UI; needs the user to draw the first track.
▶ Play it live
Leanest build: the core sandbox — draw, play, physics, minimap, stats. Most basic visuals of the set, but fully functional for a third of the cost.
38% cheaper for the same task. Throughput was lower than the default here — the savings come first; latency-critical paths can opt out per call.
Reading the numbers
- Cost is the headline. 38–60% off per run, ~57% blended — with no change to the app being built.
- Quality holds. Every routed build shipped a complete, genuinely playable game: drawing tools, gravity, collisions, particles, minimap, live stats, saved best distance. The default builds edge ahead on visual polish; the routed builds close most of the gap.
- Throughput is the trade-off. In these runs the routed builds were slower in wall-clock (a deliberate cost-first choice). For agentic, batch, and background work that’s a clear win; latency-critical calls can stay on the fast path.
- Effort is near-zero. Two env vars. The agent, prompts, and model names never changed.
See it in your console
None of these numbers are estimates — they’re pulled straight from your live traffic. Savings, spend, calls, tokens, and per-run breakdowns are all visible in the dashboard the moment a call lands.
Play them yourself
Both builds in each run are embedded above — open “Play it live” on any card, draw a downhill track with the mouse, place the rider, and hit Play. The cost difference is real; judge the quality difference for yourself.
Start cutting your AI bill today.
Two env vars, your existing client, and every call logged with what it cost versus direct. Get $5 in credits and watch the savings land in your console on the first call.