Newmen · Smart Routing Case Study
Same app.~42% cheaper.
We pointed OpenCode at Newmen with nothing but an API key and asked it to build a Web Audio synth studio — twice. Once on its default model, once with Newmen smart routing. Both paths shipped a complete, working app; the routed run came in ~42% cheaper on the headline build call. Every figure below is pulled straight from the gateway’s usage records — trade-offs and all.
lower cost on the build call — for a working app either way
It’s one line to switch
Keep your client, your prompts, and your model choices exactly as they are. Point the base URL at Newmen, drop in your key, and smart routing handles the rest.
{
"$schema": "https://opencode.ai/config.json",
"provider": {
"newmen": {
"npm": "@ai-sdk/openai-compatible",
"name": "Newmen",
"options": {
"baseURL": "https://api.newmen.ai/v1",
"apiKey": "nm_live_..."
}
}
},
"model": "newmen/gpt-5.1"
}That’s the entire integration. No SDK swaps, no prompt changes, no model config. Every call is logged in your console with full payloads for review.
The build · OpenCode head-to-head
One brief, built two ways
OpenCode built the same Web Audio Synth Station twice — once on its default model, once with Newmen smart routing. Both produced a complete, genuinely working studio. The routed build cost ~42% less on the headline build call; the default build was faster and shipped a more elaborate app.
The prompt
Build a single self-contained HTML “Synth Station” — a Web Audio browser music studio: a 16-step sequencer (kick, snare, hi-hat + a melodic synth), selectable waveforms and a filter, BPM / master-volume / per-track mute, play / stop / clear / randomize, localStorage save/load, keyboard play, an animated canvas visualizer, and a clean dark-neon UI. Self-review the code and fix anything broken before finishing.
▶ Play it live
The default model's build: the richest of the pair — a 63 KB studio with a reactive spectrum canvas, layered panels, and the most polished UI. Fastest run too (~163 tok/s).
▶ Play it live
Smart routing's build: a complete, working 28 KB studio — sequencer, drums, synth, waveforms, filter, save/load, keyboard, visualizer all present. Leaner and slower to generate than the default, for ~42% less.
~42% cheaper on the build call. The trade-off is real and we won't hide it: the routed build ran ~2× longer in wall-clock, at ~2.6× lower throughput (63 vs 163 tok/s), and shipped a leaner 28 KB app vs the default's 63 KB. Both work — pick cost or richness per workload.
Reading the numbers
- Cost is the headline. The routed build billed $0.17 against the default model's $0.30 for the same brief — ~42% cheaper, with every call logged.
- Both paths completed cleanly. Routed and default each shipped a working Synth Station: 16-step sequencer, kick/snare/hi-hat plus a melodic synth, selectable waveforms and filter, transport, per-track mute, save/load, keyboard play, and a canvas visualizer.
- Throughput and richness are the trade-off. This is not near-parity. The routed run took ~2× longer in wall-clock, ran ~2.6× slower in throughput, and produced a leaner 28 KB app versus the default's more elaborate 63 KB build. A deliberate cost-first result — latency- or polish-critical work can opt out per call.
- Effort is near-zero. Two env vars. No proxy, no SDK swap, no prompt or model changes.
See it in your console
None of these numbers are estimates — they’re pulled straight from your live traffic. Savings, spend, calls, tokens, and per-run breakdowns are all visible in the dashboard the moment a call lands.
Play them yourself
Both builds in each run are embedded above — open “Play it live” on any card, draw a downhill track with the mouse, place the rider, and hit Play. The cost difference is real; judge the quality difference for yourself.
Start cutting your AI bill today.
Two env vars, your existing client, and every call logged with what it cost versus direct. Get $5 in credits and watch the savings land in your console on the first call.