Newmen · Smart Routing Case Study

Same app.~42% cheaper.

We pointed OpenCode at Newmen with nothing but an API key and asked it to build a Web Audio synth studio — twice. Once on its default model, once with Newmen smart routing. Both paths shipped a complete, working app; the routed run came in ~42% cheaper on the headline build call. Every figure below is pulled straight from the gateway’s usage records — trade-offs and all.

lower cost on the build call — for a working app either way

$0.00

Default

$0.00

Smart routing

$0.00

Saved

2 paths, both working·43,801 routed tokens·28,479 bytes shipped

It’s one line to switch

Keep your client, your prompts, and your model choices exactly as they are. Point the base URL at Newmen, drop in your key, and smart routing handles the rest.

opencode.json

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "newmen": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Newmen",
      "options": {
        "baseURL": "https://api.newmen.ai/v1",
        "apiKey": "nm_live_..."
      }
    }
  },
  "model": "newmen/gpt-5.1"
}

That’s the entire integration. No SDK swaps, no prompt changes, no model config. Every call is logged in your console with full payloads for review.

The build · OpenCode head-to-head

One brief, built two ways

OpenCode built the same Web Audio Synth Station twice — once on its default model, once with Newmen smart routing. Both produced a complete, genuinely working studio. The routed build cost ~42% less on the headline build call; the default build was faster and shipped a more elaborate app.

The prompt

Build a single self-contained HTML “Synth Station” — a Web Audio browser music studio: a 16-step sequencer (kick, snare, hi-hat + a melodic synth), selectable waveforms and a filter, BPM / master-volume / per-track mute, play / stop / clear / randomize, localStorage save/load, keyboard play, an animated canvas visualizer, and a clean dark-neon UI. Self-review the code and fix anything broken before finishing.

DefaultSmart routing

Cost$0.30$0.17

Calls11

Tokens47,72743,801

Throughput (tok/s)16363

Build time169s335s

Cost vs default42% cheaper

Default build$0.30

▶ Play it live

The default model's build: the richest of the pair — a 63 KB studio with a reactive spectrum canvas, layered panels, and the most polished UI. Fastest run too (~163 tok/s).

Smart-routing build$0.17

▶ Play it live

Smart routing's build: a complete, working 28 KB studio — sequencer, drums, synth, waveforms, filter, save/load, keyboard, visualizer all present. Leaner and slower to generate than the default, for ~42% less.

~42% cheaper on the build call. The trade-off is real and we won't hide it: the routed build ran ~2× longer in wall-clock, at ~2.6× lower throughput (63 vs 163 tok/s), and shipped a leaner 28 KB app vs the default's 63 KB. Both work — pick cost or richness per workload.

Reading the numbers

~42%

Lower cost on the headline build call

$0.17

Routed vs $0.30 on the default model

~2.6×

Lower throughput — the trade-off

28 KB

Routed app size (default: 63 KB)

Cost is the headline. The routed build billed $0.17 against the default model's $0.30 for the same brief — ~42% cheaper, with every call logged.
Both paths completed cleanly. Routed and default each shipped a working Synth Station: 16-step sequencer, kick/snare/hi-hat plus a melodic synth, selectable waveforms and filter, transport, per-track mute, save/load, keyboard play, and a canvas visualizer.
Throughput and richness are the trade-off. This is not near-parity. The routed run took ~2× longer in wall-clock, ran ~2.6× slower in throughput, and produced a leaner 28 KB app versus the default's more elaborate 63 KB build. A deliberate cost-first result — latency- or polish-critical work can opt out per call.
Effort is near-zero. Two env vars. No proxy, no SDK swap, no prompt or model changes.

See it in your console

None of these numbers are estimates — they’re pulled straight from your live traffic. Savings, spend, calls, tokens, and per-run breakdowns are all visible in the dashboard the moment a call lands.

**Usage.** Period savings, spend vs. direct, and token totals at a glance.

**Overview.** “Saved by routing” headline plus recent calls with per-call cost and savings.

**Calls.** Every call shows the struck direct price → what you paid, and the amount saved.

**Sessions.** Calls grouped by run, so you see what a whole task cost — and saved.

Play them yourself

Both builds in each run are embedded above — open “Play it live” on any card, draw a downhill track with the mouse, place the rider, and hit Play. The cost difference is real; judge the quality difference for yourself.

Start cutting your AI bill today.

Two env vars, your existing client, and every call logged with what it cost versus direct. Get $5 in credits and watch the savings land in your console on the first call.

Create a free account →Read the docs