Leaderboard

Best. Smartest. Fastest.

Every text-generation model in the Newmen catalogue, ranked three ways and regenerated on each catalogue sync. Intelligence uses curated MMLU-Pro / code / math / GPQA scores where published; everything else is estimated from family + generation + popularity (clearly marked “est.”). Speed is best-of-provider p95 throughput from live traffic. Best Value is intelligence per dollar — and Atlas-1 wins by definition because it routes to whichever model is cheapest among those passing your eval gates.

Best Value

Atlas leads the field.

Highest intelligence-per-dollar. Atlas-1 takes the top slot because it routes across every model below it per call, picking the cheapest variant that has stayed green on your operation's evaluators.

inclusionAI: Ling-2.6-flash

$0.01/M · Inclusion AI · est.

53000

value

Atlas-1

routes across the catalogue per call

66250

value

IBM: Granite 4.0 Micro

$0.02/M · IBM Granite · est.

29412

value

How this updates

Regenerated on every sync.

The catalogue + per-provider stats + intelligence scores are written by pnpm models:sync. This page reads them at build time — there’s no extra API call, no caching to bust, no third-party service in the path. Refresh by running the sync and redeploying.

See the breakdown

Click any model.

Per-model pages show every provider that serves the model, with live latency / throughput / uptime / price / quant from upstream telemetry and (when published) the underlying MMLU-Pro / HumanEval / math / GPQA scores.

Browse all models →