NEWMEN

Leaderboard

Best. Smartest. Fastest.

Every text-generation model in the Newmen catalogue, ranked three ways and regenerated on each catalogue sync. Intelligence uses curated MMLU-Pro / code / math / GPQA scores where published; everything else is estimated from family + generation + popularity (clearly marked “est.”). Speed is best-of-provider p95 throughput from live traffic. Best Value is intelligence per dollar — and Atlas-1 wins by definition because it routes to whichever model is cheapest among those passing your eval gates.

How this updates

Regenerated on every sync.

The catalogue + per-provider stats + intelligence scores are written by pnpm models:sync. This page reads them at build time — there’s no extra API call, no caching to bust, no third-party service in the path. Refresh by running the sync and redeploying.

See the breakdown

Click any model.

Per-model pages show every provider that serves the model, with live latency / throughput / uptime / price / quant from upstream telemetry and (when published) the underlying MMLU-Pro / HumanEval / math / GPQA scores.

Browse all models →