Get started

Quickstart

Two environment variables and you're live. Newmen speaks both OpenAI Chat Completions and Anthropic Messages — keep your existing client, swap the base URL, and pay at least 5% under going direct from your first call.

Get a key + $5 credits

Sign in, create an organization, and add a card. You get $5 in credits to start and a key from /console/api-keys. Keys are prefixed nm_live_ and shown once — store it in your secret manager. Auto top-up keeps your balance above zero so calls never fail mid-flight.

Set two env vars

Point your existing client at Newmen. No SDK change, no code rewrite.

~/.bashrc

export ANTHROPIC_BASE_URL="https://api.newmen.ai"
export ANTHROPIC_API_KEY="nm_live_..."▋

Claude Code calls https://api.newmen.ai/v1/messages

Note.This is the whole migration. The same two variables wire up Codex CLI, Claude Code, Cursor, langchain, llamaindex, and any OpenRouter client. See the full migration guide for per-tool snippets.

Make your first call

Pin any provider model id you already use — openai/gpt-5.5, anthropic/claude-opus-4.7, meta-llama/llama-4-maverick — or use atlas-1 for Atlas mode (auto-optimize). Either way you pay at least 5% under direct from call #1. To pin the exact model with no substitutions, use Strict mode.

bash · curlcurl https://api.newmen.ai/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.5",
    "messages": [{ "role": "user", "content": "Say hello in one sentence." }]
  }'

python · openai SDK# Your existing OpenAI code — unchanged.
from openai import OpenAI

client = OpenAI()  # reads OPENAI_BASE_URL + OPENAI_API_KEY

res = client.chat.completions.create(
    model="openai/gpt-5.5",          # or atlas-1 for Atlas mode (auto-optimize)
    messages=[{"role": "user", "content": "Say hello in one sentence."}],
)
print(res.choices[0].message.content)
print("call_id:", res.id)            # keep this to thumbs-down later

python · anthropic SDK# Your existing Anthropic code — unchanged.
from anthropic import Anthropic

client = Anthropic()  # reads ANTHROPIC_BASE_URL + ANTHROPIC_API_KEY

msg = client.messages.create(
    model="anthropic/claude-opus-4.7",
    max_tokens=512,
    messages=[{"role": "user", "content": "Say hello in one sentence."}],
)
print(msg.content[0].text)

A successful response is OpenAI- (or Anthropic-) shaped with a call id (chatcmpl-… / msg_…), plus a Newmen delivery block telling you what was served and what it cost versus direct. The call shows up in /console/calls within about a second, and your savings roll up in /console/usage.

Don’t like a call?

Send a thumbs-down against the call id and we credit it back automatically, no questions asked (fair-use limits live in the terms). Nothing else to configure.

bash · curlcurl https://api.newmen.ai/v1/feedback \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "call_id": "chatcmpl-abc123",
    "rating": "thumbs_down"
  }'

Pro tier · optional

Pro quickstart

Everything above is pay as you go: at least 5% off from your first call, free models with no card, $5 credits, auto top-up, thumbs-down refunds. The Reliability Loop ($1,500/mo) is an opt-in that adds per-operation tuning and eval-gated auto-refund on top of Atlas mode. The only code change is one field.

Tag calls with metadata.operation_id — a stable, slug-like string naming one prompt-template-plus-task in your app (summarize_ticket, classify_intent). Calls with the same operation id roll up together and become eligible for per-operation datasets, evaluators, ship gates, and opt-in tuning.

bash · curlcurl https://api.newmen.ai/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "atlas-1",
    "messages": [{ "role": "user", "content": "Summarize this ticket." }],
    "metadata": { "operation_id": "summarize_ticket" }
  }'

Passing a new operation_id auto-registers it. To add evaluators and ship gates — which unlock the eval-gated auto-refund (a call scoring below your min_score is refunded synchronously, without anyone hitting thumbs-down) — define the operation in the console or via the API. See Operations.

Where to next

Migration guide — per-tool env-var snippets (Codex, Claude Code, Cursor, langchain, …).
Feedback — thumbs-down refunds, ratings, and corrections.
Operations (Pro) — defining schemas, ship gates, and requirement changes.
Evaluators (Pro) — automated quality checks that gate dataset promotion and drive the eval-gated refund.
Datasets (Pro) — building reviewed datasets from tagged production calls.