SDKs

Python SDK

newmen-ai · sync and async · Python 3.10+ · httpx · Pydantic v2.

The Python SDK is hand-written, with Pydantic v2 models generated from the OpenAPI spec to lock the client to the wire format. It ships at newmen-ai on PyPI.

Do you even need the SDK?

Note.Probably not to start. Newmen is a drop-in for the OpenAI and Anthropic SDKs — most people just set OPENAI_BASE_URL (or ANTHROPIC_BASE_URL) and keep their existing client. See the migration guide. Reach for newmen-ai only when you want first-class helpers for the Pro reliability loop — operations, feedback, datasets, and evaluators — in one typed client.

Install

bash# pip
pip install newmen-ai

# uv (recommended)
uv pip install newmen-ai

# poetry
poetry add newmen-ai

Initialize

pythonfrom newmen import Newmen
import os

client = Newmen(api_key=os.environ["NEWMEN_API_KEY"])
# Or override the base URL:
# client = Newmen(api_key=..., base_url="https://api.newmen.ai/v1")

operation_id (Pro)

Note.Optional. metadata["operation_id"] is what unlocks the opt-in Pro Reliability Loop — per-operation tuning, eval-gated auto-refund, datasets, and evaluators. Calls without one are recorded and billed normally and still get pay-as-you-go pricing and thumbs-down refunds; they just aren’t grouped into an operation.

Passing a new operation_id in metadata auto-registers a lightweight entry. To attach evaluators, ship gates, or an output schema, formally define the operation via the console or API. See Operations.

Sync and async

Two clients ship in the same package — pick the one that matches your service’s event loop.

pythonfrom newmen import Newmen, AsyncNumen

# Sync
client = Newmen(api_key=key)
res = client.chat.completions.create(
    model="atlas-1",
    messages=[{"role": "user", "content": "Summarise this ticket."}],
    metadata={"operation_id": "summarize_ticket"},
)
print(res.choices[0].message.content)
print("call_id:", res.id)  # save this to attach feedback

# Async
import asyncio

async def main():
    aclient = AsyncNumen(api_key=key)
    res = await aclient.chat.completions.create(
        model="atlas-1",
        messages=[{"role": "user", "content": "Summarise this ticket."}],
        metadata={"operation_id": "summarize_ticket"},
    )
    print(res.choices[0].message.content)

asyncio.run(main())

Streaming

python# Sync streaming
for chunk in client.chat.completions.stream(
    model="atlas-1",
    messages=msgs,
    metadata={"operation_id": "summarize_ticket"},
):
    print(chunk.choices[0].delta.content or "", end="", flush=True)

# Async streaming
async for chunk in aclient.chat.completions.stream(
    model="atlas-1",
    messages=msgs,
    metadata={"operation_id": "summarize_ticket"},
):
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Schema version

When you change the output_schema on an operation, pass schema_version alongside operation_id so traffic is bucketed by the schema it was generated against.

python# When you change output_schema on an operation, bump schema_version
# so production traffic is bucketed by the schema it was generated against.
client.chat.completions.create(
    model="atlas-1",
    messages=msgs,
    metadata={
        "operation_id": "summarize_ticket",
        "schema_version": "2",          # bumped after adding the "confidence" field
    },
)

For requirement changes — when the quality bar shifts but the output structure stays the same — do not bump schema_version. Update the evaluator rubric and re-run evaluation instead. See Requirement changes.

Resources

python · feedbackclient.feedback.create(
    call_id=res.id,
    rating="thumbs_up",
    correction="Better output here.",
    tags=["accuracy:good"],
)

python · operations# Formally define an operation before configuring evaluators or ship gates.
op = client.operations.create(
    key="summarize_ticket",
    name="Support ticket summary",
    description="One-sentence summary fed into routing. Must not contain PII.",
    output_schema={
        "type": "object",
        "properties": {
            "summary":  {"type": "string", "maxLength": 200},
            "priority": {"type": "string", "enum": ["low", "medium", "high"]},
        },
        "required": ["summary", "priority"],
    },
    ship_gates=[
        {"evaluator_id": "ev_regex_pii",     "min_score": 1.0},
        {"evaluator_id": "ev_judge_quality", "min_score": 0.88},
    ],
)

python · datasets# Create a dataset scoped to the operation.
ds = client.datasets.create(
    operation_id="summarize_ticket",
    name="Tickets v3 — review candidate",
)

# Bulk-add items from tagged production calls.
client.datasets.items.add(ds.id, [
    {
        "source_call_id": "chatcmpl-abc123",
        "input": {"messages": [{"role": "user", "content": "..."}]},
        "expected_output": {"summary": "Billing error on invoice #4421.", "priority": "high"},
    },
])

# Promote — raises ShipGatesUnmet if any gate fails.
client.datasets.promote(ds.id)

Errors

pythonfrom newmen import (
    Newmen,
    NumenAPIError,
    AuthenticationError,
    ShipGatesUnmetError,
)

try:
    client.datasets.promote(ds_id)
except ShipGatesUnmetError as e:
    print("gates failed:", e.failed_gates)
except AuthenticationError:
    raise  # re-issue key
except NumenAPIError as e:
    print("api error:", e.status_code, e.body)
    raise

The error hierarchy mirrors the TypeScript SDK, so cross-language applications can share an error-handling vocabulary. The SDK retries 429 and 5xx responses twice by default; opt out via max_retries=0.