Core API

Embeddings

OpenAI-compatible embeddings. Pass `model: "atlas-embed-1"` for Atlas mode (auto-optimize) across supported embedding models, or pin a specific model id for Strict mode. Same auth, same operation tagging, same `delivery` block — served model and cost vs direct — as `/chat/completions`.

Basic request

Send a single string and receive a single embedding vector. Newmen forwards to the chosen upstream and surfaces the response in OpenAI’s standard shape.

bash · curlcurl https://api.newmen.ai/v1/embeddings \
  -H "Authorization: Bearer $NEWMEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "atlas-embed-1",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

typescript · @newmen-ai/sdkimport Newmen from "@newmen-ai/sdk";

const client = new Newmen({ apiKey: process.env.NEWMEN_API_KEY });

const res = await client.embeddings.create({
  model: "atlas-embed-1",
  input: "The quick brown fox jumps over the lazy dog",
});

console.log(res.data[0].embedding); // number[]
console.log("dimensions:", res.data[0].embedding.length);

python · newmen-aifrom newmen import Newmen
import os

client = Newmen(api_key=os.environ["NEWMEN_API_KEY"])

res = client.embeddings.create(
    model="atlas-embed-1",
    input="The quick brown fox jumps over the lazy dog",
)

print(res.data[0].embedding[:8], "…")
print("dimensions:", len(res.data[0].embedding))

Batch processing

Pass an array of strings and get a single response with one embedding per input. Always cheaper and faster than N single-input calls.

typescriptconst res = await client.embeddings.create({
  model: "atlas-embed-1",
  input: [
    "Machine learning is a subset of artificial intelligence",
    "Deep learning uses neural networks with multiple layers",
    "NLP enables computers to understand text",
  ],
});

res.data.forEach((row, i) => {
  console.log(`embedding ${i}: ${row.embedding.length} dims`);
});

Note.Per-input order is preserved on the response — data[i] corresponds to input[i].

Multimodal inputs

Some embedding models (e.g. voyageai/voyage-3-large, cohere/embed-multilingual-v3) accept text + image content blocks for joint embeddings. The format matches the standard multimodal embedding shape used by upstream embedding providers.

typescriptconst res = await client.embeddings.create({
  model: "voyageai/voyage-3-large", // pin a multimodal-capable model
  input: [
    {
      content: [
        { type: "text", text: "A scenic boardwalk through a green meadow" },
        {
          type: "image_url",
          image_url: {
            url: "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/640px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    },
  ],
  encoding_format: "float",
});

Tagged for the Reliability Loop (Pro)

Embedding calls accept metadata.operation_id exactly like chat completions. Use it to group RAG-index calls together so the console can report on indexing latency and cost, and so the eval loop can flag bad chunks if you bind an evaluator to the operation.

typescriptconst res = await client.embeddings.create({
  model: "atlas-embed-1",
  input: chunks,
  metadata: { operation_id: "rag_index_pages" },
});

// res.delivery → {
//   served_model: "openai/text-embedding-3-small",
//   provider:     "OpenAI",
//   your_price:   0.0006,   // what you paid
//   direct_price: 0.0008,   // what it costs going direct
// }

Available models

atlas-embed-1 is the Atlas mode selector for embeddings — auto-optimize for the cheapest model that holds quality on your workload. The served model is always visible on the delivery block. For Strict behavior, pin any of these directly:

Open-weight:

qwen/qwen3-embedding-0.6b — 1024 dims
qwen/qwen3-embedding-4b — 2560 dims
qwen/qwen3-embedding-8b — 4096 dims
baai/bge-large-en-v1.5 / baai/bge-m3
snowflake/snowflake-arctic-embed-l
mixedbread-ai/mxbai-embed-large-v1

Closed-weight:

openai/text-embedding-3-small — 1536 dims, $0.020 / 1M
openai/text-embedding-3-large — 3072 dims, $0.130 / 1M
voyageai/voyage-3 / voyageai/voyage-3-large — text + multimodal
cohere/embed-english-v3 / cohere/embed-multilingual-v3
google/gemini-embedding-001

Note.Whichever model serves a call, you see it on the delivery block along with your price versus going direct. Embedding calls are billed at least 5% under direct, same as chat completions.