Open-model inference

Inference with zero trail

Run production inference on EU infrastructure with fast streaming, zero retention, and zero training.

RetentionNone
TrainingNever
InfraEU

First requestfast path · private by default

chat.completions.create200 OK

const zro = new OpenAI({
  apiKey: process.env.ZRO_API_KEY,
  baseURL: "https://inference.moonmath.ai/v1"
});

await zro.chat.completions.create({
  model: "glm-5.2", // or "minimax-m3"
  stream: true,
  messages
});

privacy_defaultson every request

prompt_bodies: not retained
completion_bodies: not retained
training: disabled
boundary: EU

EU-FIFinlandlive
EU-FRFrancelive

Details

Built for private inference

Yes. Zro exposes OpenAI-compatible access for chat completions, so existing clients and agent tools can point at the Zro base URL.

Yes. Zro also supports Anthropic-compatible Messages requests at /v1/messages for tools that expect that API shape.

No. Prompt and completion bodies are not retained by default after inference is processed.

Yes. Zro is built for responsive, streaming inference, so developer tools, agents, and production apps do not have to trade speed for privacy.

MiniMax M3 and GLM-5.2 are available now, with more open-model options being added across regions.

No. Customer prompts and completions are never used for training, fine-tuning, evaluations, analytics, or dataset creation.

Zro runs on privacy-forward EU infrastructure. Current regions include Finland and France.

Pro includes monthly credits that do not roll over. Active Pro accounts can add one-time top-ups, and top-up credits expire after 180 days.