ARouter — the routing layer for OpenAI & Anthropic, in Rust

// data plane

A control plane for your LLM traffic.

Everything happens at the proxy — your application keeps speaking plain OpenAI or Anthropic.

Two wire formats in

Wire-compatible /v1/chat/completions and /v1/messages — bring the OpenAI or the Anthropic SDK. Streaming and non-streaming, one line to point at it.

Two providers, either API

OpenAI and Anthropic — call either native API and the model name picks the upstream; the request and response are translated to and from each provider's native shape on the way through.

Right-size every request

A YAML policy DSL. Match on token budget, tools, images, or response format — the first match swaps the model before dispatch, usually to a cheaper tier of the same provider, and labels every metric.

Self-healing fallback

Validate a response against its JSON schema and silently retry down a fallback chain — buffered or streamed via an 8 KB pre-flush buffer. Faithful proxy: the last attempt always wins.

Resilience built in

Per-provider retries with jittered backoff, circuit breakers, connection pooling, and opt-in request hedging — so one flaky upstream never takes the gateway down.

Observable by default

OpenTelemetry traces (OTLP), Prometheus metrics, per-request x-arouter-* headers, and a real /healthz that probes every upstream. Attribute savings to a rule.

// policy

Routing is a YAML file, not a redeploy.

Rules evaluate top-down on every request. The first match swaps the model — most often to a cheaper tier of the same provider — and can attach validators and a fallback chain.

policies.yamlpolicy DSL

rules:
  # trivial -> cheap tier (same provider)
  - name: trivial_to_mini
    when:
      tokens_input_lt: 250
      no_tools: true
    then:
      model: gpt-4o-mini

  # extract: start cheap, self-heal up the ladder
  - name: extract_with_self_healing
    when:
      response_format_type: json_schema
    then:
      model: gpt-4o-mini
      validate: [json_schema]
      fallback_chain: [gpt-4.1]

what it doesfirst-match-wins

A small, tool-free prompt quietly drops to gpt-4o-mini — same provider, a fraction of the cost, just a different model string. A structured-extraction request also starts on gpt-4o-mini, checks the JSON against the requested schema, and — if it doesn't conform — self-heals up to gpt-4.1. No provider hop, no app change.

→Conditions AND together: tokens, tools, images, response_format, current model.

→Fallbacks climb the same provider's ladder by default — and can cross providers when you want; the chain re-runs selection per attempt.

→Per-request headers (x-arouter-fallback-chain, …) can override the policy inline.

// drop-in

Change the base URL. That's the integration.

Your code doesn't know ARouter exists. The model name decides the upstream — and a Claude model gets translated to Anthropic's native API on the way through. ARouter is a validated drop-in for the official OpenAI and Anthropic Python SDKs; a conformance suite runs both SDKs against real endpoints in CI.

app.pypython · openai sdk

# point your existing client at ARouter —
# the api_key your app sends is ignored.
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="anything",
)

# routed + translated to Anthropic:
client.chat.completions.create(
    model="claude-sonnet-4-5-20250929",
    messages=[{"role": "user", "content": "hello"}],
)

arouter.tomlconfig

listen_addr = "0.0.0.0:8080"

[router]
policy_file = "policies.yaml"

[providers.openai]
api_key = "sk-..."

[providers.anthropic]
api_key = "sk-ant-..."

[telemetry]
# OTLP traces when set; no-op otherwise
otlp_endpoint = "http://localhost:4317"