v3.0.0 production-ready · LLM routing proxy · written in Rust

The routing layer for OpenAI & Anthropic.

Speak the OpenAI or Anthropic API to a single endpoint. ARouter right-sizes requests by policy, self-heals bad responses, and translates across OpenAI and Anthropic — with retries, circuit breakers, and OpenTelemetry traces, all without touching application code.

shell
copy
$ docker run -p 8080:8080 \
-v $PWD/arouter.toml:/etc/arouter/arouter.toml \
ghcr.io/sricola/arouter
arouter listening · 0.0.0.0:8080 · providers ready
your app OpenAI · Anthropic SDK arouter route · heal · observe OpenAI Anthropic
two wire formats · two native providers · model-routed
OpenAI gpt-* Anthropic claude-*

// data plane

A control plane for your LLM traffic.

Everything happens at the proxy — your application keeps speaking plain OpenAI or Anthropic.

01

Two wire formats in

Wire-compatible /v1/chat/completions and /v1/messages — bring the OpenAI or the Anthropic SDK. Streaming and non-streaming, one line to point at it.

02

Two providers, either API

OpenAI and Anthropic — call either native API and the model name picks the upstream; the request and response are translated to and from each provider's native shape on the way through.

03

Right-size every request

A YAML policy DSL. Match on token budget, tools, images, or response format — the first match swaps the model before dispatch, usually to a cheaper tier of the same provider, and labels every metric.

04

Self-healing fallback

Validate a response against its JSON schema and silently retry down a fallback chain — buffered or streamed via an 8 KB pre-flush buffer. Faithful proxy: the last attempt always wins.

05

Resilience built in

Per-provider retries with jittered backoff, circuit breakers, connection pooling, and opt-in request hedging — so one flaky upstream never takes the gateway down.

06

Observable by default

OpenTelemetry traces (OTLP), Prometheus metrics, per-request x-arouter-* headers, and a real /healthz that probes every upstream. Attribute savings to a rule.


// policy

Routing is a YAML file, not a redeploy.

Rules evaluate top-down on every request. The first match swaps the model — most often to a cheaper tier of the same provider — and can attach validators and a fallback chain.

policies.yamlpolicy DSL
rules:
  # trivial -> cheap tier (same provider)
  - name: trivial_to_mini
    when:
      tokens_input_lt: 250
      no_tools: true
    then:
      model: gpt-4o-mini

  # extract: start cheap, self-heal up the ladder
  - name: extract_with_self_healing
    when:
      response_format_type: json_schema
    then:
      model: gpt-4o-mini
      validate: [json_schema]
      fallback_chain: [gpt-4.1]
what it doesfirst-match-wins

A small, tool-free prompt quietly drops to gpt-4o-mini — same provider, a fraction of the cost, just a different model string. A structured-extraction request also starts on gpt-4o-mini, checks the JSON against the requested schema, and — if it doesn't conform — self-heals up to gpt-4.1. No provider hop, no app change.

Conditions AND together: tokens, tools, images, response_format, current model.
Fallbacks climb the same provider's ladder by default — and can cross providers when you want; the chain re-runs selection per attempt.
Per-request headers (x-arouter-fallback-chain, …) can override the policy inline.

// drop-in

Change the base URL. That's the integration.

Your code doesn't know ARouter exists. The model name decides the upstream — and a Claude model gets translated to Anthropic's native API on the way through. ARouter is a validated drop-in for the official OpenAI and Anthropic Python SDKs; a conformance suite runs both SDKs against real endpoints in CI.

app.pypython · openai sdk
# point your existing client at ARouter —
# the api_key your app sends is ignored.
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="anything",
)

# routed + translated to Anthropic:
client.chat.completions.create(
    model="claude-sonnet-4-5-20250929",
    messages=[{"role": "user", "content": "hello"}],
)
arouter.tomlconfig
listen_addr = "0.0.0.0:8080"

[router]
policy_file = "policies.yaml"

[providers.openai]
api_key = "sk-..."

[providers.anthropic]
api_key = "sk-ant-..."

[telemetry]
# OTLP traces when set; no-op otherwise
otlp_endpoint = "http://localhost:4317"

Rust.
Tokio · Axum · rustls. One static binary.
2×
providers · both native APIs.
OTLP+
OpenTelemetry traces & Prometheus metrics.
Docker·Helm
multi-arch image & chart, shipped per release.
Apache-2.0
Permissive. Bring your own keys.