Speak the OpenAI or Anthropic API to a single endpoint. ARouter right-sizes requests by policy, self-heals bad responses, and translates across OpenAI and Anthropic — with retries, circuit breakers, and OpenTelemetry traces, all without touching application code.
Everything happens at the proxy — your application keeps speaking plain OpenAI or Anthropic.
Wire-compatible /v1/chat/completions and /v1/messages — bring the OpenAI or the Anthropic SDK. Streaming and non-streaming, one line to point at it.
OpenAI and Anthropic — call either native API and the model name picks the upstream; the request and response are translated to and from each provider's native shape on the way through.
A YAML policy DSL. Match on token budget, tools, images, or response format — the first match swaps the model before dispatch, usually to a cheaper tier of the same provider, and labels every metric.
Validate a response against its JSON schema and silently retry down a fallback chain — buffered or streamed via an 8 KB pre-flush buffer. Faithful proxy: the last attempt always wins.
Per-provider retries with jittered backoff, circuit breakers, connection pooling, and opt-in request hedging — so one flaky upstream never takes the gateway down.
OpenTelemetry traces (OTLP), Prometheus metrics, per-request x-arouter-* headers, and a real /healthz that probes every upstream. Attribute savings to a rule.
Rules evaluate top-down on every request. The first match swaps the model — most often to a cheaper tier of the same provider — and can attach validators and a fallback chain.
rules: # trivial -> cheap tier (same provider) - name: trivial_to_mini when: tokens_input_lt: 250 no_tools: true then: model: gpt-4o-mini # extract: start cheap, self-heal up the ladder - name: extract_with_self_healing when: response_format_type: json_schema then: model: gpt-4o-mini validate: [json_schema] fallback_chain: [gpt-4.1]
A small, tool-free prompt quietly drops to gpt-4o-mini — same provider, a fraction of the cost, just a different model string. A structured-extraction request also starts on gpt-4o-mini, checks the JSON against the requested schema, and — if it doesn't conform — self-heals up to gpt-4.1. No provider hop, no app change.
response_format, current model.x-arouter-fallback-chain, …) can override the policy inline.Your code doesn't know ARouter exists. The model name decides the upstream — and a Claude model gets translated to Anthropic's native API on the way through. ARouter is a validated drop-in for the official OpenAI and Anthropic Python SDKs; a conformance suite runs both SDKs against real endpoints in CI.
# point your existing client at ARouter — # the api_key your app sends is ignored. from openai import OpenAI client = OpenAI( base_url="http://localhost:8080/v1", api_key="anything", ) # routed + translated to Anthropic: client.chat.completions.create( model="claude-sonnet-4-5-20250929", messages=[{"role": "user", "content": "hello"}], )
listen_addr = "0.0.0.0:8080" [router] policy_file = "policies.yaml" [providers.openai] api_key = "sk-..." [providers.anthropic] api_key = "sk-ant-..." [telemetry] # OTLP traces when set; no-op otherwise otlp_endpoint = "http://localhost:4317"