Claude Code spillover to OpenRouter: flip before the 429, flip back after reset.

Every guide on this tells you the four env vars and stops. The hard part is timing. Flip too late and a 429 kills your in-flight tool turn. Flip permanently and you pay OpenRouter API rates for hours you did not need to. Here is the shell hook that uses your real plan quota to flip at the right moment, and flip back the second the 5-hour clock rolls.

Matthew Diakonov, Written with AI

Published May 6, 20268 min read

Direct answer (verified 2026-05-06)

To spill Claude Code over to OpenRouter when your Pro or Max plan caps, set four env vars per OpenRouter’s Claude Code integration doc: ANTHROPIC_BASE_URL="https://openrouter.ai/api", ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY", ANTHROPIC_API_KEY="" (must be empty), and model overrides like ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4.7". Flip back by unsetting them. The interesting part is when to flip. ClaudeMeter exposes the same server-truth utilization fractions that claude.ai/settings/usage renders, as JSON via claude-meter --json, so a 12-line shell hook can flip ANTHROPIC_BASE_URL the moment five_hour.utilization crosses 0.95, and unflip it the moment the rolling window rolls.

Why people search for “spillover”

You are 40 minutes into a Claude Code refactor on Max. The agent has touched 14 files, has a half-applied migration, and is mid-tool-call. A 429 fires. Settings/usage says “rate limit reached” and gives you a reset time five hours from now. You are not waiting five hours. So you Google around and learn that Claude Code respects an ANTHROPIC_BASE_URL override; you can point it at OpenRouter, paste in your OpenRouter API key, and the same agent loop keeps running on metered API pricing. That is “spillover”: plan as the default, OpenRouter as the safety net.

The pattern works. The problem is most people only set it up after the wall has already cost them an in-flight turn, and most people forget to switch back, so they pay API rates for hours their plan would have happily covered. Both of those are timing problems, and both are solvable if you can see the plan-side quota in advance.

The four env vars (verified against OpenRouter’s doc)

This is the part every other guide covers. It is correct, it is short, and it is straight from OpenRouter’s Claude Code integration doc. The empty ANTHROPIC_API_KEY matters: if your real plan key is in your environment, Claude Code prefers it and ignores the auth token, so the spillover quietly fails and you keep eating plan quota.

~/.spillover-env

What ClaudeMeter adds that nothing else does

Knowing the four env vars is table stakes. The questions that break people in practice are the ones nothing else answers.

Feature	Manual flip, no plan-side visibility	With ClaudeMeter driving timing
Knowing the wall is coming, not just that it has hit	Wait for the 429, then flip. Lose the in-flight turn.	Watch five_hour.utilization climb in the menu bar. Flip at 95%, never see a 429.
Knowing which bucket fired	Settings page says 'rate limited'. You do not know if it was 5-hour or 7-day or per-model.	ClaudeMeter renders one row per bucket with the exact percent and resets_at. You see the wall by name.
Knowing when to flip back	Stay on OpenRouter until you remember to switch back. Pay API rates for hours you did not need to.	five_hour.resets_at is a real timestamp. Hook fires and unsets ANTHROPIC_BASE_URL the moment it rolls.
Scripting it	Most guides give you the env vars. You wire the rest manually.	claude-meter --json is structured output. Pipe it into jq and you have a 12-line direnv hook.
Cookie paste step	Other plan-side tools ask you to paste your sessionKey cookie into a config file.	Browser extension forwards the live claude.ai cookies via Manifest V3 (extension/background.js, POLL_MINUTES = 1). No paste.

Watching the wall climb (and the hook firing)

The interesting moment is the one between “everything is fine” and “everything is on fire.” ClaudeMeter exposes that moment as a number. Here is the same machine 36 minutes apart on a Wednesday afternoon, with the hook firing at 95%.

claude-meter

And here is the same shell once the 5-hour window rolls. Utilization drops from 96% to 3% in a single tick. The hook unsets ANTHROPIC_BASE_URL and you are back on plan billing without thinking about it.

claude-meter

The spillover hook (12 lines, MIT, copy as is)

This is the anchor. Drop it in ~/.config/spillover.sh and source it from your shell prehook, or run it on a 60-second cron and source the output file. It depends on claude-meter --json (declared in src/main.rs lines 8 to 11) and jq. The schema it relies on is in src/models.rs lines 18 to 28 and is part of the public Rust types, so it will not move silently.

~/.config/spillover.sh

The schema you depend on:

claude-meter/src/models.rs

Setup, end to end

Install ClaudeMeter and load the browser extension

brew install --cask m13v/tap/claude-meter. Then load the extension from claude-meter/extension/ as an unpacked extension in Chrome, Arc, Brave, or Edge. The extension picks up your live claude.ai cookies via Manifest V3 and posts a snapshot to the menu-bar app every 60 seconds; no cookie paste, no keychain prompt. Verify with claude-meter --json: you should see a usage object with five_hour.utilization populated.

Get an OpenRouter API key with paid credits on it

Sign up at openrouter.ai and add credits. Free-tier limits are around 20 requests per minute and 200 per day per model, which a single Claude Code session burns through in one refactor. Save the key as OPENROUTER_API_KEY in your shell rc, but do NOT export ANTHROPIC_BASE_URL yet. You want spillover to be conditional, not permanent.

Drop the 12-line spillover hook into ~/.config/spillover.sh

It reads claude-meter --json, jq's the five_hour.utilization, and exports or unsets ANTHROPIC_BASE_URL based on a threshold. Source it from your zsh/bash prehook (so each new prompt re-checks) or wire it as a direnv .envrc that re-evaluates every 60 seconds. The full hook is in the Spillover hook section.

Pick the threshold and the cooldown

0.95 / 0.50 is a reasonable default: flip to OpenRouter at 95% utilization on the rolling 5-hour bucket, flip back when it drops under 50% (which usually means the window has just rolled). Tighten the threshold to 0.85 if you want more headroom; loosen the cooldown to 0.30 if you want to be sure you are fully out of the wall before paying API rates again.

Verify the flip in both directions

Trigger spillover manually by setting THRESHOLD=0.10 temporarily, source the hook, and check echo $ANTHROPIC_BASE_URL — it should print https://openrouter.ai/api. Run a small claude command and confirm the request actually went to OpenRouter (their dashboard logs requests in real time). Then restore THRESHOLD=0.95 and source again; ANTHROPIC_BASE_URL should be unset.

What this does not solve

A 5-hour rolling wall is the easy case. The harder case is when the weekly bucket fires. If seven_day or seven_day_opus or seven_day_oauth_apps is the gate, the rolling window will not save you for days. In that case spillover stops being “a few hours of overflow” and starts being “the rest of the cycle on API rates,” which is meaningfully more expensive. The hook above only watches five_hour on purpose; you want a human eye on a weekly wall, not an autopilot.

The fix when a weekly bucket fires is usually one of three things: switch the model bucket that is hot (drop from Opus to Sonnet for the rest of the week), turn on Anthropic’s metered billing inside the plan, or ride spillover until reset. ClaudeMeter shows you which weekly bucket is hot so you can pick.

Why this is hard to do without server-truth quota

The two other tools people reach for here are ccusage and Claude-Code-Usage-Monitor. Both are good at what they do; neither sees plan quota. They read the local ~/.claude/projects/<project>/<session>.jsonl files Claude Code writes to disk and price the tokens against a model card. That is local-truth: tokens that left your machine, dollars per million.

Plan caps live on Anthropic’s servers as utilization fractions on /api/organizations/{org_uuid}/usage. That endpoint is undocumented, but it is the same one claude.ai/settings/usage calls in your browser tab right now, and the response is publicly inspectable through DevTools. ClaudeMeter replays that call every 60 seconds with the cookies your browser already holds (no cookie paste, no keychain prompt) and surfaces the result as JSON. That is what makes a spillover hook driveable. Local-truth tools cannot give you the wall-is-coming signal because the wall is not in the local logs.

Want help wiring this into your shell?

Book 20 minutes and we will set up the spillover hook against your real claude-meter --json output and verify the flip in both directions.

Frequently asked

Frequently asked questions

What is Claude Code spillover to OpenRouter, in one sentence?

It is the workaround where you point Claude Code at OpenRouter's API endpoint instead of Anthropic's once your Pro or Max plan hits a rolling 5-hour or weekly cap, so the agent loop keeps running on metered API pricing instead of stalling. You flip four environment variables (ANTHROPIC_BASE_URL, ANTHROPIC_AUTH_TOKEN, ANTHROPIC_API_KEY blanked, plus model overrides), and Claude Code happily talks the Anthropic protocol to OpenRouter as if it were Anthropic.

What four environment variables do I need to set?

Per OpenRouter's own integration doc: ANTHROPIC_BASE_URL="https://openrouter.ai/api", ANTHROPIC_AUTH_TOKEN="$OPENROUTER_API_KEY", ANTHROPIC_API_KEY="" (explicitly empty so it does not collide with your real plan key), and model overrides like ANTHROPIC_DEFAULT_OPUS_MODEL="anthropic/claude-opus-4.7" / ANTHROPIC_DEFAULT_SONNET_MODEL="anthropic/claude-sonnet-4.6" / ANTHROPIC_DEFAULT_HAIKU_MODEL="anthropic/claude-haiku-4.5". Reverse the same four vars to flip back to plan billing.

Why not just stay on OpenRouter the whole time?

Because if you are already paying $100 to $200 a month for Claude Max, the plan is meaningfully cheaper than API rates for the same Opus/Sonnet traffic until you saturate it. Spillover only makes financial sense as overflow: run the plan as your default, flip to OpenRouter the moment a plan bucket is about to 429, and flip back when the 5-hour rolling window resets. That cycle is what people actually want when they search for 'spillover'. The Anthropic plan cannot give you 'wait it out' for the rest of the week, but it does give you another full bucket every five hours.

Why does timing the flip matter? Why not flip AFTER the 429 fires?

Two reasons. One, a 429 mid-agent-loop usually leaves Claude Code holding partial tool outputs and a pending edit; restarting from the same checkpoint costs you a few minutes of context rebuild every time. Two, you do not know which of the eight server-side buckets fired the 429, so you cannot tell if waiting 5 hours unblocks you or if the wall is a 7-day bucket that will not reset until Sunday. Flipping at 95% utilization on five_hour, before the wall, avoids both: no half-finished turn, and you keep the option to fall back to plan if the next bucket up is healthy.

Why can ccusage not drive this for me?

ccusage reads the local ~/.claude/projects/<project>/<session>.jsonl files Claude Code writes to disk. That is local-truth: tokens that left your machine, priced against a model card. The plan caps live on Anthropic's servers as utilization fractions on /api/organizations/{org_uuid}/usage. They are different numbers, and ccusage cannot see the server one. ClaudeMeter reads the server one (the same JSON claude.ai/settings/usage renders) and exposes it as JSON, which is why a spillover hook can be driven by it. ccusage and ClaudeMeter are complementary; one tells you tokens-spent, the other tells you plan-quota-remaining.

What does the actual shell hook look like?

Twelve lines. Run claude-meter --json, jq the five_hour.utilization fraction, compare it against a threshold (0.95 is a sane default), and either export ANTHROPIC_BASE_URL=https://openrouter.ai/api or unset it. The full snippet is in the Spillover hook section above. Wire it as a direnv hook, a starship/zsh prehook, or run it on a 60-second cron and write to a file your shell sources. The schema you depend on is documented in src/models.rs lines 18 to 28 of the open-source repo, so the field names will not move on you silently.

Does OpenRouter rate-limit me too?

Yes, but on a different axis. OpenRouter's free models cap around 20 requests per minute and 200 per day per model. Paid OpenRouter credits raise that significantly and add automatic provider failover (if one Anthropic provider in their pool is throttled, OpenRouter routes your next request to a different one). For agent-loop spillover, you generally want paid OpenRouter credits, not free-tier; a single Claude Code session blasts past 200 requests/day on the first refactor.

Will OpenRouter pricing exactly match anthropic.com pricing?

On the Anthropic-routed providers in OpenRouter's pool, base model cost is the same per-token as Anthropic's API. OpenRouter takes its margin on top, which they publish per provider on each model page. So spilled-over traffic is a few percent more expensive than calling Anthropic's API directly with your own API key, and meaningfully more expensive per token than driving the same model through your plan. The point of the cycle is to use the plan as long as possible and only spill the overflow.

What about flipping back? How do I know my plan unlocked?

Same JSON, different field. claude-meter --json returns five_hour.resets_at as an ISO-8601 timestamp. Either compare it to now() in the shell hook ('if resets_at <= now and utilization < 0.5, flip back'), or just rerun the threshold check; the moment the 5-hour clock rolls, utilization on five_hour drops back near zero and the hook unsets ANTHROPIC_BASE_URL on its next tick. ClaudeMeter's menu bar shows the same number as a relative duration ('in 4h 12m'), so eyeballing it works too.

What happens to my claude.ai/settings/usage view while I am spilled over?

Nothing changes there. Spillover traffic flows through OpenRouter's API key, which is billed against your OpenRouter account, not your Claude plan. The plan-side buckets keep their utilization frozen until the rolling window rolls. So you can spill over for an hour, watch claude.ai/settings/usage stay still, and then spill back the moment your bucket comes off the wall. The two billing surfaces never cross.

Keep reading

Triage

Claude Max plan still hitting limits? It is eight buckets, not one

Why Max users still 429 mid-week. Eight independent server-side gates, the 60-second triage to find which one fired, and the right exit per gate.

Read

Internals

Claude Max weekly quota enforcement

The exact data path from server state to the BLOCKED string. How each gate decides whether the next prompt 429s, and where the resets_at clock comes from.

Read

Compare

ClaudeMeter vs ccusage

ccusage reads local Claude Code JSONL files and prices them against a model card. ClaudeMeter reads the server-truth plan quota Anthropic actually enforces.

Read