Claude Code session burn rate vs chat: same bucket, very different burn

Both surfaces deplete one shared org-scoped bucket on the server. What people miss: a Claude Code agentic loop drains that bucket roughly 5 to 10 times faster than the same minute spent in claude.ai chat. Below is the JSON field that proves it (seven_day_oauth_apps), the two endpoints that surface the split, and the exact lines inside claude-meter that read both.

Direct answer (verified 2026-05-11)

Claude Code and claude.ai chat share the same five_hour and seven_day buckets on your Anthropic org. In our snapshots Code typically burns 5 to 10 times faster per wall-clock minute, because each agentic turn fans out into tool calls, file reads, and full-context replay. The server already tracks the split via the seven_day_oauth_apps sub-bucket on /api/organizations/{org}/usage, the same JSON claude.ai/settings/usage renders. ClaudeMeter pins both the Code-only view (/api/oauth/usage) and the combined view to the menu bar so the split is visible without doing the subtraction by hand.

Two surfaces, one bucket, two probes

The thing that confuses people is that there is one cap and two completely different code paths into it. claude-meter polls both paths because each one tells you something the other cannot.

How claude-meter sees the split

One bucket on the left. Two views into it on the right. The server decides the gate; we decide which lens to read it through.

A 6-minute window, side by side

Same account, same plan, same wall-clock window. On the left, a Claude Code refactor. On the right, a focused chat session reading and discussing the same code. Watch the 5-hour delta. This is the actual shape of the difference, not a synthetic benchmark.

Claude Code session

claude.ai chat session

The chat side moves the five_hour bucket by about a percentage point. The Code side pegs the same bucket inside the same 6 minutes. Same plan, same account, very different burn shape, because every tool call counts.

The numbers that survive the wall clock

Rough order-of-magnitude figures, from a typical Max-plan workday where the same engineer is using both surfaces. Your mileage varies (Opus vs Sonnet, file sizes, how many tests run per turn), but the shape is consistent across the accounts we have looked at.

0xCode burn rate vs chat (median)

0Tool calls in a 6-min Code loop

0Turns in a 6-min chat session

0%5-hour delta from a chat turn

Field-by-field: Code burn vs chat burn

Same bucket on the server. Different code paths in, different per-turn shape, different model-class skew. The columns below come from reading both endpoints in real time, not from a marketing deck.

Feature	claude.ai chat (cookie path)	Claude Code (OAuth path)
Where the request is authenticated	Browser session cookies. claude.ai sets sessionKey on your domain after login; every request includes the cookie header.	OAuth bearer token stored by the Claude Code CLI in the macOS Keychain under service 'Claude Code-credentials'. Rotated automatically by the CLI.
Which endpoint reveals it	GET claude.ai/api/organizations/{org_uuid}/usage. Same call the in-app Settings > Usage page makes.	GET api.anthropic.com/api/oauth/usage. Different host, different gate (no Cloudflare bot check), same JSON shape.
Bucket it depletes	five_hour and seven_day on the org. Counts here, plus IDE usage, plus oauth-app usage, equals the gate.	Same five_hour and seven_day, plus the seven_day_oauth_apps sub-bucket which isolates the Claude Code (and other OAuth-app) slice of the weekly.
Typical burn per wall-clock minute	Slow. 1-3 prompts per minute, each one mostly outbound text. A 4-minute chat exchange might be 2-3 turns.	Fast. A 4-minute agentic refactor is closer to 20-40 turns. Each turn fans out into Read, Bash, Edit, type-check, replay context.
What dominates the cost	Reply length and any image or file you attached. Long Opus replies and PDF uploads are the biggest single hits.	Tool-call overhead and full-context replay. A Read on a 1500-line file is a thousands-of-tokens charge. Multi-turn loops re-pay context every turn.
How model class affects it	You usually pick a model and stay there. Opus burns roughly 5x faster per byte than Sonnet on the seven_day_opus sub-bucket.	Code defaults to Opus on Max more often than chat does. Same Opus-heavy weighting, but you trip it faster because the turn count is higher.
Is it visible in /usage inside Claude Code?	Yes, but the in-tool /usage is a snapshot of the OAuth view only. It does not show your chat-only spend.	Same view, same limitation. Both share the parent bucket; neither in-tool screen shows you the share split.
How ClaudeMeter shows it	Cookie path snapshot in the menu bar dropdown, broken out as seven_day all + seven_day_sonnet + seven_day_opus.	OAuth path snapshot in the same dropdown, broken out via seven_day_oauth_apps so you can subtract Code from the parent seven_day to estimate chat.

The JSON that already knows the split

Anthropic does not expose a chat-vs-Code breakdown in the in-app settings page, but the server has already done the math. The response below is what the browser actually receives when the Settings > Usage page loads. Notice seven_day_oauth_apps sitting next to seven_day. Subtract one from the other and you have the chat share of the week.

GET /api/organizations/{org_uuid}/usage

On the snapshot above, the weekly bucket is at 68%; the OAuth-app (Code) slice is at 61%. That implies the chat-and-everything-else share of the week is about 7 percentage points. Code is roughly nine times the chat burn for this account this week. The server already knows; the Settings page just does not show it.

The two endpoints, the two clients

claude-meter is a Rust menu bar app, so this part is short. There are two endpoints. Each one needs a different auth and a different HTTP client (claude.ai is behind Cloudflare and requires Chrome fingerprint emulation; api.anthropic.com is not). The Code-only view:

src/oauth.rs

And the combined view, which counts chat plus Code plus IDE plus any other OAuth-app traffic on the org:

src/api.rs

Both responses deserialize into the same struct, because the JSON shape is identical. That struct declares every sub-bucket as nullable, so when the server starts returning a new one (seven_day_cowork appeared mid-2026), the parser keeps working:

src/models.rs

What the menu bar dropdown actually says

On a Monday afternoon, mid-refactor, after a weekend of light chat use, the dropdown for the account looks like the run below. The implied chat slice is calculated as seven_day − seven_day_oauth_apps. That is the number that does not appear anywhere in claude.ai's own UI.

claude-meter dropdown, Monday 18:24

The seven_day_opus row at 94% is the real failure mode here, not the oauth-apps row. The five_hour at 81% is climbing because the refactor is still going. If you switched to chat for the rest of the afternoon, oauth-apps would stop climbing, but seven_day would keep climbing because chat still adds to it. The right move on this snapshot is to drop the agent to Sonnet for the rest of the week so seven_day_opus bleeds back down.

Three burn-shape failure modes

The mid-loop wall. Five_hour climbs through 80, 90, 100 inside a 90-minute Code session. The 7-day bucket is still fine; you just sprinted. Either wait an hour for old prompts to age out, or accept that the cap is keeping you from looping faster than the cluster wants you to.

The Tuesday weekly wall on Opus. Five_hour is fine, seven_day_opus is at 100. Opus weighting is heavier per byte and Code defaults to Opus on Max. Resets_at on seven_day_opus is five days out. Switch the agent to Sonnet today and the seven_day_opus float bleeds down over 24 hours while Sonnet picks up the slack.

The phantom “chat is fine” bias. People assume chat is cheap because the bar barely moves when they type. That holds for typing. It does not hold for uploads. One Claude.ai upload of a PDF deck or a screenshot of a repo tree can be five percentage points on the 5-hour in one click. Most of the chat surprises come from attachments, not from prompts.

Why no local-log tool can answer this question

ccusage, ccburn, Claude-Code-Usage-Monitor all walk ~/.claude/projects/*.jsonl and total token counts. That is excellent for cost attribution per pull request and per project. It cannot answer the chat-vs-Code question because:

chat usage never writes to that path. claude.ai does not put anything on your disk. So the chat side is invisible to any tool that reads JSONL.
tool-call cost is weighted on the server and the weighting is not in the local file. A 1500-line file Read is one JSONL row plus a token total; the bucket-cost weighting on top of that lives only in the server response.
peak-hour multiplier on the five_hour (Anthropic March 2026 statement: weekdays 5 to 11 a.m. Pacific) shifts the wall-clock cost of the same byte. Local files have no clock.
seven_day_opus weighting charges Opus heavier per byte than Sonnet against the weekly. JSONL files record tokens, not the model-class weighting.

Run ccusage when you need cost-per-PR. Run claude-meter when you need to see which surface is eating your week. They answer different questions; on a Code-heavy week the answers can drift by 30 to 40 points and both are correct.

The honest caveat

The /api/oauth/usage and /api/organizations/{org}/usage endpoints are both undocumented. Anthropic can rename a field, split a bucket, or change the weighting at any time. claude-meter declares every nullable field as Option in Rust so a missing field downgrades to a missing row rather than a crash. macOS 12+ only today; Safari extension is not supported. The repo is open at github.com/m13v/claude-meter if you want to see exactly which requests it makes on your behalf. Install is one brew install --cask m13v/tap/claude-meter.

Want me to look at your Code-vs-chat split with you?

15 minutes. Share your screen, walk me through a typical week, and I will show you which surface is actually eating your seven_day and what to switch so you stop blowing through the wrong bucket.

Frequently asked questions

Do Claude Code and claude.ai chat draw from the same usage bucket?

Yes. Both deplete the same org-scoped buckets on the server. The five_hour utilization and the seven_day utilization on /api/organizations/{org_uuid}/usage count every prompt your account makes, whether it came in over the cookie-authenticated browser chat or over the OAuth token Claude Code uses. The seven_day_oauth_apps sub-bucket isolates the OAuth slice, but seven_day itself is the sum. There is no separate quota for Code, and there is no way on a Pro or Max plan to give Code its own pool.

How much faster does Claude Code burn the bucket than chat?

Roughly 5 to 10 times faster per wall-clock minute on a typical agentic loop, in the workloads we have measured. A 4-minute chat exchange (write a prompt, read a 500-token reply, send a follow-up) is maybe 2-3 messages. A 4-minute Claude Code loop on a real refactor is closer to 20-40 turns, each one fanning out into file reads, ripgrep, edits, type-check runs, and a multi-turn replay of the conversation context. Same wall clock, an order of magnitude more units against the same shared bucket.

Why does Claude Code burn so much more than typing in chat?

Three multipliers stack. First, agentic loops produce many short turns where chat produces a few long ones, and each turn carries the full context window forward. Second, tool calls (Read, Bash, Edit) and large file reads are billed as input tokens; a single Read on a 1500-line file is a thousands-of-tokens charge that a chat session never makes. Third, Opus weighting on the seven_day_opus sub-bucket is heavier per byte than Sonnet, and Code defaults to Opus on the Max plan more often than chat users explicitly switch to it.

Can I see Claude Code burn rate separately from chat in claude-meter?

Yes. The menu bar app polls two endpoints in parallel: api.anthropic.com/api/oauth/usage with the Claude Code OAuth token from your keychain (src/oauth.rs:114), and claude.ai/api/organizations/{org_uuid}/usage with your browser session cookies (src/api.rs:19). The OAuth endpoint shows only Claude Code traffic; the cookie endpoint shows the combined total plus a seven_day_oauth_apps sub-bucket that backs out the Code slice. Subtract one from the other and you have your chat burn.

What is seven_day_oauth_apps and why does it exist?

It is one of the sibling Window structs on the same JSON the settings page renders. claude-meter parses it in src/models.rs as Option<Window> alongside five_hour, seven_day, seven_day_sonnet, seven_day_opus, and seven_day_cowork. The OAuth-apps row covers every OAuth-token client (Claude Code is the main one, plus any third-party app that authenticates via the Anthropic OAuth flow). The reason it is broken out separately is that the server already knows which side of the wall the request came from; it just does not put that breakdown anywhere in the in-app UI.

Does ccusage show Claude Code burn rate?

It shows the local token count, which is a token-flow measurement, not a bucket-drain measurement. ccusage walks ~/.claude/projects/*.jsonl, sums input + output tokens, and divides by elapsed minutes. That number is real, but it does not match what the rate limiter checks. The server weights utilization by per-tool-call cost, per-attachment cost, peak-hour multiplier (per Anthropic's March 2026 statement on weekday 5 to 11 a.m. Pacific throttling), and per-model factor (Opus heavier than Sonnet). And ccusage cannot see your chat usage at all because chat never writes to that path. So ccusage shows Code's tokens, not Code's burn against the bucket Anthropic enforces.

If chat barely moves the needle, why am I hitting the wall on chat days?

Two reasons. One, attachments. A single Claude.ai upload of a PDF, screenshot, or repo dump is a big chunk of input tokens, and the rolling 5-hour weighting includes that. Two, peak hours. Anthropic stated in March 2026 that the 5-hour window is adjusted during weekdays 5 to 11 a.m. Pacific. The same chat session at 10 a.m. on a Tuesday fills the bucket faster than the same chat at 11 p.m. on a Saturday. Burn rate is wall-clock-dependent for chat in a way it is not for Code.

Where is the difference visible in the menu bar dropdown?

Open the dropdown and you see two rows per signed-in account. The five_hour row is the rolling burst cap and reflects the sum of chat and Code over the last 5 hours. The seven_day_oauth_apps row, if your plan exposes it, is the Code-only slice of the weekly. Subtracting seven_day_oauth_apps from seven_day gives you your chat slice. On a Max plan you also get seven_day_sonnet and seven_day_opus, which is the same JSON sliced by model class instead of by surface, so you can cross-check.

I run Claude Code on two machines. Does each one have its own burn rate?

No. The bucket is org-scoped. Two laptops logged into the same Anthropic account share one five_hour and one seven_day. If laptop A is running an Opus refactor at 78 percent five-hour, laptop B starts at 78 percent five-hour. claude-meter dedupes by account_email in dedupe_by_account so the menu bar shows one row per account, not one row per machine, and that mirrors how the server treats them.

Will switching to chat for a while let the Code bucket cool down?

It cools the OAuth-apps slice (because nothing is adding to it) but it does NOT cool the parent seven_day or five_hour buckets if you keep chatting, because both surfaces deplete the same parent. The real way to give Code room is to stop using both Code and chat, let the rolling window age out, or switch the Code agent to Sonnet for the rest of the week so the seven_day_opus sub-bucket bleeds down while the Code burn continues at a lower per-byte weight.

Keep reading

Comparison

Rolling 5-hour vs weekly quota: two caps, one JSON

The two caps sit as sibling fields on the same response. Either one independently 429s your loop. Which one bites depends on plan tier and how Opus-heavy your week is.

Read

Math

Burn rate as Δu/Δt, not tokens per minute

Why the only honest burn rate is the change in server utilization over time, and why ccusage's tokens/minute is the wrong number for predicting when you hit the wall.

Read

Reference

ClaudeMeter vs ccusage

ccusage walks local JSONL files and totals tokens. ClaudeMeter polls server quota. Different questions, both useful, run both on Code-heavy weeks.

Read

Two surfaces, one bucket, two probes

How claude-meter sees the split

A 6-minute window, side by side

The numbers that survive the wall clock

Field-by-field: Code burn vs chat burn

The JSON that already knows the split

The two endpoints, the two clients

What the menu bar dropdown actually says

Three burn-shape failure modes

Why no local-log tool can answer this question

The honest caveat

Want me to look at your Code-vs-chat split with you?

Frequently asked questions

Keep reading

Rolling 5-hour vs weekly quota: two caps, one JSON

Burn rate as Δu/Δt, not tokens per minute

ClaudeMeter vs ccusage

Comments (••)

Comments ()