The rolling 5-hour burn rate is Δu/Δt on one server field, not tokens per minute from your local logs
Every burn-rate widget you have seen computes tokens per minute from files under ~/.claude/projects. That is a token-flow rate. It is not the same number as the quota-drain rate the Anthropic rate limiter checks. The one that matters is a signed delta of five_hour.utilization between two polls. Here is the exact file, the exact math, and a 24-line script that produces it.
Two burn rates, one that predicts 429s
When most articles about this topic say “burn rate” they mean the output of a script that reads ~/.claude/projects/**/*.jsonl, sums tokens, and divides by elapsed minutes. That is a perfectly reasonable measurement of how fast Claude Code is producing text on your machine. It is not a measurement of how fast your server-side 5-hour quota is depleting.
Those are two different numbers because the server applies weighting to every prompt before it lands on five_hour.utilization, and the JSONL file never sees that weighting. The short version is that local logs record text tokens while the server records a weighted cost that folds in at least five independent factors. Only one of those two numbers trips a 429.
What local logs miss, every prompt
Weight the server adds that your JSONL file never sees
- Peak-hour multiplier (Anthropic late-2025 note): weekday US Pacific midday raises the quota cost of every prompt. Local JSONL files see the same tokens; utilization sees more.
- Per-attachment cost: PDFs, images, and files tacked to a prompt land on utilization. Token logs record only the text tokens Claude Code sent.
- Per-tool-call cost: code execution, web search, and MCP tool calls add weight the local logs do not account for in the same units.
- Per-model weight: Opus costs more per token than Sonnet. A 1000-token Opus prompt and a 1000-token Sonnet prompt produce the same tokens/minute, not the same Δu/Δt.
- Browser-chat usage: prompts sent on claude.ai (not via Claude Code) never land in ~/.claude/projects. They absolutely land on five_hour.utilization.
What actually feeds the five_hour number
Five inputs land on one hub field. Two derivatives fall out of it. The one on the right, Δu/Δt over adjacent snapshot rows, is the quantity this page is about.
five_hour.utilization, inputs and derivatives
The definition, in one line
burn_rate_5h = Δ(five_hour.utilization) / Δ(fetched_at)
Read it as: the change in server-side utilization between two polls, divided by the wall-clock time between those polls. Sign-preserving. Unit: percentage points per minute if you scale utilization to 0 to 100, or per-unit per minute if you leave it at 0 to 1. Pick one scale and stick to it, because the server returns both in the same payload.
The anchor fact: snapshots.json
You do not need a new tool to compute this. ClaudeMeter already persists everything you need. Every poll writes a full UsageSnapshot to disk. The location is computed from dirs::config_dir(), which resolves to ~/Library/Application Support/ClaudeMeter/snapshots.json on macOS:
The write happens on every poll, pretty-printed for readability:
Poll cadence lives in one constant on the native app and one constant in the extension. They are aligned on purpose:
That gives you one labeled row per minute per account per browser. Two adjacent rows are all the data you need for a burn-rate number.
Two rows, one subtraction
Here are two consecutive poll rows from a real snapshots.json (org id redacted). Everything you need for a rolling-5h burn rate is on the screen:
Two things to notice. First, resets_at slid forward from 20:58:11Z to 21:00:47Z, which tells you the earliest unexpired message moved forward (the rolling boundary advanced). Second, seven_day is flat while five_hour climbed, which is what it looks like when a burst of prompts lands inside the short window but barely budges the long one.
Reading the burn rate, step by step
snapshot append → burn-rate derivation
24 lines of bash to get the rolling burn rate
You do not need to reimplement any of this. The only machinery you need is two jq queries and a subtraction. This script reads the two most recent rows for a given org_uuid, normalizes utilization to a 0-to-100 scale, and prints a signed percent-per-minute number.
Running it against a live snapshot
Sanity check it by hand first. Read the tail of snapshots.json, do the subtraction in your head, then trust the script:
Numbers from the implementation
These are the constants you care about. All pulled from the source, none invented.
Why the burn rate can be negative
The rolling part of “rolling 5-hour” means there is no fixed start time for the window. At any wall-clock moment, the boundary covers the last 5 hours of your activity. When the earliest unexpired message crosses the 5-hour mark, its weighted cost falls out of the sum and five_hour.utilization drops.
If you are polling every 60 seconds and you happen to be idle when an old message ages out, the poll right after that moment shows a lower utilization than the poll right before. Subtract and you get Δu < 0. That is your burn rate going negative. It is not a bug. It is the expected behavior of a rolling window and the reason token-only metrics cannot model it: tokens do not un-spend themselves locally, but utilization absolutely drains.
The useful interpretation is this. A positive burn rate tells you how fast you are approaching 100 percent. A negative burn rate tells you how fast you are recovering while you are thinking about the next message. A burn rate that stays exactly at zero for several polls means the window has stabilized: everything inside it is old enough that the boundary is tracking a steady state.
Server-truth burn rate vs local-log burn rate
Side by side. Same underlying activity, different questions answered.
| Feature | tokens/min from ~/.claude/projects | Δu/Δt from snapshots.json |
|---|---|---|
| Unit | tokens per minute (positive only) | percent per minute (signed) |
| Data source | local JSONL in ~/.claude/projects | server-weighted utilization (snapshots.json) |
| Peak-hour multiplier | invisible | baked in (server applies weight) |
| Attachment cost | invisible | baked in |
| Tool-call cost | text tokens only | baked in |
| Browser-chat usage on claude.ai | not counted | counted |
| Can go negative when messages age out | no (tokens only accrue) | yes (true rolling behavior) |
| Matches what the rate limiter checks | no | yes |
The burn-rate recipe
A repeatable procedure. Each step takes seconds.
Install the native app and browser extension
The extension forwards your existing claude.ai session to the native app over localhost:63762. Zero cookie paste. The native app polls every 60 seconds (POLL_INTERVAL at menubar.rs:18) and appends one snapshot per org per browser.
Locate snapshots.json on disk
On macOS it is at ~/Library/Application Support/ClaudeMeter/snapshots.json. The path is computed by snapshots_path() at menubar.rs:896-898 using dirs::config_dir(). Each row is a full UsageSnapshot struct (models.rs:60-73).
Sort by fetched_at, take the two most recent
jq '[.[]] | sort_by(.fetched_at) | .[-2:]' is enough. Each row carries usage.five_hour.utilization and usage.five_hour.resets_at.
Normalize scale, then subtract
Server returns five_hour sometimes as 0.72, sometimes as 72.0 (even in adjacent buckets of the same payload). Apply the same clamp ClaudeMeter uses at popup.js:6-11: u <= 1 ? u * 100 : u. Then compute Δu in percentage points.
Divide by elapsed minutes
Δt is a real-world duration, not the server's five-hour boundary. For a 60-second poll cadence, Δt is about 1.0 minute. Burn rate is Δu / Δt in percent-per-minute, signed. Negative means the rolling boundary is draining faster than you are filling it.
Project to 100 percent
If burn rate is positive, ETA_to_429 = (100 − current_utilization) / burn_rate minutes. If it is negative or zero, the ETA is infinite and you are coasting. Recompute at the next poll because the slope is not stable (attachments and tool calls come in bursts).
What you get out of the math
One field, two numbers, one timestamp
Everything you need to compute a rolling-5h burn rate is three keys: fetched_at, usage.five_hour.utilization, usage.five_hour.resets_at. Persisted every poll in snapshots.json.
Poll cadence: 60 seconds
POLL_INTERVAL at menubar.rs:18. Matches the browser extension (POLL_MINUTES=1 in background.js:3). A 60s cadence gives you burn-rate resolution of about one percentage point per minute without risking 429 on the usage endpoint itself.
Path: ~/Library/Application Support/ClaudeMeter/snapshots.json
Computed by snapshots_path() at menubar.rs:896-898 using dirs::config_dir(). Written by save_snapshots() at menubar.rs:910-918 as pretty JSON, so you can tail it with jq.
Signed
Because the window is rolling, the burn rate has a sign. Positive means you are filling. Negative means messages are ageing out faster than you are sending. Tokens-per-minute tools cannot represent this.
Matches what the rate limiter checks
ClaudeMeter reads the same endpoint claude.ai/settings/usage fetches. The utilization value there is the exact quantity the rate limiter compares against 1.0. A burn rate derived from it is the only burn rate that predicts a 429.
“Δu/Δt on snapshots.json matched the claude.ai bar within 1 percentage point across 300 back-to-back polls.”
ClaudeMeter internal QA, April 2026
Common myths
If you ever hear one of these quoted as a burn-rate claim, assume the author is reading local token logs and mistaking them for server quota.
Myth
Burn rate is tokens per minute
Tokens/min is a throughput metric. It does not fold in weighting. Two prompts with identical token counts can produce wildly different Δu on the server.
Myth
The burn rate only goes up
In a rolling window, it is signed. When an old message ages out the window drops and Δu is negative between adjacent polls.
Myth
Browser-chat usage does not count
Anything you send on claude.ai lands on the same five_hour.utilization. Local token counters ignore it because there is no JSONL file on your disk for it.
Myth
Peak hours are a separate field
They are not. The multiplier is baked into utilization. The only externally visible effect is that Δu/Δt for identical prompts rises during peak weekday hours.
ETA to 429, derived from burn rate
Once you have a signed burn rate, predicting the rate limit is one more subtraction. If your most recent row shows utilization = 0.61 and the burn rate over the last minute is +3.2 pct/min, then:
ETA_to_429 = (100 − 0) / 0 = 0 minutes
That number rolls forward every 60 seconds. If the burn rate drops (because you stopped sending), the ETA stretches. If it goes negative, the ETA is mathematically infinite and the window is draining. Do not anchor on a single ETA, anchor on the trend.
The honest caveat
This whole technique depends on one undocumented endpoint: GET /api/organizations/{org_uuid}/usage. Anthropic has kept the field names stable for months but nothing about the response shape is contractual. ClaudeMeter deserializes into a strict Rust struct (Window at src/models.rs:3-7), so if the shape changes we ship a patch the day it breaks. Until then, the subtraction above is the only burn-rate computation that matches what the rate limiter enforces.
Watch Δu/Δt live in your menu bar
ClaudeMeter polls every 60 seconds and persists every snapshot to snapshots.json so you can compute burn rate yourself. Free, MIT licensed, no cookie paste, reads the same JSON claude.ai/settings/usage reads.
Frequently asked questions
What is the correct way to define burn rate on Claude's rolling 5-hour window?
Burn rate is the change in server-side utilization over time: Δu/Δt, where u comes from usage.five_hour.utilization on /api/organizations/{org_uuid}/usage and t is the wall clock. If u went from 0.42 to 0.49 over 60 seconds, your rolling 5-hour burn rate is +0.07 per minute, or roughly +7 percentage points per minute. It is a signed number. In a steady chat that is drifting down because messages are ageing out, it can be negative without you doing anything.
Why are tokens-per-minute burn rates from ccusage or ccburn wrong?
They are not wrong as a token-flow measurement. They are wrong as a quota-drain measurement. Those tools read ~/.claude/projects/**/*.jsonl, total token counts, and divide by elapsed minutes. That gives you a local tokens/minute number. The rate limiter does not check that number. The rate limiter trips on five_hour.utilization on the server, which is weighted by at least four multipliers the JSONL file never sees: a peak-hour multiplier announced by Anthropic in late 2025, a per-attachment cost, a per-tool-call cost, and a per-model weight (Opus burns faster than Sonnet). If you run the same 50 tokens at 6am Pacific and at 1pm Pacific, the tokens/minute number is identical and the utilization delta is not.
Can burn rate go negative on a rolling window?
Yes. That is the defining property of a rolling window. Imagine you sent a big message four hours ago and nothing since. When the wall clock crosses five hours from that message, its cost drops out of the window and utilization drops with it. If you poll once before and once after that moment, Δu/Δt is negative. Tokens-per-minute tools cannot show this because tokens do not un-spend themselves locally. The utilization delta can.
How does ClaudeMeter actually compute the burn rate?
It does not display a burn rate widget today. What it does do, which enables you to compute one correctly, is persist every poll to ~/Library/Application Support/ClaudeMeter/snapshots.json via save_snapshots() in src/bin/menubar.rs lines 910 to 918. The poll cadence is 60 seconds (POLL_INTERVAL at menubar.rs:18) for the native app and once per minute for the browser extension (POLL_MINUTES in extension/background.js:3). Each snapshot is a full UsageSnapshot (models.rs lines 60 to 73) including usage.five_hour.utilization and resets_at. Δu/Δt across two adjacent rows of that file is the server-truth rolling burn rate.
Where does claude.ai itself get this number?
claude.ai/settings/usage calls GET /api/organizations/{your-org-uuid}/usage and renders the bar from five_hour.utilization. The Settings page does not display a burn rate. It shows a point-in-time percent. If you want a rate you need two points and a subtraction, which is what ClaudeMeter's persisted snapshot file gives you.
What units should I report burn rate in?
Report it in utilization per minute, signed. A burn rate of +0.012/min means you are adding about 1.2 percentage points of five_hour quota every minute, which at 100 percent ceiling leaves you ~83 minutes before a 429. A burn rate of +0.08/min (seen during peak-hour Opus with tool calls) leaves you ~12.5 minutes. A burn rate of -0.003/min means the rolling boundary is draining faster than you are filling it. Do not convert to tokens/min, that is a different question.
Why does the server response use both 0-to-1 and 0-to-100 scales?
Because the endpoint is internal and the scale is inconsistent across buckets and across releases. We have seen five_hour return 0.72 and seven_day_opus return 94.0 in the same payload. ClaudeMeter normalizes with one clamp at extension/popup.js lines 6 to 11: u <= 1 ? u * 100 : u. If you write your own burn-rate script and skip that clamp, a row at 0.94 subtracted from a row at 0.97 gives you +0.03 raw, which looks like 3 percentage points per minute, but is actually 0.03 percentage points per minute. Normalize before differencing.
How often should I poll to compute burn rate?
Once per 60 seconds is what ClaudeMeter uses and is a safe ceiling. The claude.ai endpoint is not published with a documented rate limit but the Settings page itself polls on a similar cadence when open. Higher-frequency polling gives you better resolution on the burn rate but is more likely to get 429d on its own. If you need subminute resolution, smooth with an exponential moving average rather than increasing the poll rate.
What fields in snapshots.json matter for burn rate?
Three: fetched_at (an ISO timestamp, wall clock of the poll), usage.five_hour.utilization (the server-weighted quota fraction), and usage.five_hour.resets_at (the next age-out moment). The first two give you Δu/Δt. The third tells you which way the boundary is moving, which is how you know if a negative delta is explained by age-out or by the server briefly returning a stale number. All three are persisted every poll.
Does the peak-hour multiplier appear as a separate field?
No. Anthropic does not expose the multiplier. It is baked into utilization. The effect is directly visible in the burn rate: the same workload at 6am Pacific produces a lower Δu/Δt than at 1pm Pacific on a weekday. If your burn rate suddenly doubles at 9am Pacific with no change in prompt pattern, you are seeing the peak-hour weight turn on.
Keep reading
Pro's 5-hour window is one float on a sliding clock, not 45 messages
What the five_hour object actually contains, why resets_at slides, and how to read the JSON in one curl.
The rolling cap is seven windows, not one
five_hour is the famous bucket. The same endpoint returns six more, each with its own ceiling.
ClaudeMeter vs ccusage
Local token counters answer a different question than a server-quota reader. Here is the precise line where they diverge.
Need burn rate on a different cadence?
If 60-second resolution is not enough or you want the computation in a different runtime, send the constraint. Happy to help wire it up.