Server truth vs local Claude logs: the five weights your JSONL cannot see
Your local Claude Code logs live in ~/.claude/projects/<session>.jsonl. Each row records input and output tokens for one assistant turn. Anthropic does not rate-limit on that number. They rate-limit on a weighted server-side fraction that includes five things the JSONL never sees. That is why ccusage can sit at 5% while claude.ai 429s your next prompt. Here is the gap, named.
Direct answer (verified 2026-05-19)
Local JSONL holds raw token counts per turn on one machine. Server truth lives at claude.ai/settings/usage (the page calls /api/organizations/{org}/usage) and is already weighted by peak-hour multiplier, attachments, tool calls, model class, and browser-chat usage. ccusage and the server disagree because they measure different things, not because one is broken. The float Anthropic enforces is five_hour.utilization in that JSON; nothing else is the rate limit.
Two files, two stories
The cleanest way to understand the gap is to put the two payloads on the same screen. Same Claude account, same Tuesday afternoon.
That row, multiplied across the day, is what ccusage sums. Now the same hour, read from the server claude.ai actually rate-limits against:
The JSONL row reports 4,000 tokens. The server reports 94% of the rolling 5-hour bucket used. Those are not two estimates of the same quantity. They are two quantities. Both are correct.
The five weights
Every one of these adds to five_hour.utilization on the server and writes nothing to the local JSONL. Any single one can push you to 94% while the local sum still reads 5%.
Five things your JSONL does not see
- Peak-hour multiplier. Weekday 5 to 11 a.m. Pacific fills five_hour faster. Same prompt count, different fraction.
- Per-attachment cost. One PDF upload can move the server bucket more than a hundred plain prompts. JSONL records the prompt, not the upload weight.
- Per-tool-call cost. Code execution and web browsing each carry server-side cost on top of tokens. ccusage counts the surrounding tokens, not the tool.
- Per-model weight. Opus burns faster than Sonnet for the same byte count. On Max, seven_day_opus is the sub-bucket that 429s you on Tuesday afternoons.
- claude.ai browser chat. Any prompt you sent in a browser tab on the same account hits the same bucket and never writes to ~/.claude/projects.
Stack two or three of them on the same hour and the gap goes from cosmetic to job-ending. A 9 a.m. Pacific Tuesday with three PDF attachments and a few web-browsing tool calls in Opus can put your server bucket at the wall while every local readout swears you are fine.
Side by side
Local JSONL on the left, server JSON on the right. Same plan, same account, same minute.
| Feature | Local Claude logs (~/.claude/projects/*.jsonl) | Server truth (claude.ai/api/.../usage) |
|---|---|---|
| What it measures | input_tokens + output_tokens per assistant turn, summed off disk. | five_hour.utilization, a weighted fraction the server enforces. |
| Where the data lives | ~/.claude/projects/<project>/<session>.jsonl on your laptop. One file per session. | claude.ai/api/organizations/{org_uuid}/usage. One JSON, your session cookie. |
| Peak-hour weighting | Invisible. ccusage has no field for it. A 9 a.m. Pacific weekday prompt looks identical to a 2 a.m. one. | Built in. Anthropic confirmed in March 2026 that the 5-hour bucket fills faster on weekday 5 to 11 a.m. Pacific. |
| Attachments and tool calls | JSONL has no per-attachment cost and no per-tool-call cost. A 40-page PDF upload writes the same row as a one-line prompt. | Server applies both, then folds them into the utilization float. |
| Per-model weight | Token counts are raw. Opus and Sonnet rows look the same. | Opus burns the bucket faster than Sonnet for the same prompt. seven_day_opus is a separate sub-bucket on Max plans. |
| claude.ai browser chats | Never lands in JSONL. ccusage cannot see them at all. | Counted in five_hour and seven_day. Same account, same bucket. |
| Predicts the next 429 | Only if you are a Claude-Code-only user, never browse claude.ai, never attach files, never use tools, and never prompt at peak hours. | Yes. The float you read is the float the rate limiter checks. |
One HTTPS request per minute
You do not need to invent a new metric or re-implement the weighting. The float Anthropic enforces is already on the JSON the settings page reads. The claude-meter extension calls it with your own cookies, once a minute. Eight lines of the relevant background script:
That is the entire trick. POLL_MINUTES = 1 at line 3. One fetch with credentials: "include" so your existing claude.ai session ships the cookie. The response is posted to a localhost-only bridge so the macOS menu bar app can render it without a second round-trip. No telemetry, no manual cookie paste, no token-cost approximation.
The moment that proves it
Two terminal calls, one screen. ccusage on the same machine reports 5.1% of the session used. The localhost bridge serves the snapshot claude-meter just fetched from the server. claude.ai answers the next prompt with a 429. All three readings happen inside the same minute.
If you only watch the left number, you walk into the rate limit. If you only watch the right number, you never know which project burned the budget. Most heavy users run both.
When ccusage is still the right tool
ccusage is the right tool when the question is which project burned how many tokens this week. It walks JSONL files per project and gives you a breakdown the server JSON cannot, because the server only knows about your org, not your local directory tree. If you bill clients per project, attribute spend across repos, or need to find the file that ate twenty thousand output tokens in one prompt, ccusage is the answer.
ccusage is the wrong tool when the question is will my next prompt 429. For that you need the float on the server. Different question, different surface.
Frequently asked
If ccusage says 5% and claude.ai 429s me, which one is wrong?
Neither. ccusage walks ~/.claude/projects/*.jsonl and sums input_tokens + output_tokens for sessions on this machine. That is a local token-flow rate. claude.ai 429s on five_hour.utilization, a weighted server-side fraction that includes peak-hour multiplier, per-attachment cost, per-tool-call cost, per-model weight, and any browser-chat usage. The local count and the server count are two different metrics with two different denominators. They agree only by coincidence.
Where is the server-truth number, exactly?
GET https://claude.ai/api/organizations/{your-org-uuid}/usage with your logged-in claude.ai cookies. The response is JSON with five_hour and seven_day objects, each containing a utilization float and a resets_at ISO timestamp. The bars on claude.ai/settings/usage are rendered from those two fields. ClaudeMeter's extension calls this endpoint once a minute (POLL_MINUTES = 1 at extension/background.js line 3) and pushes the response to the menu-bar app.
Why does the gap feel so big in practice, like 5% local vs 94% server?
Because the two numbers do not share a denominator and the server applies weights local logs cannot see. Five common multipliers: a peak-hour multiplier on weekday 5 to 11 a.m. Pacific, a per-attachment cost that fires the moment you upload a PDF or image, a per-tool-call cost on code execution and web browsing, a per-model weight (Opus burns faster than Sonnet), and any prompt you sent on claude.ai in the browser which never lands in JSONL. Stack a few of those up and the gap goes from cosmetic to existential.
Can I just trust ccusage if I only use Claude Code, no attachments, no browser?
Closer, but still no. Even with no attachments and no browser chat, the peak-hour multiplier still applies, the per-tool-call cost on Claude Code itself still applies, and the per-model weight still applies. ccusage is accurate as a local token-flow measurement and great for cost attribution per project. It is not a faithful proxy for the rate limiter. If you want to predict whether your next prompt will 429, the only reliable signal is five_hour.utilization on the server.
Do I have to give up ccusage to read server truth?
No. They answer different questions and most heavy users run both. ccusage tells you which projects burned the most tokens this week. ClaudeMeter tells you whether your next prompt is about to 429. Many people pair ccusage in a terminal split with ClaudeMeter in the menu bar.
Will Anthropic block the endpoint claude-meter reads?
It is the same endpoint claude.ai/settings/usage already calls with the same headers your browser sends (Cookie, Referer, Accept). ClaudeMeter sends one HTTPS request per minute per org membership, identical to what an open settings tab would do if you hit reload. If Anthropic changes the response shape, the Rust serde deserializer fails fast and the menu bar shows '!' instead of a wrong number. The README documents that risk explicitly.
Is the server reading instant, or does it lag like the JSONL log?
It is at most 60 seconds behind real-time, because the extension polls once per minute. The float Anthropic enforces is the same one you read, with no extra weighting added between read time and rate-limit time. The JSONL log, by contrast, is fully real-time on this machine but blind to the entire weighting layer plus any other surface (Max sharing, browser chat, etc.).
Want a live walk through your own numbers?
Twenty minutes. Open DevTools on claude.ai, curl the bridge, and watch the two readings line up on your account.
Pages built around the same gap, from different angles.
Keep reading
Why local token counters disagree with the rate limit
Deep dive on the exact mechanism behind the local-vs-server gap, with the JSON shape, the field names, and how to predict a 429 a minute before it lands.
How to verify a Claude usage tracker actually reads server truth
Three checkable steps. Open DevTools on claude.ai/settings/usage, curl the localhost bridge at 127.0.0.1:63762, then disable the extension and watch the staleness flip.
ClaudeMeter vs ccusage
ccusage reads local Claude Code tokens off disk. ClaudeMeter reads plan quota off claude.ai. Two different questions, two different surfaces. Many users run both.
Comments (••)
Leave a comment to see what others are saying.Public and anonymous. No signup.