Claude Code Opus token usage: two numbers that never match, and only one decides your next 429

You probably came here from a thread asking how to actually see Opus token usage when you run Claude Code. The honest answer is that there are two different numbers and they read from different places.ccusagesums tokens out of your local JSONL files. claude.ai's server returns only a utilization fraction, no raw token count. This page walks each one with the actual field path, then shows the pair you can run together so you stop being surprised mid-loop.

Matthew Diakonov, Written with AI

Published May 8, 20268 min read

Direct answer (verified 2026-05-08)

Two different ledgers, both legitimately called "Opus token usage". (1) Run ccusage --model opus for raw token counts pulled from ~/.claude/projects/*.jsonl. (2) Read seven_day_opus.utilization on claude.ai/api/organizations/{org_uuid}/usage (the same JSON claude.ai/settings/usage renders, and the same JSON ClaudeMeter polls in its menu bar). The server payload contains zero raw Opus token counts; only a utilization fraction. The two never reconcile because the 4.7 tokenizer expands tokens server-side after JSONL is written, thinking tokens are written incompletely, and browser-chat usage skips JSONL entirely. Verified against claude-meter/src/models.rs today.

The anchor fact: the server response has no token field at all

Before walking the tools, fix one thing in your head. The endpoint claude.ai uses to know whether to 429 your next Opus request is GET /api/organizations/{org_uuid}/usage. That JSON has buckets named five_hour, seven_day, seven_day_opus, seven_day_sonnet, and a few more. Every single one of those buckets has the same shape: a fraction and a reset time. No tokens.

claude-meter/src/models.rs

Two fields per bucket. Utilization (a float, almost always 0.0 to 1.0, sometimes scaled 0 to 100 by quirk of the API) and resets_at (a timestamp). If you want to verify this yourself, grep the deserializer for tokens in the same file. You will not find one. The Opus rate limiter never tells you how many tokens you have used or have left. It only ever tells you how close you are to a percentage ceiling whose denominator it also does not publish.

Tool 1: ccusage --model opus

ccusage tails ~/.claude/projects/**/*.jsonl and aggregates the input, output, cache_read, and cache_creation token counts that Claude Code wrote during streaming. With --model opus it filters to records where model starts with claude-opus. You get a per-project breakdown that looks like this:

ccusage

Useful for: knowing what your CLI sent on this machine, broken down by project and message type. Not useful for: knowing whether your next Opus request will go through. The numbers in this terminal are the only place "raw Opus tokens" appears in the entire stack and they live entirely on your local disk. Anthropic's rate limiter never sees this output.

Tool 2: the server fraction at seven_day_opus

ClaudeMeter (and the bar chart on claude.ai/settings/usage) both read the same JSON. Run claude-meter status --json and you get the parsed structure verbatim:

claude-meter

Read the Opus line. Utilization is 0.94. There is no token count. There is no "you have used 1.7M Opus tokens out of 1.8M" rendering happening anywhere. The fraction by itself is what the server checks. Anthropic does not tell its own client what the denominator is. The bar on Settings is drawn from this exact float; ClaudeMeter's extension reads it on a 60-second alarm in extension/background.js line 24.

Why the two numbers diverge

If both ledgers are accurate, why don't they reconcile? Because they are accurate about different events.

Server-side tokenizer expansion (Opus 4.7). Per Anthropic's 4.7 release notes, the new tokenizer maps the same text to 1.0x to 1.35x as many tokens as 4.6 did. The expansion runs after Claude Code has already streamed a chunk and written it to JSONL. Your local sum is the pre-expansion count; the seven_day_opus float is updated against the post-expansion count.
Adaptive thinking tokens. Opus 4.7 generates thinking tokens by default when the caller opts in. They count against seven_day_opus on the server. They are written to JSONL inconsistently (some streams write the full thinking block, some write a redacted token count, some write neither). ccusage sees whatever made it into the file.
Browser-chat traffic. If you also use claude.ai in a browser tab on the same Anthropic account, that traffic depletes the same seven_day_opus float. It never writes a Claude Code JSONL. ccusage cannot see it.
Other machines and other Claude Code sessions. seven_day_opus is org-scoped on the server. ccusage on this laptop only sees this laptop's JSONL. A teammate on the same org or a second machine of yours will move the server number while ccusage stays flat.
The denominator changes by plan. The same absolute token spend is a different fraction on Pro vs Max 5x vs Max 20x. The server bakes this into utilization. Local-token tools have no idea which plan you are on, so they cannot compute the equivalent fraction even if every other gap were closed.

ccusage vs the server endpoint, as objects of measurement

Same product, two ledgers. They answer different questions.

Feature	ccusage (local JSONL)	server /usage (seven_day_opus)
Where does the data come from?	~/.claude/projects/*.jsonl on this machine	claude.ai/api/organizations/{org}/usage (server)
What units does it report?	Raw input/output/cache token counts	A utilization fraction (0.0 to 1.0+), no raw tokens
Sees browser-chat usage?	No. Browser chats never write a JSONL.	Yes. Server tracks all Opus traffic against the org.
Includes 4.7 tokenizer expansion?	No. JSONL is written before server expansion.	Yes. Expansion is applied before utilization is updated.
Includes thinking tokens?	Partial. JSONL writes are inconsistent.	Yes. Generated server-side, written to seven_day_opus.
Sees other machines on the same org?	No. Only this machine's CLI traffic.	Yes. Org-scoped server count.
Is it the number that decides the next 429?	No. The rate limiter never reads JSONL.	Yes. The rate limiter reads this float.

The Reddit-thread version: what to actually do

You came here looking for one number. You are leaving with two and an honest reason why. Here is the smallest pair that covers both questions.

Three commands, two numbers

brew install ryoppippi/tap/ccusage. Then ccusage --model opus whenever you want to see what your CLI actually sent, broken down by project. Raw tokens, your machine.
brew install --cask m13v/tap/claude-meter plus the browser extension from the releases page. Visit claude.ai once. The menu bar shows seven_day_opus live within a minute. No cookie paste.
For status lines: claude-meter status --json from any shell drops the same JSON into your tmux, Starship, or a CI gate.

What each tool can and can't tell you

ccusage --model opus is the only way to see raw token counts broken down by project and message type
ccusage cannot see browser-chat usage, server-side tokenizer expansion, or thinking tokens that didn't make it into JSONL
claude.ai/settings/usage and ClaudeMeter both read seven_day_opus.utilization, the float that gates your next request
Neither one gives you a raw 'tokens left on Opus' integer; the server does not expose one
The honest answer to 'what is my Opus token usage' is two numbers, not one

Common questions about Opus token usage

Does Anthropic's usage endpoint return a raw Opus token count?

No. The endpoint is GET https://claude.ai/api/organizations/{org_uuid}/usage, the same one claude.ai/settings/usage renders behind the bar charts. Per ClaudeMeter's Rust model in src/models.rs lines 3-7, the Window struct that represents each bucket has exactly two fields: utilization (f64) and resets_at (Option<DateTime<Utc>>). The seven_day_opus key on the response uses that same Window. There is nowhere on the response to read 'tokens used' or 'tokens remaining' as integers. Any tool that claims to read your live server-side Opus tokens is reading something else (typically your local JSONL).

Then what does ccusage actually show me?

ccusage reads ~/.claude/projects/**/*.jsonl on your machine and sums the input, output, cache_creation, and cache_read token counts that Claude Code wrote to those files when it streamed. With --model opus it filters to records where model is claude-opus-* and shows that subset. Those numbers are accurate as a local accounting of what your CLI sent and received, but they are not the number Anthropic checks before deciding whether to 429 your next Opus request. Two different ledgers.

Why don't ccusage's tokens add up to the server's utilization?

Three reasons documented in Anthropic's Opus 4.7 release notes and visible in actual usage data. First, the 4.7 tokenizer maps the same text to 1.0x to 1.35x as many tokens as 4.6 did. That expansion runs server-side, after Claude Code has already written its JSONL line. Your local sum sees the pre-expansion count. Second, adaptive thinking on Opus 4.7 generates thinking tokens that count against your seven_day_opus float; those tokens are not always written to JSONL in full. Third, your Opus quota is also touched by browser-chat sessions on claude.ai, which never write to a Claude Code JSONL at all. ccusage cannot see those.

Then how do I read 'real' Opus token usage at all?

Read both numbers, side by side, and treat them as different signals. ccusage --model opus shows what your Claude Code CLI sent on this machine, in tokens. claude.ai/settings/usage (or any tool that polls /api/organizations/{org}/usage, like ClaudeMeter) shows the seven_day_opus utilization the rate limiter actually checks, in a fraction. The first answers 'what did I send'. The second answers 'will my next Opus request 200, meter to overage, or 429'. Both are useful. Neither is a substitute for the other.

Does /usage inside Claude Code show server-truth Opus tokens?

No. Typing /usage inside an active Claude Code session prints a snapshot of your session and weekly percentages, but it interrupts the loop and gives you a fraction, not a token count. It also reads through Claude Code's own session, so it does not see browser-chat usage, and it cannot see what other Claude Code sessions on other machines have used against the same org. It is fine for a one-off check. It is not a continuous meter.

What about Claude-Code-Usage-Monitor or other local-only tools?

Claude-Code-Usage-Monitor and similar tools also tail ~/.claude/projects/*.jsonl. They are slightly different presentations of the same local data ccusage reads. They share the same blind spots: no tokenizer expansion, no full thinking-token accounting, no browser-chat visibility. A user who relies on a local-only tool and then gets rate-limited mid-session almost always sees this gap close to zero on the local count and close to 1.0 on the server fraction.

Can I just multiply ccusage's Opus output by 1.35x to get the server number?

Not reliably. The 1.0x to 1.35x expansion is not a fixed multiplier. It varies by content type (code, prose, JSON, markdown all hit different ratios) and Anthropic does not publish the per-content-type weighting. On top of that, the seven_day_opus fraction is not raw tokens divided by a fixed ceiling: it is utilization, and the denominator depends on your plan tier (Pro, Max 5x, Max 20x). Multiplying ccusage by 1.35 gives a worst-case input estimate. It does not reproduce the server fraction.

Where exactly does ClaudeMeter read the Opus number from?

From the same JSON returned by GET https://claude.ai/api/organizations/{org}/usage, then it deserializes that into the UsageResponse struct in src/models.rs. The relevant key is seven_day_opus, which is an Option<Window>. The popup at extension/popup.js line 63 conditionally renders that window if present. The Rust CLI exposes the same field via claude-meter status. No raw Opus token count appears anywhere in the pipeline because the server does not return one.

If the server doesn't expose tokens, how does it bill extra-usage in dollars?

Anthropic computes extra-usage credits server-side from the same token counts they use to update utilization, but they only expose used_credits and monthly_limit on the overage_spend_limit endpoint, not the raw tokens. So you can see your overage spend tick up in dollars without ever seeing how many extra Opus tokens that came from. ClaudeMeter pulls overage_spend_limit alongside the usage payload (src/api.rs lines 31 to 45) and shows the dollar figure. The token-to-dollar ratio is not exposed.

Does the same divergence apply to Sonnet, or only Opus?

Same shape, different magnitude. seven_day_sonnet exists on the same usage payload. Sonnet token usage from ccusage and the seven_day_sonnet fraction also disagree, but the gap tends to be smaller because Sonnet adaptive thinking is less aggressive than Opus 4.7's. The structural answer is identical though: the server returns utilization, never raw tokens, for any model.

What's the smallest setup to see both numbers without writing my own poller?

Two commands. brew install ryoppippi/tap/ccusage gives you the local JSONL summary. brew install --cask m13v/tap/claude-meter plus the browser extension gives you the live seven_day_opus fraction in the macOS menu bar. The extension picks up your existing claude.ai session, so there is no cookie paste step. Run ccusage when you want the raw token breakdown by model and project. Glance at the menu bar to know whether your next Opus request will go through.

More on the same buckets, the same fields, the same wall.

Related guides

Opus 4.7

Claude Code Opus 4.7 usage limits: the two server floats that gate you

The 4.7 tokenizer's 1.0x to 1.35x expansion explained, plus the seven_day_opus field path.

Read

Rolling window

Claude Code rolling 5-hour usage: three ledgers, three answers

How /usage, ccusage, and ClaudeMeter all answer 'where am I in the 5-hour window' differently.

Read

Per-PR cost

Claude Code Opus cost per PR: the one field that actually moves

The delta on seven_day_opus.utilization across a branch is the one number that captures Opus PR cost.

Read

ccusage and Settings are giving you different answers?

If your team runs Claude Code on Opus and the local-token sum keeps disagreeing with the bar on claude.ai, book a 20-minute call and I will help you wire the right pair into your workflow.