The Claude Max 5-hour window is server-tracked, not counted.

People picture the rolling 5-hour limit as a tally of prompts that ticks up and zeroes out every five hours. It is not. The server tracks it as one floating-point fraction it recomputes continuously over a rolling age-off. That single design choice explains why your bar drifts while you send nothing, why the reset never lands cleanly, and why no local token tool can show you the real number.

Matthew Diakonov, Written with AI

Published May 21, 20265 min read

Direct answer (verified 2026-05-21)

Claude Max's 5-hour window is server-tracked as a single utilization float, not a prompt count. On claude.ai/api/organizations/{org_uuid}/usage the five_hour field is a Window object with utilization (0.0 to 1.0) and resets_at (an ISO 8601 timestamp). The server recomputes the fraction over a rolling age-off, so it can fall while you are idle. The rate limiter checks that float; at 1.0 your next message returns a 429. You can confirm the shape in the open-source claude-meter source (github.com/m13v/claude-meter, src/models.rs lines 18-28).

What the server actually returns

Here is the relevant slice of the live response for a Max account mid-session. Note what is missing: there is no messages_used and no tokens field for the window. Just a fraction and a reset time.

usage.json

The whole window is two numbers

The open-source claude-meter Rust app deserializes the same response. The 5-hour window maps to a Window struct with exactly two members. That is the entire server-side state for the window: a utilization float and a reset timestamp. No counter lives anywhere on your side.

src/models.rs

How the server computes the float, step by step

The reason a local token sum can never match this number is that the window is a server-side division, not a client-side count. Here is what happens to a single message.

You send a message.

The server timestamps it and weights it. Not every message costs the window equally: peak-hour multipliers and model weighting fold in on the server. The weight is invisible to your machine, which is the first reason a local token sum cannot reproduce the number.

The server divides, it does not count.

It sums the weighted, still-in-window messages and divides by your plan's effective 5-hour ceiling. The result is five_hour.utilization, a fraction. The ceiling (the denominator) is never sent on the wire, so the only honest source of the fraction is the server itself.

Old messages age off on their own clock.

Each message leaves the window 5 hours after it was sent. As the oldest ones drop, the numerator shrinks and utilization falls, with no input from you. resets_at is the next age-off boundary, so it slides as the window contents change.

At utilization 1.0, the next prompt 429s.

The rate limiter checks the float, not a prompt tally. When five_hour.utilization reaches 1.0 your next message is rejected. It clears not at a fixed reset moment but as enough messages age off to push the fraction back under the ceiling.

claude-meter samples the float every 60 seconds.

Because the number slides continuously, claude-meter (MIT, github.com/m13v/claude-meter) polls GET /api/organizations/{org_uuid}/usage once a minute and pins five_hour to the menu bar with its live resets_at. The browser extension hands the logged-in claude.ai session to the macOS app, so there is no cookie paste and no expiring token.

What it looks like once you stop guessing

Same server response, formatted one row per bucket. This is what claude-meter status prints, and what the macOS menu bar dropdown renders. The 5-hour row carries its own live reset clock, so you can watch the float age off instead of refreshing the Settings page.

claude-meter status

Frequently asked questions

How does the server track the Claude Max 5-hour window?

As a single float. On GET claude.ai/api/organizations/{org_uuid}/usage the five_hour field is a Window object with two members: utilization (a number between 0.0 and 1.0) and resets_at (an ISO 8601 timestamp). The server does not return a prompt count or a token count for the window. It returns the fraction of your effective ceiling you are currently using, recomputed on its side. The open-source claude-meter app deserializes exactly that shape in src/models.rs lines 18 to 28.

Why does my 5-hour bar move when I am not sending anything?

Because it is a rolling window, not a fixed timer that resets at one instant. Each message you send has its own 5-hour age-off clock. As the oldest messages cross the 5-hour mark they drop out of the window, so utilization falls on its own, with no action from you. The server recomputes the fraction continuously, which is why the number drifts down between prompts and why resets_at slides forward as new messages enter the window.

Is utilization the same as a percentage?

Almost. The server is inconsistent about scale, so the claude-meter browser extension normalizes both shapes. In extension/background.js the helper pctFromWindow does: const u = w.utilization; return u <= 1 ? u * 100 : u. A value of 0.97 means 97 percent, and a raw 97 also means 97 percent. If you call the endpoint yourself, branch on u <= 1 before multiplying or you will show 9700 percent.

Can ccusage or Claude-Code-Usage-Monitor show me the server-tracked 5-hour number?

No. Those tools read ~/.claude/projects/**/*.jsonl on your disk and sum tokens against a model price card. That is an accurate local numerator. It is not utilization. The server fold-ins (peak-hour weighting, browser chat traffic, traffic from other devices on the same account, OAuth-app calls) never appear in your local files, and the denominator Anthropic divides by is private. They answer 'what did Claude Code burn locally', not 'how close is the server to 429ing me'. The tools are complementary, not replacements.

When exactly does the 5-hour window start?

5 hours after your first message of the current rolling window. Each subsequent message ages off on its own clock. resets_at is the next age-off boundary, so it is not a single weekly-style reset moment. If you show up at the listed resets_at expecting zero and the bar still reads 40 percent, that is because later messages in the window have not aged off yet.

Did the May 6, 2026 rate-limit change touch the 5-hour window?

Yes. Anthropic doubled the 5-hour rate limit on Pro, Max, Team, and seat-based Enterprise plans on May 6, 2026 (announcement: 'Increased rate limits on Claude Code for Pro, Max, Team and Enterprise users'). That changes the denominator, so the same workload produces a lower five_hour.utilization than before. The weekly caps were not doubled, so the most common Max surprise after May 6 is a healthy 5-hour bar while a weekly bucket quietly climbs.

How do I read the server-tracked number myself, without installing anything?

Open claude.ai/settings/usage with DevTools, switch to the Network tab, filter for /usage, and refresh. The response of GET /api/organizations/{org_uuid}/usage is the raw JSON, with five_hour.utilization and five_hour.resets_at right there. That is the exact value the rate limiter checks, the same one the Settings page renders as its top horizontal bar.

How often does claude-meter re-read the server number?

Every 60 seconds. POLL_MINUTES = 1 in extension/background.js. Because utilization slides continuously as messages age off, sampling once a minute matches the resolution a human can act on and stays below the rate at which the float typically changes in a heavy session. The macOS menu bar app and the browser extension share one fetch over a localhost bridge so the request rate to claude.ai is not doubled.

Want the live float pinned to your menu bar?

Book 15 minutes. I will walk through your actual /usage JSON, show where the 5-hour float sits right now, and get claude-meter reading your session with no cookie paste.

More on the server-tracked rolling-window primitive.

Related guides

Buckets

Claude Max rolling 5-hour weekly limit: the two limits are actually four

Max enforces five_hour plus several weekly buckets, each its own utilization float and its own 429. The full field list and the May 6, 2026 doubling.

Read

Server truth

Why token counters cannot see what Anthropic actually enforces

Utilization is a fraction with a private denominator. A local token sum has the numerator but never the denominator, so it cannot equal the server quota.

Read

Reset semantics

Claude rolling 5-hour reset: each message ages off on its own clock

The window does not reset to zero at one instant. Each message has its own 5-hour age-off clock. Why people arrive at resets_at and find the bar still high.

Read

What the server actually returns

The whole window is two numbers

How the server computes the float, step by step

You send a message.

The server divides, it does not count.

Old messages age off on their own clock.

At utilization 1.0, the next prompt 429s.

claude-meter samples the float every 60 seconds.

What it looks like once you stop guessing

Frequently asked questions

Want the live float pinned to your menu bar?

Related guides

Claude Max rolling 5-hour weekly limit: the two limits are actually four

Why token counters cannot see what Anthropic actually enforces

Claude rolling 5-hour reset: each message ages off on its own clock

Comments (••)

Comments ()