The rolling 5-hour window is not a 5-hour timer. Here is how Claude Code actually behaves around it.

Most guides describe the window like a session timer that resets 5 hours after your first prompt. That model is wrong, and Claude Code's behavior makes more sense once you see the real shape: messages age out one by one, the bucket drains gradually, and the 429 lands with no client-side warning.

M
Matthew Diakonov
5 min read

Direct answer (verified 2026-05-06)

Claude Code keeps sending until the server's five_hour.utilization crosses 1.0, then 429s. There is no client-side warning. The window is rolling: each message you send counts against your budget for exactly 5 hours from when you sent it, then ages out. The reset you see in resets_at is when your oldest still-in-window message will age out, not a clock that resets the whole bucket. Verified against ClaudeMeter's Rust struct at src/models.rs and the live response from claude.ai/api/organizations/{org}/usage.

The shape Anthropic returns

The single source of truth for what your Claude Code session is burning is the JSON the server returns from /api/organizations/{org}/usage. ClaudeMeter deserializes it into this Rust struct. Two fields per window. That is the whole API surface. No timer, no session id, no start time.

claude-meter/src/models.rs

The point: there is no "when did this 5-hour session start" on the server. There cannot be, because no such moment exists. The window is always the 5 hours ending now. The only forward-looking thing the server tells you is resets_at: when your single oldest in-window message ages out. Everything else you might want to know (when did I start? how long have I been at it?) is your problem on the client.

What you actually see in Claude Code

Two terminal moments worth getting clear on. First, the snapshot you can ask for, and second, the wall when it lands.

The snapshot Claude Code prints when you ask
The wall, no warning

The first is a manual snapshot you have to type. It interrupts the agentic loop, prints once, does not stream, and the 5-hour bar is a single integer. The second is what shows up the first time the tally crosses 1.0 on a tool-heavy prompt. Nothing in the Claude Code TUI told you 90% was about to become 100%.

One real lifecycle, minute by minute

To make the rolling part concrete, here is what happens to one actual prompt as the day progresses. Numbers are illustrative, shape is real.

1

T = 09:14, first heavy prompt

You send a 'refactor X across N files' prompt. Cost lands at, say, 0.18 of your budget. five_hour.utilization is now 0.18. resets_at is 14:14: that is when this specific message will age out.

2

T = 09:14 to 12:30, normal coding

More prompts get added to the tally. Utilization climbs steadily. resets_at stays at 14:14 because the 09:14 message is still the oldest in-window thing. Your 'reset countdown' is decreasing, but only because clock time is advancing toward that one fixed timestamp.

3

T = 12:30, you 429 mid-loop

Tally crosses 1.0 on a tool-heavy prompt. Claude Code returns 429. retry_after = 14:14, because that's when the oldest message (the 09:14 one) finally ages out and a chunk of your budget comes back.

4

T = 14:14, the 09:14 message ages out

The server drops it from your window. Utilization steps down by 0.18 in a single tick. Next-oldest (say, the 09:21 prompt) is now the new resets_at: 14:21. You can send again. You did not get a clean reset, you got back exactly what the 09:14 prompt cost.

5

T = 14:14 onward, gradual drain

As each old prompt ages out, utilization steps down by that prompt's specific weighted cost. The bucket fills up and drains continuously rather than emptying at a clock. Watch the meter and you can see the steps in real time.

What most guides get wrong vs. what actually happens

The model in most posts (start a 5-hour timer, get 100%, timer resets) is wrong in three different places. Here are the same questions answered against the actual server mechanic.

FeatureWhat most guides implyWhat actually happens
When does the window start?Most guides say it starts at your first message of the day.It does not 'start'. The window is the 5 hours ending now.
What does Claude Code show as I approach the wall?Nothing. You find out the wall is there at the 429.Nothing in the CLI. ClaudeMeter shows the live percent in the menu bar.
What does the 429 look like?Same shape. The 429 itself is identical regardless of tracker.rate_limit_exceeded with retry_after = oldest in-window message + 5h.
Does a long break reset me?Many guides imply a hard reset every 5 hours from the first message.No. Only the specific message that is 5h old ages out. Reset is gradual.
Does browser chat use the same window?Local-log tools (ccusage, Claude-Code-Usage-Monitor) cannot see browser usage.Yes, per Anthropic's help docs. Same five_hour bucket.
Can the bucket drain while I keep coding?Token-counter burn rates cannot go negative; they only count up locally.Yes. As old messages age out, utilization decreases mid-session.

So how do you stop being surprised?

You watch the same number Claude Code is about to 429 against, before it does. The number is five_hour.utilization on the JSON Anthropic's server returns to you. The path is one brew command plus a one-time browser-extension load.

  1. Step 1brew install --cask m13v/tap/claude-metermacOS menu bar app and claude-meter CLI in one command.
  2. Step 2Load the unpacked browser extension from github.com/m13v/claude-meter/tree/main/extension into Chrome, Arc, Brave, or Edge. Forwards your existing claude.ai session so the menu bar app does not need a second login.
  3. Step 3Visit claude.ai once. Within 60 seconds the bar shows your live five_hour percent and the resets_at countdown. The polling cadence is set in src/bin/menubar.rs line 18 (POLL_INTERVAL = Duration::from_secs(60)). MIT-licensed, no telemetry.

Your agentic loop keeps eating the wall and you want a real meter?

20-minute call. I will help you wire the right rolling-window meter into your Claude Code workflow so the 429 stops being a surprise.

More questions about Claude Code's rolling 5-hour behavior

Does the rolling 5-hour window start when I start coding?

No. There is no 'session start' event on the server. The window is always defined as the 5 hours ending right now. When you send a message at 09:14, that message is in-window from 09:14 until 14:14 sharp. Send another at 11:30 and it is in-window until 16:30. The bucket measures the cost of every message that has not yet aged out, summed at this moment. Anthropic's help docs talk about a '5-hour session', which is true in the sense that the longest-lived message ages out at 5 hours, but not in the sense of a session timer that ticks down.

What does Claude Code actually do as I get close to the limit?

Nothing. There is no client-side warning, no progress bar, no 'you have 12% left' message. Claude Code keeps streaming responses normally until the server returns a 429 on the next prompt. The /usage slash command will print a snapshot, but it interrupts your loop and prints once. The first time most people learn the wall is right there is when the agentic loop hits it mid-refactor and aborts.

What is the exact moment Claude Code 429s?

When the server's five_hour.utilization for your organization passes 1.0 on the request you just sent. The cost is computed server-side from message bytes, model used, peak-hour multiplier, attachments, and tool-call overhead, then added to the running tally. If the new tally crosses 1.0, that request returns 429 with a retry-after pointing at resets_at. Per ClaudeMeter's Rust deserializer at src/models.rs lines 4-7, the Window struct that holds this number has exactly two fields: utilization and resets_at. There is no 'started_at', no 'session_id', no 'first_message_at'. The server does not need any of those to enforce the cap.

Why doesn't a 4-hour break reset me to zero?

Because the window is rolling, not session-based. Imagine your messages from the last hour cost 0.6 of your budget. You step away for 4 hours. When you come back, NONE of those messages have aged out yet, because the oldest one is only 5 hours old. The server still sees 0.6 of your budget consumed. Your only real reset point is when your specific oldest message ages out, and after that the next-oldest, and so on. Reset is gradual, not a cliff.

What does resets_at actually mean?

It is the timestamp when the OLDEST currently-counted message in your window will age out. On a steady Claude Code loop sending one message every couple of minutes, resets_at is roughly 5 hours after your earliest still-in-window message, and it shifts forward by a few minutes every time that earliest message ages out and the next-earliest takes its place. ClaudeMeter renders this in src/format.rs lines 75-98 by formatting the timestamp in your local time and computing the delta from now in days and hours; it never assumes the value is fixed.

Can I avoid the 429 by sending fewer prompts?

Only in the moment, and only if your specific old prompts are heavy. Your utilization right now is the sum of every still-in-window prompt's weighted cost. Sending zero new prompts does NOT decrease utilization until the oldest in-window prompt ages out. If your earliest message in the window cost 30% of your budget, your utilization will not drop by that 30% until 5 hours after you sent it, regardless of what you do in between.

Does my claude.ai browser chat affect Claude Code's rolling window?

Yes. Pro and Max plans share one usage pool across claude.ai and Claude Code, per Anthropic's help article on using Claude Code with subscription plans. A long PR description you drafted in the browser at 10am is in your five_hour bucket until 3pm, even if you only run Claude Code in the afternoon. Local-token tools like ccusage cannot see the browser-chat half because nothing writes to ~/.claude/projects/*.jsonl when you talk to Claude in a browser tab.

Is /usage in Claude Code the same number as the website?

Yes for the snapshot. Claude Code's /usage hits an OAuth-authenticated endpoint at api.anthropic.com/api/oauth/usage. claude.ai/settings/usage hits a cookie-authenticated endpoint at claude.ai/api/organizations/{org_uuid}/usage. Both return the same five_hour.utilization float (sometimes 0-1, sometimes 0-100 in the same payload, normalized by ClaudeMeter as `u <= 1 ? u * 100 : u`). The difference is /usage interrupts your CLI loop; the settings page is a tab; ClaudeMeter polls the same JSON every 60 seconds (POLL_INTERVAL at src/bin/menubar.rs:18) so neither breaks your flow.

Why does my burn rate sometimes go negative?

Because messages age out. If you poll five_hour.utilization at 13:42 and again at 13:43, and between those two polls a heavy message from 08:42 aged out, the utilization at 13:43 is lower than at 13:42 even though you sent prompts in between. Local token-counting tools cannot show this because tokens do not un-spend themselves on disk. The server-side rolling tally can and does decrease.

How do I actually see what's coming so the 429 doesn't surprise me?

Run ClaudeMeter alongside Claude Code. The macOS menu bar shows the live five_hour.utilization percent and the resets_at countdown, polled every 60 seconds from the same endpoint claude.ai/settings/usage uses. When you watch the bar instead of guessing, the wall stops being a surprise: you see it climb past 80%, you wrap up the current task, and you let the oldest messages age out. The browser extension forwards your existing claude.ai session so there is no cookie paste and no second login.