Active recall sessions burn the 5-hour rolling window faster than your card count suggests

Quizzing yourself with Claude Pro looks like a sequence of one-line turns. Under the hood, every “next card” re-sends the entire growing transcript: the system prompt, the running quiz frame, the cards you have already answered, your last replies. By card 27 a two-word prompt is a 14k-token request. The 5-hour cap weighs input tokens, which is why a 30-card recall session can light up the orange band the way a long technical conversation does. This page is the mechanic, the JSON, and how to plan the break around resets_at instead of around the rate-limit error.

Matthew Diakonov, Written with AI

Published May 5, 20269 min read

Direct answer (verified 2026-05-05)

To track Claude usage during active recall study sessions, install ClaudeMeter (free, MIT, browser extension plus macOS menu bar app). It polls GET /api/organizations/{org}/usage on claude.ai every 60 seconds with your existing logged-in cookie, reads five_hour.utilization and five_hour.resets_at, and renders a row labelled 5-hour · {Nm/Nh} · {percent}% in the popup and on the menu bar. Source verified at extension/background.js.

Why a recall quiz is heavier than it looks

Active recall as a study technique works by retrieval, not review. You ask Claude to quiz you, you answer, Claude grades, you move on. From the human side every turn is a card. From the API side every turn is the entire prior conversation plus one new sentence.

That asymmetry is what surprises students who run their first serious study session. Card 1 is cheap. Card 12 is meaningfully heavier. Card 27 is a small essay every time you say “next”, because the transcript Claude needs to keep grading consistently is 18 prior question/answer pairs and your running notes:

The same study session, three timestamps

The visible turn at 16:51 is two words. The request body is whatever the running thread weighs at that point: every prompt and reply, serialized back to the model, weighed against the 5-hour cap. That is why this page exists, and why a guide that talks about active recall without mentioning the rolling cap leaves out the part that fires.

What the menu bar actually shows during the session

One row, three columns, updated every 60 seconds. The countdown is the time until the rolling window has shed enough usage that the fraction goes back under the cap; the percent is five_hour.utilization rounded for display:

ClaudeMeter popup, 71% and counting

The 7-day row is also there, but during a single recall session it barely moves. The signal that matters for “am I about to be cut off mid-card” is the 5-hour row. At 71 percent with 38 minutes on the clock, you have somewhere between three and seven more heavy turns before the bar trips orange, depending on how the transcript has grown. That is enough information to decide consciously whether to push or stop.

The countdown is one function, four bands

The label after the “5-hour” word comes from a single humanization function in popup.js. It bands the difference between now and resets_at into one of four shapes: “now” if the bucket is already clear, “Nm” under one hour, “Nh” up to forty eight hours, “Nd” beyond. On a study session the 5-hour row lives in “Nm” or low “Nh”, which is the band that lets you plan a break:

extension/popup.js

So “38m” is a thirty-eight minute coffee break. “1h” is a walk. “3h” is “the rest of this study session will be Sonnet, not Opus”. The function is ten lines and it is the part of the tracker that makes the percent actionable, because a percent without a clock is just an apology in advance.

The poll is fixed at 60 seconds, on purpose

The extension registers a chrome.alarms tick on install and on browser startup with a one-minute period. The same alarm fires refresh(), which calls fetchSnapshots() and posts to the menu bar app over localhost. The number is POLL_MINUTES = 1 on line 3:

extension/background.js

A faster poll would not get you a fresher answer because the server-side bucket only recomputes that often. A slower poll would miss the moment a long Opus turn lands on the bucket, which is the moment the percent jumps from 64 to 78 in one tick during a recall session. Sixty seconds is the cadence claude.ai's own Settings page recomputes against, and matching it keeps the numbers in lockstep.

The endpoint, and why nothing else can see your study traffic

The extension calls fetch with credentials: "include" so the existing claude.ai session cookie travels with the request. The endpoint is the same internal one claude.ai/settings/usage renders against:

extension/background.js

The reason this matters specifically for active recall sessions: most Claude usage trackers (ccusage, Claude-Code-Usage-Monitor) read ~/.claude/projects/<project>/<session>.jsonl on disk. Those files are written by the Claude Code CLI. A claude.ai chat session, which is where flashcard quizzing happens for most students, never writes to that path. Local-log tools therefore show zero usage during a recall session even when the server-side bucket is at 71 percent. The only honest read on the cap is the cookie authenticated GET, which is what the extension is doing once a minute on your behalf.

Server-truth tracker vs local-log tool, on a recall session

Both kinds of tools are useful for different things. They do not replace each other. For a study session running through claude.ai chat, only one of them sees the traffic.

Feature	ccusage (local Claude Code log read)	ClaudeMeter (server-truth read)
What the tool counts	Tokens summed from ~/.claude/projects/{project}/{session}.jsonl on disk	five_hour.utilization fraction the server returns for your account
Sees claude.ai chat usage (where flashcard quizzing happens)	No (no jsonl is written for browser chats)	Yes (chat traffic is part of five_hour on the server)
Counts the input transcript carried by every recall turn	Browser chat is not in the file at all	Yes (server already weighed it)
Refresh cadence during a session	On demand (re-runs the JSONL scan when you ask)	60 seconds (POLL_MINUTES = 1)
Countdown to the reset	Not part of the local-log signal	Banded label off resets_at: Nm, Nh, Nd
Surface	Terminal only	macOS menu bar, browser toolbar popup, CLI
Cost	Free, MIT licensed (different scope, complementary)	Free, MIT licensed

Six things to keep in mind during the session

Active recall and the rolling cap

Every active recall turn re-sends the prior transcript. The visible message is short; the request body is not. The 5-hour cap weighs input tokens, so card 27 of a quiz is a heavier prompt than card 1 even when the words you typed are the same.
claude.ai chat usage shows up in five_hour.utilization on the server but does not get written to ~/.claude/projects/. Local-log tools cannot see it. The only ground-truth read is the cookie-authenticated GET on /api/organizations/{org}/usage.
The popup row reads '5-hour · {countdown} · {percent}%'. The countdown comes from fmtResets banding the resets_at timestamp into Nm, Nh, or Nd. On a study session it almost always lives in Nm or low-Nh territory.
The window is rolling. resets_at moves as your oldest weighted prompts age out 5 hours after they were sent. A break of 'until the countdown clears' restores the budget; a break of 'an hour' gets you most of the way there.
ClaudeMeter polls every 60 seconds. That is the cadence claude.ai's own Settings page recomputes against. Tighter polling gains nothing because the server-side bucket only updates that often anyway.
Color thresholds: under 80 percent green, 80 to 99 orange, 100 and up red. Treating orange as 'finish this card and stop' takes the 429 surprise out of a recall session almost entirely.

Install in four steps, before your next session

Step 1: brew install the menu bar app

brew install --cask m13v/tap/claude-meter. The cask installs ClaudeMeter.app under /Applications and registers a launch agent so the menu bar icon comes back after reboot.

Step 2: load the unpacked extension in your study browser

Clone github.com/m13v/claude-meter, open chrome://extensions (or arc://extensions, brave://extensions, edge://extensions), enable Developer mode, click 'Load unpacked', select the extension/ folder. The browser pins the icon next to the URL bar.

Step 3: open claude.ai once

If you are not already logged in, sign in. The extension reads your existing session cookie via fetch with credentials: 'include' against /api/organizations/{org}/usage; you do not paste anything. Within one minute the badge lights up with a percent.

Step 4: start your recall session and watch the row

Begin quizzing. The popup updates every 60 seconds with the current 5-hour percent and a countdown. When the row turns orange (80 percent) you have one or two more cards before the cap. Plan the break, switch models, or close the chat and start a fresh thread.

The pattern that keeps the bar green for two hours

Three habits, in combination, change a 30-card recall session from a sprint to the orange band into a steady run that ends with the bar around 60 percent.

First, end the chat after every 8 to 10 cards and start a fresh one. The new thread carries no transcript, so card 1 of the new chat is genuinely cheap. The recall benefit is the retrieval, not the running history; the model does not need to remember card 7 to grade card 14.

Second, use Sonnet for the quiz turns and reserve Opus for the moments you actually need a deeper explanation. The 5-hour cap weights model picks. Sonnet on a recall card costs measurably less of the bucket than Opus on the same card.

Third, glance at the menu bar between cards. When the row reads 5-hour · 38m · 81%, finish the current card and stop. The bucket is going to clear in 38 minutes; that is exactly the length of a real break. Coming back at 38 percent is the difference between two productive hours and one productive hour followed by a 429.

The honest caveat

The endpoint is internal and undocumented. Anthropic can rename a field in any release. The Rust struct in src/models.rs declares each known field as Option<Window>, so a missing field deserializes cleanly and the menu bar shows an error chip instead of a wrong percent. That is a forward-compat hedge, not a guarantee. If the shape of the response moves, the open-source repo gets a same-day patch and you pull the next brew release.

Studying through Claude Pro and want to compare bucket math?

15 minutes, no slide deck. Happy to swap notes on the rolling-bucket edges, what an Opus turn really costs on a recall session, and the moments the JSON shape shifts.

Frequently asked questions

Why does an active recall session with Claude burn quota faster than the same number of normal questions?

Because each turn includes the entire prior transcript. When you ask Claude to quiz you on chapter three, then answer, then ask for the next card, every prompt sends back the system prompt, the running quiz frame, the cards you have already seen, your previous answers, and Claude's previous responses. By card twenty the input is a small essay even though the visible message is one sentence. The 5-hour rolling cap weighs input tokens too, so a 30-card recall session can register against the cap like a long technical conversation.

How do I see the actual percent of my 5-hour window mid-session without leaving the chat?

Install ClaudeMeter (free, MIT, github.com/m13v/claude-meter). It runs as a macOS menu bar app plus a browser extension that polls /api/organizations/{org_uuid}/usage on claude.ai once every 60 seconds with your existing logged-in cookie, reads five_hour.utilization and five_hour.resets_at, and renders a row labelled '5-hour · {countdown} · {percent}%' in the popup and on the menu bar. No cookie paste, no API key, no extra tab.

Will ccusage tell me how much of my Claude Pro session quota I have left during a study session?

No. ccusage reads ~/.claude/projects/{project}/{session}.jsonl on disk, which is what the Claude Code CLI writes locally. A claude.ai chat session does not write to that path at all, so ccusage shows zero usage even after an hour of recall quizzing in the browser. The number Anthropic actually weighs against your 5-hour cap only comes through the cookie-authenticated GET on /api/organizations/{org_uuid}/usage, which is what ClaudeMeter polls.

How long does the 5-hour rolling window actually take to reset?

The window is rolling, not fixed. resets_at points at the moment the oldest weighted prompt in your bucket ages out 5 hours after it was sent. So if you fire your first heavy prompt at 14:02 and your last one at 15:30, the bucket does not empty all at once at 19:30; it tapers as each prompt rolls past its 5-hour anniversary. The popup countdown ('38m', '4h', '12m') is the time until the bucket utilization drops below the cap, computed off the resets_at field the server returns.

What is the cheapest active recall pattern that does not eat the cap?

Three things in combination. First, start each card in a fresh chat instead of one long thread; that drops the carried transcript to zero. Second, use Sonnet or Haiku for the quiz turns and reserve Opus for the moments you actually need a deep explanation; the per-model weighting matters. Third, watch the menu bar percent. When the 5-hour row crosses 80, switch to lighter models or let the next 30 minutes roll the bucket forward before you push more cards through.

Does ClaudeMeter work if I am studying through claude.ai in the browser, not Claude Code?

Yes. The extension fetches /api/organizations/{org}/usage with credentials: 'include', which means it travels on the same session cookie claude.ai uses for the chat surface. Whether your last hour of usage came from Claude Code, the desktop app, claude.ai chat, or all three at once, the server reports the unified five_hour.utilization fraction and the extension reads it. The chat-side burn is exactly what active recall study sessions generate, and it is exactly the burn local-log tools cannot see.

How do I time a study break so the window resets while I am away from the desk?

Look at the popup row. If it reads '5-hour · 1h · 84%', a 60 to 75 minute break gets you back to a fresh bucket, because the next batch of weighted prompts ages out during that window. If the countdown reads 'Nh' (e.g. '3h'), a single break is not going to clear it; alternate to lighter cards, switch to a notebook, or accept that the rest of the session is going to be Sonnet-only.

What does the popup actually show while I am quizzing?

Two rows on a Pro account. The first is '5-hour · {countdown} · {percent}%' for the rolling cap; the second is '7-day · {countdown} · {percent}%' for the weekly bucket. Both come from the same /usage JSON. The toolbar badge holds the 5-hour percent (because that is the cap that fires next on a heavy session), and the icon tooltip shows both. The menu bar app surfaces the same numbers, color coded green under 80, orange at 80, red at 100.

Can I run claude-meter as a CLI and pipe the percent into a status line?

Yes. The brew cask installs a CLI next to the app at /Applications/ClaudeMeter.app/Contents/MacOS/claude-meter. Pass --json for a machine-readable snapshot. A common pattern during a study session is to bind a shell alias that prints '5h: NN%' so a tmux pane can show it without opening the popup; the data is exactly the same fields, served once on demand instead of polled.

What if the active recall session pushes me over the cap mid-card?

Anthropic returns a 429 and the chat refuses the next prompt. ClaudeMeter cannot prevent this; the cap is enforced server side. What the popup gives you is the 5 to 15 minute warning. The bar turns orange at 80 and red at 100. If you treat orange as 'finish the current card and stop', you almost never see the 429. If you ignore the bar, the failure mode is the same as it would be without ClaudeMeter, with one extra signal you chose not to act on.

Keep reading

Tracker

5-hour window tracker: the countdown math most guides skip

The resets_at humanization function (now / Nm / Nh / Nd) and why a fixed 60-second poll keeps the countdown honest.

Read

Compare

Local counter vs server quota: why ccusage and claude.ai disagree

Why ccusage at 8 percent and claude.ai at 71 percent are both correct. Two ledgers, two sources, neither replaces the other.

Read

Mental model

Burn rate on the 5-hour window: what one heavy prompt costs

How a single Opus turn lands on the bucket, why the curve is not linear, and how to read the percent jump in real time.

Read