fix(observer): JSONL file pruning and token budget cap#2067
fix(observer): JSONL file pruning and token budget cap#2067thedotmack wants to merge 1 commit intomainfrom
Conversation
Fix #1937: JSONL files under ~/.claude/projects/ now get cleaned up after processing. Adds periodic cleanup (hourly) that deletes processed files older than 7 days and enforces a 1GB total size cap. Fix #1938: Observer background sessions now respect a daily token budget (default 100k tokens/day) and throttle interval (default 5s between runs). Adds CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY and CLAUDE_MEM_OBSERVER_THROTTLE_MS settings. Budget status is exposed via /api/health endpoint. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 35 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Code ReviewOverviewThis PR fixes two resource-leak bugs: JSONL transcript files accumulating indefinitely (#1937) and the observer consuming unbounded tokens (#1938). The approach is solid — a cleanup service for JSONL files and an in-memory budget tracker for the observer. Overall the code is well-structured and the feature logic is correct. IssuesSettings loaded on every
Phase 2 size check uses stale baseline Minor Points
Positive Notes
SummaryThe core logic is correct and well-structured. The main actionable items before merge are: (1) cache the settings read in 🤖 Generated with Claude Code |
Greptile SummaryThis PR adds two independent features: periodic JSONL file pruning (7-day retention / 1 GB cap, hourly interval) and an in-memory observer token budget + throttle gate. Both are well-structured with good error handling, clean lifecycle management, and health-endpoint exposure. All remaining findings are style/P2. Confidence Score: 5/5Safe to merge; all findings are P2 quality-of-life improvements with no blocking correctness issues. Both features are narrow in scope and independently safe. The cleanup logic only deletes files already confirmed fully-processed, and budget tracking degrades gracefully on misconfiguration. The four P2 comments (UTC day rollover, || 0 fallback, double timestamp update, per-call settings I/O) are worth addressing but none break the primary paths. src/services/observer/ObserverBudgetTracker.ts — contains all four P2 findings. Important Files Changed
Sequence DiagramsequenceDiagram
participant Hook as Claude Hook
participant SR as SessionRoutes
participant BT as ObserverBudgetTracker
participant Queue as ObservationQueue
participant Agent as SDKAgent/OtherAgent
participant RP as ResponseProcessor
participant TW as TranscriptWatcher
participant CL as JsonlCleanup
Hook->>SR: POST /api/sessions/observations
SR->>BT: canProcessObservation()
BT->>BT: maybeResetDailyBudget()
BT->>BT: check throttle (lastObservationTimestamp)
BT->>BT: check budget (tokensConsumedToday >= max)
alt budget or throttle exceeded
BT-->>SR: false
SR-->>Hook: {status: 'skipped', reason: 'budget_or_throttle'}
else allowed
BT-->>SR: true
SR->>Queue: queueObservation()
SR->>BT: markObservationProcessed()
SR-->>Hook: {status: 'queued'}
Agent->>Queue: dequeue observations
Agent->>RP: processAgentResponse(..., discoveryTokens)
RP->>BT: recordTokensUsed(discoveryTokens)
BT->>BT: tokensConsumedToday += tokens
end
Note over TW,CL: Hourly cleanup loop
TW->>CL: startPeriodicJsonlCleanup(state, statePath)
CL->>CL: cleanStaleOffsets()
CL->>CL: Phase 1: delete processed files older than 7 days
CL->>CL: Phase 2: delete oldest processed if total > 1GB
CL->>TW: saveWatchState() (removes deleted offsets)
Reviews (1): Last reviewed commit: "fix(observer): add JSONL file pruning an..." | Re-trigger Greptile |
| recordTokensUsed(tokenCount: number): void { | ||
| this.maybeResetDailyBudget(); | ||
| this.tokensConsumedToday += tokenCount; | ||
| this.lastObservationTimestamp = Date.now(); | ||
|
|
||
| logger.debug('OBSERVER', 'Token usage recorded', { | ||
| tokensUsed: tokenCount, | ||
| tokensConsumedToday: this.tokensConsumedToday, | ||
| budgetDay: this.currentBudgetDay, | ||
| }); |
There was a problem hiding this comment.
recordTokensUsed extends the throttle window unexpectedly
recordTokensUsed() sets this.lastObservationTimestamp = Date.now() at processing-completion time. canProcessObservation() uses that same field for its throttle check. This means every time a long-running AI call finishes, the throttle window restarts — in effect adding a throttleMs quiet period after each processing result, on top of the queue-time quiet period set by markObservationProcessed(). For a 30-second AI call with throttleMs = 5000, observations could be blocked for an extra 5 s after every completion even when the queue is drained and ready for new work.
If the intent is purely "don't flood the queue", only markObservationProcessed() (called at queue time) should update lastObservationTimestamp. Consider removing the timestamp assignment from recordTokensUsed().
| recordTokensUsed(tokenCount: number): void { | |
| this.maybeResetDailyBudget(); | |
| this.tokensConsumedToday += tokenCount; | |
| this.lastObservationTimestamp = Date.now(); | |
| logger.debug('OBSERVER', 'Token usage recorded', { | |
| tokensUsed: tokenCount, | |
| tokensConsumedToday: this.tokensConsumedToday, | |
| budgetDay: this.currentBudgetDay, | |
| }); | |
| recordTokensUsed(tokenCount: number): void { | |
| this.maybeResetDailyBudget(); | |
| this.tokensConsumedToday += tokenCount; | |
| logger.debug('OBSERVER', 'Token usage recorded', { | |
| tokensUsed: tokenCount, | |
| tokensConsumedToday: this.tokensConsumedToday, | |
| budgetDay: this.currentBudgetDay, | |
| }); | |
| } |
| private getTodayString(): string { | ||
| return new Date().toISOString().slice(0, 10); | ||
| } |
There was a problem hiding this comment.
Budget day uses UTC date, not local date
toISOString().slice(0, 10) returns the date in UTC. For users in non-UTC timezones the daily budget resets at their UTC midnight offset — for example, US Eastern users see a reset at 8 pm local time. Consider using local date parts instead:
| private getTodayString(): string { | |
| return new Date().toISOString().slice(0, 10); | |
| } | |
| private getTodayString(): string { | |
| const d = new Date(); | |
| const y = d.getFullYear(); | |
| const m = String(d.getMonth() + 1).padStart(2, '0'); | |
| const day = String(d.getDate()).padStart(2, '0'); | |
| return `${y}-${m}-${day}`; | |
| } |
| const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH); | ||
| const maxTokensPerDay = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000; | ||
| const throttleMs = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10) || 5000; |
There was a problem hiding this comment.
|| 0 fallback makes it impossible to disable the budget cap
parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000 treats 0 as falsy and falls back to 100 000. A user who sets the value to "0" (to effectively disable the cap) will silently get the default limit instead. The same issue applies to the throttleMs line. Consider using an explicit NaN guard:
| const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH); | |
| const maxTokensPerDay = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000; | |
| const throttleMs = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10) || 5000; | |
| const parsed = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10); | |
| const maxTokensPerDay = !isNaN(parsed) && parsed > 0 ? parsed : 100_000; | |
| const parsedThrottle = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10); | |
| const throttleMs = !isNaN(parsedThrottle) && parsedThrottle >= 0 ? parsedThrottle : 5000; |
| canProcessObservation(): boolean { | ||
| this.maybeResetDailyBudget(); | ||
|
|
||
| const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH); | ||
| const maxTokensPerDay = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000; | ||
| const throttleMs = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10) || 5000; |
There was a problem hiding this comment.
Settings file read on every
canProcessObservation() call
SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH) performs a synchronous disk read each time this method is invoked (potentially on every incoming observation). The same pattern is repeated in getBudgetStatus() and recordTokensUsed() (via maybeResetDailyBudget()). For high-frequency hook calls this could add measurable I/O overhead. A short-lived in-memory cache (e.g., re-read settings at most once per minute) would avoid repeated disk hits without sacrificing configurability.
|
Closing to start fresh from main — will redo fixes isolated in Docker container. |
Summary
Test plan
Closes #1937, closes #1938
🤖 Generated with Claude Code