Skip to content

fix(observer): JSONL file pruning and token budget cap#2067

Closed
thedotmack wants to merge 1 commit intomainfrom
bugfix/observer-bugs
Closed

fix(observer): JSONL file pruning and token budget cap#2067
thedotmack wants to merge 1 commit intomainfrom
bugfix/observer-bugs

Conversation

@thedotmack
Copy link
Copy Markdown
Owner

Summary

Test plan

  • Verify processed JSONL files are cleaned up after 7 days
  • Verify JSONL cleanup respects 1GB size cap
  • Verify observer respects daily token budget
  • Verify observer throttling works between runs

Closes #1937, closes #1938

🤖 Generated with Claude Code

Fix #1937: JSONL files under ~/.claude/projects/ now get cleaned up after
processing. Adds periodic cleanup (hourly) that deletes processed files
older than 7 days and enforces a 1GB total size cap.

Fix #1938: Observer background sessions now respect a daily token budget
(default 100k tokens/day) and throttle interval (default 5s between runs).
Adds CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY and CLAUDE_MEM_OBSERVER_THROTTLE_MS
settings. Budget status is exposed via /api/health endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 18, 2026

Warning

Rate limit exceeded

@thedotmack has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 35 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 4 minutes and 35 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 06fc829e-74ce-44a2-89ac-929a092bea31

📥 Commits

Reviewing files that changed from the base of the PR and between beea789 and 3965edc.

📒 Files selected for processing (8)
  • src/services/observer/ObserverBudgetTracker.ts
  • src/services/server/Server.ts
  • src/services/transcripts/cleanup.ts
  • src/services/transcripts/watcher.ts
  • src/services/worker/agents/ResponseProcessor.ts
  • src/services/worker/http/routes/SessionRoutes.ts
  • src/shared/SettingsDefaultsManager.ts
  • src/utils/logger.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bugfix/observer-bugs

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 18, 2026

Code Review

Overview

This PR fixes two resource-leak bugs: JSONL transcript files accumulating indefinitely (#1937) and the observer consuming unbounded tokens (#1938). The approach is solid — a cleanup service for JSONL files and an in-memory budget tracker for the observer. Overall the code is well-structured and the feature logic is correct.


Issues

Settings loaded on every canProcessObservation() call
SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH) is called inside canProcessObservation() and again inside getBudgetStatus(). If this involves disk I/O (and the name suggests it does), this is a hot-path concern — every observation request hits disk twice. Consider caching settings with a short TTL (e.g. 60s) or loading once at construction and reloading only on explicit signal.

markObservationProcessed() is called before processing completes
In SessionRoutes.ts, the throttle timestamp is stamped immediately after queuing the observation — before the AI agent actually runs. This means a burst of requests can saturate the queue before the throttle kicks in. If the intent is to throttle submissions, this is fine, but the name markObservationProcessed implies completion. Consider renaming to markObservationQueued to avoid confusion, or moving it post-processing.

discoveryTokens source not visible in diff
ResponseProcessor.ts references discoveryTokens but its definition isn't in the diff. If this variable is undefined at the call site for non-observer sessions, the > 0 guard handles it (since undefined > 0 is false), but this is fragile. Worth adding a typeof discoveryTokens === 'number' guard or ensuring the variable is always defined in scope.

Phase 2 size check uses stale baseline
In runJsonlCleanup, totalSizeBytes is computed from all files (processed + unprocessed). After Phase 1 deletes some files, the running total is updated correctly. But if unprocessed files alone exceed 1GB, Phase 2 will try to delete processed files that can't bring the total under the cap — the loop exits without achieving the target. This won't cause data loss, but the log message will mislead (it reports remaining size as if cleanup succeeded). Consider logging a warning when the cap can't be reached due to unprocessed files.


Minor Points

  • Issue number references in comments/code — CLAUDE.md says not to reference specific bugs or tasks in source comments (e.g. // Issue #1938, // Fix #1937). These belong in the PR description, not the code.
  • Cleanup interval is hardcodedCLEANUP_INTERVAL_MS = 60 * 60 * 1000 isn't configurable via settings. Consistent with maxAgeMs/maxTotalSizeBytes which are exposed, but the interval might be worth adding for power users.
  • No automated tests — The test plan is entirely manual. ObserverBudgetTracker has resetInstance() which signals testability intent; unit tests for the day-rollover, throttle, and budget-exhaustion paths would catch regressions cheaply.

Positive Notes

  • Clean singleton with resetInstance() for test isolation
  • Proper cleanup of setInterval via the returned stop function and stopJsonlCleanup in stop()
  • Log throttling (every-50th / every-100th) prevents log spam without losing signal
  • Exposing observerBudget in the health endpoint is excellent for observability
  • cleanStaleOffsets preventing state file bloat is a nice touch alongside the main cleanup

Summary

The core logic is correct and well-structured. The main actionable items before merge are: (1) cache the settings read in ObserverBudgetTracker, (2) rename markObservationProcessed to reflect that it stamps queue time, and (3) strip issue-number comments from source. Everything else is low-priority polish.

🤖 Generated with Claude Code

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 18, 2026

Greptile Summary

This PR adds two independent features: periodic JSONL file pruning (7-day retention / 1 GB cap, hourly interval) and an in-memory observer token budget + throttle gate. Both are well-structured with good error handling, clean lifecycle management, and health-endpoint exposure. All remaining findings are style/P2.

Confidence Score: 5/5

Safe to merge; all findings are P2 quality-of-life improvements with no blocking correctness issues.

Both features are narrow in scope and independently safe. The cleanup logic only deletes files already confirmed fully-processed, and budget tracking degrades gracefully on misconfiguration. The four P2 comments (UTC day rollover, || 0 fallback, double timestamp update, per-call settings I/O) are worth addressing but none break the primary paths.

src/services/observer/ObserverBudgetTracker.ts — contains all four P2 findings.

Important Files Changed

Filename Overview
src/services/observer/ObserverBudgetTracker.ts New singleton tracking daily token budget and throttle interval; has three P2 issues: UTC-based day rollover, `
src/services/transcripts/cleanup.ts New JSONL cleanup service with age-based (7-day) and size-based (1GB) pruning of fully-processed files; logic is sound with proper error handling and state persistence.
src/services/transcripts/watcher.ts Integrates periodic JSONL cleanup into TranscriptWatcher start/stop lifecycle; correctly shares state reference so cleanup always sees live offsets.
src/services/worker/http/routes/SessionRoutes.ts Budget/throttle gate added to both observation endpoints; markObservationProcessed() correctly called only after successful queue, not on early-return paths.
src/services/worker/agents/ResponseProcessor.ts Records discoveryTokens against the budget after successful storage; correctly scoped to observer AI calls only.
src/shared/SettingsDefaultsManager.ts Adds CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY and CLAUDE_MEM_OBSERVER_THROTTLE_MS with sensible defaults.
src/services/server/Server.ts Exposes observerBudget status in the health endpoint via getBudgetStatus().
src/utils/logger.ts Adds TRANSCRIPT and OBSERVER to the Component union type to support new logging calls.

Sequence Diagram

sequenceDiagram
    participant Hook as Claude Hook
    participant SR as SessionRoutes
    participant BT as ObserverBudgetTracker
    participant Queue as ObservationQueue
    participant Agent as SDKAgent/OtherAgent
    participant RP as ResponseProcessor
    participant TW as TranscriptWatcher
    participant CL as JsonlCleanup

    Hook->>SR: POST /api/sessions/observations
    SR->>BT: canProcessObservation()
    BT->>BT: maybeResetDailyBudget()
    BT->>BT: check throttle (lastObservationTimestamp)
    BT->>BT: check budget (tokensConsumedToday >= max)
    alt budget or throttle exceeded
        BT-->>SR: false
        SR-->>Hook: {status: 'skipped', reason: 'budget_or_throttle'}
    else allowed
        BT-->>SR: true
        SR->>Queue: queueObservation()
        SR->>BT: markObservationProcessed()
        SR-->>Hook: {status: 'queued'}
        Agent->>Queue: dequeue observations
        Agent->>RP: processAgentResponse(..., discoveryTokens)
        RP->>BT: recordTokensUsed(discoveryTokens)
        BT->>BT: tokensConsumedToday += tokens
    end

    Note over TW,CL: Hourly cleanup loop
    TW->>CL: startPeriodicJsonlCleanup(state, statePath)
    CL->>CL: cleanStaleOffsets()
    CL->>CL: Phase 1: delete processed files older than 7 days
    CL->>CL: Phase 2: delete oldest processed if total > 1GB
    CL->>TW: saveWatchState() (removes deleted offsets)
Loading

Fix All in Claude Code

Reviews (1): Last reviewed commit: "fix(observer): add JSONL file pruning an..." | Re-trigger Greptile

Comment on lines +104 to +113
recordTokensUsed(tokenCount: number): void {
this.maybeResetDailyBudget();
this.tokensConsumedToday += tokenCount;
this.lastObservationTimestamp = Date.now();

logger.debug('OBSERVER', 'Token usage recorded', {
tokensUsed: tokenCount,
tokensConsumedToday: this.tokensConsumedToday,
budgetDay: this.currentBudgetDay,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 recordTokensUsed extends the throttle window unexpectedly

recordTokensUsed() sets this.lastObservationTimestamp = Date.now() at processing-completion time. canProcessObservation() uses that same field for its throttle check. This means every time a long-running AI call finishes, the throttle window restarts — in effect adding a throttleMs quiet period after each processing result, on top of the queue-time quiet period set by markObservationProcessed(). For a 30-second AI call with throttleMs = 5000, observations could be blocked for an extra 5 s after every completion even when the queue is drained and ready for new work.

If the intent is purely "don't flood the queue", only markObservationProcessed() (called at queue time) should update lastObservationTimestamp. Consider removing the timestamp assignment from recordTokensUsed().

Suggested change
recordTokensUsed(tokenCount: number): void {
this.maybeResetDailyBudget();
this.tokensConsumedToday += tokenCount;
this.lastObservationTimestamp = Date.now();
logger.debug('OBSERVER', 'Token usage recorded', {
tokensUsed: tokenCount,
tokensConsumedToday: this.tokensConsumedToday,
budgetDay: this.currentBudgetDay,
});
recordTokensUsed(tokenCount: number): void {
this.maybeResetDailyBudget();
this.tokensConsumedToday += tokenCount;
logger.debug('OBSERVER', 'Token usage recorded', {
tokensUsed: tokenCount,
tokensConsumedToday: this.tokensConsumedToday,
budgetDay: this.currentBudgetDay,
});
}

Fix in Claude Code

Comment on lines +168 to +170
private getTodayString(): string {
return new Date().toISOString().slice(0, 10);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Budget day uses UTC date, not local date

toISOString().slice(0, 10) returns the date in UTC. For users in non-UTC timezones the daily budget resets at their UTC midnight offset — for example, US Eastern users see a reset at 8 pm local time. Consider using local date parts instead:

Suggested change
private getTodayString(): string {
return new Date().toISOString().slice(0, 10);
}
private getTodayString(): string {
const d = new Date();
const y = d.getFullYear();
const m = String(d.getMonth() + 1).padStart(2, '0');
const day = String(d.getDate()).padStart(2, '0');
return `${y}-${m}-${day}`;
}

Fix in Claude Code

Comment on lines +64 to +66
const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH);
const maxTokensPerDay = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000;
const throttleMs = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10) || 5000;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 || 0 fallback makes it impossible to disable the budget cap

parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000 treats 0 as falsy and falls back to 100 000. A user who sets the value to "0" (to effectively disable the cap) will silently get the default limit instead. The same issue applies to the throttleMs line. Consider using an explicit NaN guard:

Suggested change
const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH);
const maxTokensPerDay = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000;
const throttleMs = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10) || 5000;
const parsed = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10);
const maxTokensPerDay = !isNaN(parsed) && parsed > 0 ? parsed : 100_000;
const parsedThrottle = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10);
const throttleMs = !isNaN(parsedThrottle) && parsedThrottle >= 0 ? parsedThrottle : 5000;

Fix in Claude Code

Comment on lines +61 to +66
canProcessObservation(): boolean {
this.maybeResetDailyBudget();

const settings = SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH);
const maxTokensPerDay = parseInt(settings.CLAUDE_MEM_OBSERVER_MAX_TOKENS_PER_DAY, 10) || 100_000;
const throttleMs = parseInt(settings.CLAUDE_MEM_OBSERVER_THROTTLE_MS, 10) || 5000;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Settings file read on every canProcessObservation() call

SettingsDefaultsManager.loadFromFile(USER_SETTINGS_PATH) performs a synchronous disk read each time this method is invoked (potentially on every incoming observation). The same pattern is repeated in getBudgetStatus() and recordTokensUsed() (via maybeResetDailyBudget()). For high-frequency hook calls this could add measurable I/O overhead. A short-lived in-memory cache (e.g., re-read settings at most once per minute) would avoid repeated disk hits without sacrificing configurability.

Fix in Claude Code

@thedotmack
Copy link
Copy Markdown
Owner Author

Closing to start fresh from main — will redo fixes isolated in Docker container.

@thedotmack thedotmack closed this Apr 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Observer background sessions burn excessive tokens with no budget cap [BUG] Observer JSONL files accumulate indefinitely, reaching tens of GB

1 participant