Replace flat restart counter with windowed restart guard#2057
Closed
Replace flat restart counter with windowed restart guard#2057
Conversation
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
- Create shared RestartGuard module (src/services/worker/RestartGuard.ts) that counts restarts within a 60-second window (max 5) instead of using a flat counter that only resets on clean completion. - Update ActiveSession interface with restartTimestamps field. - Update both SessionRoutes.ts and worker-service.ts to use the shared windowed guard. - SessionRoutes.ts now marks stranded messages as abandoned when the guard trips (previously left them stuck forever). - Add comprehensive tests for the windowed restart guard. - Update architecture docs to reflect new circuit-breaker behavior. Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/1a7b5a77-2012-40b0-bb05-a4a4cd293148 Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>
Agent-Logs-Url: https://github.com/thedotmack/claude-mem/sessions/1a7b5a77-2012-40b0-bb05-a4a4cd293148 Co-authored-by: thedotmack <683968+thedotmack@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix generator restart guard to recover pending messages
Replace flat restart counter with windowed restart guard
Apr 17, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
consecutiveRestartscounter (cap=3) only resets on fully clean completion, so any long-running session that legitimately restarts >3 times across its lifetime gets permanently aborted — stranding pending messages with no recovery. The guard itself is necessary to prevent tight crash-loops from burning tokens, so it can't simply be removed.Approach
Replace the flat counter with a time-windowed guard: only restarts within a 60-second sliding window count toward the limit (max 5). Tight loops trip the guard in seconds; occasional restarts across hours never accumulate.
Changes
src/services/worker/RestartGuard.ts— new shared module withrecordRestart(),resetRestarts(),getRecentRestartCount(). Prunes timestamps outside the window on each call.src/services/worker-types.ts— addrestartTimestamps: number[]toActiveSessionsrc/services/worker/SessionManager.ts— initializerestartTimestamps: []SessionRoutes.ts+worker-service.ts— both call sites now use the shared windowed guard instead of independent flat countersSessionRoutes.ts— guard trip now callsmarkAllSessionMessagesAbandoned()instead of just aborting (previously left messages stranded forever)Tests
16 tests covering windowed pruning, guard trip/allow logic, and the two acceptance scenarios: