Skip to content

fix: isolate claude hook installs + dedupe on reinstall (GRA-1233)#269

Open
Gradata wants to merge 2 commits into
mainfrom
gra-108-hook-test-isolation-dedupe
Open

fix: isolate claude hook installs + dedupe on reinstall (GRA-1233)#269
Gradata wants to merge 2 commits into
mainfrom
gra-108-hook-test-isolation-dedupe

Conversation

@Gradata

@Gradata Gradata commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Fixes GRA-1233: gradata install --agent claude-code now installs all 5 hooks including PostToolUse auto_correct, session_close, pre_compact, and context_inject — no longer PreToolUse-only.

  • Adds HOOKS tuple with all 5 hook registrations
  • Replaces single-hook install() with per-module loop
  • Adds has_exact_installed_hook_set for idempotency
  • Adds remove_existing_module_hook to avoid stale entries on reinstall
  • Updates uninstall() to sweep all hook events

Ref: GRA-1233 (02508378-fdad-40ad-96a1-23e99872f78f)

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough
  • Fixed hook installation: gradata install --agent claude-code now installs all five hooks (PostToolUse auto_correct, session_close, pre_compact, context_inject, and PreToolUse) instead of just PreToolUse
  • Idempotent installation: Introduced _has_exact_installed_hook_set() to verify all hooks are installed and prevent duplicate entries on reinstall
  • Stale hook cleanup: Added _remove_existing_module_hook() to remove outdated hook entries before reinstalling, preserving unrelated user-owned hooks
  • Enhanced diagnostics: New _check_missing_hook_brain_dirs() diagnostic detects and optionally removes Claude hooks that reference missing BRAIN_DIR targets
  • Centralized settings loading: New _read_settings() helper in stale hook check for more reliable Claude settings JSON loading
  • Improved uninstall: Updated uninstall() to comprehensively sweep and remove all hook events across the configuration
  • Test coverage: Added tests validating multi-hook installation, stale hook removal on reinstall, and missing brain directory detection

Walkthrough

This PR detects and removes Claude hook entries that reference missing BRAIN_DIR targets. It refactors stale-hook detection, adds a removal function to clean invalid hooks from settings JSON, updates the adapter's install logic to deduplicate hooks idempotently, and integrates the check into the doctor's diagnostics pipeline.

Changes

Hook brain directory detection and removal workflow

Layer / File(s) Summary
Stale hook detection and removal infrastructure
Gradata/src/gradata/hooks/stale_hook_check.py
_missing_hook_brain_dirs() refactored to use a new _read_settings() helper. New _remove_missing_hook_brain_dirs() traverses the nested hooks structure across events, groups, and individual hooks; filters entries whose derived BRAIN_DIR does not exist; deduplicates removals; prunes empty event/group nodes; and persists the cleaned settings JSON back to disk.
Claude Code adapter idempotent installation
Gradata/src/gradata/hooks/adapters/claude_code.py
install() now calls _has_exact_installed_hook_set() to verify all desired Gradata module hooks are already present. If not, it removes any existing Gradata-owned hooks per (event, module) pair before fresh installation. New helpers _is_gradata_module_hook(), _has_exact_installed_hook_set(), and _remove_existing_module_hook() manage precise hook identification and selective removal while preserving unrelated user-owned entries.
Doctor diagnostic integration
Gradata/src/gradata/_doctor.py
New _check_missing_hook_brain_dirs(auto_remove=True) detects missing BRAIN_DIR targets in Claude hooks and optionally removes them, returning a structured status with counts and capped previews. Integrated into diagnose() as a local check in the ordered pipeline.
Test coverage for detection and adapter cleanup
Gradata/tests/test_doctor_hooks.py, Gradata/tests/test_hook_adapters.py
New doctor test validates detection and removal of missing brain directories with mixed existing/missing targets. New adapter tests verify stale brain-path deduplication during initial installation and reinstall scenarios. Includes json import for hook configuration serialization.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Gradata/gradata#248: Contains the same functions and behavior changes in _doctor.py, hooks/adapters/claude_code.py, hooks/stale_hook_check.py, and related tests.
  • Gradata/gradata#242: Modifies the same Claude hook staleness detection path in stale_hook_check.py, including _missing_hook_brain_dirs() and brain-root handling logic.

Suggested labels

bug

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately summarizes the main change: isolating Claude hook installs and deduplicating on reinstall, with a reference to the issue number.
Description check ✅ Passed The description directly relates to the changeset, detailing the specific changes made to fix GRA-1233 including hook installations, new helper functions, and updates to install/uninstall logic.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch gra-108-hook-test-isolation-dedupe

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.22.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.17][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the bug Something isn't working label Jun 8, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/src/gradata/hooks/adapters/claude_code.py`:
- Around line 101-113: The function _has_exact_installed_hook_set currently can
miss duplicate installed hooks inside a single event group; update the loop so
you explicitly detect duplicates by ensuring module_entries contains exactly one
matching entry (len(module_entries) == 1) and that the single entry exactly
equals the expected signature (compare str(module_entries[0]) ==
f"{sig}:{event}:{module}" rather than using substring containment); if
len(module_entries) != 1 or the equality check fails, return False. Use the
existing HOOKS and _is_gradata_module_hook helpers to build module_entries but
rely on explicit count+exact-string-check to catch duplicates.

In `@Gradata/src/gradata/hooks/stale_hook_check.py`:
- Around line 260-262: The code currently calls path.parent.mkdir(...) then
path.write_text(json.dumps(settings,...)) in stale_hook_check.py which can
produce a truncated settings.json if interrupted; replace the direct write with
the project's atomic JSON write helper (e.g., use the atomic-write helper
function used elsewhere) to write the serialized settings atomically: ensure the
directory is created as before, then call the helper (passing the target Path
and the settings dict/serialized JSON) instead of path.write_text; keep the same
formatting (indent and sort_keys) and return removed as before.
- Around line 210-216: The _remove_missing_hook_brain_dirs function can mutate
real settings when GRADATA_HOOK_ROOT or GRADATA_HOOK_ROOT_POST are set; add the
same isolation guard used earlier (see the check around line 166) to
short-circuit and return an empty list if either environment overrides are
present, before reading or modifying settings. Specifically, in
_remove_missing_hook_brain_dirs (and using the same helpers _settings_path and
_read_settings), check for GRADATA_HOOK_ROOT/GRADATA_HOOK_ROOT_POST and
immediately return [] to preserve hook-root isolation for test/dev runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f8490051-7ad3-4cb2-b53e-96678102d254

📥 Commits

Reviewing files that changed from the base of the PR and between acde088 and 8597b76.

📒 Files selected for processing (5)
  • Gradata/src/gradata/_doctor.py
  • Gradata/src/gradata/hooks/adapters/claude_code.py
  • Gradata/src/gradata/hooks/stale_hook_check.py
  • Gradata/tests/test_doctor_hooks.py
  • Gradata/tests/test_hook_adapters.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: pytest ubuntu-latest / py3.12
  • GitHub Check: pytest ubuntu-latest / py3.11
  • GitHub Check: pytest windows-latest / py3.11
  • GitHub Check: pytest macos-latest / py3.11
  • GitHub Check: pytest windows-latest / py3.12
  • GitHub Check: pytest macos-latest / py3.12
  • GitHub Check: pytest (py3.12)
  • GitHub Check: pytest (py3.11)
🧰 Additional context used
📓 Path-based instructions (2)
Gradata/src/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/src/**/*.py: Prefer sentence-transformers for local embeddings, google-genai for Gemini embeddings, cryptography for AES-GCM encrypted system.db, bm25s for BM25 rule ranking, and mem0ai for external memory adapters — guard all optional dependency imports with try / except ImportError at the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bare except: pass — use typed exceptions or at minimum logger.warning(...) with exc_info=True to avoid silent failure in a memory product
Never import from out-of-scope sibling directories ../Sprites/ or ../Hausgem/ within gradata/* code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to ../Sprites/, ../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from inside gradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes

Files:

  • Gradata/src/gradata/_doctor.py
  • Gradata/src/gradata/hooks/adapters/claude_code.py
  • Gradata/src/gradata/hooks/stale_hook_check.py
Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

  • Gradata/tests/test_hook_adapters.py
  • Gradata/tests/test_doctor_hooks.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: Gradata
Repo: Gradata/gradata PR: 0
File: :0-0
Timestamp: 2026-04-17T17:18:07.439Z
Learning: In PR `#102` (gradata/gradata), Round 2 addressed: cli.py env-first brain resolution (GRADATA_BRAIN > --brain-dir > cwd), _tenant.py corrupt .tenant_id overwrite, _env_int default clamping to minimum, and _events.py tenant-scoped fallback SELECT for dedup. All ruff and 99 tests green after these fixes.
📚 Learning: 2026-05-01T15:50:32.772Z
Learnt from: CR
Repo: Gradata/gradata PR: 0
File: Gradata/AGENTS.md:0-0
Timestamp: 2026-05-01T15:50:32.772Z
Learning: Applies to Gradata/tests/**/*.py : Set `BRAIN_DIR` environment variable via `tmp_path` in conftest.py for test isolation — ensure `_paths.py` module cache refreshes when calling `Brain.init()` directly inside tests

Applied to files:

  • Gradata/tests/test_hook_adapters.py
  • Gradata/tests/test_doctor_hooks.py
  • Gradata/src/gradata/hooks/stale_hook_check.py
📚 Learning: 2026-05-01T15:50:32.772Z
Learnt from: CR
Repo: Gradata/gradata PR: 0
File: Gradata/AGENTS.md:0-0
Timestamp: 2026-05-01T15:50:32.772Z
Learning: Use `from gradata import Brain` as the public entry point — `brain.correct()` is THE entry point for the headline product promise

Applied to files:

  • Gradata/src/gradata/hooks/stale_hook_check.py
🔇 Additional comments (5)
Gradata/src/gradata/hooks/stale_hook_check.py (1)

166-207: LGTM!

Also applies to: 347-349

Gradata/src/gradata/hooks/adapters/claude_code.py (1)

67-99: LGTM!

Also applies to: 117-143

Gradata/src/gradata/_doctor.py (1)

236-261: LGTM!

Also applies to: 525-526

Gradata/tests/test_hook_adapters.py (1)

3-3: LGTM!

Also applies to: 135-215

Gradata/tests/test_doctor_hooks.py (1)

1-60: LGTM!

Comment on lines +101 to +113
def _has_exact_installed_hook_set(hooks: dict, sig: str) -> bool:
"""Return true only when every Gradata module hook is the desired one."""
for event, _matcher, module in HOOKS:
entries = hooks.get(event)
if not isinstance(entries, list):
return False
module_entries = [
entry for entry in entries if _is_gradata_module_hook(entry, event, module)
]
if len(module_entries) != 1:
return False
if f"{sig}:{event}:{module}" not in str(module_entries[0]):
return False

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

_has_exact_installed_hook_set() can miss duplicates inside a single group.

Line 107 builds module_entries from top-level event entries. If one group contains two gradata.hooks.<module> commands, it still counts as one entry and Line 110 passes, so install() returns already_present and skips dedupe.

Suggested fix
 def _has_exact_installed_hook_set(hooks: dict, sig: str) -> bool:
     """Return true only when every Gradata module hook is the desired one."""
     for event, _matcher, module in HOOKS:
         entries = hooks.get(event)
         if not isinstance(entries, list):
             return False
-        module_entries = [
-            entry for entry in entries if _is_gradata_module_hook(entry, event, module)
-        ]
-        if len(module_entries) != 1:
+        module_hooks: list[object] = []
+        for entry in entries:
+            if isinstance(entry, dict) and isinstance(entry.get("hooks"), list):
+                module_hooks.extend(
+                    hook
+                    for hook in entry["hooks"]
+                    if _is_gradata_module_hook(hook, event, module)
+                )
+            elif _is_gradata_module_hook(entry, event, module):
+                module_hooks.append(entry)
+        if len(module_hooks) != 1:
             return False
-        if f"{sig}:{event}:{module}" not in str(module_entries[0]):
+        if f"{sig}:{event}:{module}" not in str(module_hooks[0]):
             return False
     return True
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/src/gradata/hooks/adapters/claude_code.py` around lines 101 - 113,
The function _has_exact_installed_hook_set currently can miss duplicate
installed hooks inside a single event group; update the loop so you explicitly
detect duplicates by ensuring module_entries contains exactly one matching entry
(len(module_entries) == 1) and that the single entry exactly equals the expected
signature (compare str(module_entries[0]) == f"{sig}:{event}:{module}" rather
than using substring containment); if len(module_entries) != 1 or the equality
check fails, return False. Use the existing HOOKS and _is_gradata_module_hook
helpers to build module_entries but rely on explicit count+exact-string-check to
catch duplicates.

Comment on lines +210 to +216
def _remove_missing_hook_brain_dirs(settings_path: Path | None = None) -> list[Path]:
"""Remove Gradata hook commands whose BRAIN_DIR targets no longer exist."""
path = settings_path or _settings_path()
settings = _read_settings(path)
if not settings:
return []
hooks = settings.get("hooks")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve hook-root isolation in the removal path.

Line 210 starts a destructive cleanup path, but unlike Line 166 it does not short-circuit when GRADATA_HOOK_ROOT/GRADATA_HOOK_ROOT_POST are set. Since doctor calls this remover by default, isolated test/dev runs can still mutate a real Claude settings file.

Suggested fix
 def _remove_missing_hook_brain_dirs(settings_path: Path | None = None) -> list[Path]:
     """Remove Gradata hook commands whose BRAIN_DIR targets no longer exist."""
+    if settings_path is None and (
+        env_str("GRADATA_HOOK_ROOT") or env_str("GRADATA_HOOK_ROOT_POST")
+    ):
+        return []
     path = settings_path or _settings_path()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/src/gradata/hooks/stale_hook_check.py` around lines 210 - 216, The
_remove_missing_hook_brain_dirs function can mutate real settings when
GRADATA_HOOK_ROOT or GRADATA_HOOK_ROOT_POST are set; add the same isolation
guard used earlier (see the check around line 166) to short-circuit and return
an empty list if either environment overrides are present, before reading or
modifying settings. Specifically, in _remove_missing_hook_brain_dirs (and using
the same helpers _settings_path and _read_settings), check for
GRADATA_HOOK_ROOT/GRADATA_HOOK_ROOT_POST and immediately return [] to preserve
hook-root isolation for test/dev runs.

Comment on lines +260 to +262
path.parent.mkdir(parents=True, exist_ok=True)
path.write_text(json.dumps(settings, indent=2, sort_keys=True) + "\n", encoding="utf-8")
return removed

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use the atomic JSON write helper for settings persistence.

Line 261 writes settings.json directly; a mid-write interruption can leave a truncated file.

As per coding guidelines, “Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/src/gradata/hooks/stale_hook_check.py` around lines 260 - 262, The
code currently calls path.parent.mkdir(...) then
path.write_text(json.dumps(settings,...)) in stale_hook_check.py which can
produce a truncated settings.json if interrupted; replace the direct write with
the project's atomic JSON write helper (e.g., use the atomic-write helper
function used elsewhere) to write the serialized settings atomically: ensure the
directory is created as before, then call the helper (passing the target Path
and the settings dict/serialized JSON) instead of path.write_text; keep the same
formatting (indent and sort_keys) and return removed as before.

Source: Coding guidelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant