Skip to content

fix: add iteration cap to label_propagation to prevent infinite loop#1434

Open
octo-patch wants to merge 1 commit intogetzep:mainfrom
octo-patch:fix/issue-1397-label-propagation-iteration-cap
Open

fix: add iteration cap to label_propagation to prevent infinite loop#1434
octo-patch wants to merge 1 commit intogetzep:mainfrom
octo-patch:fix/issue-1397-label-propagation-iteration-cap

Conversation

@octo-patch
Copy link
Copy Markdown

Fixes #1397

Problem

label_propagation in community_operations.py used an unbounded while True loop that could oscillate forever on mid-sized graphs (~1000+ nodes). The issue reporter observed their build_communities() call hanging for 44 minutes at 100% CPU before they killed it.

The root cause is the tiebreaker rule: when a node's neighbor count is tied, the code did new_community = max(community_candidate, curr_community), which creates a systematic bias towards higher-numbered communities. Combined with deterministic iteration order, this causes pairs of nodes to flip communities on every pass without ever converging.

Solution

Two complementary changes:

  1. Add an iteration cap (MAX_LABEL_PROPAGATION_ITERATIONS = 200): Replaces the while True loop with for iteration in range(MAX_LABEL_PROPAGATION_ITERATIONS). Real-world graphs converge in <20 iterations; the 200-iteration cap is a safety ceiling that produces approximate community labels instead of hanging indefinitely.

  2. Fix the tiebreaker: Prefer curr_community when tied instead of max(community_candidate, curr_community). This matches the reference label-propagation algorithm specification and short-circuits oscillation. When a node has no clear plurality neighbor community, keeping its current assignment is stable.

A logger.debug message is emitted on convergence; a logger.warning is emitted if the cap is hit.

Testing

The fix is a logic correction with a clear behavioral improvement:

  • On well-structured graphs: converges in <20 iterations (unchanged semantics)
  • On pathological graphs: terminates after 200 iterations with approximate labels instead of hanging
  • No external dependencies required to verify the fix

…ixes getzep#1397)

label_propagation used an unbounded while True loop that could oscillate
forever on mid-sized graphs (~1000+ nodes) due to a tiebreaker rule that
caused pairs of nodes to flip communities on every pass without converging.

Changes:
- Add MAX_LABEL_PROPAGATION_ITERATIONS = 200 safety cap
- Replace while True with for loop using the iteration cap
- Fix tiebreaker: prefer curr_community to avoid oscillation
- Log debug on convergence, warning when cap is hit
@RexLeee
Copy link
Copy Markdown

RexLeee commented Apr 25, 2026

+1 — adopting this patch as a monkey-patch in our production graphiti-core 0.28.2 deployment after hitting the bug on a bipartite supply-chain graph (details: #1397 (comment)).

The two-line change (iteration cap + curr_community tiebreaker) is minimal and easy to audit. Whether or not the maintainers prefer the more comprehensive async-LPA approach in #1388, shipping any fix would unblock production users — the loop genuinely runs forever on real-world graphs.

Copy link
Copy Markdown

@xkonjin xkonjin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: fix: add iteration cap to label_propagation to prevent infinite loop

Overall: Excellent safety fix. Replaces an unbounded while True with a finite cap and adds tie-breaking logic to prevent oscillation — both are real convergence hazards in label propagation on cyclic graphs.

Positives:

  • MAX_LABEL_PROPAGATION_ITERATIONS = 200 is a reasonable, non-magic default. Good to see it is module-level and easy to tune.
  • The tie-breaking change (new_community = curr_community instead of max(community_candidate, curr_community)) directly addresses oscillation between equal-rank neighbors, which is a classic label propagation failure mode.
  • Logging is well-placed: debug on convergence, warning on cap-hit. This helps operators distinguish "fast community detection" from "degraded approximate labels."

Potential issues:

  1. 200 may be too low for large graphs: If a projection has hundreds of nodes with long dependency chains, 200 iterations might not converge. Consider exposing max_iterations as a parameter to label_propagation() so callers can tune it, or document the chosen value in a docstring.
  2. Convergence semantics change: Previously, the function would loop until absolute stability. Now it returns approximate labels after 200 iterations. Downstream consumers (community summarization, search pruning) should be aware that labels may be unstable under the cap. The warning log helps, but a docstring note on the function itself would be stronger.
  3. No test for cap-hit path: The test coverage appears to be manual. Consider adding a unit test that constructs a deliberately oscillating graph (e.g., a 2-node cycle with symmetric edge weights) and asserts:
    • The function returns within the cap.
    • The warning is emitted.
    • Labels do not oscillate indefinitely.

Security: No issues. This is a pure algorithmic safety bound.

Verdict: Good fix, ready to merge. Recommend adding a docstring and an oscillation test in a follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

label_propagation has unbounded while True, oscillates forever on mid-sized graphs

3 participants