Skip to content

label_propagation has unbounded while True, oscillates forever on mid-sized graphs #1397

@rmulligan

Description

@rmulligan

label_propagation has unbounded while True and can oscillate forever on mid-sized graphs

Summary

graphiti_core/utils/maintenance/community_operations.py::label_propagation
contains an unbounded while True: loop with no iteration cap. On
any graph where two or more adjacent nodes have tied neighbor counts,
the tiebreaker rule new_community = max(community_candidate, curr_community)
combined with the deterministic update order causes pairs of nodes
to flip each other's community labels on every pass without ever
converging.

We observed this on a 1,286-entity / 2,051-edge graph: the
function ran for 44 minutes at 100% CPU with zero database
activity
before we killed the build. Python flame graph showed
100% of the time in label_propagation's inner community_candidates
defaultdict accumulation loop — no I/O, no LLM calls, no GC, just
pure Python churn on tie flipping.

Environment

  • graphiti-core 0.28.2
  • Python 3.12
  • FalkorDB driver
  • Graph: 1,286 entities, 2,051 edges built up over ~2 weeks of
    autonomous agent ingest

Source (community_operations.py:92-137)

def label_propagation(projection: dict[str, list[Neighbor]]) -> list[list[str]]:
    # Implement the label propagation community detection algorithm.
    # 1. Start with each node being assigned its own community
    # 2. Each node will take on the community of the plurality of its neighbors
    # 3. Ties are broken by going to the largest community
    # 4. Continue until no communities change during propagation

    community_map = {uuid: i for i, uuid in enumerate(projection.keys())}

    while True:                                          # <-- unbounded loop
        no_change = True
        new_community_map: dict[str, int] = {}

        for uuid, neighbors in projection.items():
            curr_community = community_map[uuid]

            community_candidates: dict[int, int] = defaultdict(int)
            for neighbor in neighbors:
                community_candidates[community_map[neighbor.node_uuid]] += neighbor.edge_count
            community_lst = [
                (count, community) for community, count in community_candidates.items()
            ]

            community_lst.sort(reverse=True)
            candidate_rank, community_candidate = community_lst[0] if community_lst else (0, -1)
            if community_candidate != -1 and candidate_rank > 1:
                new_community = community_candidate
            else:
                new_community = max(community_candidate, curr_community)   # <-- tiebreaker

            new_community_map[uuid] = new_community

            if new_community != curr_community:
                no_change = False

        if no_change:
            break

        community_map = new_community_map
    # ...

Why it oscillates

Consider two nodes A and B connected by an edge with weight 1.
Initial assignment: A=0, B=1.

Iteration 1:

  • Update A: its only neighbor is B with community 1. Rank 1, not

    1, so hits the tiebreaker: new_community = max(1, 0) = 1.
    A → 1.

  • Update B: its only neighbor is A, which in community_map
    (not new_community_map!) still shows 0. Rank 1, not > 1,
    tiebreaker: new_community = max(0, 1) = 1. B → 1.

Iteration 2 with community_map = {A: 1, B: 1}:

  • Update A: neighbor B has community 1. Tiebreaker: max(1, 1) = 1.
    A → 1. No change.
  • Update B: same. No change.
  • no_change = True, break.

So for a trivially symmetric two-node case, it converges in 2
iterations. The problem is non-symmetric cases where the update
order causes asymmetric flips. On our observed graph, the result
was a persistent oscillation with no_change never reaching True.

A second reason the tiebreaker fails to converge: max(community_candidate, curr_community)
prefers higher-numbered communities, but the initial assignment
gives each node a community index equal to its position in the
projection dict. If the iteration order is stable (Python 3.7+),
this can create a systematic bias that flips nodes back and forth.

Reproducer

The simplest repro we observed is running build_communities() on
a live knowledge graph with ~1,000+ loosely-connected nodes. A
synthetic repro requires a specific edge topology; we haven't
narrowed down the minimum-viable graph yet, but we can share an
export of the FalkorDB graph that triggers it (2.8 MB dump) if
that would help.

If you want an algorithmic test, Erdős–Rényi random graphs at
n=1000, p=0.005 tend to produce the oscillation pattern.

Suggested fix

Two complementary changes:

  1. Add an iteration cap. Real-world label propagation on
    well-structured graphs converges in fewer than 20 iterations.
    A cap of 200 is a safety ceiling — if we hit it, the labels are
    approximate but the build moves forward instead of hanging.

    MAX_ITERATIONS = 200
    
    for iteration in range(MAX_ITERATIONS):
        no_change = True
        new_community_map: dict[str, int] = {}
        # ... body ...
        if no_change:
            logger.info('label_propagation converged in %d iterations', iteration + 1)
            break
        community_map = new_community_map
    else:
        logger.warning(
            'label_propagation hit the %d-iteration safety cap without '
            'converging; community labels are approximate', MAX_ITERATIONS,
        )
        community_map = new_community_map
  2. Fix the tiebreaker to prefer stability over bias.
    Instead of max(community_candidate, curr_community), prefer
    curr_community when the rank is tied — this is how the
    reference label-propagation algorithm is usually specified,
    and it short-circuits oscillation almost entirely:

    if community_candidate != -1 and candidate_rank > 1:
        new_community = community_candidate
    else:
        # Prefer the current community when tied — avoids oscillation
        new_community = curr_community if community_candidate in community_candidates else community_candidate

Our workaround

We monkey-patch label_propagation with an iteration-capped version
before calling graphiti.build_communities(). Full patch available
at scripts/build_communities.py::_apply_label_propagation_patch.
Cap is 200, converges in <20 on healthy graphs, hits the cap and
logs a warning on pathological ones.

Impact

For any graph that hits the oscillation pattern, build_communities
hangs indefinitely (we killed ours at 44 minutes). This is a
production-blocking issue for anyone using Graphiti's community
feature at scale, and we suspect many users never notice because
they silently run out of patience and kill the build.

Labels suggestion

bug, performance, community, infinite loop

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions