Skip to content

CAGRA: variable graph degree for CPU-based algorithm#2031

Open
achirkin wants to merge 19 commits into
NVIDIA:mainfrom
achirkin:fea-variable-graph-degree
Open

CAGRA: variable graph degree for CPU-based algorithm#2031
achirkin wants to merge 19 commits into
NVIDIA:mainfrom
achirkin:fea-variable-graph-degree

Conversation

@achirkin

Copy link
Copy Markdown
Contributor

Modify optimize routine of CAGRA build process to allow variable graph degree.

Introduce variable_graph_degree_fraction parameter (by default = 1.0 = normal / constant degree behavior). This parameter defines the minimum allowed graph degree for any graph node.

@copy-pr-bot

copy-pr-bot Bot commented Apr 16, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@achirkin achirkin self-assigned this Apr 16, 2026
@achirkin achirkin added feature request New feature or request non-breaking Introduces a non-breaking change labels Apr 16, 2026
@achirkin achirkin moved this to In Progress in Unstructured Data Processing Apr 16, 2026
@achirkin

Copy link
Copy Markdown
Contributor Author

/ok to test

@achirkin

Copy link
Copy Markdown
Contributor Author

/ok to test

@achirkin

Copy link
Copy Markdown
Contributor Author

/ok to test

@mfoerste4 mfoerste4 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

graph_core.cuh looks good to me.

Comment on lines +1868 to +1873
if (variable_graph_degree) {
RAFT_LOG_INFO("# Pruning kNN graph (size=%lu, degree=%lu, target_pruned_degree=%lu)",
graph_size,
knn_graph_degree,
target_pruned_degree);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to move down the legacy RAFT_LOG here as well and combine it.

0.0,
false,
normalize_mean);
raft::copy(res, raft::make_host_scalar_view(&avg_natural), d_avg_natural.view());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you not choose map_reduce on the vector directly?

Comment on lines +355 to +364
if constexpr (VariableDegree) {
if (i + 1 == target_pruned_degree) {
// Freeze the detour level after we've placed exactly target_pruned_degree edges.
target_detour_level = warp_min_count;
} else if (i >= target_pruned_degree && warp_min_count > target_detour_level &&
natural_degree == output_graph_degree) {
// The detour level just rose above the target band. Record the natural degree once.
natural_degree = i;
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we just track the 'natural_degree' here but continue to fill up the output graph - is this required?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because we need the full graph in the other steps; we only use the natural degree later during the merging step.

@achirkin

Copy link
Copy Markdown
Contributor Author

/ok to test

@achirkin achirkin marked this pull request as ready for review July 1, 2026 12:50
@achirkin achirkin requested a review from a team as a code owner July 1, 2026 12:50
@cjnolet

cjnolet commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@achirkin can we get some benchmarks attached to this PR please demonstrating the improvements?

It's a lot easier to point users to PRs to see benefits of changes than it is to have to look up slide decks or internal convos.

@achirkin

achirkin commented Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Here are results on couple datasets. We look at red line (CAGRA graph with default parameters) vs broun line (HNSW graph). I'm not yet happy with the performance and want to tweak the merging logic just a little bit more. Apart from this, the PR is ready for review.

search-wiki-split-10M-vardegree search-openai-5M-vardegree

(the build time is virtually unchanged with/without variable degree option)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Introduces a non-breaking change

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

4 participants