Skip to content
This repository was archived by the owner on Oct 10, 2025. It is now read-only.

vector extension optional params + features#5955

Merged
carminite merged 20 commits intomasterfrom
origin/vector-index-optional-params
Sep 8, 2025
Merged

vector extension optional params + features#5955
carminite merged 20 commits intomasterfrom
origin/vector-index-optional-params

Conversation

@carminite
Copy link
Copy Markdown
Contributor

@carminite carminite commented Sep 3, 2025

Description

  • Implements skip_if_exists optional param for CREATE_VECTOR_INDEX cypher function
  • Implements skip_if_not_exists optional param for DROP_VECTOR_INDEX cypher function
  • Basic test of all above in error_suppress.test
  • ~~Implemented DROP_ALL_VECTOR_INDEXES(tableName) to erase all (and only) vector indexes from a table.~~FEATURE REMOVED
  • Basic test of above in drop_all.test ditto

Fixes #5808
Associated docs (issue or PR):
#5808
#5755 (not too sure whether I should cover this as well)

Contributor agreement

@carminite
Copy link
Copy Markdown
Contributor Author

General changes and worries:

  • skip_if_exists is an optional parameter in CREATE_VECTOR_INDEX, default set to false
  • skipIfExists is a new field in HNSWIndexConfig (though it doesn't have any use after the cypher query resolves)
  • skipAfterBind is a new flag in CreateHNSWIndexBindData that gets set to true if the correct BinderException::what() message is detected
  • Since many of the required fields in CreateHNSWIndexBindData are invalid after erroneous bind, placeholder (bad) values are used (m a g i c n u m b e r s)

@carminite carminite marked this pull request as draft September 3, 2025 16:07
Comment thread test_list Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 3, 2025

Codecov Report

❌ Patch coverage is 89.83051% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.76%. Comparing base (45a8b88) to head (aa33657).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
extension/vector/src/index/hnsw_index_utils.cpp 70.00% 6 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5955   +/-   ##
=======================================
  Coverage   85.76%   85.76%           
=======================================
  Files        1641     1641           
  Lines       75467    75525   +58     
  Branches     8985     9003   +18     
=======================================
+ Hits        64721    64777   +56     
- Misses      10488    10490    +2     
  Partials      258      258           
Flag Coverage Δ
extension 63.82% <89.83%> (+0.02%) ⬆️
in-mem 80.98% <ø> (+0.01%) ⬆️
on-disk 86.32% <ø> (-0.01%) ⬇️
recovery 86.33% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Sep 3, 2025

Benchmark Result

Master commit hash: 7924145a5c02d5c89f1d95c28d9ec4dc5d0abc1b
Branch commit hash: f79223a2dd4feae532347f48c19e094585bde83e

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
join q31 7.60 5.05 2.56 (50.67%)
join SelectiveTwoHopJoin 42.50 54.20 -11.70 (-21.58%)
ldbc_snb_is q33 16.87 13.03 3.84 (29.47%)
multi-rel multi-rel-lookup 6.16 10.83 -4.67 (-43.11%)
recursive_join recursive-join-dense 5424.52 7110.42 -1685.90 (-23.71%)
recursive_join recursive-join-trail 5674.21 7097.19 -1422.98 (-20.05%)
Other queries
Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 699.80 709.60 -9.80 (-1.38%)
aggregation q28 7678.89 7649.37 29.52 (0.39%)
filter q14 59.81 59.58 0.23 (0.38%)
filter q15 62.64 62.56 0.08 (0.12%)
filter q16 273.44 275.96 -2.53 (-0.92%)
filter q17 378.66 382.22 -3.56 (-0.93%)
filter q18 1837.74 1866.12 -28.38 (-1.52%)
filter zonemap-node 23.62 23.60 0.02 (0.08%)
filter zonemap-node-lhs-cast 23.95 24.06 -0.11 (-0.48%)
filter zonemap-node-null 23.71 23.52 0.19 (0.81%)
filter zonemap-rel 5566.94 5588.44 -21.50 (-0.38%)
fixed_size_expr_evaluator q07 620.51 618.55 1.96 (0.32%)
fixed_size_expr_evaluator q08 905.07 905.80 -0.73 (-0.08%)
fixed_size_expr_evaluator q09 904.55 904.86 -0.31 (-0.03%)
fixed_size_expr_evaluator q10 190.27 189.75 0.52 (0.28%)
fixed_size_expr_evaluator q11 189.83 190.34 -0.52 (-0.27%)
fixed_size_expr_evaluator q12 168.37 167.38 1.00 (0.59%)
fixed_size_expr_evaluator q13 1494.99 1507.17 -12.19 (-0.81%)
fixed_size_seq_scan q23 46.97 44.76 2.21 (4.93%)
join q29 756.20 789.36 -33.16 (-4.20%)
join q30 1688.06 1754.01 -65.95 (-3.76%)
ldbc_snb_ic q35 9.92 8.78 1.14 (13.04%)
ldbc_snb_ic q36 98.64 90.76 7.88 (8.69%)
ldbc_snb_is q32 5.12 5.00 0.12 (2.37%)
ldbc_snb_is q34 1.17 1.26 -0.09 (-7.26%)
limit push-down-limit-into-distinct 1986.04 1932.60 53.44 (2.77%)
multi-rel multi-rel-large-scan 1509.42 1465.64 43.78 (2.99%)
multi-rel multi-rel-small-scan 194.93 205.28 -10.36 (-5.04%)
order_by q25 63.31 66.21 -2.91 (-4.39%)
order_by q26 375.23 375.89 -0.65 (-0.17%)
order_by q27 1298.90 1306.02 -7.12 (-0.55%)
recursive_join recursive-join-bidirection 291.44 314.81 -23.37 (-7.42%)
recursive_join recursive-join-path 22528.65 22957.64 -428.99 (-1.87%)
recursive_join recursive-join-sparse 7.28 7.38 -0.10 (-1.40%)
scan_after_filter q01 104.37 105.13 -0.75 (-0.72%)
scan_after_filter q02 89.00 91.24 -2.24 (-2.46%)
shortest_path_ldbc100 q37 75.84 77.53 -1.69 (-2.18%)
shortest_path_ldbc100 q38 281.95 340.38 -58.43 (-17.17%)
shortest_path_ldbc100 q39 83.43 87.19 -3.77 (-4.32%)
shortest_path_ldbc100 q40 544.87 524.13 20.74 (3.96%)
var_size_expr_evaluator q03 2119.40 2087.13 32.27 (1.55%)
var_size_expr_evaluator q04 2171.20 2167.10 4.10 (0.19%)
var_size_expr_evaluator q05 2518.35 2517.90 0.45 (0.02%)
var_size_expr_evaluator q06 1265.66 1274.50 -8.84 (-0.69%)
var_size_seq_scan q19 1343.33 1345.01 -1.69 (-0.13%)
var_size_seq_scan q20 2639.06 2667.46 -28.40 (-1.06%)
var_size_seq_scan q21 2157.58 2156.60 0.98 (0.05%)
var_size_seq_scan q22 109.60 109.01 0.58 (0.54%)

@carminite carminite self-assigned this Sep 3, 2025
Comment thread extension/vector/src/function/create_hnsw_index.cpp Outdated
Comment thread extension/vector/src/function/drop_hnsw_index.cpp Outdated
Copy link
Copy Markdown
Contributor

@sdht0 sdht0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let @ray6080 do the main review. Just leaving some observations.

Comment thread extension/vector/src/function/create_hnsw_index.cpp Outdated
Comment thread extension/vector/src/function/create_hnsw_index.cpp Outdated
@carminite
Copy link
Copy Markdown
Contributor Author

New solution checks if indexExists twice with the same values, might be redundant. In the future, split bindNodeTable and move some logic to the operation bindFunc

@carminite
Copy link
Copy Markdown
Contributor Author

Above comment resolved, should be good to go

@carminite carminite marked this pull request as ready for review September 3, 2025 21:42
@@ -0,0 +1,87 @@
#include "catalog/catalog.h"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need such function right now. @ray6080 can comment there as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread extension/vector/src/include/function/hnsw_index_functions.h
Comment thread extension/vector/src/include/function/hnsw_index_functions.h Outdated
Comment thread extension/vector/src/include/index/hnsw_config.h Outdated
Comment thread extension/vector/src/include/index/hnsw_index_utils.h Outdated
Comment thread extension/vector/src/index/hnsw_index_utils.cpp Outdated
Comment thread extension/vector/src/main/vector_extension.cpp Outdated
Comment thread extension/vector/test/test_files/drop_all.test Outdated
Comment thread extension/vector/src/index/hnsw_index_utils.cpp
Comment thread extension/vector/src/index/hnsw_index_utils.cpp Outdated
Comment thread extension/vector/src/index/hnsw_index_utils.cpp Outdated
Comment thread extension/vector/src/index/hnsw_index_utils.cpp
Comment thread extension/vector/src/index/hnsw_config.cpp
@carminite carminite requested a review from acquamarin September 5, 2025 17:42
Comment thread extension/vector/src/function/create_hnsw_index.cpp
Comment thread extension/vector/src/index/hnsw_index_utils.cpp Outdated
Copy link
Copy Markdown
Contributor

@ray6080 ray6080 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

Let me know to take a final look when you're done with comments and rebasing.

const auto nodeTableEntry = HNSWIndexUtils::bindNodeTable(*context, tableName);
if (HNSWIndexUtils::validateIndexExistence(*context, nodeTableEntry, indexName,
HNSWIndexUtils::IndexOperation::CREATE, config.conflictAction)) {
return std::make_unique<CreateHNSWIndexBindData>(context, indexName, nullptr, 0, 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We inline comments as follows:

Suggested change
return std::make_unique<CreateHNSWIndexBindData>(context, indexName, nullptr, 0, 0,
return std::make_unique<CreateHNSWIndexBindData>(context, indexName, nullptr /*nodeTableEntry*/, 0 /*propertyID*/, 0 /*numNodes*/,

catalog::TableCatalogEntry* tableEntry;
common::property_id_t propertyID;
HNSWIndexConfig config;
bool skipAfterBind;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename this to skipIndexCreation, which is more accurate.

struct DropHNSWIndexBindData final : TableFuncBindData {
catalog::NodeTableCatalogEntry* tableEntry;
std::string indexName;
bool skipAfterBind;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename to something along the line of skipIndexDropping.

@carminite carminite merged commit 3ebd904 into master Sep 8, 2025
27 of 28 checks passed
@carminite carminite deleted the origin/vector-index-optional-params branch September 8, 2025 18:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

extensions(fts,vector): support IF NOT EXISTS

4 participants