Skip to content

statement-store: new api implementation#11989

Open
DenzelPenzel wants to merge 18 commits into
masterfrom
denzelpenzel/statement-store-api
Open

statement-store: new api implementation#11989
DenzelPenzel wants to merge 18 commits into
masterfrom
denzelpenzel/statement-store-api

Conversation

@DenzelPenzel
Copy link
Copy Markdown
Contributor

@DenzelPenzel DenzelPenzel commented May 5, 2026

Impl #10997

Summary

In this PR, we add the unstable statement-store JSON-RPC surface and wire it into the parachain node RPC stack. This lets clients submit SCALE-encoded statements over RPC, open one long-lived statement subscription, and then attach or remove topic filters on that subscription without opening a new stream for each filter.

The subscription flow is split into statement_unstable_subscribe and statement_unstable_add_filter. A subscription starts empty, each added filter gets its own filterId, and live notifications carry the ids of the filters that matched the statement. That lets a client track several statement topics over a single RPC subscription while still knowing which filters produced each replay or live event.

RPC Shape

We add statement_unstable_submit, statement_unstable_subscribe, statement_unstable_add_filter, and statement_unstable_remove_filter under sc-rpc-spec-v2::statement.

statement_unstable_submit decodes submitted statement bytes and maps store results into RPC-level outcomes: new, known, rejected, or invalid. Subscription state is scoped to the jsonrpsee connection that created it, so a filter can only be added to or removed from a subscription owned by the same connection.

For filters, the unstable RPC accepts any and matchAll. matchAny is rejected at the RPC boundary for now, which keeps the external API aligned with the current unstable contract while the store internals can still use the optimized filter representation.

Subscription Semantics

Multi-filter subscriptions are handled by the existing statement subscription matcher workers. add_filter validates capacity, allocates a filterId, queues an AddFilter message for the matcher, and returns without waiting for replay snapshot collection. The matcher then collects the replay snapshot and registers the filter in the same critical section, so live statements cannot slip between the snapshot and filter registration.

For each added filter the subscription emits:

  • replayStatements batches for already-admitted matching statements
  • replayDone once that filter's replay is drained
  • newStatements for live statements, including all matching filterIds
  • stop if local subscription resource caps are hit

Live statements that arrive while a replay is still in progress are kept in matcher-owned pending state, then released once replay ordering allows it. Statements already delivered by replay are kept out of the live path for that filter, avoiding duplicate delivery for the common "submit, then subscribe" case.

Each subscription is capped at 128 active filters. Filter removal is idempotent, and dropping the RPC subscription cleans up matcher state.

@DenzelPenzel DenzelPenzel marked this pull request as draft May 5, 2026 21:12
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd fmt

@DenzelPenzel DenzelPenzel requested a review from alexggh May 8, 2026 15:57
@DenzelPenzel DenzelPenzel marked this pull request as ready for review May 8, 2026 15:57
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd fmt

@DenzelPenzel DenzelPenzel added T0-node This PR/Issue is related to the topic “node”. T10-tests This PR/Issue is related to tests. labels May 8, 2026
@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

/cmd prdoc --audience node_dev --bump patch

github-actions Bot and others added 2 commits May 11, 2026 11:55
…ent-store-api

# Conflicts:
#	cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store/integration.rs
Comment thread cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store/common.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs
Comment thread substrate/client/statement-store/src/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/subscription.rs Outdated
Comment thread substrate/client/rpc-spec-v2/src/statement/statement.rs
filter: OptimizedTopicFilter,
) -> Result<AddFilterOutcome, Error> {
let handle = state.lock().handle.clone();
match handle.add_filter(filter) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I thought a bit about what the approach should be I don't like at all having 4 locks for this subscription, so I would go about this the following way:

  1. Make this lock actually protect the SubscriptionHandle which makes sure the operation is allowed and makes sense.
  2. Maybe consider using an async mutex here is critical section might be long.
  3. After validating, while holding the lock send an "ADD_FILTER" to the subscription worker.
  4. Then the loop in SubscriptionsHandle should collect the needed statements pendingReplys and park any statements that arrive while we send the pendingReplys.
  5. Then the run_subscription_task, should need just the channel from matcher and will do just a serving of the subscription.

I think this will simplify the implementation, let me know if you find any roadblocks.

Copy link
Copy Markdown
Contributor Author

@DenzelPenzel DenzelPenzel May 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworked this so add_filter no longer waits for replay collection. The handle only validates capacity, allocates the filterId under its local lock, queues an AddFilter control message, and returns the id.

The matcher worker now owns the replay path: once it receives AddFilter, it collects the snapshot and registers the filter in the same critical section, then wakes any pending stream request. LiveEventStream::poll_next no longer does replay/pending-live coordination itself, it only asks the matcher for the next ready event.

@alexggh

#[derive(Debug, Clone, Eq, PartialEq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[cfg_attr(feature = "serde", serde(tag = "reason", rename_all = "camelCase"))]
pub enum RejectionReason {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#[serde(rename_all_fields = "camelCase")] to rename Enum variants. details

SubmitResult::Rejected(reason) => Ok(SubmitOutcome::Rejected(reason)),
SubmitResult::Invalid(reason) => Ok(SubmitOutcome::Invalid(reason)),
SubmitResult::KnownExpired => {
Err(Error::InternalError("store returned KnownExpired for local submit".into()))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not this also be SubmitOutcome::Invalid(reason)? Spec expects only new, known, rejected and invalid.

{
let state = self.state.lock();
if state.active_filter_ids.len() >= MAX_FILTERS_PER_SUBSCRIPTION ||
state.pending_replays.len() >= PENDING_REPLAYS_HARD_CAP
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is capped, the size of the response is not, 128 matchAny requests can be open per every 16 connections. So it is 2048 matchAny that basically want to return all the statement store. And while 1 filter is streaming to the client everything else sits in RAM. Because as soon as filter request received we scale decode it and keep in store.pending_replays inside register_filter_with_snapshot. Even with 100MB state it will be a 200Gb.
There should be a way to make a "lazy" approach.

@DenzelPenzel
Copy link
Copy Markdown
Contributor Author

DenzelPenzel commented May 13, 2026

Statement Store RPC Bench Report

Parameter Value
Nodes 5 collators
Clients 50,000
Rounds 1
Interval 10,000 ms
Messages per client 5
Message size 512 bytes
Messages per round 250,000
RPC pool 5,000 connections x 5 nodes = 25,000 total
Collator RPC max connections 51,000
Collator RPC max subscriptions per connection 160

Aggregate Result

Lower latency and send time are better. On the main latency metric, v2 is
slower than v1 by 12.22%.

Metric v1 bench v2 unstable_bench v2 vs v1
Send avg, s 35.729000 39.170000 +9.63%
Receive avg, s 0.000000 0.925000 n/a
Latency avg, s 35.729000 40.095000 +12.22%
Attempts avg/msg 1.000000 1.000000 0.00%
Elapsed avg, s 218.00 256.00 +17.43%

Per-Run Data

v1: bench

Run Send avg, s Receive avg, s Latency avg, s Latency max, s Elapsed, s
1 35.729 0.000 35.729 58.866 218

v2: unstable_bench

Run Send avg, s Receive avg, s Latency avg, s Latency max, s Elapsed, s
1 39.170 0.925 40.095 52.248 256

Tail Behavior

This run has only one sample per bench, so median, mean max, and worst max are
all single-run values. v2 has worse average latency, but better max latency in
this sample.

Metric v1 bench v2 unstable_bench v2 vs v1
Median latency avg, s 35.729 40.095 +12.22%
Mean of latency max, s 58.866 52.248 -11.24%
Worst latency max, s 58.866 52.248 -11.24%

Conclusion

v2 is worse on the primary latency path in this run:

Comparison Result
Average latency v2 is 12.22% slower
Average send time v2 is 9.63% slower
Median per-run latency v2 is 12.22% slower
Mean max latency v2 is 11.24% better
Worst observed max latency v2 is 11.24% better
Average elapsed time v2 is 17.43% slower

@alexggh

.and_then(|connection| connection.get(sub_id).cloned())
}

pub async fn get_with_timeout(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added get_with_timeout because accept().await can return the subscription id to the client before we finish storing it in StatementSubscriptions.

A fast client can then call add_filter immediately, and a plain get may falsely return InvalidSubscription.

So this is a short bounded wait for the subscribe/register handoff, not a general retry mechanism

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you ever hit this problem or just defensive programming here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I identified it during the bench tests. This seems to be the only viable option I see to fix it without changing the protocol contract

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I identified it during the bench tests.

You mean, you hit this problems or just thought it might happen ?

This seems to be the only viable option I see to fix it without changing the protocol contract

The core issue is that we call accept before we call register, we should be able to call subscriptions.register before accept, all we need is a getter in PendingSubscriptionSink for sub_id.

Can you see how much friction you would get if you go this route ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I hit this problem while running benchmark tests.


let fut = async move {
run_subscription_task(sink, live_stream).await;
drop(entry);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for keeping SubscriptionEntry inside async block

Comment thread substrate/client/rpc-spec-v2/src/statement/event.rs Outdated
Comment thread substrate/client/statement-store/src/subscription.rs
Comment thread substrate/client/statement-store/src/subscription.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

T0-node This PR/Issue is related to the topic “node”. T10-tests This PR/Issue is related to tests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants