statement-store: new api implementation by DenzelPenzel · Pull Request #11989 · paritytech/polkadot-sdk

DenzelPenzel · 2026-05-05T21:12:34Z

Summary

In this PR, we add the unstable statement-store JSON-RPC surface and wire it into the parachain node RPC stack. This lets clients submit SCALE-encoded statements over RPC, open one long-lived statement subscription, and then attach or remove topic filters on that subscription without opening a new stream for each filter.

The subscription flow is split into statement_unstable_subscribe and statement_unstable_add_filter. A subscription starts empty, each added filter gets its own filterId, and live notifications carry the ids of the filters that matched the statement. That lets a client track several statement topics over a single RPC subscription while still knowing which filters produced each replay or live event.

RPC Shape

We add statement_unstable_submit, statement_unstable_subscribe, statement_unstable_add_filter, and statement_unstable_remove_filter under sc-rpc-spec-v2::statement.

statement_unstable_submit decodes submitted statement bytes and maps store results into RPC-level outcomes: new, known, rejected, or invalid. Subscription state is scoped to the jsonrpsee connection that created it, so a filter can only be added to or removed from a subscription owned by the same connection.

For filters, the unstable RPC accepts any and matchAll. matchAny is rejected at the RPC boundary for now, which keeps the external API aligned with the current unstable contract while the store internals can still use the optimized filter representation.

Subscription Semantics

Multi-filter subscriptions are handled by the existing statement subscription matcher workers. add_filter validates capacity, allocates a filterId, queues an AddFilter message for the matcher, and returns without waiting for replay snapshot collection. The matcher then collects the replay snapshot and registers the filter in the same critical section, so live statements cannot slip between the snapshot and filter registration.

For each added filter the subscription emits:

replayStatements batches for already-admitted matching statements
replayDone once that filter's replay is drained
newStatements for live statements, including all matching filterIds
stop if local subscription resource caps are hit

Live statements that arrive while a replay is still in progress are kept in matcher-owned pending state, then released once replay ordering allows it. Statements already delivered by replay are kept out of the live path for that filter, avoiding duplicate delivery for the common "submit, then subscribe" case.

Each subscription is capped at 128 active filters. Filter removal is idempotent, and dropping the RPC subscription cleans up matcher state.

DenzelPenzel · 2026-05-05T21:16:24Z

/cmd fmt

DenzelPenzel · 2026-05-08T15:58:02Z

/cmd fmt

…' into denzelpenzel/statement-store-api # Conflicts: # substrate/client/statement-store/src/subscription.rs

DenzelPenzel · 2026-05-11T11:52:50Z

/cmd prdoc --audience node_dev --bump patch

…e_dev --bump patch'

…ent-store-api # Conflicts: # cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store/integration.rs

alexggh · 2026-05-12T09:19:45Z

+	filter: OptimizedTopicFilter,
+) -> Result<AddFilterOutcome, Error> {
+	let handle = state.lock().handle.clone();
+	match handle.add_filter(filter) {


Alright I thought a bit about what the approach should be I don't like at all having 4 locks for this subscription, so I would go about this the following way:

Make this lock actually protect the SubscriptionHandle which makes sure the operation is allowed and makes sense.

Maybe consider using an async mutex here is critical section might be long.

After validating, while holding the lock send an "ADD_FILTER" to the subscription worker.

Then the loop in SubscriptionsHandle should collect the needed statements pendingReplys and park any statements that arrive while we send the pendingReplys.

Then the run_subscription_task, should need just the channel from matcher and will do just a serving of the subscription.

I think this will simplify the implementation, let me know if you find any roadblocks.

I reworked this so add_filter no longer waits for replay collection. The handle only validates capacity, allocates the filterId under its local lock, queues an AddFilter control message, and returns the id.

The matcher worker now owns the replay path: once it receives AddFilter, it collects the snapshot and registers the filter in the same critical section, then wakes any pending stream request. LiveEventStream::poll_next no longer does replay/pending-live coordination itself, it only asks the matcher for the next ready event.

@alexggh

P1sar · 2026-05-12T14:18:53Z

 #[derive(Debug, Clone, Eq, PartialEq)]
 #[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
 #[cfg_attr(feature = "serde", serde(tag = "reason", rename_all = "camelCase"))]
 pub enum RejectionReason {


#[serde(rename_all_fields = "camelCase")] to rename Enum variants. details

P1sar · 2026-05-12T14:21:12Z

+			SubmitResult::Rejected(reason) => Ok(SubmitOutcome::Rejected(reason)),
+			SubmitResult::Invalid(reason) => Ok(SubmitOutcome::Invalid(reason)),
+			SubmitResult::KnownExpired => {
+				Err(Error::InternalError("store returned KnownExpired for local submit".into()))


Should not this also be SubmitOutcome::Invalid(reason)? Spec expects only new, known, rejected and invalid.

P1sar · 2026-05-12T15:55:40Z

+		{
+			let state = self.state.lock();
+			if state.active_filter_ids.len() >= MAX_FILTERS_PER_SUBSCRIPTION ||
+				state.pending_replays.len() >= PENDING_REPLAYS_HARD_CAP


Although this is capped, the size of the response is not, 128 matchAny requests can be open per every 16 connections. So it is 2048 matchAny that basically want to return all the statement store. And while 1 filter is streaming to the client everything else sits in RAM. Because as soon as filter request received we scale decode it and keep in store.pending_replays inside register_filter_with_snapshot. Even with 100MB state it will be a 200Gb.
There should be a way to make a "lazy" approach.

DenzelPenzel · 2026-05-13T22:27:58Z

Statement Store RPC Bench Report

Parameter	Value
Nodes	5 collators
Clients	50,000
Rounds	1
Interval	10,000 ms
Messages per client	5
Message size	512 bytes
Messages per round	250,000
RPC pool	5,000 connections x 5 nodes = 25,000 total
Collator RPC max connections	51,000
Collator RPC max subscriptions per connection	160

Aggregate Result

Lower latency and send time are better. On the main latency metric, v2 is
slower than v1 by 12.22%.

Metric	v1 `bench`	v2 `unstable_bench`	v2 vs v1
Send avg, s	35.729000	39.170000	+9.63%
Receive avg, s	0.000000	0.925000	n/a
Latency avg, s	35.729000	40.095000	+12.22%
Attempts avg/msg	1.000000	1.000000	0.00%
Elapsed avg, s	218.00	256.00	+17.43%

Per-Run Data

v1: `bench`

Run	Send avg, s	Receive avg, s	Latency avg, s	Latency max, s	Elapsed, s
1	35.729	0.000	35.729	58.866	218

v2: `unstable_bench`

Run	Send avg, s	Receive avg, s	Latency avg, s	Latency max, s	Elapsed, s
1	39.170	0.925	40.095	52.248	256

Tail Behavior

This run has only one sample per bench, so median, mean max, and worst max are
all single-run values. v2 has worse average latency, but better max latency in
this sample.

Metric	v1 `bench`	v2 `unstable_bench`	v2 vs v1
Median latency avg, s	35.729	40.095	+12.22%
Mean of latency max, s	58.866	52.248	-11.24%
Worst latency max, s	58.866	52.248	-11.24%

Conclusion

v2 is worse on the primary latency path in this run:

Comparison	Result
Average latency	v2 is 12.22% slower
Average send time	v2 is 9.63% slower
Median per-run latency	v2 is 12.22% slower
Mean max latency	v2 is 11.24% better
Worst observed max latency	v2 is 11.24% better
Average elapsed time	v2 is 17.43% slower

@alexggh

DenzelPenzel · 2026-05-14T10:42:58Z

+			.and_then(|connection| connection.get(sub_id).cloned())
+	}
+
+	pub async fn get_with_timeout(


Added get_with_timeout because accept().await can return the subscription id to the client before we finish storing it in StatementSubscriptions.

A fast client can then call add_filter immediately, and a plain get may falsely return InvalidSubscription.

So this is a short bounded wait for the subscribe/register handoff, not a general retry mechanism

Did you ever hit this problem or just defensive programming here?

No, I identified it during the bench tests. This seems to be the only viable option I see to fix it without changing the protocol contract

No, I identified it during the bench tests.

You mean, you hit this problems or just thought it might happen ?

This seems to be the only viable option I see to fix it without changing the protocol contract

The core issue is that we call accept before we call register, we should be able to call subscriptions.register before accept, all we need is a getter in PendingSubscriptionSink for sub_id.

Can you see how much friction you would get if you go this route ?

Yes, I hit this problem while running benchmark tests.

alexggh · 2026-05-14T13:15:32Z

+
+		let fut = async move {
+			run_subscription_task(sink, live_stream).await;
+			drop(entry);


Is this really needed ?

for keeping SubscriptionEntry inside async block

statement-store: new api implementation

7a9b775

DenzelPenzel marked this pull request as draft May 5, 2026 21:12

github-actions Bot and others added 3 commits May 5, 2026 21:19

Update from github-actions[bot] running command 'fmt'

b8de7a9

statement-store: support multi-filter RPC subscriptions

62bbfd2

statement-store: cap stop tests

e286b2a

DenzelPenzel requested a review from alexggh May 8, 2026 15:57

DenzelPenzel marked this pull request as ready for review May 8, 2026 15:57

github-actions Bot and others added 4 commits May 8, 2026 16:01

Update from github-actions[bot] running command 'fmt'

88fb3a8

statement-store: docs

e8de00d

Merge remote-tracking branch 'origin/denzelpenzel/statement-store-api…

777ebc6

…' into denzelpenzel/statement-store-api # Conflicts: # substrate/client/statement-store/src/subscription.rs

statement-store: docs

c6792de

DenzelPenzel added T0-node This PR/Issue is related to the topic “node”. T10-tests This PR/Issue is related to tests. labels May 8, 2026

statement-store: docs

c6836c4

github-actions Bot and others added 2 commits May 11, 2026 11:55

Update from github-actions[bot] running command 'prdoc --audience nod…

def5ff4

…e_dev --bump patch'

Merge remote-tracking branch 'origin/master' into denzelpenzel/statem…

c9b35e7

…ent-store-api # Conflicts: # cumulus/zombienet/zombienet-sdk/tests/zombie_ci/statement_store/integration.rs

alexggh reviewed May 12, 2026

View reviewed changes

P1sar reviewed May 12, 2026

View reviewed changes

P1sar requested changes May 12, 2026

View reviewed changes

DenzelPenzel added 3 commits May 12, 2026 21:19

statement-store: clean up unstable subscription handling

8c61ec2

statement-store: clean up v2 api

53d65b9

statement-store: get subs with timeout

03deb07

DenzelPenzel commented May 14, 2026

View reviewed changes

alexggh reviewed May 14, 2026

View reviewed changes

statement-store: reuse submit outcome across RPC APIs

787ba0d

DenzelPenzel added 3 commits May 15, 2026 14:47

statement-store: make add filter async

0e6b147

statement-store: trim unstable subscription cleanup

0d04255

statement-store: inline subscription task loop

bcd3d6b

Conversation

DenzelPenzel commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

RPC Shape

Subscription Semantics

Uh oh!

DenzelPenzel commented May 5, 2026

Uh oh!

DenzelPenzel commented May 8, 2026

Uh oh!

DenzelPenzel commented May 11, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DenzelPenzel May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DenzelPenzel commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Statement Store RPC Bench Report

Aggregate Result

Per-Run Data

v1: bench

v2: unstable_bench

Tail Behavior

Conclusion

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DenzelPenzel commented May 5, 2026 •

edited

Loading

DenzelPenzel May 12, 2026 •

edited

Loading

DenzelPenzel commented May 13, 2026 •

edited

Loading

v1: `bench`

v2: `unstable_bench`