Skip to content

[improve][broker] PIP-473 P5.1: metadata-driven transaction coordinator (NEW_TXN / END_TXN)#25863

Open
merlimat wants to merge 2 commits into
apache:masterfrom
merlimat:mmerli/pip-473-tc-v5
Open

[improve][broker] PIP-473 P5.1: metadata-driven transaction coordinator (NEW_TXN / END_TXN)#25863
merlimat wants to merge 2 commits into
apache:masterfrom
merlimat:mmerli/pip-473-tc-v5

Conversation

@merlimat
Copy link
Copy Markdown
Contributor

Summary

Adds the PIP-473 metadata-driven v5 transaction coordinator — the broker-side service that serves NEW_TXN / END_TXN against the metadata-store transaction layout (P2's TxnMetadataStore) instead of the legacy TransactionMetadataStoreService.

Gated behind a new transactionCoordinatorScalableTopicsEnabled flag (default off). When on, ServerCnx routes the TC wire commands to TransactionCoordinatorV5; when off, behavior is unchanged. The flag is meant to be flipped together with the scalable-topic transaction buffer and pending-ack store providers (a later phase).

Leader election

Per-partition coordinator, same as the legacy TC: a broker runs the v5 TC for partition N iff it owns partition N of transaction_coordinator_assign. handleClientConnect mirrors the legacy ownership check, so the client-side discovery surface is unchanged. (A pure metadata-store election is a later phase; reusing the assign-topic keeps this PR's scope tight.)

Wire commands

Command Behavior
TC_CLIENT_CONNECT ownership check for the tcId's assign partition
NEW_TXN create /txn/id/<tcId>_<seq> header in OPEN
END_TXN CAS header to COMMITTED/ABORTED, fan out participant events
ADD_PARTITION_TO_TXN, ADD_SUBSCRIPTION_TO_TXN no-op

ADD_PARTITION / ADD_SUBSCRIPTION are no-ops per the PIP: v5 participants advertise themselves by writing /txn/op records when they apply ops, so the pre-registration step is unnecessary.

txnId generation — no reuse

leastSigBits is drawn from a per-tcId monotonic counter at /txn/tc-seq/<tcId> (TcSequence, CAS-incremented with retry on BadVersionException). Monotonic-per-tcId ⟹ txnIds are never reused — the participant-side aborted set is keyed by txnId, and reuse would corrupt it.

endTransaction fan-out

END_TXN CAS-updates the header to the terminal state, then enumerates the txn's participants via the new idx:ops-by-txn secondary index on /txn/op (so it doesn't scan the whole namespace), and publishes one segment-event per affected segment + one subscription-event per affected (segment, subscription) pair. Idempotent on retry — a header already in the requested terminal state short-circuits.

Test plan

  • pulsar-broker:test --tests TransactionCoordinatorV5Test — 8 cases: sequential txnId per tcId, commit/abort fan-out, idempotent retry, mismatched-action failure, unknown-txn failure, add* no-ops, per-tcId scoping.
  • pulsar-broker:test --tests TxnMetadataStoreTest / MetadataTransactionBufferTest / MetadataPendingAckStoreTest — green after the idx:ops-by-txn index addition.
  • Checkstyle clean (main + test).

Deferred / follow-ups

  • Timeout sweep + GC sweep. This PR is happy-path newTxn / endTxn only.
  • Pure metadata-store leader election (replacing the assign-topic reuse).

merlimat added 2 commits May 23, 2026 09:50
… NEW_TXN / END_TXN

Adds the v5 transaction coordinator behind the new `transactionCoordinatorV5Enabled`
flag (default off). When the flag is on, `ServerCnx` routes `TC_CLIENT_CONNECT`,
`NEW_TXN`, `END_TXN`, `ADD_PARTITION_TO_TXN`, and `ADD_SUBSCRIPTION_TO_TXN` to
`TransactionCoordinatorV5` instead of the legacy `TransactionMetadataStoreService`.

The TC writes the txn header at `/txn/id/<tcId>_<seq>`, assigns `seq` from a
per-tcId monotonic counter (`/txn/tc-seq/<tcId>`) so txnIds are never reused, and
at end-txn enumerates the txn's participants via the new `idx:ops-by-txn`
secondary index on `/txn/op`, publishing one segment-event per affected segment
and one subscription-event per affected (segment, subscription) pair.

`ADD_PARTITION_TO_TXN` / `ADD_SUBSCRIPTION_TO_TXN` are no-ops per PIP-473 —
participants advertise themselves by writing `/txn/op` records when they apply
ops, so the pre-registration step is unnecessary.

Leader election reuses the existing `transaction_coordinator_assign` topic
ownership check (same surface the legacy TC uses); a pure metadata-store
election lands in P5.3.
Addresses review feedback:
- Rename `transactionCoordinatorV5Enabled` →
  `transactionCoordinatorScalableTopicsEnabled` and drop PIP / phase references
  from the user-facing config doc.
- Import `TransactionCoordinatorV5` in `PulsarService` instead of using the
  fully qualified name inline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant