Skip to content

awong/iceberg v2/debezium#30205

Draft
andrwng wants to merge 5 commits intoawong/iceberg-v2/equality-deletesfrom
awong/iceberg-v2/debezium
Draft

awong/iceberg v2/debezium#30205
andrwng wants to merge 5 commits intoawong/iceberg-v2/equality-deletesfrom
awong/iceberg-v2/debezium

Conversation

@andrwng
Copy link
Copy Markdown
Contributor

@andrwng andrwng commented Apr 16, 2026

Stack created with GitHub Stacks CLIGive Feedback 💬

andrwng added 5 commits April 16, 2026 14:51
Introduce a compile-time schema descriptor framework that defines the
Redpanda system struct as a type-level description. This replaces
hand-coded field IDs and struct construction with a single source of
truth that derives runtime struct_type, struct_value, and field
accessors automatically. The build_rp_struct function moves from a
local definition in record_translator to a shared public API in
table_definition, eliminating duplication for future translators.
Add translated_record struct and key_field_names to record_type,
enabling translators to produce both data rows and equality delete
keys. The multiplexer now creates a separate partitioning_writer for
delete keys when key_field_names is set, validating that key columns
include all partition source columns before writing.
Add debezium_translator that parses Debezium CDC envelopes, extracting
before/after payloads and operation types to produce data rows and
equality delete keys. The translator uses schema descriptors from
table_definition for field ID assignment and build_rp_struct for the
redpanda system struct.

The coordinator now handles equality delete files alongside data files
using row_delta_action, partitioning delete files by the same spec as
data files. A new debezium_schema_id_prefix iceberg mode is added to
model::iceberg_mode and wired through datalake_manager.
Add Docker install scripts for PostgreSQL and Debezium Server to the
ducktape image, along with PostgresService and DebeziumServerService
wrappers for use in integration tests.
Add debezium_cdc_e2e_test that exercises the full pipeline: PostgreSQL
source -> Debezium Server -> Redpanda -> Iceberg datalake translation,
verifying that inserts, updates, and deletes produce the expected data
and equality delete files in the Iceberg table.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant