Skip to content

[docs]:Add Flink CDC integration doc under Flink #3347

Open
ThorneANN wants to merge 3 commits into
apache:mainfrom
ThorneANN:add-flink-cdc-docs
Open

[docs]:Add Flink CDC integration doc under Flink #3347
ThorneANN wants to merge 3 commits into
apache:mainfrom
ThorneANN:add-flink-cdc-docs

Conversation

@ThorneANN
Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #1945

Brief change log

Added a new flink-cdc-intergartion under "Flink Engine" in website/docs/engine-flink/flink-cdc-intergartion.md.
Introduced two integration methods:
Flink CDC Pipeline Connector — defining synchronization pipeline via a YAML file.
Flink CDC SQL Connector — syncing PostgreSQL data using SQL statements.
Added the official Flink CDC documentation reference link.

Tests

API and Format

Documentation

@ThorneANN ThorneANN changed the title [Docs]:add Flink CDC integration section under Flink Engine [docs]:add Flink CDC integration section under Flink Engine May 19, 2026
@ThorneANN ThorneANN changed the title [docs]:add Flink CDC integration section under Flink Engine [docs]:Add Flink CDC integration doc under Flink May 19, 2026
@beryllw
Copy link
Copy Markdown
Contributor

beryllw commented May 26, 2026

Thanks for the contribution! One small thing — the PR description references close #1945, which is a previously closed PR. The tracking issue for this work should be #1939.


- A running **Fluss cluster** (CoordinatorServer + TabletServer). See [Deploying with Docker](../install-deploy/deploying-with-docker.md) for setup instructions.
- A running **Flink cluster** with the required connector JARs. See [Getting Started with Flink](getting-started.md) for Flink setup.
- The required connector JARs placed under `<FLINK_HOME>/lib/`. The examples below use MySQL as the source, but other databases (PostgreSQL, Oracle, etc.) are also supported — see [Further Reading](#further-reading) for the full list of connectors.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to use PostgreSQL as the source example instead of MySQL? The original issue (#1939) describes the use case with PostgreSQL, and it's more widely adopted internationally, which might resonate better with readers.

Submit the pipeline using the Flink CDC CLI:

```shell
./bin/flink-cdc.sh mysql-to-fluss.yaml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./bin/flink-cdc.sh comes from the standalone Flink CDC distribution (flink-cdc-3.x.y-bin.tar.gz), not the standard Flink distribution. It might be helpful to mention in the Prerequisites section that users need to download the Flink CDC distribution separately and set FLINK_HOME.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./bin/flink-cdc.sh comes from the standalone Flink CDC distribution (flink-cdc-3.x.y-bin.tar.gz), not the standard Flink distribution. It might be helpful to mention in the Prerequisites section that users need to download the Flink CDC distribution separately and set FLINK_HOME.

I will change this source with pgsql

---
sidebar_label: Flink CDC
title: Flink CDC Integration
sidebar_position: 9
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sidebar_position: 9 conflicts with the existing options.md in the same directory. Would you mind changing it to 10 to avoid non-deterministic sidebar ordering?

@ThorneANN
Copy link
Copy Markdown
Contributor Author

ThorneANN commented May 26, 2026

@beryllw PTAL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants