Chapter 29

CDC Data Pipeline

CDC: Change Data Capture

CDC captures every row-level change from a database's transaction log and streams it as events to downstream systems in near real-time — the core infrastructure for real-time data pipelines, heterogeneous sync, and microservice decoupling.

What is CDC

Method Latency Capture DELETE DB Load Complexity
Full poll Minutes No High Low
Timestamp poll Seconds No Medium Low
Triggers ms Yes High (write amp) High
CDC (log parsing) ms Yes Minimal Medium

MySQL Binlog Requirements

server_id        = 1
log_bin          = /var/log/mysql/mysql-bin.log
binlog_format    = ROW   # required for CDC
binlog_row_image = FULL  # capture full before/after images
gtid_mode        = ON
enforce_gtid_consistency = ON

Debezium Quickstart

# Grant CDC permissions
CREATE USER 'debezium'@'%' IDENTIFIED BY 'pass';
GRANT SELECT, RELOAD, SHOW DATABASES, REPLICATION SLAVE,
      REPLICATION CLIENT ON *.* TO 'debezium'@'%';

# Register connector via Kafka Connect REST API
curl -X POST http://localhost:8083/connectors -H 'Content-Type: application/json' -d '{
  "name": "mysql-connector",
  "config": {
    "connector.class": "io.debezium.connector.mysql.MySqlConnector",
    "database.hostname": "mysql", "database.port": "3306",
    "database.user": "debezium", "database.password": "pass",
    "database.server.id": "184054",
    "topic.prefix": "dbserver",
    "database.include.list": "mydb",
    "database.history.kafka.bootstrap.servers": "kafka:9092",
    "database.history.kafka.topic": "schema-changes.mydb"
  }
}'

Each event contains: before, after, op (c/u/d/r), ts_ms, and source (gtid, file, pos).

Full Kafka Pipeline

MySQL → Debezium → Kafka → Flink CDC → ClickHouse (OLAP)
                                     → Elasticsearch (search)
                                     → Redis (cache invalidation)

Exactly-Once Delivery

CDC natively provides at-least-once delivery. Make consumers idempotent by tracking processed GTID sets. Use Schema Registry with Avro to handle DDL evolution gracefully.

Rate this chapter
4.9  / 5  (3 ratings)

💬 Comments