Skip to main content

Your submission was sent successfully! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates from Canonical and upcoming events where you can meet our team.Close

Thank you for contacting us. A member of our team will be in touch shortly. Close

Cluster replication

This How-To will cover how to set up cluster replication using MirrorMaker through Kafka Connect.

For a brief explanation of how MirrorMaker works, see the MirrorMaker explanation page.

Pre-requisites

To set up cluster replication we need:

  • Two Charmed Apache Kafka clusters:
    • An “active” source cluster to replicate from.
    • A “passive” target cluster to replicate to.
  • A Charmed Kafka Connect cluster to run the MirrorMaker connectors.

It is best practice to co-locate the Kafka Connect cluster with the target Apache Kafka cluster - for example in the same cloud region.

For guidance on how to set up Charmed Apache Kafka, please refer to the following resources:

Deploy a MirrorMaker integrator

The MirrorMaker integrator charm manages tasks on a Charmed Kafka Connect cluster that replicates data from an active Apache Kafka cluster to a passive cluster.

Current deployment

Check the status of deployed applications by running juju status command. The result should be similar to:

Model  Controller  Cloud/Region         Version  SLA          Timestamp
k      vms         localhost/localhost  3.6.3    unsupported  10:45:37+02:00

App            Version  Status  Scale  Charm          Channel      Rev  Exposed  Message
active         3.9.0    active      1  kafka          3/edge       205  no
passive        3.9.0    active      1  kafka          3/edge       205  no
kafka-connect           active      1  kafka-connect  latest/edge   20  no

Unit              Workload  Agent  Machine  Public address  Ports           Message
active/0*         active    idle   0        10.86.75.171    19092/tcp
passive/0*        active    idle   1        10.86.75.153    9092,19092/tcp
kafka-connect/0*  active    idle   2        10.86.75.45     8083/tcp

To integrate Kafka Connect with the passive cluster (as recommened for active-passive replication), run:

juju integrate kafka-connect passive

Set up active-passive replication

First, deploy the MirrorMaker integrator charm:

juju deploy mirrormaker-connect-integrator mirrormaker

After some time the app should show up in the model as blocked:

Unit              Workload  Agent  Machine  Public address  Ports           Message
mirrormaker/0*    blocked   idle   4        10.86.75.16                     Integrator not ready to start, check if all relations are setup successfully

Set up the necessary relations:

juju integrate kafka-connect mirrormaker
juju integrate mirrormaker:source active
juju integrate mirrormaker:target passive

After some time, the mirrormaker application should show up as active/idle in the juju status:

Model  Controller  Cloud/Region         Version  SLA          Timestamp
k      vms         localhost/localhost  3.6.3    unsupported  10:59:37+02:00

App            Version  Status  Scale  Charm          Channel      Rev  Exposed  Message
active         3.9.0    active      1  kafka          3/edge       205  no       
kafka-connect           active      1  kafka-connect  latest/edge   20  no       
mirrormaker             active      1  mirrormaker                   0  no       Task Status: UNASSIGNED
passive        3.9.0    active      1  kafka          3/edge       205  no       

Unit              Workload  Agent  Machine  Public address  Ports           Message
active/0*         active    idle   0        10.86.75.171    9092,19092/tcp  
kafka-connect/0*  active    idle   2        10.86.75.45     8083/tcp        
mirrormaker/0*    active    idle   3        10.86.75.189    8080/tcp        Task Status: UNASSIGNED
passive/0*        active    idle   1        10.86.75.153    9092,19092/tcp  

[note] Task status might show as UNASSIGNED since there are no replication tasks running yet. If the active Kafka cluster is idle, this is expected. The task status will change to RUNNING once the replication tasks are created and started. [\note]

With this, the deployment is complete. The Charmed Kafka Connect cluster will now start tasks to replicate data from the active cluster to the passive cluster.

Set up active-active replication

MirrorMaker allows for a deployment where both clusters are active. This means that data can be replicated from both clusters to each other. This is done by creating a MirrorMaker connector for each cluster. Two flows are needed in this scenario, one from cluster A to cluster B and one from cluster B to cluster A.

In essence, it is equivalent to do two active-passive deployments, one for each direction.

We recommend having two Kafka Connect deployments ready, one on each end of the replication.

Deployment

To ensure that the topics are prefixed with the cluster name and do not collide with each other, deploy two different Mirrormaker integrators with the config option prefix_topics=true:

juju deploy mirrormaker-connect-integrator --config prefix_topics=true mirrormaker-a-b
juju deploy mirrormaker-connect-integrator --config prefix_topics=true mirrormaker-b-a

Check the status of deployed applications by running juju status command. The result should be similar to:

Model  Controller  Cloud/Region         Version  SLA          Timestamp
k      vms         localhost/localhost  3.6.3    unsupported  10:59:37+02:00

App              Version  Status  Scale  Charm          Channel      Rev  Exposed  Message
kafka-a          3.9.0    active      1  kafka          3/edge       205  no       
kafka-b          3.9.0    active      1  kafka          3/edge       205  no       
kafka-connect-a           active      1  kafka-connect  latest/edge   20  no       
kafka-connect-b           active      1  kafka-connect  latest/edge   20  no       
mirrormaker-a-b           active      1  mirrormaker                   0  no       Task Status: UNASSIGNED
mirrormaker-b-a           active      1  mirrormaker                   0  no       Task Status: UNASSIGNED

Unit                Workload  Agent  Machine  Public address  Ports           Message
kafka-a/0*          active    idle   0        10.86.75.171    9092,19092/tcp  
kafka-b/0*          active    idle   1        10.86.75.153    9092,19092/tcp  
kafka-connect-a/0*  active    idle   2        10.86.75.45     8083/tcp        
kafka-connect-b/0*  active    idle   2        10.86.75.46     8083/tcp        
mirrormaker-a-b/0*  active    idle   3        10.86.75.189    8080/tcp        Task Status: UNASSIGNED
mirrormaker-b-a/0*  active    idle   3        10.86.75.190    8080/tcp        Task Status: UNASSIGNED

Then the integrations needed should be done like follows:

# active-passive  A -> B
juju integrate kafka-connect-b kafka-b
juju integrate kafka-connect-b mirrormaker-a-b
juju integrate mirrormaker-a-b:source kafka-a
juju integrate mirrormaker-a-b:target kafka-b

# active-passive  B -> A
juju integrate kafka-connect-a kafka-a
juju integrate kafka-connect-a mirrormaker-b-a
juju integrate mirrormaker-b-a:source kafka-b
juju integrate mirrormaker-b-a:target kafka-a

With this, the deployment is complete. There will be two bi-directional replication flows between kafka-a and kafka-b. The topics will be prefixed with the cluster name, so that they do not collide with each other. For example, a topic called demo created on kafka-a will be replicated as a new topic on kafka-b named kafka-a.replica.demo, and vice versa.

Last updated a day ago. Help improve this document in the forum.