Kafka Stretch Clusters

Sonam Vermani
4 min readOct 2, 2020

Introduction

Kafka is run as a cluster of one or more servers that can span across multiple Datacenters or cloud regions. A Kafka cluster is highly scalable and fault-tolerant, if any of its servers fails, the other servers will take over their work to ensure continuous operations without any loss of data.

Stretch Clusters

Stretch clusters are intended to protect the Kafka cluster from failure in the event an entire datacenter fails. This is done by installing a single Kafka cluster across multiple Datacenters. This is NOT a multi-cluster: it is just one cluster which has broker servers on different Datacenters.
Kafka’s normal replication mechanism is used, as usual, to keep all brokers in the cluster in-sync.

The advantages of this architecture are in the synchronous replication — some types of business simply require that their DR site is always 100% synchronised with the primary site. The other advantage is that both datacenter and all brokers in the cluster are used.

Let’s understand how this works

Assume we have 5 broker node and 3 zookeeper node cluster which are stretched across the 2 datacenters DC1 and DC2. Let’s first understand how Kafka partition distribution works:

When you first create a topic, Kafka decides how to allocate the partitions between the brokers. For a given topic if there are 5 partitions and replication factor is 3, so there will be a total of 3x5 =15 copies across 5 brokers.

While doing allocations goals are:

  • Spread replicas evenly among brokers.
  • Make sure that for each partition each replica is on a different broker. If partition 0 has the leader on broker 2, the followers can be on brokers 3 and 4, but not on 2 and not both on 3

Every partition will have a partition leader and all the requests will be served from that Leader broker in order to guarantee consistency. Once leader assignment is done then based on the replication factor remaining brokers are marked as followers of he partition.

There could be various possible combinations of replica’s and partitions for any topic, let’s understand how Kafka places the replicas onto the brokers.

For better understanding, each partition is represented as different colour. Each broker is given unique broker id, which is set in the broker’s server.properties.

Scenario #1 : Topic created with 5 Partitions and 3 Replicas

Kafka Cluster — Partition Distribution

In this example, a total of 15 replicas are placed between all the brokers and following is the sequence of leader and followers.

P0 — B4 is Leader, B5 & B1 followers

P1 — B5 is Leader, B1 & B2 followers

P2 — B1 is Leader, B2 & B3 followers

P3 — B2 is Leader, B3 & B4 followers

P4 — B3 is Leader, B4 & B5 followers

There can be various possible combinations of Leader and follower broker depending on which broker is chosen by Kafka first. In an event of failure, we would have enough in-sync replicas for each of the partitions to serve the request.

Scenario #2 : Topic created with 1 Partition and 5 Replicas

In this example, there will be only 5 copies to be placed between all the brokers and once the partition leader is decided the remaining brokers will be the followers.

Here the disadvantage is that since there is only one leader and one partition all the request will be served only from the Leader broker and the load will not be balanced.

Scenario #3 : Topic created with 3 Partition and 3 Replicas

In this example there are 9 copies/replicas to be placed between the brokers. Kafka will make sure all the copies of the same partition is not placed on the same broker. Following can be the sequence of leader and followers.

P0 — B1 is Leader, B2 & B3 followers

P1 — B2 is Leader, B3 & B4 followers

P2 — B3 is Leader, B4 & B5 followers

Scenario #4 : Topic created with 1 Partition and 3 Replicas

In this scenario there are 3 copies to be placed and let’s say the leader broker is B2 so the remaining 2 copies will be placed on B3 and B4. This scenario may give you high availability in terms of replicas but the brokers B1 and B5 of the cluster are idle as they have no processing to be done which means we are wasting resources here! This scenario is not recommended for production clusters.

In a 5 node production cluster we should consider the following:

  • There are enough in-sync replicas available so that Kafka can decide on the leader broker in an event of failure.
  • The load should be balanced which means that there should not be only one broker leader for the topic who will be busy serving all the client requests and the other brokers in the cluster are just replicating the partition data.

--

--

Sonam Vermani

Java programmer | Like to code 👩🏻‍💻 | Still learning