SlideShare a Scribd company logo
A Hitchhiker's Guide to
Apache Kafka®
Geo-Replication
Sanjana Kaundinya | Senior Software Engineer
Rajini Sivaram | Principal Software Engineer
Overview of Kafka Replication
Kafka Overview
● Broker - Stores messages in partitions
● Topic - Virtual Group of one or more partitions
● Partitions - Log files on disk with only sequential writes.
Kafka guarantees message ordering in a partition.
Broker
T1
P
P2
P1
C
C
C
P
CG1
Kafka Log Offsets
P1
C1
P
C2
4 5 6 7 8 9
0 1 2 3
P2
P3
P4
Partition 1
__consumer_offsets
startOffset CG1 CG2 HW LEO
Produce
● Append Only Log
● Log End Offset
● High Watermark
● Consumer Offsets
Why do we Need Replication ?
How can a broker go down?
● Controlled shutdown
● Uncontrolled shutdown
What happens when a broker goes down?
● Durability
● Availability
Kafka Replication
● Partition replicas are evenly distributed
● Byte for byte copy of each other
● One replica is a leader and all writes go to the leader
● Leader decides when to commit data
P1
(L)
P2
(L)
P4
P1
P2
P3
P4
P3
(L)
P1
P3
P4
(L)
P2
Replication
Factor = 3
How are messages committed ?
● Leader maintains in sync replicas (ISR)
● Failure cases are handled with the use of a leader epoch
● The leader epoch is part of the message(KIP-101)
R1
(L)
R2
(L)
R4
(L)
R2
R1
R3
R4
R3
R3
R3
(L)
R4
R2
P1
(L)
Salient Points for Replication
● Intra cluster Replication helps improve durability and
availability for node level failures.
● Offsets are core piece of Kafka producer and consumer
ecosystem.
● Kafka Replication protocol ensures strong consistency
through byte for byte replication and providing
message ordering guarantees.
Multi Zone(MZ) HA Kafka Cluster
B B
ZK zk
P C
B
zk
AZ1 AZ2 AZ3
Inter Zone Latency <10 ms
Typical ~3 ms
ZK ZK
Why Do We Need To Globally Replicate ?
● Global Availability
● Protection against disasters
○ Natural disaster
○ Cloud provider outage
● Regulatory Compliance
● Aggregate Clusters
● IOT use cases
● Migration from one region to another
Differences Among Multi-DC Solutions
Stretched Clusters Connected Clusters
Kafka Stretched Clusters
Stretched Clusters
● Offset Preserving
● Fast Disaster Recovery
● Automated Client
Failover with No
Custom Code
● Sync or Async
Replication per Topic
with Confluent’s
Multi-Region Clusters
13
3 DC Stretched Cluster
2.5 DC Stretched Cluster
Fetch from
Followers
● With KIP-392,
consumers can
read from the
closest replica
● This helps to save
on networking
costs and helps
with overall
latency 16
Confluent Multi-Region Clusters (MRC)
Leader
Follower
Observer
● Sync vs Async replication
● Replica placement
MRC: Automatic Observer Promotion
Leader
Follower
Observer
observerPromotionPolicy
● under-min-isr
● under-replicated
● leader-is-observer
Network Considerations
● Single Kafka Cluster with
bi-directional connectivity
● Cost of cross-DC traffic
● Network Latency: < 50ms
between DCs
○ Sync: client impact
○ Async: durability
impact
● Network partitions
● Replication tuning: buffer
sizes, fetcher threads
19
Security Considerations
● Authentication using SSL
or SASL_SSL for
inter-broker connections
● Wire-encryption using
TLS
● Single Kafka Cluster
○ Single account and
access management
for clients
○ ACLs apply across
whole cluster 20
Kafka Connected Clusters
Clusters can
replicate using
Kafka Connect
● Have two separate
Kafka clusters in use
● Different from a single
stretched cluster
● Offset Translation
● MirrorMaker 2.0 and
Confluent Replicator
Connect based
Replication
22
C
Fundamentals of Kafka Connect
● Offset management
● Elastic scalability
● Parallelization
● Task distribution
● Failure & Retries
● Configuration Management
● REST API
Multi-Geo Replication Through
MirrorMaker 2
MirrorMaker 2
Offset Translation in MirrorMaker 2.0
offset_sync
topic,
partition,
src offset,
matching dest offset
checkpoints
topic,
partition,
group name,
consumer group src offset,
matching dest offset
Consumer
translateOffsets
Destination Cluster
Offset Translation in Replicator
26
Network Considerations
● Where to run Connect based
clusters?
○ local producer, remote
consumer
● Connectivity from Connect to
source and destination brokers
○ Firewalls
● High Latency networks
○ Kafka batch sizes
○ TCP buffers: OS level and
application level
○ Automatic window scaling
27
Security Considerations
● Credentials
○ Source credentials
○ Destination credentials
○ Externalize passwords
● Wire encryption using TLS
● Access control
○ Access to read from
source cluster
○ Access to write to
destination cluster
○ Naming conventions:
prefixed ACLs
28
Connecting Clusters
Sans Kafka Connect
● Multi continent
replication without the
an external system
● Offset preserving,
eliminating need for
offset translation
● Has similar use cases as
Kafka Connect based
architectures
Cluster Linking
29
Multi-Geographic Deployment Strategies
with Apache Kafka
Active-Passive
● One cluster is the
primary, other cluster
is the standby
● The primary cluster is
the only one written to
● Commonly used
topology used for
regulatory compliance
31
Producer
Active DC Passive DC
Consumer Consumer
Replication
Active-Active
● Two clusters replicate
to each other
● Records are produced
to both clusters and
seen by clients in both
clusters
● Used for a globally
distributed
architecture, data
needs to be regionally
available 32
Producer
Active DC Active DC
Consumer Consumer
Producer
Replication
Replication
Preventing Cyclic Replication in an
Active-Active Setup
How do connected clusters prevent cyclic replication?
● MirrorMaker 2.0 uses alias detection
● Confluent Replicator adds a provenance header to each
record which contains:
○ ID of the origin cluster
○ Name of the topic
○ Timestamp
Fan-In AKA
Aggregation
● Multiple clusters write
to one centralized
cluster
● Can aggregate into
one centralized topic
or do this on the
central cluster
● Use cases:
aggregation, analytics,
IOT 34
DC
Producer
Producer
DC
Aggregate
DC DC
Producer
R
e
p
l
i
c
a
t
i
o
n
Replication
Replication
Consumer
Fan-Out
● One cluster writes out
to multiple other
clusters
● Only one cluster is
actively produced to
● Use cases: expanded
version of
active-passive setups,
IOT
35
DC
Consumer
DC
Central DC DC
R
e
p
l
i
c
a
t
i
o
n
Replication
Replication
Producer
Consumer
Consumer
Disaster Recovery:
Failing Over
● If primary cluster goes
down, all producers
have to be move to the
secondary cluster
● Need to ensure that
consumer applications
can resume where
they last left off
36
R R
R
A - Primary
ZK
R R
R
B - Secondary
ZK
Producer
Replication
Consumer
37
R R
R
B - Secondary
ZK
Disaster Recovery:
Failing Back
● Once the disaster is
mitigated, switch back
to the primary cluster
● Have to ensure client
applications can write
back to the original
cluster
R R
R
A - Primary
ZK
Producer
Consumer
Resume
Replication
Reconciliation
Operational
Last point system was
operational
Disaster
Disaster strikes and
system goes down
2
1
Recovery
Begin recovery after
disaster strikes
Normalcy
System back to being
operational
4
3
38
Disaster Recovery: Metrics
Recovery Point
Objective
Recovery Time Objective
Which multi-geo deployment to choose?
● It really depends!
● Considerations:
○ Cost
○ Business Requirements
○ Use Case
○ Regulatory Compliance
● Two must haves:
○ Resilient to disasters
○ Security
Questions?

More Related Content

PDF
Fundamentals of Apache Kafka
PPTX
A visual introduction to Apache Kafka
PPTX
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
PPTX
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PDF
Exactly-once Semantics in Apache Kafka
ODP
Stream processing using Kafka
PDF
Introduction to apache kafka
Fundamentals of Apache Kafka
A visual introduction to Apache Kafka
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Squirreling Away $640 Billion: How Stripe Leverages Flink for Change Data Cap...
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Exactly-once Semantics in Apache Kafka
Stream processing using Kafka
Introduction to apache kafka

What's hot (20)

PPTX
Introduction to Kafka Cruise Control
PDF
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
PPTX
APACHE KAFKA / Kafka Connect / Kafka Streams
PDF
Apache Kafka - Martin Podval
PDF
Apache Kafka
PDF
Producer Performance Tuning for Apache Kafka
PPTX
Multi-Datacenter Kafka - Strata San Jose 2017
PDF
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
PPTX
No data loss pipeline with apache kafka
PDF
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
PDF
Introduction to eBPF
PPTX
Kafka 101
PPTX
Stability Patterns for Microservices
PDF
Disaster Recovery Plans for Apache Kafka
PPTX
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
The Current State of Table API in 2022
PDF
PDF
How Uber scaled its Real Time Infrastructure to Trillion events per day
PPTX
Apache Kafka - Messaging System Overview
Introduction to Kafka Cruise Control
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
APACHE KAFKA / Kafka Connect / Kafka Streams
Apache Kafka - Martin Podval
Apache Kafka
Producer Performance Tuning for Apache Kafka
Multi-Datacenter Kafka - Strata San Jose 2017
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
No data loss pipeline with apache kafka
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Introduction to eBPF
Kafka 101
Stability Patterns for Microservices
Disaster Recovery Plans for Apache Kafka
Getting up to speed with MirrorMaker 2 | Mickael Maison, IBM and Ryanne Dolan...
Dynamic Rule-based Real-time Market Data Alerts
The Current State of Table API in 2022
How Uber scaled its Real Time Infrastructure to Trillion events per day
Apache Kafka - Messaging System Overview
Ad

Similar to A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya and Rajini Sivaram | Kafka Summit London 2022 (20)

PDF
Building zero data loss pipelines with apache kafka
PDF
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
PDF
An Introduction to Apache Kafka
PDF
Uber: Kafka Consumer Proxy
PPTX
Instaclustr Kafka Meetup Sydney Presentation
PDF
Insta clustr seattle kafka meetup presentation bb
PDF
PDF
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
PPTX
Getting Started with Kafka on k8s
PDF
Kafka in action - Tech Talk - Paytm
PDF
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
PDF
Apache Kafka - Free Friday
PDF
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
PPTX
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
PDF
Apache Kafka Architecture & Fundamentals Explained
PDF
Apache KAfka
PDF
Redpanda and ClickHouse
PDF
Build real time stream processing applications using Apache Kafka
PPTX
RDMA at Hyperscale: Experience and Future Directions
PDF
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Building zero data loss pipelines with apache kafka
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
An Introduction to Apache Kafka
Uber: Kafka Consumer Proxy
Instaclustr Kafka Meetup Sydney Presentation
Insta clustr seattle kafka meetup presentation bb
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
Getting Started with Kafka on k8s
Kafka in action - Tech Talk - Paytm
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Apache Kafka - Free Friday
Kafka Cluster Federation at Uber (Yupeng Fui & Xiaoman Dong, Uber) Kafka Summ...
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
Apache Kafka Architecture & Fundamentals Explained
Apache KAfka
Redpanda and ClickHouse
Build real time stream processing applications using Apache Kafka
RDMA at Hyperscale: Experience and Future Directions
Multi-Tenancy Kafka cluster for LINE services with 250 billion daily messages
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks

Recently uploaded (20)

PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
A Presentation on Artificial Intelligence
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Encapsulation_ Review paper, used for researhc scholars
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Reach Out and Touch Someone: Haptics and Empathic Computing
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Review of recent advances in non-invasive hemoglobin estimation
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cuic standard and advanced reporting.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
A Presentation on Artificial Intelligence

A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya and Rajini Sivaram | Kafka Summit London 2022

  • 1. A Hitchhiker's Guide to Apache Kafka® Geo-Replication Sanjana Kaundinya | Senior Software Engineer Rajini Sivaram | Principal Software Engineer
  • 2. Overview of Kafka Replication
  • 3. Kafka Overview ● Broker - Stores messages in partitions ● Topic - Virtual Group of one or more partitions ● Partitions - Log files on disk with only sequential writes. Kafka guarantees message ordering in a partition. Broker T1 P P2 P1 C C C P
  • 4. CG1 Kafka Log Offsets P1 C1 P C2 4 5 6 7 8 9 0 1 2 3 P2 P3 P4 Partition 1 __consumer_offsets startOffset CG1 CG2 HW LEO Produce ● Append Only Log ● Log End Offset ● High Watermark ● Consumer Offsets
  • 5. Why do we Need Replication ? How can a broker go down? ● Controlled shutdown ● Uncontrolled shutdown What happens when a broker goes down? ● Durability ● Availability
  • 6. Kafka Replication ● Partition replicas are evenly distributed ● Byte for byte copy of each other ● One replica is a leader and all writes go to the leader ● Leader decides when to commit data P1 (L) P2 (L) P4 P1 P2 P3 P4 P3 (L) P1 P3 P4 (L) P2 Replication Factor = 3
  • 7. How are messages committed ? ● Leader maintains in sync replicas (ISR) ● Failure cases are handled with the use of a leader epoch ● The leader epoch is part of the message(KIP-101) R1 (L) R2 (L) R4 (L) R2 R1 R3 R4 R3 R3 R3 (L) R4 R2 P1 (L)
  • 8. Salient Points for Replication ● Intra cluster Replication helps improve durability and availability for node level failures. ● Offsets are core piece of Kafka producer and consumer ecosystem. ● Kafka Replication protocol ensures strong consistency through byte for byte replication and providing message ordering guarantees.
  • 9. Multi Zone(MZ) HA Kafka Cluster B B ZK zk P C B zk AZ1 AZ2 AZ3 Inter Zone Latency <10 ms Typical ~3 ms ZK ZK
  • 10. Why Do We Need To Globally Replicate ? ● Global Availability ● Protection against disasters ○ Natural disaster ○ Cloud provider outage ● Regulatory Compliance ● Aggregate Clusters ● IOT use cases ● Migration from one region to another
  • 11. Differences Among Multi-DC Solutions Stretched Clusters Connected Clusters
  • 13. Stretched Clusters ● Offset Preserving ● Fast Disaster Recovery ● Automated Client Failover with No Custom Code ● Sync or Async Replication per Topic with Confluent’s Multi-Region Clusters 13
  • 14. 3 DC Stretched Cluster
  • 15. 2.5 DC Stretched Cluster
  • 16. Fetch from Followers ● With KIP-392, consumers can read from the closest replica ● This helps to save on networking costs and helps with overall latency 16
  • 17. Confluent Multi-Region Clusters (MRC) Leader Follower Observer ● Sync vs Async replication ● Replica placement
  • 18. MRC: Automatic Observer Promotion Leader Follower Observer observerPromotionPolicy ● under-min-isr ● under-replicated ● leader-is-observer
  • 19. Network Considerations ● Single Kafka Cluster with bi-directional connectivity ● Cost of cross-DC traffic ● Network Latency: < 50ms between DCs ○ Sync: client impact ○ Async: durability impact ● Network partitions ● Replication tuning: buffer sizes, fetcher threads 19
  • 20. Security Considerations ● Authentication using SSL or SASL_SSL for inter-broker connections ● Wire-encryption using TLS ● Single Kafka Cluster ○ Single account and access management for clients ○ ACLs apply across whole cluster 20
  • 22. Clusters can replicate using Kafka Connect ● Have two separate Kafka clusters in use ● Different from a single stretched cluster ● Offset Translation ● MirrorMaker 2.0 and Confluent Replicator Connect based Replication 22 C
  • 23. Fundamentals of Kafka Connect ● Offset management ● Elastic scalability ● Parallelization ● Task distribution ● Failure & Retries ● Configuration Management ● REST API
  • 25. Offset Translation in MirrorMaker 2.0 offset_sync topic, partition, src offset, matching dest offset checkpoints topic, partition, group name, consumer group src offset, matching dest offset Consumer translateOffsets Destination Cluster
  • 26. Offset Translation in Replicator 26
  • 27. Network Considerations ● Where to run Connect based clusters? ○ local producer, remote consumer ● Connectivity from Connect to source and destination brokers ○ Firewalls ● High Latency networks ○ Kafka batch sizes ○ TCP buffers: OS level and application level ○ Automatic window scaling 27
  • 28. Security Considerations ● Credentials ○ Source credentials ○ Destination credentials ○ Externalize passwords ● Wire encryption using TLS ● Access control ○ Access to read from source cluster ○ Access to write to destination cluster ○ Naming conventions: prefixed ACLs 28
  • 29. Connecting Clusters Sans Kafka Connect ● Multi continent replication without the an external system ● Offset preserving, eliminating need for offset translation ● Has similar use cases as Kafka Connect based architectures Cluster Linking 29
  • 31. Active-Passive ● One cluster is the primary, other cluster is the standby ● The primary cluster is the only one written to ● Commonly used topology used for regulatory compliance 31 Producer Active DC Passive DC Consumer Consumer Replication
  • 32. Active-Active ● Two clusters replicate to each other ● Records are produced to both clusters and seen by clients in both clusters ● Used for a globally distributed architecture, data needs to be regionally available 32 Producer Active DC Active DC Consumer Consumer Producer Replication Replication
  • 33. Preventing Cyclic Replication in an Active-Active Setup How do connected clusters prevent cyclic replication? ● MirrorMaker 2.0 uses alias detection ● Confluent Replicator adds a provenance header to each record which contains: ○ ID of the origin cluster ○ Name of the topic ○ Timestamp
  • 34. Fan-In AKA Aggregation ● Multiple clusters write to one centralized cluster ● Can aggregate into one centralized topic or do this on the central cluster ● Use cases: aggregation, analytics, IOT 34 DC Producer Producer DC Aggregate DC DC Producer R e p l i c a t i o n Replication Replication Consumer
  • 35. Fan-Out ● One cluster writes out to multiple other clusters ● Only one cluster is actively produced to ● Use cases: expanded version of active-passive setups, IOT 35 DC Consumer DC Central DC DC R e p l i c a t i o n Replication Replication Producer Consumer Consumer
  • 36. Disaster Recovery: Failing Over ● If primary cluster goes down, all producers have to be move to the secondary cluster ● Need to ensure that consumer applications can resume where they last left off 36 R R R A - Primary ZK R R R B - Secondary ZK Producer Replication Consumer
  • 37. 37 R R R B - Secondary ZK Disaster Recovery: Failing Back ● Once the disaster is mitigated, switch back to the primary cluster ● Have to ensure client applications can write back to the original cluster R R R A - Primary ZK Producer Consumer Resume Replication Reconciliation
  • 38. Operational Last point system was operational Disaster Disaster strikes and system goes down 2 1 Recovery Begin recovery after disaster strikes Normalcy System back to being operational 4 3 38 Disaster Recovery: Metrics Recovery Point Objective Recovery Time Objective
  • 39. Which multi-geo deployment to choose? ● It really depends! ● Considerations: ○ Cost ○ Business Requirements ○ Use Case ○ Regulatory Compliance ● Two must haves: ○ Resilient to disasters ○ Security