SlideShare a Scribd company logo
 lessons from managing a pulsar cluster
● Senior Developer at
Nutanix responsible for all
things pulsar
● Love spending time with
data (stores, streams,
analytics etc)
● Ex-MySQL - started out
with 3 great years building
MySQL Replication
● Contributions to pulsar &
MySQL
Who am I ?
https://www.linkedin.com/in/shivjijha/
https://twitter.com/ShivjiJha
● Helping customers
manage cost and security
for hybrid cloud.
● Crunch (& stream) data to
find insights about cost
and security
● Needed pub/sub to store
events and replay when
required
What do we do ?
https://www.nutanix.com/products/beam
Platforms We Use
Platforms We Use
How do we
Choose
a platform ??
Avoid bias
towards
familiar
technology
The First Steps
Summarising the github comment
1. Kafka alternative - incubating apache project PULSAR
2. Open sourced by Yahoo
3. Hundreds of billions of messages per day in pulsar at Yahoo
4. Solving annoying problems in kafka like:
a. Topic management
b. Disruptive rebalances
5. Same raw power (throughput, latencies etc)
6. Stateless brokers
7. Apache bookkeeper for storage
8. Stream + queue
Wow, that is
a lot of
Promise!!
First principles - Requirements?
1. Coordination
2. Persistence
3. Scale compute and storage independently
4. High Availability
5. Fault tolerance
6. Client ecosystem
Requirement # 1
✓ Coordination
Requirement # 1
✓ Coordination
Requirement # 1
1. Coordination
Requirement # 2
✓ Persistence
Requirement # 2
✓ Persistence
Requirement # 2
✓ Persistence
Requirement # 2
✓ Persistence
Requirement # 3
✓ Scale compute and storage independently
Requirement # 3
✓ Scale compute and storage independently
Requirement # 3
✓ Scale compute and storage independently
Brokers => serve msg
Requirement # 3
✓ Scale compute and storage independently
Bookies => store
Brokers => serve msg
Requirement # 3
✓ Scale compute and storage independently
Bookies => store
Brokers => serve msg
Requirement # 4
✓ High Availability
Requirement # 4
✓ High Availability
Requirement # 4
✓ High Availability
Replicated brokers
Replicated bookies
Requirement # 4
✓ High Availability
Replicated brokers
Replicated bookies
Requirement # 5
✓ Fault tolerance
✓ Replicated compute (brokers)
✓ Replicated store (bookkeeper / bookies)
Requirement # 5
✓ Fault tolerance
✓ Replicated compute (brokers)
✓ Replicated store (bookkeeper / bookies)
✓ Tunable fault tolerance (bookkeeper)
✓ Ensemble size
✓ Write quorum size
✓ Ack quorum size
https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html
Requirement # 5
✓ Fault tolerance
✓ Replicated compute (brokers)
✓ Replicated store (bookkeeper / bookies)
✓ Tunable fault tolerance (bookkeeper)
✓ Ensemble size
✓ Write quorum size
✓ Ack quorum size
https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html
set-persistence --ensemble 5 --writeQuorum 3 --ackQuorum 2
Requirement # 5
✓ Fault tolerance
✓ Replicated compute (brokers)
✓ Replicated store (bookkeeper / bookies)
✓ Tunable fault tolerance (bookkeeper)
✓ Ensemble size
✓ Write quorum size
✓ Ack quorum size
https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html
When scaling
bookie cluster,
finetune quorum
sizes
set-persistence --ensemble 5 --writeQuorum 3 --ackQuorum 2
Requirement # 5
✓ Fault tolerance
✓ Replicated compute (brokers)
✓ Replicated store (bookkeeper / bookies)
✓ Tunable fault tolerance (bookkeeper)
✓ Ensemble size
✓ Write quorum size
✓ Ack quorum size
https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html
When scaling
bookie cluster,
finetune quorum
sizes
set-persistence --ensemble 5 --writeQuorum 3 --ackQuorum 2
Requirement # 6
✓ Client ecosystem
✓ Work in progress
Requirement # 6
✓ Client ecosystem
✓ Work in progress
✓ Compensating factors:
✓ Clients are easier to change, just a library afterall!
✓ Very active community (slack)
✓ Quick turnaround (and quick fixes) for critical issues
Requirement # 6
✓ Client ecosystem
✓ Work in progress
✓ Compensating factors:
✓ Clients are easier to change, just a library afterall!
✓ Very active community (slack)
✓ Quick turnaround (and quick fixes) for critical issues
✓ Bonus features
✓ Load balancer auto balances topics among brokers
✓ Tiered storage
✓ Unified platform (Stream + Queue)
✓ Multi-tenant topic structure
Requirement # 6
✓ Client ecosystem
✓ Work in progress
✓ Compensating factors:
✓ Clients are easier to change, just a library afterall!
✓ Very active community (slack)
✓ Quick turnaround (and quick fixes) for critical issues
✓ Bonus features
✓ Load balancer auto balances topics among brokers
✓ Tiered storage
✓ Unified platform (Stream + Queue)
✓ Multi-tenant topic structure
Tuning Configurations
✓ Configurations could be optimized for backward compatibility
✓ Not necessarily for performance
✓ Not necessarily for latest features
✓ Perf Test for your use cases and tune!
Performance Testing
Pulsar with
https://locust.io/
Test Sync
Message
Test Async
Message
Tuning Configurations
✓ Durability vs throughput (bookkeeper.conf)
# Maximum latency to impose on a journal write to achieve grouping
journalMaxGroupWaitMSec=2
Tuning Configurations
✓ Disable auto recovery in bookkeeper when out for maintenance!
bookkeeper shell autorecovery -disable
STOP / MAINTENANCE / START
bookkeeper shell autorecovery -enable
Tuning Configurations
✓ Auto recovery vs throughput (broker.conf)
✓ If you have a small number of bookies, and a bookie goes down, auto recovery
may overwhelm the remaining bookies
✓ Number of entries that a replication will re-replicate in parallel
maxPendingReadRequestsPerThread=2500
rereplicationEntryBatchSize=100
Contribute to stay in sync
1. Development is fast, in fact very fast
a. Don’t maintain forks, easier to contribute
https://github.com/apache/pulsar/graphs/contributors
Contribute to stay in sync
1. Development is fast, in fact very fast
a. Don’t maintain forks, easier to contribute
2. We do the same!
https://github.com/apache/pulsar/graphs/contributors
Pulsar Use cases In Beam
&
Event Sourcing
1. Persisting your application's state by storing the history that
determines the current state of your application.
State of application at
any point in time
State of application at
this instant of time
https://docs.microsoft.com/en-us/previous-versions/msp-n-p/jj591559(v=pandp.10)
● History of events
● Past Tense verbs
● Immutable
● Ordered
● Restore for state at any
point in time
● Use: CQRS, Audit trail etc
Event Sourcing
https://docs.microsoft.com/en-us/azure/architecture/patterns/event-sourcing
Representing Events (Schema)
1. Pulsar supports bytes, string, avro, ptobuff, json etc
2. Schemaless?
a. Any code that manipulates the data needs to make some assumptions about its
structure.
b. All producers and consumers know the hidden implicit schema.
3. Opinion: Use schema as far as possible.
a. Pulsar supports schema registry out of the box.
Representing Events (Schema)
1. Of course, Schemalessness offers a pragmatic alternative at times.
https://martinfowler.com/articles/schemaless/#non-uniform-types
Representing Events (Schema)
1. Of course, Schemalessness offers a pragmatic alternative at times.
https://martinfowler.com/articles/schemaless/#non-uniform-types
Add custom
fields for UI etc
Representing Events (Schema)
1. Of course, Schemalessness offers a pragmatic alternative at times.
https://martinfowler.com/articles/schemaless/#non-uniform-types
Add custom
fields for UI etc
Different attributes
depending on kind
of event
Representing Events (Schema)
1. Of course, Schemalessness offers a pragmatic alternative at times.
https://martinfowler.com/articles/schemaless/#non-uniform-types
Add custom
fields for UI etc
Different attributes
depending on kind
of event
Obviously, easy for
schemaless,
still needs care!
What to put on ONE topic?
1. Two choices:
a. Topic == collection of events of same type
b. Topic == events that need relative ordering guarantee.
https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html
What to put on ONE topic?
1. Two choices:
a. Topic == collection of events of same type
b. Topic == events that need relative ordering guarantee.
2. Winner: choice (b)
https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html
Avro / Proto (Struct) Schema
1. Language agnostic schema. Being stuck with one language sucks!
2. JSON seems first pick if you use REST, but
a. slow and
b. too verbose.
c. Complete Schema shipped with every message
3. Avro and proto are good.
4. We like Avro for its wide adoption.
a. And use pulsar’s built in schema registry
5. Consider keeping schema flat and fat (denormalize)!
https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
Schema Evolution
1. Choose a schema-auto-update strategy that suits use case.
a. We keep it forward compatible (add fields, delete optional fields)
b. Data produced with new schema can be read by consumers using last schema
c. Update producer, then consumers when they have time / need.
2. Each avro message contains an avro schema id & version.
3. Decode with the exact writer schema.
Summarizing Lessons
✓ Avoid bias to “known” when choosing a platform.
✓ Tune re-replication (ensemble, write quorum, ack quorum) when
scaling out bookies horizontally.
✓ Use schema, as far as possible!
✓ Tune configuration for size, resource, throughput, durability etc.
May be optimized for backward compatibility.
✓ Disable auto-recovery of bookie before taking down.
✓ Balance recovery with incoming user traffic.
✓ Put events that require ordering on same topic.
Stay Connected:
● Pulsar Mailing Lists
○ users@pulsar.apache.org
○ dev@pulsar.apache.org
● Pulsar Slack
○ https://apache-pulsar.slack.com
● You can contact me at:
○ https://twitter.com/ShivjiJha
○ https://www.linkedin.com/in/shivjijha/
Q & A Time

More Related Content

PDF
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
PDF
Pulsar Summit Asia - Structured Data Stream with Apache Pulsar
PDF
Pulsar Summit Asia - Running a secure pulsar cluster
PPTX
Apache Con 2021 Structured Data Streaming
PDF
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
PPTX
Introduction Apache Kafka
PDF
Streaming millions of Contact Center interactions in (near) real-time with Pu...
PPTX
Kafka blr-meetup-presentation - Kafka internals
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
Pulsar Summit Asia - Structured Data Stream with Apache Pulsar
Pulsar Summit Asia - Running a secure pulsar cluster
Apache Con 2021 Structured Data Streaming
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Introduction Apache Kafka
Streaming millions of Contact Center interactions in (near) real-time with Pu...
Kafka blr-meetup-presentation - Kafka internals

What's hot (20)

PDF
Introduction to Apache Kafka
PPTX
Apache Kafka
PDF
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
PPTX
Kafka Tutorial - basics of the Kafka streaming platform
PPTX
Introduction to Kafka and Zookeeper
PDF
Introduction to apache kafka
KEY
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
PDF
A la rencontre de Kafka, le log distribué par Florian GARCIA
ODP
Introduction to Apache Kafka- Part 1
PDF
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
PDF
Transaction preview of Apache Pulsar
PPTX
Kafka
PDF
PDF
When apache pulsar meets apache flink
PDF
Kafka on Pulsar
PDF
Building High-Throughput, Low-Latency Pipelines in Kafka
PPTX
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
PDF
Integrating Apache Pulsar with Big Data Ecosystem
PDF
Apache Kafka - Martin Podval
PDF
Kafka and Spark Streaming
Introduction to Apache Kafka
Apache Kafka
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Kafka Tutorial - basics of the Kafka streaming platform
Introduction to Kafka and Zookeeper
Introduction to apache kafka
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
A la rencontre de Kafka, le log distribué par Florian GARCIA
Introduction to Apache Kafka- Part 1
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
Transaction preview of Apache Pulsar
Kafka
When apache pulsar meets apache flink
Kafka on Pulsar
Building High-Throughput, Low-Latency Pipelines in Kafka
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Integrating Apache Pulsar with Big Data Ecosystem
Apache Kafka - Martin Podval
Kafka and Spark Streaming
Ad

Similar to lessons from managing a pulsar cluster (20)

PDF
Lessons from managing a Pulsar cluster (Nutanix)
PDF
Pulsar - flexible pub-sub for internet scale
PDF
Hands-on Workshop: Apache Pulsar
PDF
Apache Pulsar Seattle - Meetup
PDF
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
PDF
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
PDF
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
PDF
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
PDF
Pulsar summit-keynote-final
PDF
Why Splunk Chose Pulsar_Karthik Ramasamy
PDF
Python web conference 2022 apache pulsar development 101 with python (f li-...
PDF
bigdata 2022_ FLiP Into Pulsar Apps
PDF
Apache Pulsar @Splunk
PDF
Linked In Stream Processing Meetup - Apache Pulsar
PDF
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
PDF
Apache Pulsar Overview
PDF
Timothy Spann: Apache Pulsar for ML
PDF
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
PDF
Interactive querying of streams using Apache Pulsar_Jerry peng
PDF
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
Lessons from managing a Pulsar cluster (Nutanix)
Pulsar - flexible pub-sub for internet scale
Hands-on Workshop: Apache Pulsar
Apache Pulsar Seattle - Meetup
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
Unifying Messaging, Queueing & Light Weight Compute Using Apache Pulsar
Python Web Conference 2022 - Apache Pulsar Development 101 with Python (FLiP-Py)
Pulsar summit-keynote-final
Why Splunk Chose Pulsar_Karthik Ramasamy
Python web conference 2022 apache pulsar development 101 with python (f li-...
bigdata 2022_ FLiP Into Pulsar Apps
Apache Pulsar @Splunk
Linked In Stream Processing Meetup - Apache Pulsar
Ten reasons to choose Apache Pulsar over Apache Kafka for Event Sourcing_Robe...
Apache Pulsar Overview
Timothy Spann: Apache Pulsar for ML
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
Interactive querying of streams using Apache Pulsar_Jerry peng
Apache Pulsar in Action MEAP V04 David Kjerrumgaard
Ad

More from Shivji Kumar Jha (17)

PPTX
Batch to near-realtime: inspired by a real production incident
PDF
Navigating Transactions: ACID Complexity in Modern Databases
PPTX
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
PPTX
osi-oss-dbs.pptx
PPTX
pulsar-platformatory-meetup-2.pptx
PDF
Pulsar Summit Asia 2022 - Streaming wars and How Apache Pulsar is acing the b...
PDF
Pulsar Summit Asia 2022 - Keeping on top of hybrid cloud usage with Pulsar
PDF
Pulsar summit asia 2021: Designing Pulsar for Isolation
PPTX
Event sourcing Live 2021: Streaming App Changes to Event Store
PPTX
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
PDF
FOSSASIA 2015: MySQL Group Replication
PDF
MySQL High Availability with Replication New Features
PDF
MySQL Developer Day conference: MySQL Replication and Scalability
PDF
MySQL User Camp: MySQL Cluster
PDF
MySQL User Camp: GTIDs
PDF
Open source India - MySQL Labs: Multi-Source Replication
PDF
MySQL User Camp: Multi-threaded Slaves
Batch to near-realtime: inspired by a real production incident
Navigating Transactions: ACID Complexity in Modern Databases
Druid Summit 2023 : Changing Druid Ingestion from 3 hours to 5 minutes
osi-oss-dbs.pptx
pulsar-platformatory-meetup-2.pptx
Pulsar Summit Asia 2022 - Streaming wars and How Apache Pulsar is acing the b...
Pulsar Summit Asia 2022 - Keeping on top of hybrid cloud usage with Pulsar
Pulsar summit asia 2021: Designing Pulsar for Isolation
Event sourcing Live 2021: Streaming App Changes to Event Store
Apache Con 2021 : Apache Bookkeeper Key Value Store and use cases
FOSSASIA 2015: MySQL Group Replication
MySQL High Availability with Replication New Features
MySQL Developer Day conference: MySQL Replication and Scalability
MySQL User Camp: MySQL Cluster
MySQL User Camp: GTIDs
Open source India - MySQL Labs: Multi-Source Replication
MySQL User Camp: Multi-threaded Slaves

Recently uploaded (20)

PPTX
UNIT 4 Total Quality Management .pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Structs to JSON How Go Powers REST APIs.pdf
PPTX
Sustainable Sites - Green Building Construction
PPT
Mechanical Engineering MATERIALS Selection
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
PPT on Performance Review to get promotions
PPTX
web development for engineering and engineering
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
Construction Project Organization Group 2.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
composite construction of structures.pdf
UNIT 4 Total Quality Management .pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Lecture Notes Electrical Wiring System Components
Structs to JSON How Go Powers REST APIs.pdf
Sustainable Sites - Green Building Construction
Mechanical Engineering MATERIALS Selection
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPT on Performance Review to get promotions
web development for engineering and engineering
Foundation to blockchain - A guide to Blockchain Tech
Construction Project Organization Group 2.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
bas. eng. economics group 4 presentation 1.pptx
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
CH1 Production IntroductoryConcepts.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
composite construction of structures.pdf

lessons from managing a pulsar cluster

  • 2. ● Senior Developer at Nutanix responsible for all things pulsar ● Love spending time with data (stores, streams, analytics etc) ● Ex-MySQL - started out with 3 great years building MySQL Replication ● Contributions to pulsar & MySQL Who am I ? https://www.linkedin.com/in/shivjijha/ https://twitter.com/ShivjiJha
  • 3. ● Helping customers manage cost and security for hybrid cloud. ● Crunch (& stream) data to find insights about cost and security ● Needed pub/sub to store events and replay when required What do we do ? https://www.nutanix.com/products/beam
  • 6. How do we Choose a platform ??
  • 9. Summarising the github comment 1. Kafka alternative - incubating apache project PULSAR 2. Open sourced by Yahoo 3. Hundreds of billions of messages per day in pulsar at Yahoo 4. Solving annoying problems in kafka like: a. Topic management b. Disruptive rebalances 5. Same raw power (throughput, latencies etc) 6. Stateless brokers 7. Apache bookkeeper for storage 8. Stream + queue
  • 10. Wow, that is a lot of Promise!!
  • 11. First principles - Requirements? 1. Coordination 2. Persistence 3. Scale compute and storage independently 4. High Availability 5. Fault tolerance 6. Client ecosystem
  • 12. Requirement # 1 ✓ Coordination
  • 13. Requirement # 1 ✓ Coordination
  • 14. Requirement # 1 1. Coordination
  • 15. Requirement # 2 ✓ Persistence
  • 16. Requirement # 2 ✓ Persistence
  • 17. Requirement # 2 ✓ Persistence
  • 18. Requirement # 2 ✓ Persistence
  • 19. Requirement # 3 ✓ Scale compute and storage independently
  • 20. Requirement # 3 ✓ Scale compute and storage independently
  • 21. Requirement # 3 ✓ Scale compute and storage independently Brokers => serve msg
  • 22. Requirement # 3 ✓ Scale compute and storage independently Bookies => store Brokers => serve msg
  • 23. Requirement # 3 ✓ Scale compute and storage independently Bookies => store Brokers => serve msg
  • 24. Requirement # 4 ✓ High Availability
  • 25. Requirement # 4 ✓ High Availability
  • 26. Requirement # 4 ✓ High Availability Replicated brokers Replicated bookies
  • 27. Requirement # 4 ✓ High Availability Replicated brokers Replicated bookies
  • 28. Requirement # 5 ✓ Fault tolerance ✓ Replicated compute (brokers) ✓ Replicated store (bookkeeper / bookies)
  • 29. Requirement # 5 ✓ Fault tolerance ✓ Replicated compute (brokers) ✓ Replicated store (bookkeeper / bookies) ✓ Tunable fault tolerance (bookkeeper) ✓ Ensemble size ✓ Write quorum size ✓ Ack quorum size https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html
  • 30. Requirement # 5 ✓ Fault tolerance ✓ Replicated compute (brokers) ✓ Replicated store (bookkeeper / bookies) ✓ Tunable fault tolerance (bookkeeper) ✓ Ensemble size ✓ Write quorum size ✓ Ack quorum size https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html set-persistence --ensemble 5 --writeQuorum 3 --ackQuorum 2
  • 31. Requirement # 5 ✓ Fault tolerance ✓ Replicated compute (brokers) ✓ Replicated store (bookkeeper / bookies) ✓ Tunable fault tolerance (bookkeeper) ✓ Ensemble size ✓ Write quorum size ✓ Ack quorum size https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html When scaling bookie cluster, finetune quorum sizes set-persistence --ensemble 5 --writeQuorum 3 --ackQuorum 2
  • 32. Requirement # 5 ✓ Fault tolerance ✓ Replicated compute (brokers) ✓ Replicated store (bookkeeper / bookies) ✓ Tunable fault tolerance (bookkeeper) ✓ Ensemble size ✓ Write quorum size ✓ Ack quorum size https://www.splunk.com/en_us/blog/it/why-apache-bookkeeper-part-1-consistency-durability-availability.html When scaling bookie cluster, finetune quorum sizes set-persistence --ensemble 5 --writeQuorum 3 --ackQuorum 2
  • 33. Requirement # 6 ✓ Client ecosystem ✓ Work in progress
  • 34. Requirement # 6 ✓ Client ecosystem ✓ Work in progress ✓ Compensating factors: ✓ Clients are easier to change, just a library afterall! ✓ Very active community (slack) ✓ Quick turnaround (and quick fixes) for critical issues
  • 35. Requirement # 6 ✓ Client ecosystem ✓ Work in progress ✓ Compensating factors: ✓ Clients are easier to change, just a library afterall! ✓ Very active community (slack) ✓ Quick turnaround (and quick fixes) for critical issues ✓ Bonus features ✓ Load balancer auto balances topics among brokers ✓ Tiered storage ✓ Unified platform (Stream + Queue) ✓ Multi-tenant topic structure
  • 36. Requirement # 6 ✓ Client ecosystem ✓ Work in progress ✓ Compensating factors: ✓ Clients are easier to change, just a library afterall! ✓ Very active community (slack) ✓ Quick turnaround (and quick fixes) for critical issues ✓ Bonus features ✓ Load balancer auto balances topics among brokers ✓ Tiered storage ✓ Unified platform (Stream + Queue) ✓ Multi-tenant topic structure
  • 37. Tuning Configurations ✓ Configurations could be optimized for backward compatibility ✓ Not necessarily for performance ✓ Not necessarily for latest features ✓ Perf Test for your use cases and tune!
  • 41. Tuning Configurations ✓ Durability vs throughput (bookkeeper.conf) # Maximum latency to impose on a journal write to achieve grouping journalMaxGroupWaitMSec=2
  • 42. Tuning Configurations ✓ Disable auto recovery in bookkeeper when out for maintenance! bookkeeper shell autorecovery -disable STOP / MAINTENANCE / START bookkeeper shell autorecovery -enable
  • 43. Tuning Configurations ✓ Auto recovery vs throughput (broker.conf) ✓ If you have a small number of bookies, and a bookie goes down, auto recovery may overwhelm the remaining bookies ✓ Number of entries that a replication will re-replicate in parallel maxPendingReadRequestsPerThread=2500 rereplicationEntryBatchSize=100
  • 44. Contribute to stay in sync 1. Development is fast, in fact very fast a. Don’t maintain forks, easier to contribute https://github.com/apache/pulsar/graphs/contributors
  • 45. Contribute to stay in sync 1. Development is fast, in fact very fast a. Don’t maintain forks, easier to contribute 2. We do the same! https://github.com/apache/pulsar/graphs/contributors
  • 46. Pulsar Use cases In Beam &
  • 47. Event Sourcing 1. Persisting your application's state by storing the history that determines the current state of your application. State of application at any point in time State of application at this instant of time https://docs.microsoft.com/en-us/previous-versions/msp-n-p/jj591559(v=pandp.10)
  • 48. ● History of events ● Past Tense verbs ● Immutable ● Ordered ● Restore for state at any point in time ● Use: CQRS, Audit trail etc Event Sourcing https://docs.microsoft.com/en-us/azure/architecture/patterns/event-sourcing
  • 49. Representing Events (Schema) 1. Pulsar supports bytes, string, avro, ptobuff, json etc 2. Schemaless? a. Any code that manipulates the data needs to make some assumptions about its structure. b. All producers and consumers know the hidden implicit schema. 3. Opinion: Use schema as far as possible. a. Pulsar supports schema registry out of the box.
  • 50. Representing Events (Schema) 1. Of course, Schemalessness offers a pragmatic alternative at times. https://martinfowler.com/articles/schemaless/#non-uniform-types
  • 51. Representing Events (Schema) 1. Of course, Schemalessness offers a pragmatic alternative at times. https://martinfowler.com/articles/schemaless/#non-uniform-types Add custom fields for UI etc
  • 52. Representing Events (Schema) 1. Of course, Schemalessness offers a pragmatic alternative at times. https://martinfowler.com/articles/schemaless/#non-uniform-types Add custom fields for UI etc Different attributes depending on kind of event
  • 53. Representing Events (Schema) 1. Of course, Schemalessness offers a pragmatic alternative at times. https://martinfowler.com/articles/schemaless/#non-uniform-types Add custom fields for UI etc Different attributes depending on kind of event Obviously, easy for schemaless, still needs care!
  • 54. What to put on ONE topic? 1. Two choices: a. Topic == collection of events of same type b. Topic == events that need relative ordering guarantee. https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html
  • 55. What to put on ONE topic? 1. Two choices: a. Topic == collection of events of same type b. Topic == events that need relative ordering guarantee. 2. Winner: choice (b) https://martin.kleppmann.com/2018/01/18/event-types-in-kafka-topic.html
  • 56. Avro / Proto (Struct) Schema 1. Language agnostic schema. Being stuck with one language sucks! 2. JSON seems first pick if you use REST, but a. slow and b. too verbose. c. Complete Schema shipped with every message 3. Avro and proto are good. 4. We like Avro for its wide adoption. a. And use pulsar’s built in schema registry 5. Consider keeping schema flat and fat (denormalize)! https://martin.kleppmann.com/2012/12/05/schema-evolution-in-avro-protocol-buffers-thrift.html
  • 57. Schema Evolution 1. Choose a schema-auto-update strategy that suits use case. a. We keep it forward compatible (add fields, delete optional fields) b. Data produced with new schema can be read by consumers using last schema c. Update producer, then consumers when they have time / need. 2. Each avro message contains an avro schema id & version. 3. Decode with the exact writer schema.
  • 58. Summarizing Lessons ✓ Avoid bias to “known” when choosing a platform. ✓ Tune re-replication (ensemble, write quorum, ack quorum) when scaling out bookies horizontally. ✓ Use schema, as far as possible! ✓ Tune configuration for size, resource, throughput, durability etc. May be optimized for backward compatibility. ✓ Disable auto-recovery of bookie before taking down. ✓ Balance recovery with incoming user traffic. ✓ Put events that require ordering on same topic.
  • 59. Stay Connected: ● Pulsar Mailing Lists ○ users@pulsar.apache.org ○ dev@pulsar.apache.org ● Pulsar Slack ○ https://apache-pulsar.slack.com ● You can contact me at: ○ https://twitter.com/ShivjiJha ○ https://www.linkedin.com/in/shivjijha/ Q & A Time