SlideShare a Scribd company logo
APACHE ZOOKEEPER
Viet-Dung TRINH (Bill), 03/2016
Saltlux – Vietnam Development Center
Agenda
•  Overview
•  The ZooKeeper Service
•  The ZooKeeper Data Model
•  Recipes
Overview – What is ZooKeeper?
•  An open source, high-performance
coordination service for distributed
application.
•  Exposes common services in simple
interface:
•  Naming
•  Configuration management
•  Locks & synchronization
•  Groups services
•  Build your own on it for specific needs
Overview – Who uses ZooKeeper?
•  Companies:
•  Yahoo!
•  Zynga
•  Rackspace
•  Linkedlin
•  Netflix, and many more…
•  Projects:
•  Apache Map/Reduce (Yarn)
•  Apache HBase
•  Apache Kafka
•  Apache Storm
•  Neo4j, and many more…
Overview – ZooKeeper Use Cases
•  Configuration Management
•  Cluster member nodes bootstrapping configuration from a
centralized source in unattended way
•  Distributed Cluster Management
•  Node join / leave
•  Node statuses in real time
•  Naming service – e.g. DNS
•  Distributed synchronization – locks, barriers, queues
•  Leader election in a distributed system
The ZooKeeper Service (ZKS)
•  ZooKeeper Service is replicated over a set of machines
•  All machines store a copy of the data (in-memory)
•  A leader is elected on service startup
•  Clients only connect to a single ZooKeeper server and maintain a
TCP connection
The ZKS - Sessions
•  Before executing any request, client must establish a
session with service
•  All operations client summits to service are associated to
a session
•  Client initially connects to any server in ensemble, and
only to single server.
•  Session offer order guarantees – requests in session are
executed in FIFO order
The ZKS – Session States and Lifetime
•  Main possible states: CONNECTING, CONNECTED,
CLOSED, NOT_CONNECTED
The ZooKeeper Data Model (ZDM)
•  Hierarchal name space
•  Each node is called as a ZNode
•  Every ZNode has data (given as byte[])
and can optionally have children
•  ZNode paths:
•  Canonical, absolute, slash-separated
•  No relative references
•  Names can have Unicode characters
•  ZNode maintain stat structure
ZDM - Versions
•  Eash Znode has version number, is incremented every
time its data changes
•  setData and delete take version as input, operation
succeeds only if client’s version is equal to server’s one
ZDM – ZNodes – Stat Structure
•  The Stat structure for each znode in ZooKeeper is made
up of the following fields:
•  czxid
•  mzxid
•  pzxid
•  ctime
•  mtime
•  dataVersion
•  cversion
•  aclVersion
•  ephemeralOwner
•  dataLength
•  numChildren
ZDM – Types of ZNode
•  Persistent ZNode
•  Have lifetime in ZooKeeper’s namespace until they’re explicitly
deleted (can be deleted by delete API call)
•  Ephemeral ZNode
•  Is deleted by ZooKeeper service when the creating client’s session
ends
•  Can also be explicitly deleted
•  Are not allowed to have children
•  Sequential Znode
•  Is assigned a sequence number by ZooKeeper as a part of name
during creation
•  Sequence number is integer (4bytes) with format of 10 digits with 0
padding. E.g. /path/to/znode-0000000001
ZDM – Znode Operations
ZDM – Znode – Reads & Writes
•  Read requests are processed locally at the ZooKeeper
server to which client is currently connected
•  Write requests are forwarded to leader and go through
majority consensus before a response is generated
ZDM – Consistency Guarantees
•  Sequential Consistency
•  Atomicity
•  Single System Image
•  Reliability
•  Timeliness (Eventual Consistency)
ZDM - Watches
•  A watch event is one-time trigger, sent to client that set
watch, which occurs when data for which watch was set
changes.
•  Watches allow clients to get notifications when a znode
changes in any way (NodeChildrenChanged,
NodeCreated, NodeDataChanged,NodeDeleted)
•  All of read operations – getData(), getChildren(), exists()
– have option of setting watch
•  ZooKeeper Guarantees about Watches:
•  Watches are ordered, order of watch events corresponds to the
order of the updates
•  A client will see a watch event for znode it is watching before
seeing the new data that corresponds to that znode
ZDM – Watches (cont)
ZDM – Access Control List
•  ZooKeeper uses ACLs to control access to its znodes
•  ACLs are made up of pairs of (scheme:id, permission)
•  Build-in ACL schemes
•  world: has single id, anyone
•  auth: doesn’t use any id, represents any authenticated user
•  digest: use a username:password
•  host: use the client host name as ACL id identity
•  ip: use the client host IP as ACL id identity
•  ACL Permissions:
•  CREATE
•  READ
•  WRITE
•  DELETE
•  ADMIN
•  E.g. (ip:192.168.0.0/16, READ)
Recipe #1: Queue
•  A distributed queue is very common data structure used in
distributed systems.
•  Producer: generate / create new items and put them into
queue
•  Consumer: remove items from queue and process them
•  Addition and removal of items follow ordering of FIFO
Recipe #1: Queue (cont)
•  A ZNode will be designated to hold a queue instance,
queue-znode
•  All queue items are stored as znodes under queue-znode
•  Producers add an item to queue by creating znode under
queue-znode
•  Consumers retrieve items by getting and then deleting a
child from queue-znode
QUEUE-ZNODE : “queue instance”
|-- QUEUE-0000000001 : “item1”
|-- QUEUE-0000000002 : “item2”
|-- QUEUE-0000000003 : “item3”
Recipe #1: Queue (cont)
•  Let /_QUEUE_ represent top-level znode, is called queue-
znode
•  Producer put something into queue by creating a
SEQUENCE_EPHEMERAL znode with name “queue-N”,
N is monotonically increasing number
create (“queue-”, SEQUENCE_EPHEMARAL)
•  Consumer process getChildren() call on queue-znode with
watch event set to true
M = getChildren(/_QUEUE_, true)
•  Client picks up items from list and continues processing
until reaching the end of the list, and then check again
•  The algorithm continues until get_children() returns
empty list
Recipe #2: Group Membership
•  A persistent Znode /membership represent the root of the
group in ZooKeeper tree
•  Any client that joins the cluster creates ephemeral znode
under /membership to locate memberships in tree and set
a watch on /membership
•  When another node joins or leaves the cluster, this node
gets a notification and becomes aware of the change in
group membership
Recipe #2: Group Membership (cont)
•  Let /_MEMBERSHIP_ represent root of group membership
•  Client joining the group create ephemeral nodes under root
•  All members of group will register for watch events on /
_MEMBERSHIP, thereby being aware of other members in
group
L = getChildren(“/_MEMBERSHIP”, true)
•  When new client joins group, all other members are notified
•  Similarly, a client leaves due to failure or otherwise,
ZooKeeper automatically delete node, trigger event
•  Live members know which node joined or left by looking at
the list of children L
References
[1]. Apache ZooKeeper, http://zookeeper.apache.org
[2]. Introduction to Apache ZooKeeper,
http://www.slideshare.net/sauravhaloi
[3]. Saurav Haloi, Apache Zookeeper Essentials, 2015
Questions?
Thank You!

More Related Content

PPT
Zookeeper Introduce
PPTX
Introduction to Apache ZooKeeper
PPTX
Apache Spark Architecture
PDF
Cassandra Introduction & Features
PDF
Apache ZooKeeper
PPTX
Spark architecture
PDF
Introduction to Cassandra
PDF
Intro to HBase
Zookeeper Introduce
Introduction to Apache ZooKeeper
Apache Spark Architecture
Cassandra Introduction & Features
Apache ZooKeeper
Spark architecture
Introduction to Cassandra
Intro to HBase

What's hot (20)

PDF
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
KEY
Introduction to memcached
ODP
Introduction to Ansible
PDF
Introduction to Apache Hive
PDF
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
PDF
Data Stores @ Netflix
PPTX
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
PDF
Data ingestion and distribution with apache NiFi
PDF
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
PPTX
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
PPTX
Stability Patterns for Microservices
PDF
Fundamentals of Apache Kafka
PPTX
Zookeeper Tutorial for beginners
PPTX
04 spark-pair rdd-rdd-persistence
PDF
Big Data: Getting started with Big SQL self-study guide
PDF
Galera cluster for high availability
PPTX
Apache Zookeeper Explained: Tutorial, Use Cases and Zookeeper Java API Examples
PDF
Introduction to Spark Streaming
PDF
Simplifying Big Data Analytics with Apache Spark
PDF
Iceberg: a fast table format for S3
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction to memcached
Introduction to Ansible
Introduction to Apache Hive
Introduction to Apache ZooKeeper | Big Data Hadoop Spark Tutorial | CloudxLab
Data Stores @ Netflix
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Data ingestion and distribution with apache NiFi
ACID ORC, Iceberg, and Delta Lake—An Overview of Table Formats for Large Scal...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Stability Patterns for Microservices
Fundamentals of Apache Kafka
Zookeeper Tutorial for beginners
04 spark-pair rdd-rdd-persistence
Big Data: Getting started with Big SQL self-study guide
Galera cluster for high availability
Apache Zookeeper Explained: Tutorial, Use Cases and Zookeeper Java API Examples
Introduction to Spark Streaming
Simplifying Big Data Analytics with Apache Spark
Iceberg: a fast table format for S3
Ad

Similar to Apache Zookeeper (20)

PPTX
Meetup on Apache Zookeeper
PDF
Scalable IoT platform
PPTX
PPTX
Apache zookeeper 101
PPTX
Benchmarking Solr Performance at Scale
PPTX
Windows 8 Metro apps and the outside world
PPTX
Cassandra
PPTX
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
PPTX
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
PDF
Server-side JS with NodeJS
PDF
Distributed system coordination by zookeeper and introduction to kazoo python...
KEY
DjangoCon 2010 Scaling Disqus
PPTX
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
PPTX
OrigoDB - take the red pill
PPTX
SQL Server Deep Drive
PPTX
Advance HBase and Zookeeper - Module 8
PDF
Oracle WebLogic Diagnostics & Perfomance tuning
PDF
Introduction to SolrCloud
PDF
Deploying and managing Solr at scale
PPTX
Architecting for Microservices Part 2
Meetup on Apache Zookeeper
Scalable IoT platform
Apache zookeeper 101
Benchmarking Solr Performance at Scale
Windows 8 Metro apps and the outside world
Cassandra
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
Cassandra Tools and Distributed Administration (Jeffrey Berger, Knewton) | C*...
Server-side JS with NodeJS
Distributed system coordination by zookeeper and introduction to kazoo python...
DjangoCon 2010 Scaling Disqus
What We Learned About Cassandra While Building go90 (Christopher Webster & Th...
OrigoDB - take the red pill
SQL Server Deep Drive
Advance HBase and Zookeeper - Module 8
Oracle WebLogic Diagnostics & Perfomance tuning
Introduction to SolrCloud
Deploying and managing Solr at scale
Architecting for Microservices Part 2
Ad

More from Nguyen Quang (13)

PDF
Apache Storm
PPTX
Deep Reinforcement Learning
PPTX
Deep Dialog System Review
PPTX
Sequence to Sequence Learning with Neural Networks
PPT
Introduction to cassandra
PPTX
Web browser architecture
PPTX
Eclipse orion
PPT
X Query for beginner
PPTX
Html 5
PPT
Redistributable introtoscrum
PPT
Text categorization
PPTX
A holistic lexicon based approach to opinion mining
PPTX
Overview of NoSQL
Apache Storm
Deep Reinforcement Learning
Deep Dialog System Review
Sequence to Sequence Learning with Neural Networks
Introduction to cassandra
Web browser architecture
Eclipse orion
X Query for beginner
Html 5
Redistributable introtoscrum
Text categorization
A holistic lexicon based approach to opinion mining
Overview of NoSQL

Recently uploaded (20)

PPTX
CHAPTER 2 - PM Management and IT Context
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
System and Network Administraation Chapter 3
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Transform Your Business with a Software ERP System
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Introduction to Artificial Intelligence
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Digital Strategies for Manufacturing Companies
PPTX
history of c programming in notes for students .pptx
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
CHAPTER 2 - PM Management and IT Context
How Creative Agencies Leverage Project Management Software.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Wondershare Filmora 15 Crack With Activation Key [2025
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Design an Analysis of Algorithms I-SECS-1021-03
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
System and Network Administraation Chapter 3
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Online Work Permit System for Fast Permit Processing
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Transform Your Business with a Software ERP System
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Introduction to Artificial Intelligence
2025 Textile ERP Trends: SAP, Odoo & Oracle
Digital Strategies for Manufacturing Companies
history of c programming in notes for students .pptx
Navsoft: AI-Powered Business Solutions & Custom Software Development

Apache Zookeeper

  • 1. APACHE ZOOKEEPER Viet-Dung TRINH (Bill), 03/2016 Saltlux – Vietnam Development Center
  • 2. Agenda •  Overview •  The ZooKeeper Service •  The ZooKeeper Data Model •  Recipes
  • 3. Overview – What is ZooKeeper? •  An open source, high-performance coordination service for distributed application. •  Exposes common services in simple interface: •  Naming •  Configuration management •  Locks & synchronization •  Groups services •  Build your own on it for specific needs
  • 4. Overview – Who uses ZooKeeper? •  Companies: •  Yahoo! •  Zynga •  Rackspace •  Linkedlin •  Netflix, and many more… •  Projects: •  Apache Map/Reduce (Yarn) •  Apache HBase •  Apache Kafka •  Apache Storm •  Neo4j, and many more…
  • 5. Overview – ZooKeeper Use Cases •  Configuration Management •  Cluster member nodes bootstrapping configuration from a centralized source in unattended way •  Distributed Cluster Management •  Node join / leave •  Node statuses in real time •  Naming service – e.g. DNS •  Distributed synchronization – locks, barriers, queues •  Leader election in a distributed system
  • 6. The ZooKeeper Service (ZKS) •  ZooKeeper Service is replicated over a set of machines •  All machines store a copy of the data (in-memory) •  A leader is elected on service startup •  Clients only connect to a single ZooKeeper server and maintain a TCP connection
  • 7. The ZKS - Sessions •  Before executing any request, client must establish a session with service •  All operations client summits to service are associated to a session •  Client initially connects to any server in ensemble, and only to single server. •  Session offer order guarantees – requests in session are executed in FIFO order
  • 8. The ZKS – Session States and Lifetime •  Main possible states: CONNECTING, CONNECTED, CLOSED, NOT_CONNECTED
  • 9. The ZooKeeper Data Model (ZDM) •  Hierarchal name space •  Each node is called as a ZNode •  Every ZNode has data (given as byte[]) and can optionally have children •  ZNode paths: •  Canonical, absolute, slash-separated •  No relative references •  Names can have Unicode characters •  ZNode maintain stat structure
  • 10. ZDM - Versions •  Eash Znode has version number, is incremented every time its data changes •  setData and delete take version as input, operation succeeds only if client’s version is equal to server’s one
  • 11. ZDM – ZNodes – Stat Structure •  The Stat structure for each znode in ZooKeeper is made up of the following fields: •  czxid •  mzxid •  pzxid •  ctime •  mtime •  dataVersion •  cversion •  aclVersion •  ephemeralOwner •  dataLength •  numChildren
  • 12. ZDM – Types of ZNode •  Persistent ZNode •  Have lifetime in ZooKeeper’s namespace until they’re explicitly deleted (can be deleted by delete API call) •  Ephemeral ZNode •  Is deleted by ZooKeeper service when the creating client’s session ends •  Can also be explicitly deleted •  Are not allowed to have children •  Sequential Znode •  Is assigned a sequence number by ZooKeeper as a part of name during creation •  Sequence number is integer (4bytes) with format of 10 digits with 0 padding. E.g. /path/to/znode-0000000001
  • 13. ZDM – Znode Operations
  • 14. ZDM – Znode – Reads & Writes •  Read requests are processed locally at the ZooKeeper server to which client is currently connected •  Write requests are forwarded to leader and go through majority consensus before a response is generated
  • 15. ZDM – Consistency Guarantees •  Sequential Consistency •  Atomicity •  Single System Image •  Reliability •  Timeliness (Eventual Consistency)
  • 16. ZDM - Watches •  A watch event is one-time trigger, sent to client that set watch, which occurs when data for which watch was set changes. •  Watches allow clients to get notifications when a znode changes in any way (NodeChildrenChanged, NodeCreated, NodeDataChanged,NodeDeleted) •  All of read operations – getData(), getChildren(), exists() – have option of setting watch •  ZooKeeper Guarantees about Watches: •  Watches are ordered, order of watch events corresponds to the order of the updates •  A client will see a watch event for znode it is watching before seeing the new data that corresponds to that znode
  • 17. ZDM – Watches (cont)
  • 18. ZDM – Access Control List •  ZooKeeper uses ACLs to control access to its znodes •  ACLs are made up of pairs of (scheme:id, permission) •  Build-in ACL schemes •  world: has single id, anyone •  auth: doesn’t use any id, represents any authenticated user •  digest: use a username:password •  host: use the client host name as ACL id identity •  ip: use the client host IP as ACL id identity •  ACL Permissions: •  CREATE •  READ •  WRITE •  DELETE •  ADMIN •  E.g. (ip:192.168.0.0/16, READ)
  • 19. Recipe #1: Queue •  A distributed queue is very common data structure used in distributed systems. •  Producer: generate / create new items and put them into queue •  Consumer: remove items from queue and process them •  Addition and removal of items follow ordering of FIFO
  • 20. Recipe #1: Queue (cont) •  A ZNode will be designated to hold a queue instance, queue-znode •  All queue items are stored as znodes under queue-znode •  Producers add an item to queue by creating znode under queue-znode •  Consumers retrieve items by getting and then deleting a child from queue-znode QUEUE-ZNODE : “queue instance” |-- QUEUE-0000000001 : “item1” |-- QUEUE-0000000002 : “item2” |-- QUEUE-0000000003 : “item3”
  • 21. Recipe #1: Queue (cont) •  Let /_QUEUE_ represent top-level znode, is called queue- znode •  Producer put something into queue by creating a SEQUENCE_EPHEMERAL znode with name “queue-N”, N is monotonically increasing number create (“queue-”, SEQUENCE_EPHEMARAL) •  Consumer process getChildren() call on queue-znode with watch event set to true M = getChildren(/_QUEUE_, true) •  Client picks up items from list and continues processing until reaching the end of the list, and then check again •  The algorithm continues until get_children() returns empty list
  • 22. Recipe #2: Group Membership •  A persistent Znode /membership represent the root of the group in ZooKeeper tree •  Any client that joins the cluster creates ephemeral znode under /membership to locate memberships in tree and set a watch on /membership •  When another node joins or leaves the cluster, this node gets a notification and becomes aware of the change in group membership
  • 23. Recipe #2: Group Membership (cont) •  Let /_MEMBERSHIP_ represent root of group membership •  Client joining the group create ephemeral nodes under root •  All members of group will register for watch events on / _MEMBERSHIP, thereby being aware of other members in group L = getChildren(“/_MEMBERSHIP”, true) •  When new client joins group, all other members are notified •  Similarly, a client leaves due to failure or otherwise, ZooKeeper automatically delete node, trigger event •  Live members know which node joined or left by looking at the list of children L
  • 24. References [1]. Apache ZooKeeper, http://zookeeper.apache.org [2]. Introduction to Apache ZooKeeper, http://www.slideshare.net/sauravhaloi [3]. Saurav Haloi, Apache Zookeeper Essentials, 2015