Dok Talks #124 - Intro to Druid on Kubernetes

Apache Druid
on Kubernetes
Apache Druid Database Overview
Kubernetes & Helm Charts
Apache Druid’s Helm Chart
Overview
Scaling Up and Down
Auto-Scaling Ingestion
What it Doesn’t (yet) Do

It is a database that is:
Fully scalable
Batch and real-time data
Ad-hoc statistical queries
Low latency delivery
What is Apache Druid?

log search
real-time ingest
ﬂexible schema
text search
Fully scalable

log search
real-time ingest
ﬂexible schema
text search
timeseries
low latency ingest
time-based storage
time functions
Fully scalable

columnar
eﬃcient storage
fast analytic queries
data distribution
log search
real-time ingest
ﬂexible schema
text search
timeseries
low latency ingest
time-based storage
time functions
Fully scalable

columnar
eﬃcient storage
fast analytic queries
data distribution
log search
real-time ingest
ﬂexible schema
text search
timeseries
low latency ingest
time-based storage
time functions
High Performance
Real-time Analytics

Apache®, Apache Druid®, Druid®, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
The Architecture

The Druid Architecture
Overview & High Availability
Query Services Data Services Master Services
broker
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
broker
broker
Deep Storage:
- HDFS
- S3, GCP, Azure
- local ( test only)
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical

Data Ingestion Processing
broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical

broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical
REST API

broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical
Streaming
data
Batch
data

broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical
middle-
manager
middle-
manager
middle-
manager
middle-
manager
Streaming
data
Batch
data
Streaming
data
Streaming
data

Deep Storage:
Data Management Processing
broker
middle-
manager
historical
broker
broker
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
Streaming
data
Batch
data

Deep Storage:
Query Processing
broker
middle-
manager
historical
broker
broker
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
Streaming
data
Batch
data
middle-
manager
middle-
manager
REST API

Dok Talks #124 - Intro to Druid on Kubernetes

A Little History first… in their own words
Kubernetes

Kubernetes Cluster
High Level Functions
Kubernetes
Control Plane
➔ Acquire/manage Nodes and
Storage
➔ Accept new object requests
➔ Schedule and manage
containers on Nodes
➔ Instantiate containers for
object deployment
➔ Monitor object state
➔ Apply application policies
◆ Restart policy
◆ Upgrade
◆ Fault tolerance
namespace my-dev
Dev Node
Operating System
Container Runtime
Container
zookeeper 2.1.4
zookeeper
Container
druid 0.22.1
coordinator
Container
druid 0.22.1
historical
Container
druid 0.22.1
overlord
Container
druid 0.22.1
broker
Container
druid 0.22.1
router
Container
druid 0.22.1
middle-manager
Container
postgresql 8.6.4
postgresql
namespace qa-test
Master Node
Operating System
Container Runtime
Container
zookeeper 2.1.4
zookeeper
Container
druid 0.22.1
coordinator
Container
postgresql 8.6.4
postgresql
Master
Operating System
Container Runtime
Container
druid 0.22.1
overlord
Master Node
Operating System
Container Runtime
Container
zookeeper 2.1.4
zookeeper
Container
druid 0.22.1
coordinator
Master
Operating System
Container Runtime
Container
zookeeper 2.1.4
zookeeper
Container
druid 0.22.1
overlord
Container
druid 0.22.1
broker
Query Node
Operating System
Container Runtime
Container
druid 0.22.1
router
Container
druid 0.22.1
broker
Query
Operating System
Container Runtime
Container
druid 0.22.1
router
Data Node
Operating System
Container Runtime
Container
druid 0.22.1
historical
Data
Operating System
Container Runtime
Container
druid 0.22.1
historical
Data Node
Operating System
Container Runtime
Container
druid 0.22.1
middle-manager
Realtime
Operating System
Container Runtime
Container
druid 0.22.1
middle-manager

Kubernetes provides Orchestration at Scale
● High Availability -
○ Recovery - Actively monitors and restarts pods if appropriate
○ AntiAﬃnity - Insures no single point of failure by placing services on separate nodes
○ Persistent storage enables fast Historical recovery
● Scalability
○ Manage individual components’ scale by changing one property
○ Autoscaling based on resource utilization
● Security -
○ Encryption
○ Ingress control & network Isolation
● Upgrades -
○ Roll out changes automatically and with controlled disruption
Why Apache Druid on Kubernetes

In general, it is a set of templates that describe Kubernetes objects that, in turn, provide
services & applications.
Apache Druid ® helm chart @ https://github.com/apache/druid/tree/master/helm/druid
- Dependencies - zookeeper, postgresql or mysql
- Templates for each microservice (historical, broker, middlemanager, etc.)
- Default values.yaml - these are the parameters for an installation.
Users override values to create diﬀerent deployments with their own values.yaml:
A Parameterization of Complex Deployments
Helm Charts
historical:
replicaCount: 10 # scale of historical data
middleManager:
replicaCount: 6 # scale of real-time ingestion

Template Objects
Apache Druid Helm Chart
broker
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
broker
broker
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
● Deployment - manages a set of stateless pods and
keeps them running
● Ingress - outside access
● Service - Logical persistent network access, HTTP(S)
port

Template Objects
broker
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
broker
broker
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
● StatefulSet - local files hold intermediate ingestion
files, so stateful helps jobs pick up where they left off..
● PodDisruptionBudget - determines how many pods
can be offline at a time -> upgrades
● Service - Logical persistent access, HTTP(S) port
● Horizontal Pod Autoscaler - controls autoscaling

Template Objects
broker
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
broker
broker
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
● StatefulSet - persistent storage is extremely important
at recovery time => very fast recovery.
● PodDisruptionBudget - determines how many pods
can be oﬄine at a time -> upgrades
● Service - Logical persistent access, HTTP(S) port
● No HPA - Autoscaling is not a good idea here.

A very simple example
How to Use - Druid Helm Chart
An example helps, so if we deploy vanilla:
> git clone https://github.com/apache/druid
> cd druid
> helm dependency update helm/druid
> helm install helm/druid a_druid -n a_space --create-namespace
> kubectl get pods -n a_space
NAME READY STATUS RESTARTS AGE
druid-broker-744c5f46b7-5l4r7 1/1 Running 0 8m12s
druid-coordinator-7c79f9c6c9-9wlqk 1/1 Running 0 8m12s
druid-historical-0 1/1 Running 0 8m12s
druid-middle-manager-0 1/1 Running 0 8m12s
druid-postgresql-0 1/1 Running 0 8m12s
druid-router-84d7cc6d87-w546r 1/1 Running 0 8m12s
druid-zookeeper-0 1/1 Running 0 8m12s

Create a change file like values_2_historicals.yaml:
historical:
replicaCount: 2 # scale of historical data
Best Practice ( requires helm diff add-on ) :
> helm diff upgrade -C 2 a_druid helm/druid -n a_space -f values_2_historicals.yaml
reading three way merge from env
default, druid-historical, StatefulSet (apps) has changed:
...
spec:
serviceName: druid-historical
- replicas: 1
+ replicas: 2
selector:
matchLabels:

Apply the change:
> helm upgrade helm/druid a_druid -n a_space -f values_2_historicals.yaml
> kubectl get pods -n a_space
NAME READY STATUS RESTARTS AGE
druid-broker-744c5f46b7-5l4r7 1/1 Running 0 13m
druid-coordinator-7c79f9c6c9-9wlqk 1/1 Running 0 13m
druid-historical-0 1/1 Running 0 13m
druid-historical-1 0/1 Running 0 23s
druid-middle-manager-0 1/1 Running 0 13m
druid-postgresql-0 1/1 Running 0 13m
druid-router-84d7cc6d87-w546r 1/1 Running 0 13m
druid-zookeeper-0 1/1 Running 0 13m

Configuration with Helm Chart
broker
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
broker
broker
Deep Storage:
s3
local
hdfs
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
Metadata DB:
postgresql
mysql
my_values.yaml:
configVars:
druid_storage_type
(
See also more properties @
https://druid.apache.org/doc
s/latest/configuration/index.
html#deep-storage
)

broker
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
broker
broker
Deep Storage:
s3
local
hdfs
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
Metadata DB:
postgresql
mysql
my_values.yaml:
configVars:
druid_metadata_storage_type
…connector_connectURI
…connector_user
…connector_password
my_values.yaml:
configVars:
druid_storage_type
(
See also more properties @
https://druid.apache.org/doc
s/latest/configuration/index.
html#deep-storage
)

broker
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
broker
broker
Deep Storage:
s3
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
middle-
manager
historical
middle-
manager
historical
Metadata DB:
postgresql
my_values.yaml:
<service>:
resources:
requests:
cpu: 250m
memory: 1Gi
limits:
cpu: 1000m
memory: 2Gi
(
A great resource to determine good values is
@https://druid.apache.org/docs/latest/operations/basic-cluster-tuning.html
)

Data Ingestion and Helm Chart
broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical
REST API
my_values.yaml:
router:
replicaCount: 2
ingress:
enabled: True
my_values.yaml:
overlord:
replicaCount: 2
coordinator:
replicaCount: 2

broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical
REST API
my_values.yaml:
middleManager:
replicaCount: 2
antiaffinity
nodeSelector
config:
druid_indexer_runn…
druid_indexer_fork…

Highly Available Data Ingestion
broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical
Streaming
data
Batch
data
kafka_ingestion.json:
{
…
“ioConfig”:{
“taskCount”: 2,
“replicas”: 2,
“taskDuration”:”PT1H”
}
}

broker
historical
historical
middle-
manager
historical
broker
broker
Deep Storage:
router overlord
overlord
coordinator
coordinator
historical
historical
middle-
manager
historical
middle-
manager
middle-
manager
middle-
manager
middle-
manager
Streaming
data
Batch
data
Streaming
data
Streaming
data
my_values.yaml:
middleManager:
replicaCount: 6
my_values.yaml:
middleManager:
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 6
metrics:
memory and cpu
thresholds

Deep Storage:
Historicals and Helm Chart
Data Management Processing
broker
middle-
manager
historical
broker
broker
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
Streaming
data
Batch
data
my_values.yaml:
historical:
replicaCount: 2
antiaffinity
nodeSelector

Deep Storage:
Query Processing & Helm Chart
broker
middle-
manager
historical
broker
broker
router overlord
overlord
coordinator
coordinator
middle-
manager
historical
Streaming
data
Batch
data
middle-
manager
middle-
manager
REST API
my_values.yaml:
broker:
replicaCount: 3
antiaffinity
nodeSelector

Summarizing
● Apache Druid is a real-time OLAP database
● Kubernetes makes deploying and managing the database easier
○ Increased availability (monitored, auto-recovered, persistent)
○ Better RTO and RPO
○ Autoscaled components for ingestion and real-time query
● helm install makes it easy to deploy many different configurations:
○ Create and manage different values.yaml for each config:
■ dev-min-cluster.yaml
■ qa-ha-cluster.yaml
■ prod-ha-cluster-autoscaling.yaml
● Changes to the configs can be applied live
■ helm diff and helm upgrade
● Not just scaling
● Rolling upgrades too

What can you do to help
What doesn’t it do?
● Metrics configuration - enable metrics collection and display
○ Metrics are part of Apache Druid
○ Metric-emitters have been contributed by the community
■ Influxdb-metrics-emitter, prometheus-emitter,
kafka-emitter… and many more
○ Helm chart could use a set of options to turn on metrics and
enable specific emitters.
● Multi-tier configurations are not yet enabled
○ Apache Druid support multiple temperature levels, i.e.
■ High speed SSDs vs High volume HDDs
○ Helm chart could use a dynamic tier configuration mechanism
● The Apache Druid Community :
○ You are invited!
○ Fork the repo at https://github.com/apache/druid
○ Make your changes
○ Submit a PR!

ASF Slack
#druid
Google Groups
https://groups.google.com/forum/#!forum/druid-user
Druid Meetups
https://www.meetup.com/pro/apache-druid/
Druid News & Info
@druidio #apachedruid @implydata
Druid Professionals Group
https://www.linkedin.com/groups/8791983/
Druid User Forum by Imply
https://www.druidforum.org
Imply Community Team
community@imply.io
&
Imply Training Program
https://learn.imply.io

Apache®, Apache Druid®, Druid®, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.
Thank you

Dok Talks #124 - Intro to Druid on Kubernetes

More Related Content

Similar to Dok Talks #124 - Intro to Druid on Kubernetes (20)

More from DoKC (20)

Recently uploaded (20)

Dok Talks #124 - Intro to Druid on Kubernetes