SlideShare a Scribd company logo
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Clinton Gormley
@clintongormley
Scaling real time
search and analytics with
elasticsearch
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch.org/guide
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
• distributed
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
• distributed
• search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch
• real-time
• distributed
• search
• analytics
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how to use it?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how to use it?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how does it work?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 1:
making text searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
where content like
“%darling%buds%”
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
slow & inflexible
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
sorted list of
unique terms
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
where
they
occur
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Term Doc	
  1 Doc	
  2 Doc	
  3
breathe
brings
buds
but
by
can
…
damasked
darling
date
day
deaf
death
declines
delight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies » relevance
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
» relevance
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
» relevance
» doc weight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
» relevance
» doc weight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
» relevance
» doc weight
» word proximity
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
• char offsets
» relevance
» doc weight
» word proximity
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
• term frequencies
• text length
• term positions
• char offsets
» relevance
» doc weight
» word proximity
» highlighting
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
not just for text
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
numbers, dates, bools, enums
geopoints, geoshapes, etc
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 2:
analytics
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
for search
map values → doc_ids
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
for search
map values → doc_ids
for analytics
map doc_ids → values
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
uninvert the index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
uninvert the index
cache values in memory
called “fielddata”
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
uninvert the index
data access from RAM
very fast
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
on-the-fly analytics
in the context of
a user’s query
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
on-the-fly analytics
relevant analytics
for each user
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
calculate metrics
count, min, max, sum, avg,
percentiles, cardinality,
stddev, variance, sum of squares
!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
grouped by
popular terms, significant terms,
ranges, dates, geolocation, etc
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
grouped by
groups can
… contain subgroups
… which contain subgroups
etc
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 3:
building the inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
• fielddata never changes
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
• fielddata never changes
• compressible
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
immutable
• cache friendly
• reads from RAM
• fielddata never changes
• compressible
• no locking
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but, immutable…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 4:
dynamic inverted index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
in-memory buffer
commit
segment
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync
• clear buffer
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync
• clear buffer
• reopen index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene commit
• write new segment
• write new commit point
• fsync ← expensive!
• clear buffer
• reopen index
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 5:
near real-time search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
in-memory buffer
flush
segment
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
flush
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
flush
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
commit point
segment
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
• write new segment
• clear buffer
• reopen index
!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
• write new segment
• clear buffer
• reopen index
• no fsync
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lucene flush
• write new segment
• clear buffer
• reopen index
• no fsync → lightweight
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
data not safe until fsync’ed!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 6:
don’t lose data
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 6:
don’t lose data
→ transaction log
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
in-memory buffer
flush
segment
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
flush
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
commit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
segment
searchable
translog
commit point
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
elasticsearch “refresh”
• lucene “flush”
• makes changes searchable
• lightweight
!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• lucene “commit”
• clears transaction log
• persists changes
• heavy
!
elasticsearch “flush”
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
refresh every second
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
near real-time search!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
near real-time search!
near real-time analytics!
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• slow searches
• poor term frequencies
• poor compression
!
!
too many segments
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 7:
reduce segments
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
searchable
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
merge process
• many small → one big
• removes deleted docs
• runs in background
• throttled
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“Any wonder it broke down” by Brian Snelson is licensed under CC BY 2.0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
sometimes you
need another truck
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 8:
scale out, not up
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
shard your data
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
shard your data
transparent in elasticsearch
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
many segments
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one shard
ss
many segments →
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one shard
ss
many segments
ssssssss
many shards
ss
→
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one shard
ss
many segments
one index
IIssssssss
many shards
ss
→
→
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“node”
running instance of elasticsearch
≈ one server
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“shard”
bucket of data
lives on one node
physical worker unit
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“index”
logical namespace
points to one or more shards
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“index”
logical namespace
points to one or more shards
shard = hash(_id) % no_of_shards
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
PUT doc _id:1
hash(1) % 3 shard_2
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
GET doc _id:2
hash(2) % 3 shard_0
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
Search all docs
shard = hash(_id) % no_of_shards
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 9:
scaling elastically
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
start small
node_A
shard_0
shard_1
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add more nodes
node_A
shard_0
shard_1
shard_2
node_B node_C
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
shards migrate
node_A
shard_0
shard_1
shard_2
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
rebalanced
node_A
shard_0
node_B
shard_1
node_C
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add new index
node_A
shard_0
shard_1
node_B
shard_1
shard_2
node_C
shard_0
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
more hardware?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
more hardware?
more hardware failure
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
at 3am on sunday…
node_A
shard_0
shard_1
node_B
shard_1
shard_2
node_C
shard_0
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
boom!
node_A
shard_0
shard_1
node_B
shard_1
shard_2
node_C
shard_0
shard_2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 10:
add redundancy
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
for every shard
…make a copy
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“primary shard”
main shard
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“replica shard(s)”
copy of primary shard
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
one node
node_A
P0
P1
P2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
R0
R1
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
redundancy
node_A
P0
P1
P2
node_B
R0
R1
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
R0
R1
R2
node_C
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
add a node
node_A
P0
P1
P2
node_B
R0
R1
R2
R1
node_C
P0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
rebalanced
node_A
P0
P1
P2
node_B
R0
R1
R2
node_C
P0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
lose a node
node_A
P0
P1
P2
node_B
R0
R2
R1
node_C
P0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
replica primary
node_A
P0
P1
P2
node_B
R0
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
replica primary
node_A
P0
P1
P2
node_B
P0
R2
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
allocate replicas
node_A
P0
P1
P2
node_B
P0
R2
R0
R1
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
rebalanced
node_A
P0
P1
P2
node_B
P0
R2
R0
R1
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
primary shard
• just a role
• receives doc changes first
• forwards new doc to replicas in parallel
• number of primaries fixed
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
replica shard
• copy of primary shard
• serves read/search requests
• number of replicas can be changed
• more replicas → more read throughput

*if you have more hardware*
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
but…
who controls all this?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
step 11:
the master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“Master Yoda” by Gonzalo Martín is licensed under CC BY-SA 2.0
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“node”
running instance of elastic search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“node”
running instance of elastic search
node_A
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“cluster”
one or more nodes
with same cluster name
working together
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
“cluster”
node_A node_B
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
discover a cluster
with multicast/unicast
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
discover a cluster
with multicast/unicast
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
request routing
send request to any node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
request routing
forwards to correct node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
how?
every node knows where
every document is
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
every node knows where
every document is
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
cluster level information
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
cluster level information
indices shards nodes
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
cluster state
can only be updated by
the master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A
master node
elected when cluster forms
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B
master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B
master node
node_C
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
master node
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
master node
just a role
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_A node_B node_C
master node
re-elected if master fails
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_B node_C
master node
node_A
re-elected if master fails
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
node_B node_C
master node
re-elected if master fails
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
master node
only manages
cluster level changes
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
master node
not doc-level

get/put/search
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
the result?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
distributed
real-time
search & analytics
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
which works in the same way
on your laptop…
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
…as on your
1,000 node cluster
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
who is using it?
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• full text search
• highlighted search snippets
• search-as-you-type
• did-you-mean suggestions
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• combine visitor logs with 

social network data
• real-time feedback to editors
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• combines full text search with
geolocation
• uses more-like-this to find 

related questions and answers
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• search repositories, users, 

issues, pull requests
• search 130 billion lines of code
• track all alerts, events, logs
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
• index and analyse 

5TB of log data every day
Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
thank you
@clintongormley
elasticsearch.org/downloads
elasticsearch.com/support
elasticsearch.com/jobs

More Related Content

PDF
Elasticsearch Query DSL - Not just for wizards...
PDF
Down and dirty with Elasticsearch
PDF
Clinton Gormley – Elasticsearch Query DSL – Not just for wizards…- NoSQL matt...
PPTX
Introducing ElasticSearch - Ashish
PPTX
Getting Started with Progressive Web Apps [Beyond Tellerrand 2019]
PDF
SEOgadget Links API Extension for Excel - Mozcon 2012
PPTX
Constructing your search
ODP
Terms of endearment - the ElasticSearch Query DSL explained
Elasticsearch Query DSL - Not just for wizards...
Down and dirty with Elasticsearch
Clinton Gormley – Elasticsearch Query DSL – Not just for wizards…- NoSQL matt...
Introducing ElasticSearch - Ashish
Getting Started with Progressive Web Apps [Beyond Tellerrand 2019]
SEOgadget Links API Extension for Excel - Mozcon 2012
Constructing your search
Terms of endearment - the ElasticSearch Query DSL explained

Viewers also liked (20)

PDF
Introduction to Elasticsearch
PDF
What's new in Elasticsearch v5
PDF
Workshop: Learning Elasticsearch
PPTX
Building an ETL pipeline for Elasticsearch using Spark
PPTX
Elastic search overview
PDF
Elasticsearch in Netflix
PDF
Elasticsearch in 15 minutes
PPT
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
PPTX
ElasticSearch Basic Introduction
PPTX
An Introduction to Elastic Search.
PPTX
Microservices, Continuous Delivery, and Elasticsearch at Capital One
PPTX
Webmining[final]
PDF
To infinity and beyond
PPT
Campaign Technology
PPTX
Unit Testing and Tools - ADNUG
PPTX
Show me the problem- Our insights journey at Netflix
PPTX
quick intro to elastic search
PPTX
Elastic search Walkthrough
PDF
8 ways to leverage AWS Lambda in your Big Data workloads
PPTX
Elasticsearch 5.0
Introduction to Elasticsearch
What's new in Elasticsearch v5
Workshop: Learning Elasticsearch
Building an ETL pipeline for Elasticsearch using Spark
Elastic search overview
Elasticsearch in Netflix
Elasticsearch in 15 minutes
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
ElasticSearch Basic Introduction
An Introduction to Elastic Search.
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Webmining[final]
To infinity and beyond
Campaign Technology
Unit Testing and Tools - ADNUG
Show me the problem- Our insights journey at Netflix
quick intro to elastic search
Elastic search Walkthrough
8 ways to leverage AWS Lambda in your Big Data workloads
Elasticsearch 5.0
Ad

Similar to Scaling real-time search and analytics with Elasticsearch (11)

PDF
Making sense of your data to give new insight - Elasticsearch at Findability ...
PDF
OSMC 2014 | Using Elasticsearch, Logstash & Kibana in system administration b...
PPTX
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
PDF
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...
PDF
Linked Data Snowball, or Why We Need Reconciliation
PDF
(Webinar) Content Marketing: Neuromarketing Science 2014
KEY
Digital Textbooks
KEY
Digital Textbook Presentation
PDF
Semantic Integration with Apache Jena and Stanbol
PPT
SPPTChap003.ppt
PDF
Top-Punctuation-Howlers
Making sense of your data to give new insight - Elasticsearch at Findability ...
OSMC 2014 | Using Elasticsearch, Logstash & Kibana in system administration b...
Realtime Analytics and Anomalities Detection using Elasticsearch, Hadoop and ...
OSMC 2014: Using elasticsearch, logstash & kibana in system administration | ...
Linked Data Snowball, or Why We Need Reconciliation
(Webinar) Content Marketing: Neuromarketing Science 2014
Digital Textbooks
Digital Textbook Presentation
Semantic Integration with Apache Jena and Stanbol
SPPTChap003.ppt
Top-Punctuation-Howlers
Ad

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
KodekX | Application Modernization Development
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Big Data Technologies - Introduction.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
cuic standard and advanced reporting.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Empathic Computing: Creating Shared Understanding
The Rise and Fall of 3GPP – Time for a Sabbatical?
Building Integrated photovoltaic BIPV_UPV.pdf
KodekX | Application Modernization Development
Understanding_Digital_Forensics_Presentation.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Big Data Technologies - Introduction.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Network Security Unit 5.pdf for BCA BBA.
cuic standard and advanced reporting.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
20250228 LYD VKU AI Blended-Learning.pptx
MYSQL Presentation for SQL database connectivity
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Empathic Computing: Creating Shared Understanding

Scaling real-time search and analytics with Elasticsearch

  • 1. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Clinton Gormley @clintongormley Scaling real time search and analytics with elasticsearch
  • 2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
  • 3. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch.org/guide
  • 4. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch
  • 5. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time
  • 6. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time • distributed
  • 7. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time • distributed • search
  • 8. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch • real-time • distributed • search • analytics
  • 9. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how to use it?
  • 10. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how to use it?
  • 11. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how does it work?
  • 12. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 1: making text searchable
  • 13. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
  • 14. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. where content like “%darling%buds%”
  • 15. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. slow & inflexible
  • 16. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited.
  • 17. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight sorted list of unique terms
  • 18. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight where they occur
  • 19. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight
  • 20. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Term Doc  1 Doc  2 Doc  3 breathe brings buds but by can … damasked darling date day deaf death declines delight
  • 21. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index
  • 22. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies
  • 23. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies » relevance
  • 24. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length » relevance
  • 25. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length » relevance » doc weight
  • 26. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions » relevance » doc weight
  • 27. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions » relevance » doc weight » word proximity
  • 28. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions • char offsets » relevance » doc weight » word proximity
  • 29. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index • term frequencies • text length • term positions • char offsets » relevance » doc weight » word proximity » highlighting
  • 30. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index not just for text
  • 31. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index numbers, dates, bools, enums geopoints, geoshapes, etc
  • 32. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 2: analytics
  • 33. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. for search map values → doc_ids
  • 34. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. for search map values → doc_ids for analytics map doc_ids → values
  • 35. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. uninvert the index
  • 36. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. uninvert the index cache values in memory called “fielddata”
  • 37. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. uninvert the index data access from RAM very fast
  • 38. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. on-the-fly analytics in the context of a user’s query
  • 39. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. on-the-fly analytics relevant analytics for each user
  • 40. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. calculate metrics count, min, max, sum, avg, percentiles, cardinality, stddev, variance, sum of squares !
  • 41. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. grouped by popular terms, significant terms, ranges, dates, geolocation, etc
  • 42. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. grouped by groups can … contain subgroups … which contain subgroups etc
  • 43. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 3: building the inverted index
  • 44. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. inverted index
  • 45. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable
  • 46. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly
  • 47. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM
  • 48. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM • fielddata never changes
  • 49. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM • fielddata never changes • compressible
  • 50. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. immutable • cache friendly • reads from RAM • fielddata never changes • compressible • no locking
  • 51. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but, immutable…
  • 52. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 4: dynamic inverted index
  • 53. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. in-memory buffer commit segment
  • 54. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 55. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable commit
  • 56. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 57. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable commit
  • 58. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 59. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit
  • 60. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment
  • 61. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point
  • 62. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync
  • 63. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync • clear buffer
  • 64. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync • clear buffer • reopen index
  • 65. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene commit • write new segment • write new commit point • fsync ← expensive! • clear buffer • reopen index
  • 66. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 5: near real-time search
  • 67. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. in-memory buffer flush segment
  • 68. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable
  • 69. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable flush
  • 70. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable
  • 71. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable flush
  • 72. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable
  • 73. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable commit
  • 74. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. commit point segment searchable
  • 75. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush
  • 76. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush • write new segment • clear buffer • reopen index !
  • 77. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush • write new segment • clear buffer • reopen index • no fsync
  • 78. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lucene flush • write new segment • clear buffer • reopen index • no fsync → lightweight
  • 79. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… data not safe until fsync’ed!
  • 80. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 6: don’t lose data
  • 81. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 6: don’t lose data → transaction log
  • 82. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. in-memory buffer flush segment translog
  • 83. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog
  • 84. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable flush translog
  • 85. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog
  • 86. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog commit
  • 87. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. segment searchable translog commit point
  • 88. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. elasticsearch “refresh” • lucene “flush” • makes changes searchable • lightweight !
  • 89. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • lucene “commit” • clears transaction log • persists changes • heavy ! elasticsearch “flush”
  • 90. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. refresh every second
  • 91. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. near real-time search!
  • 92. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. near real-time search! near real-time analytics!
  • 93. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 94. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • slow searches • poor term frequencies • poor compression ! ! too many segments
  • 95. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 7: reduce segments
  • 96. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 97. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 98. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 99. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 100. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 101. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 102. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 103. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 104. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 105. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 106. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 107. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. searchable
  • 108. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. merge process • many small → one big • removes deleted docs • runs in background • throttled
  • 109. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 110. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “Any wonder it broke down” by Brian Snelson is licensed under CC BY 2.0
  • 111. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. sometimes you need another truck
  • 112. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 8: scale out, not up
  • 113. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. shard your data
  • 114. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. shard your data transparent in elasticsearch
  • 115. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. many segments
  • 116. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one shard ss many segments →
  • 117. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one shard ss many segments ssssssss many shards ss →
  • 118. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one shard ss many segments one index IIssssssss many shards ss → →
  • 119. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “node” running instance of elasticsearch ≈ one server
  • 120. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “shard” bucket of data lives on one node physical worker unit
  • 121. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “index” logical namespace points to one or more shards
  • 122. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “index” logical namespace points to one or more shards shard = hash(_id) % no_of_shards
  • 123. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. PUT doc _id:1 hash(1) % 3 shard_2 node_A shard_0 node_B shard_1 node_C shard_2
  • 124. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. GET doc _id:2 hash(2) % 3 shard_0 node_A shard_0 node_B shard_1 node_C shard_2
  • 125. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. Search all docs shard = hash(_id) % no_of_shards node_A shard_0 node_B shard_1 node_C shard_2
  • 126. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 9: scaling elastically
  • 127. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. start small node_A shard_0 shard_1 shard_2
  • 128. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add more nodes node_A shard_0 shard_1 shard_2 node_B node_C
  • 129. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. shards migrate node_A shard_0 shard_1 shard_2 node_B shard_1 node_C shard_2
  • 130. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. rebalanced node_A shard_0 node_B shard_1 node_C shard_2
  • 131. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add new index node_A shard_0 shard_1 node_B shard_1 shard_2 node_C shard_0 shard_2
  • 132. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 133. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… more hardware?
  • 134. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… more hardware? more hardware failure
  • 135. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. at 3am on sunday… node_A shard_0 shard_1 node_B shard_1 shard_2 node_C shard_0 shard_2
  • 136. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. boom! node_A shard_0 shard_1 node_B shard_1 shard_2 node_C shard_0 shard_2
  • 137. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 10: add redundancy
  • 138. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. for every shard …make a copy
  • 139. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “primary shard” main shard
  • 140. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “replica shard(s)” copy of primary shard
  • 141. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. one node node_A P0 P1 P2
  • 142. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B
  • 143. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B R0 R1 R2
  • 144. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. redundancy node_A P0 P1 P2 node_B R0 R1 R2
  • 145. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B R0 R1 R2 node_C
  • 146. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. add a node node_A P0 P1 P2 node_B R0 R1 R2 R1 node_C P0
  • 147. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. rebalanced node_A P0 P1 P2 node_B R0 R1 R2 node_C P0
  • 148. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. lose a node node_A P0 P1 P2 node_B R0 R2 R1 node_C P0
  • 149. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. replica primary node_A P0 P1 P2 node_B R0 R2
  • 150. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. replica primary node_A P0 P1 P2 node_B P0 R2
  • 151. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. allocate replicas node_A P0 P1 P2 node_B P0 R2 R0 R1
  • 152. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. rebalanced node_A P0 P1 P2 node_B P0 R2 R0 R1
  • 153. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. primary shard • just a role • receives doc changes first • forwards new doc to replicas in parallel • number of primaries fixed
  • 154. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. replica shard • copy of primary shard • serves read/search requests • number of replicas can be changed • more replicas → more read throughput
 *if you have more hardware*
  • 155. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but…
  • 156. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. but… who controls all this?
  • 157. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. step 11: the master node
  • 158. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “Master Yoda” by Gonzalo Martín is licensed under CC BY-SA 2.0
  • 159. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “node” running instance of elastic search
  • 160. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “node” running instance of elastic search node_A
  • 161. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “cluster” one or more nodes with same cluster name working together
  • 162. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. “cluster” node_A node_B
  • 163. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C discover a cluster with multicast/unicast
  • 164. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C discover a cluster with multicast/unicast
  • 165. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C request routing send request to any node
  • 166. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C request routing forwards to correct node
  • 167. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how?
  • 168. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. how? every node knows where every document is
  • 169. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state every node knows where every document is
  • 170. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state cluster level information
  • 171. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state cluster level information indices shards nodes
  • 172. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. cluster state can only be updated by the master node
  • 173. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A master node elected when cluster forms
  • 174. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B master node
  • 175. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B master node node_C
  • 176. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C master node
  • 177. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C master node just a role
  • 178. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_A node_B node_C master node re-elected if master fails
  • 179. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_B node_C master node node_A re-elected if master fails
  • 180. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. node_B node_C master node re-elected if master fails
  • 181. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. master node only manages cluster level changes
  • 182. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. master node not doc-level
 get/put/search
  • 183. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. the result?
  • 184. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. distributed real-time search & analytics
  • 185. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. which works in the same way on your laptop…
  • 186. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. …as on your 1,000 node cluster
  • 187. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. who is using it?
  • 188. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • full text search • highlighted search snippets • search-as-you-type • did-you-mean suggestions
  • 189. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • combine visitor logs with 
 social network data • real-time feedback to editors
  • 190. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • combines full text search with geolocation • uses more-like-this to find 
 related questions and answers
  • 191. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • search repositories, users, 
 issues, pull requests • search 130 billion lines of code • track all alerts, events, logs
  • 192. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. • index and analyse 
 5TB of log data every day
  • 193. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited. thank you @clintongormley elasticsearch.org/downloads elasticsearch.com/support elasticsearch.com/jobs