SlideShare a Scribd company logo
Is your Elasticsearch Cluster
Production Ready?
Itamar Syn-Hershko
http://code972.com | @synhershko
http://BigDataBoutique.co.il
Me?
http://bdbq.co.il
What does it take?
β€’ Cluster deployed using best
practices
β€’ Thorough monitoring
β€’ Inspect. Fix. Repeat.
β€’ Good capacity planning
β€’ Memory management
β€’ Indexing and sharding strategy
β€’ Security
Cluster Topology
Master-eligible
nodes (3)
Data nodes
(sizing by data)
Client nodes, aka
coordinating nodes
(scalable, sizing by
traffic)
Deployments
β€’ Prefer immutable images & scripted deployments
β€’ For AWS see https://github.com/synhershko/elasticsearch-
cloud-deploy/
β€’ GCP coming soon
Backups
β€’ Very efficient
β€’ Very important
β€’ Several storages supported
β€’ To a shared file system
β€’ HDFS
β€’ Azure / GCP / AWS repositories via plugins
What to monitor (on the cluster, per
host)?
β€’ CPU load
β€’ Memory utilization
β€’ Heap utilization
β€’ GC time
β€’ Disk utilization
β€’ Disk IOPs
β€’ Merges
β€’ Deleted docs
β€’ Requests per sec (indexing, search)
β€’ Load average < number of cores
β€’ Network in / out
β€’ Thread pool rejections
β€’ Number of nodes
β€’ Cache sizes
β€’ Cache evictions
β€’ Cluster state / health
β€’ Number of shards per type
X-Pack monitoring (aka Marvel)
Grafana
dashboards
β€’ More fine-grained, cluster-wide view
β€’ Provided with metrics polling script (Python)
https://github.com/synhershko/elasticsearch-grafana-monitoring
Monitoring Destination
β€’ To the same cluster
β€’ To a different cluster (Recommended)
β€’ External systems (e.g. graphite) – only if already in org
β€’ X-Pack subscribers can now send metrics to Elastic Cloud
Typical garbage collection sawtooth
CPU
monitoring
Correlating metrics
β€’ Shards on the same node have issues?
β€’ During merges?
β€’ CPU and GC
β€’ HTTP traffic and indexing or search operations
Threadpools & Throughput
Boosting slow operations
β€’ Search or Indexing heavy?
β€’ Measure operations also from applications side!
β€’ Slow searches
β€’ Queries need optimization
β€’ Scoring (not using filters)
β€’ Numeric ranges pre-5
β€’ Scripts
β€’ Slow indexing
β€’ Sharding strategy
β€’ Use bulk indexing (optimize for 10-15MB of data, regardless of
number of documents / operations)
β€’ Slow analyzers affects both! (e.g. n-grams)
Don’t use NGrams!
β€’ Being used for β€œcontains” search
β€’ You ain’t gonna need it, use WordDelimiter Token Filter instead
β€’ Useful for fuzzy search / auto-correction
β€’ Best used via Elasticsearch’s Suggesters
β€’ Useful for languages without spaces, or with compound
words
β€’ min_gram , max_gram
Caches
β€’ Query cache
β€’ Request cache
β€’ Measure evictions rate & cache usage
Memory Allocation
β€’ ES_HEAP_SIZE
β€’ DocValues used?
β€’ Fielddata usage
β€’ Query cache (for queries in filter context)
β€’ Request cache (for aggregations and count queries)
β€’ Never over 32GB!
β€’ Default cache sizes not always fit usage
β€’ Set appropriate static configs in elasticsearch.yml
β€’ At least 50% of memory to file-system cache
β€’ Usually more
Server Sizing
β€’ Master nodes
β€’ 1-2 cores, 2-4 GB memory, 50% ES_HEAP_SIZE
β€’ Data nodes
β€’ > 4 cores, measure and preserve disk/mem ratio (can start with
1/24)
β€’ ES_HEAP_SIZE as per previous slide
β€’ Client nodes
β€’ CPU and network heavy, 4GB memory should be enough for most
use cases
Index Management Patterns
β€’ A Monolith Index
β€’ Search faΓ§ade on top of your data
β€’ Record linkage
β€’ Anomaly detection
β€’ Rolling indexes (time based events)
β€’ Centralized logging
β€’ Auditing
β€’ IoT
logs-2016.11.20 logs-2016.11.21 logs-2016.11.22 logs-2016.11.23logs-2016.11.19
Optimal shard size
β€’ Few millions in document size, for search performance
β€’ A bit more if only doing aggregations
β€’ 5-8GB on disk max, for startup times and network
reallocation
β€’ doc_values are enabled by default, turn off for non-aggs fields to
save space
Sharding
β€’ Index Shards
β€’ Resharding / auto-sharding not supported
β€’ Index-level sharding
β€’ Avoid using types (deprecated > 6.x)
β€’ Multi-tenancy
β€’ Rollover API (> 5.x)
β€’ Cluster level
β€’ Cluster per project
β€’ Cross-cluster search capability
Multitenancy
β€’ Silos – Every tenant get their own index
β€’ Index sizes vary
β€’ Potentially wasting resources
β€’ Pool – All tenants are in one big index
β€’ Sharding isn’t dynamic
β€’ Effects on tf/idf, aggregations, throughput
β€’ Hybrid – Big tenants in their own index, pool(s) for small
ones
Use Explicit Mapping
(aka Avoid Schemaless)
β€’ In one of two ways:
β€’ Disable dynamic mapping in settings (index.mapper.dynamic: false). Will
refuse indexing.
β€’ Create catch-all dynamic template with enabled:false mapping
β€’ Why?
β€’ Avoids hundreds of fields by mistake
β€’ Saves effort on indexing and disk space
β€’ Defaults are bad anyhow, don’t rely on them
β€’ Prefer using index templates (especially for rolling indices)
Re-balancing is your enemy
β€’ Lock down shard rebalancing
β€’ cluster.routing.rebalance.enable
β€’ none
β€’ cluster.routing.allocation.enable
β€’ primaries
β€’ new_primaries
β€’ none
More safe configs
β€’ action.disable_delete_all_indices: true
β€’ action.auto_create_index: false
Deep paging (don’t!)
β€’ Don’t from-size
β€’ search_after (> 5.x)
β€’ Scroll and sliced-scroll (> 5.x)
β€’ Not for normal operation
Deletions
β€’ Deletions have an overhead
β€’ Slow searches
β€’ Segmentation
β€’ More work on segment merging
β€’ Non-exact tf/idf
β€’ Every document update is a deletion
β€’ No need to avoid it completely, just design accordingly
Geographic Distribution
β€’ Never with the same cluster!
β€’ Cross-cluster search (formerly Tribe Node)
β€’ For geographic sharding
β€’ Different indexes in different regions
β€’ xDCR for HA / DR
β€’ Can be solved by infra – replicating queues (Kafka), DBs
β€’ Solution coming in X-Pack
Your ingestion architecture?
β€’ Favor external ingestion, relieve Elastic from that responsibility
β€’ Upgrade Logstash to 5.x
β€’ Consider using FileBeat instead of logstash for log-tailing
β€’ Prefer logstash machines over ingest nodes
β€’ Use queues (Kafka, Redis) to protect against surges
Security
Protecting your cluster
β€’ Don’t bind to a public IP
β€’ Use only private IP/DNSs, preferably in subnets (e.g. AWS VPC)
β€’ network.host in elasticsearch.yml
β€’ Proxy all client requests to ES
β€’ Disable HTTP where not needed
β€’ + Don’t use default ports
β€’ Secure publicly available client nodes
β€’ Access via VPN only
β€’ At the very least SSL + authentication if VPN not an option
β€’ Disable dynamic scripting (pre-5.x)
Securing Indexes and Documents
β€’ Heavy Kibana user?
β€’ Authentication and authorization
β€’ Index, Document and Field level security
β€’ Requires X-Pack Security
β€’ Application level authentication and authorization
β€’ Application filtering of content (fields, documents)
β€’ Index level (e.g. index per tenant)
β€’ Document level (using permissions)
β€’ Inter-node comms, encryption at rest (X-Pack only)
Upcoming in ES land
β€’ Elasticsearch 6
β€’ Machine Learning
β€’ Anomaly detection on time series data
β€’ Enterprise Cloud
β€’ Elastic Cloud deployed on-premise
β€’ Any plugin authors in the crowd?
Elasticsearch Training
Elasticsearch for Developers &
Maintaining Elasticsearch in Production
β€’ September (10,11,17/9)
β€’ November (12,13,16/11)
http://bdbq.co.il/courses
Consultancy and Development services
http://bdbq.co.il/services/elasticsearch
Questions?
@synhershko on social (Twitter, github, …)
Blog at http://code972.com
Training and consultancy at
http://BigDataBoutique.co.il

More Related Content

PDF
CosmosDb for beginners
PPTX
RedisConf17 - Home Depot - Turbo charging existing applications with Redis
PPTX
Securing Your MongoDB Deployment
PPTX
MongoDB Capacity Planning
PPTX
Introduction to CosmosDB - Azure Bootcamp 2018
PDF
AWS Cloud experience concepts tips and tricks
PDF
Why You Definitely Don’t Want to Build Your Own Time Series Database
PPTX
Tips, Tricks & Best Practices for large scale HDInsight Deployments
CosmosDb for beginners
RedisConf17 - Home Depot - Turbo charging existing applications with Redis
Securing Your MongoDB Deployment
MongoDB Capacity Planning
Introduction to CosmosDB - Azure Bootcamp 2018
AWS Cloud experience concepts tips and tricks
Why You Definitely Don’t Want to Build Your Own Time Series Database
Tips, Tricks & Best Practices for large scale HDInsight Deployments

What's hot (19)

PPTX
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
PPTX
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive Query
PPTX
Five essential new enhancements in azure HDnsight
PPTX
Webinar: Solr & Fusion for Big Data
PDF
CosmosDB for DBAs & Developers
PPTX
MongoDB Replication fundamentals - Desert Code Camp - October 2014
PPTX
Move your on prem data to a lake in a Lake in Cloud
Β 
PPTX
Drupal performance
PPTX
Persistent Storage for Containerized Applications
PPTX
Azure CosmosDB
PDF
Selecting the right persistent storage options for apps in containers Open So...
PDF
Elasticsearch in production Boston Meetup October 2014
Β 
PPTX
Azure CosmosDB the new frontier of big data and nosql
PDF
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
PDF
Solr cloud the 'search first' nosql database extended deep dive
PDF
InfluxDB Internals
PDF
NoSQL benchmarking
PPTX
Compare DynamoDB vs. MongoDB
PPTX
Operationalizing MongoDB at AOL
Scylla Summit 2022: New AWS Instances Perfect for ScyllaDB
Interactive ad-hoc analysis at petabyte scale with HDInsight Interactive Query
Five essential new enhancements in azure HDnsight
Webinar: Solr & Fusion for Big Data
CosmosDB for DBAs & Developers
MongoDB Replication fundamentals - Desert Code Camp - October 2014
Move your on prem data to a lake in a Lake in Cloud
Β 
Drupal performance
Persistent Storage for Containerized Applications
Azure CosmosDB
Selecting the right persistent storage options for apps in containers Open So...
Elasticsearch in production Boston Meetup October 2014
Β 
Azure CosmosDB the new frontier of big data and nosql
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
Solr cloud the 'search first' nosql database extended deep dive
InfluxDB Internals
NoSQL benchmarking
Compare DynamoDB vs. MongoDB
Operationalizing MongoDB at AOL
Ad

Similar to Is your Elastic Cluster Stable and Production Ready? (20)

PPTX
Managing Security At 1M Events a Second using Elasticsearch
PDF
Architectural Best Practices to Master + Pitfalls to Avoid (P)
PPTX
Running & Scaling Large Elasticsearch Clusters
PDF
PrΓ‘cticas recomendadas en materia de arquitectura y errores que debes evitar
PDF
Elasticsearch for Logs & Metrics - a deep dive
PPTX
Perl and Elasticsearch
PPTX
Elasticsearch { "Meetup" : "talk" }
PDF
Architecture at Scale
PPTX
Dev nexus 2017
PDF
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
PDF
Black friday logs - Scaling Elasticsearch
PPTX
Toronto High Scalability meetup - Scaling ELK
PDF
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
PDF
Scaling Elasticsearch at Synthesio
PDF
Optimizing Elastic for Search at McQueen Solutions
PDF
Elasticsearch in production New York Meetup at Twitter October 2014
Β 
PDF
Elasticsearch from the trenches
PPTX
Devnexus 2018
PPTX
Elasticsearch - Scalability and Multitenancy
PDF
An Introduction to Elasticsearch for Beginners
Managing Security At 1M Events a Second using Elasticsearch
Architectural Best Practices to Master + Pitfalls to Avoid (P)
Running & Scaling Large Elasticsearch Clusters
PrΓ‘cticas recomendadas en materia de arquitectura y errores que debes evitar
Elasticsearch for Logs & Metrics - a deep dive
Perl and Elasticsearch
Elasticsearch { "Meetup" : "talk" }
Architecture at Scale
Dev nexus 2017
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Black friday logs - Scaling Elasticsearch
Toronto High Scalability meetup - Scaling ELK
SUE 2018 - Migrating a 130TB Cluster from Elasticsearch 2 to 5 in 20 Hours Wi...
Scaling Elasticsearch at Synthesio
Optimizing Elastic for Search at McQueen Solutions
Elasticsearch in production New York Meetup at Twitter October 2014
Β 
Elasticsearch from the trenches
Devnexus 2018
Elasticsearch - Scalability and Multitenancy
An Introduction to Elasticsearch for Beginners
Ad

More from DoiT International (19)

PPTX
Terraform Modules Restructured
PPTX
GAN training with Tensorflow and Tensor Cores
PDF
Orchestrating Redis & K8s Operators
PPTX
K8s best practices from the field!
PPTX
An Open-Source Platform to Connect, Manage, and Secure Microservices
PPTX
Applying ML for Log Analysis
PPTX
GCP for AWS Professionals
PPTX
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
PPTX
AWS Cyber Security Best Practices
PPTX
Google Cloud Spanner Preview
PPTX
Amazon Athena Hands-On Workshop
PDF
AWS Athena vs. Google BigQuery for interactive SQL Queries
PPTX
Google BigQuery 101 & What’s New
PDF
Running Production-Grade Kubernetes on AWS
PPTX
Scaling Jenkins with Kubernetes by Ami Mahloof
PPTX
CI Implementation with Kubernetes at LivePerson by Saar Demri
PPTX
Kubernetes @ Nanit by Chen Fisher
PDF
Dataflow - A Unified Model for Batch and Streaming Data Processing
PPTX
Kubernetes - State of the Union (Q1-2016)
Terraform Modules Restructured
GAN training with Tensorflow and Tensor Cores
Orchestrating Redis & K8s Operators
K8s best practices from the field!
An Open-Source Platform to Connect, Manage, and Secure Microservices
Applying ML for Log Analysis
GCP for AWS Professionals
Cloud Dataflow - A Unified Model for Batch and Streaming Data Processing
AWS Cyber Security Best Practices
Google Cloud Spanner Preview
Amazon Athena Hands-On Workshop
AWS Athena vs. Google BigQuery for interactive SQL Queries
Google BigQuery 101 & What’s New
Running Production-Grade Kubernetes on AWS
Scaling Jenkins with Kubernetes by Ami Mahloof
CI Implementation with Kubernetes at LivePerson by Saar Demri
Kubernetes @ Nanit by Chen Fisher
Dataflow - A Unified Model for Batch and Streaming Data Processing
Kubernetes - State of the Union (Q1-2016)

Recently uploaded (20)

PPTX
Introduction to Information and Communication Technology
PPTX
Funds Management Learning Material for Beg
PDF
Decoding a Decade: 10 Years of Applied CTI Discipline
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
INTERNET------BASICS-------UPDATED PPT PRESENTATION
PPTX
CHE NAA, , b,mn,mblblblbljb jb jlb ,j , ,C PPT.pptx
PDF
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PPTX
Internet___Basics___Styled_ presentation
PDF
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
Β 
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
CSharp_Syntax_Basics.pptxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PPTX
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PPTX
artificial intelligence overview of it and more
PDF
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1
PPTX
presentation_pfe-universite-molay-seltan.pptx
PDF
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1
Introduction to Information and Communication Technology
Funds Management Learning Material for Beg
Decoding a Decade: 10 Years of Applied CTI Discipline
SAP Ariba Sourcing PPT for learning material
INTERNET------BASICS-------UPDATED PPT PRESENTATION
CHE NAA, , b,mn,mblblblbljb jb jlb ,j , ,C PPT.pptx
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
Introuction about ICD -10 and ICD-11 PPT.pptx
Unit-1 introduction to cyber security discuss about how to secure a system
Module 1 - Cyber Law and Ethics 101.pptx
Internet___Basics___Styled_ presentation
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
Β 
522797556-Unit-2-Temperature-measurement-1-1.pptx
CSharp_Syntax_Basics.pptxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
artificial intelligence overview of it and more
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1
presentation_pfe-universite-molay-seltan.pptx
LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1

Is your Elastic Cluster Stable and Production Ready?

  • 1. Is your Elasticsearch Cluster Production Ready? Itamar Syn-Hershko http://code972.com | @synhershko http://BigDataBoutique.co.il
  • 3. What does it take? β€’ Cluster deployed using best practices β€’ Thorough monitoring β€’ Inspect. Fix. Repeat. β€’ Good capacity planning β€’ Memory management β€’ Indexing and sharding strategy β€’ Security
  • 4. Cluster Topology Master-eligible nodes (3) Data nodes (sizing by data) Client nodes, aka coordinating nodes (scalable, sizing by traffic)
  • 5. Deployments β€’ Prefer immutable images & scripted deployments β€’ For AWS see https://github.com/synhershko/elasticsearch- cloud-deploy/ β€’ GCP coming soon
  • 6. Backups β€’ Very efficient β€’ Very important β€’ Several storages supported β€’ To a shared file system β€’ HDFS β€’ Azure / GCP / AWS repositories via plugins
  • 7. What to monitor (on the cluster, per host)? β€’ CPU load β€’ Memory utilization β€’ Heap utilization β€’ GC time β€’ Disk utilization β€’ Disk IOPs β€’ Merges β€’ Deleted docs β€’ Requests per sec (indexing, search) β€’ Load average < number of cores β€’ Network in / out β€’ Thread pool rejections β€’ Number of nodes β€’ Cache sizes β€’ Cache evictions β€’ Cluster state / health β€’ Number of shards per type
  • 9. Grafana dashboards β€’ More fine-grained, cluster-wide view β€’ Provided with metrics polling script (Python) https://github.com/synhershko/elasticsearch-grafana-monitoring
  • 10. Monitoring Destination β€’ To the same cluster β€’ To a different cluster (Recommended) β€’ External systems (e.g. graphite) – only if already in org β€’ X-Pack subscribers can now send metrics to Elastic Cloud
  • 13. Correlating metrics β€’ Shards on the same node have issues? β€’ During merges? β€’ CPU and GC β€’ HTTP traffic and indexing or search operations
  • 15. Boosting slow operations β€’ Search or Indexing heavy? β€’ Measure operations also from applications side! β€’ Slow searches β€’ Queries need optimization β€’ Scoring (not using filters) β€’ Numeric ranges pre-5 β€’ Scripts β€’ Slow indexing β€’ Sharding strategy β€’ Use bulk indexing (optimize for 10-15MB of data, regardless of number of documents / operations) β€’ Slow analyzers affects both! (e.g. n-grams)
  • 16. Don’t use NGrams! β€’ Being used for β€œcontains” search β€’ You ain’t gonna need it, use WordDelimiter Token Filter instead β€’ Useful for fuzzy search / auto-correction β€’ Best used via Elasticsearch’s Suggesters β€’ Useful for languages without spaces, or with compound words β€’ min_gram , max_gram
  • 17. Caches β€’ Query cache β€’ Request cache β€’ Measure evictions rate & cache usage
  • 18. Memory Allocation β€’ ES_HEAP_SIZE β€’ DocValues used? β€’ Fielddata usage β€’ Query cache (for queries in filter context) β€’ Request cache (for aggregations and count queries) β€’ Never over 32GB! β€’ Default cache sizes not always fit usage β€’ Set appropriate static configs in elasticsearch.yml β€’ At least 50% of memory to file-system cache β€’ Usually more
  • 19. Server Sizing β€’ Master nodes β€’ 1-2 cores, 2-4 GB memory, 50% ES_HEAP_SIZE β€’ Data nodes β€’ > 4 cores, measure and preserve disk/mem ratio (can start with 1/24) β€’ ES_HEAP_SIZE as per previous slide β€’ Client nodes β€’ CPU and network heavy, 4GB memory should be enough for most use cases
  • 20. Index Management Patterns β€’ A Monolith Index β€’ Search faΓ§ade on top of your data β€’ Record linkage β€’ Anomaly detection β€’ Rolling indexes (time based events) β€’ Centralized logging β€’ Auditing β€’ IoT logs-2016.11.20 logs-2016.11.21 logs-2016.11.22 logs-2016.11.23logs-2016.11.19
  • 21. Optimal shard size β€’ Few millions in document size, for search performance β€’ A bit more if only doing aggregations β€’ 5-8GB on disk max, for startup times and network reallocation β€’ doc_values are enabled by default, turn off for non-aggs fields to save space
  • 22. Sharding β€’ Index Shards β€’ Resharding / auto-sharding not supported β€’ Index-level sharding β€’ Avoid using types (deprecated > 6.x) β€’ Multi-tenancy β€’ Rollover API (> 5.x) β€’ Cluster level β€’ Cluster per project β€’ Cross-cluster search capability
  • 23. Multitenancy β€’ Silos – Every tenant get their own index β€’ Index sizes vary β€’ Potentially wasting resources β€’ Pool – All tenants are in one big index β€’ Sharding isn’t dynamic β€’ Effects on tf/idf, aggregations, throughput β€’ Hybrid – Big tenants in their own index, pool(s) for small ones
  • 24. Use Explicit Mapping (aka Avoid Schemaless) β€’ In one of two ways: β€’ Disable dynamic mapping in settings (index.mapper.dynamic: false). Will refuse indexing. β€’ Create catch-all dynamic template with enabled:false mapping β€’ Why? β€’ Avoids hundreds of fields by mistake β€’ Saves effort on indexing and disk space β€’ Defaults are bad anyhow, don’t rely on them β€’ Prefer using index templates (especially for rolling indices)
  • 25. Re-balancing is your enemy β€’ Lock down shard rebalancing β€’ cluster.routing.rebalance.enable β€’ none β€’ cluster.routing.allocation.enable β€’ primaries β€’ new_primaries β€’ none
  • 26. More safe configs β€’ action.disable_delete_all_indices: true β€’ action.auto_create_index: false
  • 27. Deep paging (don’t!) β€’ Don’t from-size β€’ search_after (> 5.x) β€’ Scroll and sliced-scroll (> 5.x) β€’ Not for normal operation
  • 28. Deletions β€’ Deletions have an overhead β€’ Slow searches β€’ Segmentation β€’ More work on segment merging β€’ Non-exact tf/idf β€’ Every document update is a deletion β€’ No need to avoid it completely, just design accordingly
  • 29. Geographic Distribution β€’ Never with the same cluster! β€’ Cross-cluster search (formerly Tribe Node) β€’ For geographic sharding β€’ Different indexes in different regions β€’ xDCR for HA / DR β€’ Can be solved by infra – replicating queues (Kafka), DBs β€’ Solution coming in X-Pack
  • 30. Your ingestion architecture? β€’ Favor external ingestion, relieve Elastic from that responsibility β€’ Upgrade Logstash to 5.x β€’ Consider using FileBeat instead of logstash for log-tailing β€’ Prefer logstash machines over ingest nodes β€’ Use queues (Kafka, Redis) to protect against surges
  • 32. Protecting your cluster β€’ Don’t bind to a public IP β€’ Use only private IP/DNSs, preferably in subnets (e.g. AWS VPC) β€’ network.host in elasticsearch.yml β€’ Proxy all client requests to ES β€’ Disable HTTP where not needed β€’ + Don’t use default ports β€’ Secure publicly available client nodes β€’ Access via VPN only β€’ At the very least SSL + authentication if VPN not an option β€’ Disable dynamic scripting (pre-5.x)
  • 33. Securing Indexes and Documents β€’ Heavy Kibana user? β€’ Authentication and authorization β€’ Index, Document and Field level security β€’ Requires X-Pack Security β€’ Application level authentication and authorization β€’ Application filtering of content (fields, documents) β€’ Index level (e.g. index per tenant) β€’ Document level (using permissions) β€’ Inter-node comms, encryption at rest (X-Pack only)
  • 34. Upcoming in ES land β€’ Elasticsearch 6 β€’ Machine Learning β€’ Anomaly detection on time series data β€’ Enterprise Cloud β€’ Elastic Cloud deployed on-premise β€’ Any plugin authors in the crowd?
  • 35. Elasticsearch Training Elasticsearch for Developers & Maintaining Elasticsearch in Production β€’ September (10,11,17/9) β€’ November (12,13,16/11) http://bdbq.co.il/courses Consultancy and Development services http://bdbq.co.il/services/elasticsearch
  • 36. Questions? @synhershko on social (Twitter, github, …) Blog at http://code972.com Training and consultancy at http://BigDataBoutique.co.il