SlideShare a Scribd company logo
Igor Motov
igor@motovs.org
 twitter: @imotov
   github: imotov
Boston elasticsearch meetup October 2012
Sonian Inc.
•Cloud-based email archiving
•Founded in 2007
•Headquarters: Newton, MA
Small team of about15
  developers distributed
from Campinas, Brazil to
   Vancouver, Canada
Using elasticsearch since
   June 2010, v0.8.0
We have about


      6 billion
records indexed in elasticsearch
100,000
 Netflix DVD Titles
3,000,000
 Pages in en.wikipedia.org
22,000,000
Books in Library of Congress catalog
150,000,000
   Linked-in profiles
3,000,000,000
  Estimated bing.com index size
6,000,000,000
   Sonian Inc. index size
50,000,000,000
Estimated google.com
     index size
Infrastructure
http://www.sonian.com/awssonian-technical-diagram/
Ingestion (safe):   Clojure
Search Engine: elasticsearch
Web App:          Ruby on Rail

Deployment:     Chef
Monitoring:     Sensu
10 clusters
     6 AWS Regions
2-17 nodes in each cluster
Custom version of
   elasticsearch
 based on 0.19.9
with several plugins
jetty plugin

• jetty-based http transport
• SSL support
• Authentication
• Request logging (json, plain)
Request logs are also indexed
      in elasticsearch
Open source
https://github.com/sonian/elasti
           csearch-jetty
Zookeeper plugin

 Zookeeper-based discovery
Replacement for zen discovery

            Experimental!
Open source
https://github.com/sonian/elasti
       csearch-zookeeper
Valve plugin

•Custom jetty plugin filter
•Rejects bulk indexing requests
if cluster is overloaded
Lessons learned in
 the last two years
          or
Proper Care and
     Feeding of
Elasticsearch Nodes
Rule1: Give nodes plenty of
           space

Running out of disk space or
memory is the simplest way to
    corrupt your index.
Make sure elasticsearch
         doesn’t swap
 It reduces performance and
causes nodes to leave clusters
elasticsearch.yml

bootstrap.mlockall: true
Increase the number of open
    file descriptors to 64k.
Rule 2: Distributed but well
          connected

All nodes should be able to talk
    to each other all the time
Otherwise your cluster might
 get split-brain syndrome
Consider setting

discovery.zen.minimum_master_nodes
Rule 3: Throttle the bulk
        indexing load

  Asynchronous architecture
makes es scalable and fast, but
 susceptible to running out of
memory under excessive bulk
       indexing load.
Rule 4: Try to make all shards
approximately the same size

Elasticsearch allocates shards
   based on the number of
  shards. It doesn’t consider
 shard sizes or available disk
             space.
4 rules for happy elasticsearch

1. Give nodes plenty of space
2. Distributed but well
   connected
3. Throttle the load
4. Make all shards the same
   size
Questions?
More Information

Latest stable release: 0.19.10

Web Site: http://www.elasticsearch.org/

Follow @elasticsearch on twitter

IRC: #elasticsearch on irc.freenode.net

GitHub: https://github.com/elasticsearch/elasticsearch

Mailing list: elasticsearch on http://groups.google.com/

Stackoverflow tag: elasticsearch

More Related Content

PDF
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
PPTX
How to Make a Honeypot Stickier (SSH*)
PPTX
How to Make a Honeypot Stickier (SSH*)
PPTX
OpenStack!
PDF
Tai lieu 1
PPT
Drupal and Elasticsearch
PPTX
Drupal 8 + Elasticsearch + Docker
PDF
Introduction to react-query. A Redux alternative? (Nikos Kleidis, Front End D...
2018년 3월 정기 세미나 - March 2018 Ops Meetup 후기
How to Make a Honeypot Stickier (SSH*)
How to Make a Honeypot Stickier (SSH*)
OpenStack!
Tai lieu 1
Drupal and Elasticsearch
Drupal 8 + Elasticsearch + Docker
Introduction to react-query. A Redux alternative? (Nikos Kleidis, Front End D...

What's hot (15)

PPTX
Security Walls in Linux Environment: Practice, Experience, and Results
PDF
API analytics with Redis and Google Bigquery. NoSQL matters edition
PDF
Dev opsmeetup sept2013-leaseweb
PDF
JupyterHub + kubernetes
PPTX
Have You Seen My Malware?
PDF
Managing and Integrating Vault at The New York Times
PDF
OpenRestyを用いてイケイケなサービスを作る方法
PDF
Lustre Community Release Update
PPTX
Open stack neutron and opendaylight
PPTX
Kwort Linux 4.3 the new stable version is released
ODP
Fusker - A NodeJS Security Framework
PPTX
Network Monitoring with Icinga
PDF
PDF
오픈스택한국커뮤니티 소개
PPTX
CloudStack and the HeartBleed vulnerability
Security Walls in Linux Environment: Practice, Experience, and Results
API analytics with Redis and Google Bigquery. NoSQL matters edition
Dev opsmeetup sept2013-leaseweb
JupyterHub + kubernetes
Have You Seen My Malware?
Managing and Integrating Vault at The New York Times
OpenRestyを用いてイケイケなサービスを作る方法
Lustre Community Release Update
Open stack neutron and opendaylight
Kwort Linux 4.3 the new stable version is released
Fusker - A NodeJS Security Framework
Network Monitoring with Icinga
오픈스택한국커뮤니티 소개
CloudStack and the HeartBleed vulnerability
Ad

Viewers also liked (20)

PDF
H6 het parlement
PPTX
Xii promoción
PPTX
Challenges of Monetary Policy Communication
PPT
Money museums as_tools_for_economic_education
PPTX
Presentation劉思竹v4.2 10122608
PPTX
Actividad 1
PPTX
Week3 intro to computer (history of comps, comps in everyday life)
PPT
Jay baernsa2012slides
PPTX
Challenges of Monetary Policy Communication
PPTX
Proposal rehearsal sze_chuliu 1021216(ver. 2.1)
PDF
Fencyt
PPTX
Actividad 1
PPTX
Proceso de paz
PDF
Elasticsearch Quick Introduction
PPT
Camtasia getting started guide
PDF
Stad & Esch studiedag: Burgerschap & Flipping the class & Spons
PPTX
Presentation 1021014(v3)
PDF
Trabalhe de casa, pergunte-me como!
PPTX
Presentation(ii)劉思竹v2.1
PPT
De viaxe por galicia. 2ºciclo
H6 het parlement
Xii promoción
Challenges of Monetary Policy Communication
Money museums as_tools_for_economic_education
Presentation劉思竹v4.2 10122608
Actividad 1
Week3 intro to computer (history of comps, comps in everyday life)
Jay baernsa2012slides
Challenges of Monetary Policy Communication
Proposal rehearsal sze_chuliu 1021216(ver. 2.1)
Fencyt
Actividad 1
Proceso de paz
Elasticsearch Quick Introduction
Camtasia getting started guide
Stad & Esch studiedag: Burgerschap & Flipping the class & Spons
Presentation 1021014(v3)
Trabalhe de casa, pergunte-me como!
Presentation(ii)劉思竹v2.1
De viaxe por galicia. 2ºciclo
Ad

Similar to Boston elasticsearch meetup October 2012 (20)

PPTX
Dev nexus 2017
PPTX
Search and analyze your data with elasticsearch
PPTX
Devnexus 2018
PPT
Elk presentation1#3
PPTX
Elastic pivorak
PDF
Is your Elastic Cluster Stable and Production Ready?
ODP
Elasticsearch for beginners
PDF
Elasticsearch
PDF
Elasticsearch, a distributed search engine with real-time analytics
PPTX
Elasticsearch
PDF
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
PPTX
Perl and Elasticsearch
PDF
Elasticsearch Introduction at BigData meetup
PDF
Running ElasticSearch on Google Compute Engine in Production
PDF
Optimizing elastic search on google compute engine
PPTX
Elasticsearch - DevNexus 2015
PPTX
ElasticSearch Meetup 30 - 10 - 2014
PPTX
Elasticsearch { "Meetup" : "talk" }
PPTX
ElasticSearch Basics
PPSX
Elasticsearch - basics and beyond
Dev nexus 2017
Search and analyze your data with elasticsearch
Devnexus 2018
Elk presentation1#3
Elastic pivorak
Is your Elastic Cluster Stable and Production Ready?
Elasticsearch for beginners
Elasticsearch
Elasticsearch, a distributed search engine with real-time analytics
Elasticsearch
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
Perl and Elasticsearch
Elasticsearch Introduction at BigData meetup
Running ElasticSearch on Google Compute Engine in Production
Optimizing elastic search on google compute engine
Elasticsearch - DevNexus 2015
ElasticSearch Meetup 30 - 10 - 2014
Elasticsearch { "Meetup" : "talk" }
ElasticSearch Basics
Elasticsearch - basics and beyond

Recently uploaded (20)

PDF
KodekX | Application Modernization Development
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Machine learning based COVID-19 study performance prediction
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Cloud computing and distributed systems.
PDF
Electronic commerce courselecture one. Pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Big Data Technologies - Introduction.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
KodekX | Application Modernization Development
The AUB Centre for AI in Media Proposal.docx
Digital-Transformation-Roadmap-for-Companies.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MIND Revenue Release Quarter 2 2025 Press Release
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Review of recent advances in non-invasive hemoglobin estimation
Machine learning based COVID-19 study performance prediction
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Spectral efficient network and resource selection model in 5G networks
Spectroscopy.pptx food analysis technology
Cloud computing and distributed systems.
Electronic commerce courselecture one. Pdf
cuic standard and advanced reporting.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Big Data Technologies - Introduction.pptx
sap open course for s4hana steps from ECC to s4
Network Security Unit 5.pdf for BCA BBA.
MYSQL Presentation for SQL database connectivity
Reach Out and Touch Someone: Haptics and Empathic Computing

Boston elasticsearch meetup October 2012

Editor's Notes

  • #38: http://www.flickr.com/photos/drachmann/327122302/
  • #39: http://www.flickr.com/photos/4nitsirk/3778043845/