SlideShare a Scribd company logo
(Elastic)search in Big Data
Radu Gheorghe
@radu0gheorghe @sematext
What is “search in Big Data”? Challenges?
Some solutions?
How does Elasticsearch do it?
Agenda
Search Expectations
headphones for iPhone 4, iPhone 5, iPhone 6 and iPhone 7
iPhone 5
iPhone 4
Relevancy...
iphone
iphone
iphone 5
Institute of Public Health
...and autocomplete...
iph
No results found for “iphnoe”
iPhone 5
iPhone 4
… and fuzziness...
iphnoe
Did you mean “iPhone”?
iPhone 5
iPhone 4
...and corrections...
iphnoe
shows results
anyway
iPhone 5
iPhone 4
iPhone 3
Galaxy S4
...and similar terms...
iphone
iPhone 5
iPhone 4
...and don’t forget the statistics!
iphone
☑ iOS
☐ other
☑ <100RON
☐ 100-200RON
☐ >200RON
Wait. Fancy search == Big Data?
Fancy stuff isn’t free
iphone
☑ iOS
☐ other
☑ <100RON
☐ 100-200RON
☐ >200RON
N requests for
autocomplete
Did you mean...
iPhone 5
iPhone 4
iPhone 3
Galaxy S4
1 request for
each of the stats
1 request for
synonyms, 1
for exact
matches, etc
1 request for
corrections
Distributed search. When one server
doesn’t cut it
Log Search
web_server01
database01
backend01
search engine
10:01 - webapp - DB connect error
10:00 - DB - I/O error
error
Log Analytics
unique IPs: 7584
iPhone 5
iPhone 4
Galaxy S4
best sellers
Romania: 200
France: 150
Hungary: 120
users per country
revenue per day
Distributed search solutions
Elasticsearch
Solr
Others: SenseiDB, Sphinx…
SaaS: CloudSearch, Logsene...
built on top of
Lucene
Document-oriented
Lucene awesome: index & store data,
relevancy, fuzzy, suggesters...
...all wrapped up in JSON over HTTP
Elasticsearch
Aggregations
revenue per day
unique IPs: 7584
Aggregations
revenue per day
unique IPs: 7584
Romania: 200
France: 150
Hungary: 120
unique IPs per country
Aggregations
revenue per day
Romania: 200
France: 150
Hungary: 120
unique IPs per country
unique IPs per country per day
Romania
unique IPs: 7584
Node 1
Node 1
Node 1 Node 2
Node 1 Node 2
Node 1 Node 2 Node 3
Node 1 Node 2 Node 3
Node 1 Node 2 Node 3
Node 1 Node 2
Node 1 Node 2
Big Data distributed
search
search and real-
time analytics
Big Data distributed
search
search and real-
time analytics
more search
features
Big Data distributed
search
search and real-
time analytics
more search
features
clients
usage
(logs)
Thank you!
radu.gheorghe@sematext.com
@radu0gheorghe @sematext
Big Data distributed
search
search and real-
time analytics
more search
features
clients
usage
(logs)

More Related Content

PPTX
quick intro to elastic search
ODP
Elastic search
PPTX
An Introduction to Elastic Search.
PDF
Introduction to Elasticsearch
PPTX
Elastic search Walkthrough
PPSX
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
PDF
Elastic Search
PDF
Elastic search & patent information @ mtc
quick intro to elastic search
Elastic search
An Introduction to Elastic Search.
Introduction to Elasticsearch
Elastic search Walkthrough
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Elastic Search
Elastic search & patent information @ mtc

Viewers also liked (20)

PPTX
ElasticSearch Basic Introduction
PPTX
Elastic Search
PDF
Simple search with elastic search
PPTX
Power of Elastic Search - nLocate
PPTX
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
PDF
Docker Logging Webinar
PDF
Top Node.js Metrics to Watch
ODP
Searching Relational Data with Elasticsearch
PDF
Tuning Solr & Pipeline for Logs
PPTX
MongoDB and Apache HBase: Benchmarking
PPTX
Musings on Secondary Indexing in HBase
PPTX
Don’t Redesign Your Website in the Dark: Master the redesign process with cus...
ODP
Search Analytics with Flume and HBase
PPTX
Ricerche performanti con ElasticSearch sfruttando la potenza e la flessibilit...
ODP
Query DSL In Elasticsearch
PDF
Docker Monitoring Webinar
PPTX
Apache HBase Application Archetypes
PDF
Solr Anti Patterns
PDF
Tuning Solr for Logs
PDF
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
ElasticSearch Basic Introduction
Elastic Search
Simple search with elastic search
Power of Elastic Search - nLocate
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Docker Logging Webinar
Top Node.js Metrics to Watch
Searching Relational Data with Elasticsearch
Tuning Solr & Pipeline for Logs
MongoDB and Apache HBase: Benchmarking
Musings on Secondary Indexing in HBase
Don’t Redesign Your Website in the Dark: Master the redesign process with cus...
Search Analytics with Flume and HBase
Ricerche performanti con ElasticSearch sfruttando la potenza e la flessibilit...
Query DSL In Elasticsearch
Docker Monitoring Webinar
Apache HBase Application Archetypes
Solr Anti Patterns
Tuning Solr for Logs
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
Ad

Similar to (Elastic)search in big data (20)

PPTX
BigData Search Simplified with ElasticSearch
PPTX
Elastic pivorak
PDF
Elasticsearch
PDF
You're not using ElasticSearch (outdated)
PDF
_Search? Made Simple: Elastic + App Search
PDF
Voxpopme - Elasticsearch Service
PDF
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
PPTX
Academy PRO: Introduction to search engines. Meet Elasticsearch
PPTX
Connect and search your data
PDF
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
PPTX
Search and analyze your data with elasticsearch
ODP
Elastic Search
PDF
Elastic search mind mapping
PPTX
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
PDF
Elasticsearch
PPTX
Big data elasticsearch practical
PPSX
Elasticsearch - basics and beyond
PDF
Mastering Elasticsearch 2nd Edition Edition Rafal Kuc
PDF
InformationRetrieval
PPTX
Dev nexus 2017
BigData Search Simplified with ElasticSearch
Elastic pivorak
Elasticsearch
You're not using ElasticSearch (outdated)
_Search? Made Simple: Elastic + App Search
Voxpopme - Elasticsearch Service
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Academy PRO: Introduction to search engines. Meet Elasticsearch
Connect and search your data
Mejorando las busquedas en nuestras aplicaciones web con elasticsearch
Search and analyze your data with elasticsearch
Elastic Search
Elastic search mind mapping
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
Elasticsearch
Big data elasticsearch practical
Elasticsearch - basics and beyond
Mastering Elasticsearch 2nd Edition Edition Rafal Kuc
InformationRetrieval
Dev nexus 2017
Ad

More from Sematext Group, Inc. (20)

PDF
Tweaking the Base Score: Lucene/Solr Similarities Explained
PDF
OOPs, OOMs, oh my! Containerizing JVM apps
PPTX
Is observability good for your brain?
PDF
Introducing log analysis to your organization
PPTX
Solr Search Engine: Optimize Is (Not) Bad for You
PDF
Solr on Docker - the Good, the Bad and the Ugly
PDF
Monitoring and Log Management for
PDF
Introduction to solr
PDF
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
PDF
Elasticsearch for Logs & Metrics - a deep dive
PDF
How to Run Solr on Docker and Why
PPT
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
PDF
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
PDF
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
PDF
Side by Side with Elasticsearch & Solr, Part 2
PPTX
Tuning Elasticsearch Indexing Pipeline for Logs
PDF
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
PDF
Side by Side with Elasticsearch and Solr
PDF
Open Source Search Evolution
PDF
Elasticsearch and Solr for Logs
Tweaking the Base Score: Lucene/Solr Similarities Explained
OOPs, OOMs, oh my! Containerizing JVM apps
Is observability good for your brain?
Introducing log analysis to your organization
Solr Search Engine: Optimize Is (Not) Bad for You
Solr on Docker - the Good, the Bad and the Ugly
Monitoring and Log Management for
Introduction to solr
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Elasticsearch for Logs & Metrics - a deep dive
How to Run Solr on Docker and Why
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Side by Side with Elasticsearch & Solr, Part 2
Tuning Elasticsearch Indexing Pipeline for Logs
From Zero to Hero - Centralized Logging with Logstash & Elasticsearch
Side by Side with Elasticsearch and Solr
Open Source Search Evolution
Elasticsearch and Solr for Logs

Recently uploaded (20)

PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
PPTX
Database Infoormation System (DBIS).pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Computer network topology notes for revision
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Foundation of Data Science unit number two notes
PPTX
Introduction to machine learning and Linear Models
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
Global journeys: estimating international migration
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
Taxes Foundatisdcsdcsdon Certificate.pdf
Database Infoormation System (DBIS).pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Clinical guidelines as a resource for EBP(1).pdf
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
IB Computer Science - Internal Assessment.pptx
Computer network topology notes for revision
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Reliability_Chapter_ presentation 1221.5784
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Supervised vs unsupervised machine learning algorithms
STUDY DESIGN details- Lt Col Maksud (21).pptx
Foundation of Data Science unit number two notes
Introduction to machine learning and Linear Models
Moving the Public Sector (Government) to a Digital Adoption
Global journeys: estimating international migration
Introduction-to-Cloud-ComputingFinal.pptx

(Elastic)search in big data