SlideShare a Scribd company logo
ELASTICSEARCH
A real-time distributed search and analytics engine
By Gautam Kumar
AGENDA
• Introduction
• Term ology
• Installation & Talking to Elasticsearch
• Inside cluster
• Data In & Data out
• Mapping & analysis
• EXACT VALUES vs Full Text search
• Advance Search
• Index Managment
INTRODUCTION
Why Elasticsearch?
• A real-time distributed search and analytics engine.
• Full-text search, structured search, analytics, and all three in combination.
• Every field is indexed and searchable.
• Access via a simple RESTful API, using a web client from your favorite
programming language
Who is using Elasticsearch:
Wikipedia, GitHub, Stack Overflow
TERMOLOGY
Database
Index
PUT
/megacorp/employee
/1
{
"first_name" : "John",
"last_name" : "Smith"
}
Tables
Type
employee
Row Column
Document Field
{
"first_name" : "John",
"last_name" : "Smith"
}
INSTALLATION & TALKING TO
ELASTICSEARCH
• Java API
Node client- it doesn’t hold any data itself, but it knows what data lives on
which node in the cluster, and can forward requests directly to the correct node.
Transport client- used to send requests to a remote cluster
• RESTful API with JSON over HTTP
curl -XGET 'http://localhost:9200/_count?pretty' -d '
{
"query": {
"match_all": {}
}
}
'
DATA IN & DATA OUT
Indexing a Document-
• By own id.
PUT /{index}/{type}/{id}
{
"field": "value",
}
Index API
• By autogenerated Id
POST /website/blog/
{
"title": "My second blog entry",
"text": "Still trying this out...",
"date": "2014/01/01"
}
DATA IN & DATA …...
Retrieving a Document
GET /website/blog/123?pretty
GET
/website/blog/123?_source=title,
text
GET /website/blog/123/_source
output
{
"_index" :
"_type" :
"_id" :
"_version" :
"found" :
"_source" :
"title":
"text":
"date":
}
"website",
"blog",
"123",
1,
true,
{
"My first blog entry",
DATA IN & DATA …...
Updating a Whole Document
PUT /website/blog/123
{
"title": "My first blog entry",
"text": "I am starting to get the
hang of this...",
"date": "2014/01/02"
}
output
{
"_index" :
"website",
"_type" :
"blog",
"_id" :
"123",
"_version" : 2,
"created":
false
}
DATA IN & DATA …...
Creating a New Document
PUT /website/blog/123/_create
{
"title": "My first blog entry",
"text": "I am starting to get the
hang of this...",
"date": "2014/01/02"
}
output
{
"_index" :
"website",
"_type" :
"blog",
"_id" :
"123",
"_version" : 2,
"created":
false
}
DATA IN & DATA …...
Deleting a Document
DELETE /website/blog/123
output
{
"found" :
"_index" :
"_type" :
"_id" :
"_version" :
true,
"website",
"blog",
"123",
3
}
MAPPING & ANALYSIS
Mapping-How the data in each
field is interpreted
GET /gb/_mapping/tweet
{
"gb": {
"mappings": {
"tweet": {
"properties": {
"name": {"type":"text"
},
Analysis-How full text is processed
to make it searchable
GET /_analyze?analyzer=standard
Text to analyze
{
"tokens": [
{
"token":
"start_offset":
"end_offset":
"type":
"position":
},
{
"token":
"start_offset":
"end_offset":
"type":
"position":2
EXACT VALUES VS FULL TEXT
SEARCH
Exact values: are exactly what they
sound like. Examples are a date or a user
ID
GET /_search?q=date:2014-09-15
GET /_search?q=date:2014
Full text, on the other hand, refers to
textual data like the text of a tweet or the
body of an email.
GET /_search?q=2014
GET /_search?q=2014-09-15
/gb/_search
/gb,us/_search
/gb/user/_search
/gb,us/user,tweet/_search
/_all/user,tweet/_search
ADVANCE SEARCH
Structure of Query
GET /_search
{
"query": {
"match": {
"tweet": "elasticsearch"
}
}
}
Combining Multiple Clauses
{
"bool": {
"must":
{ "match": { "tweet": "elasticsearch" }},
"must_not": { "match": { "name": "mary" }},
"should":
{ "match": { "tweet": "full text" }}
}
}
ADVANCE SEARCH.......
term Filter:is used to filter by
exact values
{"term":{"age":26}}
range Filter:
{"range": {"age": {
"gte":20,
"lt":30
}}}
terms Filter: allows to specify
multiple values
• { "terms": { "tag": [ "search", "full_text",
"nosql" ] }}
• Match all query
{ "match_all": {}}
ADVANCE SEARCH.......
match Query:to query for a full-text or
exact value in almost any field.
{ "match": { "tweet": "About Search" }}
For exact values
{"match":{"age":26}}
multi_match query: same match
query on multiple fields
{
"multi_match": {
"query":
"full text search",
"fields":
[ "title", "body" ]
}
}
INDEX MANAGMENT
• Creating an Index:
PUT /my_index
{
"settings": { ... any settings ... },
"mappings": {
"type_one": { ... any mappings ... },
"type_two": { ... any mappings ... },
...
}
}
?
Thank you

More Related Content

PPTX
Elasticsearch an overview
PDF
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
PPTX
Elastic search intro-@lamper
PDF
The What and Why of NoSql
PPTX
ElasticSearch - Introduction to Aggregations
PDF
MongoDB and Schema Design
PPTX
What You Missed in Computer Science
PDF
Log File Analysis: The most powerful tool in your SEO toolkit
Elasticsearch an overview
Webinar: Event Processing & Data Analytics with Lucidworks Fusion
Elastic search intro-@lamper
The What and Why of NoSql
ElasticSearch - Introduction to Aggregations
MongoDB and Schema Design
What You Missed in Computer Science
Log File Analysis: The most powerful tool in your SEO toolkit

What's hot (20)

PPTX
#MongoDB indexes
PPTX
Using server logs to your advantage
PPT
Akiban Presentation at Percona Live NYC 2012
PDF
Distributed Crawler Service architecture presentation
PPTX
Gab2015 azure search as a service
DOCX
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...
PPTX
Adding azuresearch
PPTX
MongoDB et Hadoop
PPTX
Houston tech fest dev intro to sharepoint search
PPTX
Deep-Dive to Azure Search
PDF
Modernizing WordPress Search with Elasticsearch
PPTX
Getting Started With Elasticsearch In .NET
PDF
Data persistence using pouchdb and couchdb
PPTX
Azure DocumentDB for Healthcare Integration
PPTX
MongoDB Schema Design by Examples
PPTX
SF ElasticSearch Meetup 2013.04.06 - Monitoring
PDF
Finding Love with MongoDB
PPT
Web Crawler
PDF
Mices 2018 cxp pavel_penchev_searchhub-searchcollector
PPTX
Azure search
#MongoDB indexes
Using server logs to your advantage
Akiban Presentation at Percona Live NYC 2012
Distributed Crawler Service architecture presentation
Gab2015 azure search as a service
SMART CRAWLER: A TWO-STAGE CRAWLER FOR EFFICIENTLY HARVESTING DEEP-WEB INTERF...
Adding azuresearch
MongoDB et Hadoop
Houston tech fest dev intro to sharepoint search
Deep-Dive to Azure Search
Modernizing WordPress Search with Elasticsearch
Getting Started With Elasticsearch In .NET
Data persistence using pouchdb and couchdb
Azure DocumentDB for Healthcare Integration
MongoDB Schema Design by Examples
SF ElasticSearch Meetup 2013.04.06 - Monitoring
Finding Love with MongoDB
Web Crawler
Mices 2018 cxp pavel_penchev_searchhub-searchcollector
Azure search
Ad

Similar to Elasticsearch a real-time distributed search and analytics engine (20)

PPTX
Elasticsearch
ODP
Elastic Search
PDF
06. ElasticSearch : Mapping and Analysis
PPTX
ElasticSearch Basics
PPTX
Elasticsearch
PPTX
Elastic search basic conceptes by gggg.pptx
PDF
Elasticsearch Introduction at BigData meetup
PPTX
Elasticsearch as a search alternative to a relational database
PDF
JavaCro'15 - Elasticsearch as a search alternative to a relational database -...
PDF
Enhancement of Searching and Analyzing the Document using Elastic Search
PDF
ElasticSearch - index server used as a document database
PDF
Elasticsearch
PPTX
Intro to elasticsearch
PPSX
Elasticsearch - basics and beyond
PDF
ElasticSearch
PDF
Elasticsearch and Spark
PPTX
Elasticsearch
PPTX
Elastic pivorak
PDF
James elastic search
PPTX
An Introduction to Elastic Search.
Elasticsearch
Elastic Search
06. ElasticSearch : Mapping and Analysis
ElasticSearch Basics
Elasticsearch
Elastic search basic conceptes by gggg.pptx
Elasticsearch Introduction at BigData meetup
Elasticsearch as a search alternative to a relational database
JavaCro'15 - Elasticsearch as a search alternative to a relational database -...
Enhancement of Searching and Analyzing the Document using Elastic Search
ElasticSearch - index server used as a document database
Elasticsearch
Intro to elasticsearch
Elasticsearch - basics and beyond
ElasticSearch
Elasticsearch and Spark
Elasticsearch
Elastic pivorak
James elastic search
An Introduction to Elastic Search.
Ad

Recently uploaded (20)

PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Computer network topology notes for revision
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to machine learning and Linear Models
PDF
Lecture1 pattern recognition............
PPT
Quality review (1)_presentation of this 21
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Acceptance and paychological effects of mandatory extra coach I classes.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Qualitative Qantitative and Mixed Methods.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Business Analytics and business intelligence.pdf
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Supervised vs unsupervised machine learning algorithms
Clinical guidelines as a resource for EBP(1).pdf
Computer network topology notes for revision
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to machine learning and Linear Models
Lecture1 pattern recognition............
Quality review (1)_presentation of this 21

Elasticsearch a real-time distributed search and analytics engine

  • 1. ELASTICSEARCH A real-time distributed search and analytics engine By Gautam Kumar
  • 2. AGENDA • Introduction • Term ology • Installation & Talking to Elasticsearch • Inside cluster • Data In & Data out • Mapping & analysis • EXACT VALUES vs Full Text search • Advance Search • Index Managment
  • 3. INTRODUCTION Why Elasticsearch? • A real-time distributed search and analytics engine. • Full-text search, structured search, analytics, and all three in combination. • Every field is indexed and searchable. • Access via a simple RESTful API, using a web client from your favorite programming language Who is using Elasticsearch: Wikipedia, GitHub, Stack Overflow
  • 4. TERMOLOGY Database Index PUT /megacorp/employee /1 { "first_name" : "John", "last_name" : "Smith" } Tables Type employee Row Column Document Field { "first_name" : "John", "last_name" : "Smith" }
  • 5. INSTALLATION & TALKING TO ELASTICSEARCH • Java API Node client- it doesn’t hold any data itself, but it knows what data lives on which node in the cluster, and can forward requests directly to the correct node. Transport client- used to send requests to a remote cluster • RESTful API with JSON over HTTP curl -XGET 'http://localhost:9200/_count?pretty' -d ' { "query": { "match_all": {} } } '
  • 6. DATA IN & DATA OUT Indexing a Document- • By own id. PUT /{index}/{type}/{id} { "field": "value", } Index API • By autogenerated Id POST /website/blog/ { "title": "My second blog entry", "text": "Still trying this out...", "date": "2014/01/01" }
  • 7. DATA IN & DATA …... Retrieving a Document GET /website/blog/123?pretty GET /website/blog/123?_source=title, text GET /website/blog/123/_source output { "_index" : "_type" : "_id" : "_version" : "found" : "_source" : "title": "text": "date": } "website", "blog", "123", 1, true, { "My first blog entry",
  • 8. DATA IN & DATA …... Updating a Whole Document PUT /website/blog/123 { "title": "My first blog entry", "text": "I am starting to get the hang of this...", "date": "2014/01/02" } output { "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 2, "created": false }
  • 9. DATA IN & DATA …... Creating a New Document PUT /website/blog/123/_create { "title": "My first blog entry", "text": "I am starting to get the hang of this...", "date": "2014/01/02" } output { "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 2, "created": false }
  • 10. DATA IN & DATA …... Deleting a Document DELETE /website/blog/123 output { "found" : "_index" : "_type" : "_id" : "_version" : true, "website", "blog", "123", 3 }
  • 11. MAPPING & ANALYSIS Mapping-How the data in each field is interpreted GET /gb/_mapping/tweet { "gb": { "mappings": { "tweet": { "properties": { "name": {"type":"text" }, Analysis-How full text is processed to make it searchable GET /_analyze?analyzer=standard Text to analyze { "tokens": [ { "token": "start_offset": "end_offset": "type": "position": }, { "token": "start_offset": "end_offset": "type": "position":2
  • 12. EXACT VALUES VS FULL TEXT SEARCH Exact values: are exactly what they sound like. Examples are a date or a user ID GET /_search?q=date:2014-09-15 GET /_search?q=date:2014 Full text, on the other hand, refers to textual data like the text of a tweet or the body of an email. GET /_search?q=2014 GET /_search?q=2014-09-15 /gb/_search /gb,us/_search /gb/user/_search /gb,us/user,tweet/_search /_all/user,tweet/_search
  • 13. ADVANCE SEARCH Structure of Query GET /_search { "query": { "match": { "tweet": "elasticsearch" } } } Combining Multiple Clauses { "bool": { "must": { "match": { "tweet": "elasticsearch" }}, "must_not": { "match": { "name": "mary" }}, "should": { "match": { "tweet": "full text" }} } }
  • 14. ADVANCE SEARCH....... term Filter:is used to filter by exact values {"term":{"age":26}} range Filter: {"range": {"age": { "gte":20, "lt":30 }}} terms Filter: allows to specify multiple values • { "terms": { "tag": [ "search", "full_text", "nosql" ] }} • Match all query { "match_all": {}}
  • 15. ADVANCE SEARCH....... match Query:to query for a full-text or exact value in almost any field. { "match": { "tweet": "About Search" }} For exact values {"match":{"age":26}} multi_match query: same match query on multiple fields { "multi_match": { "query": "full text search", "fields": [ "title", "body" ] } }
  • 16. INDEX MANAGMENT • Creating an Index: PUT /my_index { "settings": { ... any settings ... }, "mappings": { "type_one": { ... any mappings ... }, "type_two": { ... any mappings ... }, ... } }

Editor's Notes

  • #9: Documents in Elasticsearch are immutable; we cannot change them. Instead, if we need to update an existing document, we reindex or replace it,
  • #17: prevent the automatic creation of indices by adding the following setting to the config/elasticsearch.yml file on each node:  action.auto_create_index: false