SlideShare a Scribd company logo
Lisa Jung
Developer Advocate @Elastic
Beginner’s Crash Course to Elastic Stack Series
Part 1.2: Understanding the Relevance of your search using
Elasticsearch and Kibana
Beginner’s crash course to the Elastic Stack Series
● Part 1.1: Intro to Elasticsearch and Kibana
○ use case of Elasticsearch and Kibana
○ the basic architecture of Elasticsearch
○ perform CRUD(Create, Read, Update, and Delete) operations with
Elasticsearch and Kibana
Missed the first workshop? No worries!
● Part 1.1: Intro to Elasticsearch and Kibana
○ Repo: https://ela.st/workshop-1-repo
The Elastic Stack
Reliably and securely take data from
any source, in any format, then search,
analyze, and visualize it in real time.
Elasticsearch
Store | Search | Analyze
Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf
Elastic is a search company.
We focus on value to users by producing fast results
that operate at scale and are relevant. This is our
DNA. We believe search is an experience. It is what
defines us, and makes us unique.
Scale,
Relevance
Relevance
How do we measure relevance?
● Precision
● Recall
Store | Search | Analyze
I store data as
documents!
Documents with similar traits
are grouped into an index!
Index
Document Document Document
Document Document
Document
When search query is sent, Elasticsearch retrieves relevant
documents and presents the documents as search results.
Index
Document Document Document
Document Document
Document
Search Results
These two diagrams depict the same thing!
Index
Index
Document Document Document
Document Document
Document
Index
True positives are relevant documents that are
returned to the user.
T
T
T
T
T True positives
Index
False positives are irrelevant documents that are
returned to the user.
T
T
T
T
T True positives
F
F False positives
Index
True negatives are irrelevant documents that are
not returned to the user.
T
T
T
T
T
T
T
F
F
True negatives
T
T
T
T
Index
False negatives are relevant documents that were
not returned to the user.
T
T
T
T
T
T
T
F
F
True negatives
False negatives
T
T
T
T
What is precision?
Precision =
True positives
True positives + False positives
What portion of the retrieved data is actually relevant to
the search query?
What is recall?
Recall =
True positives
True positives + False negatives
What portion of relevant data is being returned as search
results?
T
T
T
Precision and Recall are inversely related
Precision
I want all the
retrieved results to
be a perfect match to
the query, even if it
means returning less
documents.
I want to retrieve
more results even
if documents may
not be a perfect
match to the
query.
Precision Recall
Recall
Precision Recall
Precision and recall determine which documents are
included in the search results.
Precision and recall do not determine which of the returned
documents are more relevant compared to the other!
Ranking refers to ordering of the results (from most relevant
results at the top, to least relevant at the bottom).
Most Relevant
…
…
…
Less Relevant
…
…
…
Least Relevant
How to form good habits
(Highest Score)
(Lowest Score)
What is score?
● The score is a value that represents how relevant a document is to that
specific query
● A score is computed for each document that is a hit
What is score?
● Term Frequency(TF)
● Inverse Document Frequency(IDF)
What is score?
How to form good habits } Search Query
Hits
Most Relevant
…
…
…
Less Relevant
…
…
…
Least Relevant
(Highest Score)
(Lowest Score)
Term Frequency(TF) determines how many times each search
term appears in a document.
How to form good habits
TF= 4
TF= 1
If search terms are found in
high frequency in a
document, the document is
considered more relevant to
the search query.
Search Terms
} Search Query
What is Inverse Document Frequency(IDF)?
How to form good habits
How to form a meetup group
Hits
Good chicken recipe
How to form a band
Good times rolling
Good habits 101
Good habits are easy to master!
We may contain some
of the search terms
but we have nothing
to do with forming
good habits!
IDF diminishes the weight of
terms that occur very
frequently in the document
set and increases the weight
of terms that occur rarely!
Fine tuning precision or recall using
Elasticsearch and Kibana
Click on the link to the workshop repo.
https://ela.st/workshop-2-repo
Scroll down to the Resources section & open Free
Elastic Cloud Trial link in a new tab.
Scroll down to the Resources section & open Free
Elastic Cloud Trial link in a new tab.
Go to the email account you signed up with and
verify your email.
Open email from Elastic. Click on verify and accept
button.
Enter your password.
Click on start your free trial.
Select the Elastic Stack.
Configure your settings.
Name your deployment then create deployment.
Beginner’s Crash Course
Save your deployment credentials.
Open Kibana.
Beginner’s Crash Course
Beginner’s Crash Course
Click on Explore on my own option.
Click on Upload a file option
Download and Unzip News Category Dataset from Kaggle
Drag and drop a file you want to upload.
Kibana will give you an analysis of the first 1000 lines of
your data and give you a summary of your dataset.
Field section displays fields identified, high level statistics,
and top occuring values
Click on import button
Name your index and click on import.
Then Elasticsearch will do the rest!
news_headlines
Click on menu icon, and open Dev Tools.
Click on Explore on my own option.
Fine tuning precision or recall using
Elasticsearch and Kibana
Questions?
Join the Elastic Austin User Group for
updates on future workshops!
Lisa Jung
https://discuss.elastic.co/

More Related Content

PDF
[Vancouver] part 2 understanding the relevance of your search with elasticse...
PPTX
Exploratory testing part 2
PPTX
The Three Pillars of Continuous Delivery - Boston Continuous Delivery Event
PDF
Get full visibility and find hidden security issues
PPTX
Incremental design v2
PPTX
Utilizing the natural langauage toolkit for keyword research
PDF
Data Science - Part XI - Text Analytics
PDF
How publishing works in the digital era
[Vancouver] part 2 understanding the relevance of your search with elasticse...
Exploratory testing part 2
The Three Pillars of Continuous Delivery - Boston Continuous Delivery Event
Get full visibility and find hidden security issues
Incremental design v2
Utilizing the natural langauage toolkit for keyword research
Data Science - Part XI - Text Analytics
How publishing works in the digital era

Similar to Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf (20)

PDF
Intro to Elasticsearch and Kibana.pdf
PPTX
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
PDF
Everything You Wish You Knew About Search
PDF
LGL Certification Training Guide
PPT
Writing Quality Code
PDF
Machine Learning Product Managers Meetup Event
PDF
Agile Software Architecture
PPTX
Navigating the Mess of a Shared drive Migration to SharePoint
PPTX
Natural language processing and search
PPTX
Elegant and Efficient Database Design
PPTX
Metric Abuse: Frequently Misused Metrics in Oracle
PPTX
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
PDF
DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...
PDF
Developing in R - the contextual Multi-Armed Bandit edition
PPTX
Scrum and kanban in the enterprise webinar
PPTX
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
DOC
Business analyst
PDF
Barga Data Science lecture 9
PDF
Get full visibility and find hidden security issues
PDF
Object oriented software engineering concepts
Intro to Elasticsearch and Kibana.pdf
Real-time Recommendations for Retail: Architecture, Algorithms, and Design
Everything You Wish You Knew About Search
LGL Certification Training Guide
Writing Quality Code
Machine Learning Product Managers Meetup Event
Agile Software Architecture
Navigating the Mess of a Shared drive Migration to SharePoint
Natural language processing and search
Elegant and Efficient Database Design
Metric Abuse: Frequently Misused Metrics in Oracle
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and Vocabularies
DevOps Paradox: Going Faster Brings Higher Quality, Lower Costs, & Better Out...
Developing in R - the contextual Multi-Armed Bandit edition
Scrum and kanban in the enterprise webinar
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
Business analyst
Barga Data Science lecture 9
Get full visibility and find hidden security issues
Object oriented software engineering concepts
Ad

Recently uploaded (20)

PPTX
HR Introduction Slide (1).pptx on hr intro
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PDF
Roadmap Map-digital Banking feature MB,IB,AB
DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
DOCX
Business Management - unit 1 and 2
PDF
WRN_Investor_Presentation_August 2025.pdf
PDF
Laughter Yoga Basic Learning Workshop Manual
PDF
COST SHEET- Tender and Quotation unit 2.pdf
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PDF
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise
PPTX
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PDF
How to Get Business Funding for Small Business Fast
PDF
Nidhal Samdaie CV - International Business Consultant
PPTX
5 Stages of group development guide.pptx
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PDF
How to Get Funding for Your Trucking Business
PDF
Types of control:Qualitative vs Quantitative
PDF
Chapter 5_Foreign Exchange Market in .pdf
PPTX
Lecture (1)-Introduction.pptx business communication
HR Introduction Slide (1).pptx on hr intro
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
Roadmap Map-digital Banking feature MB,IB,AB
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
Business Management - unit 1 and 2
WRN_Investor_Presentation_August 2025.pdf
Laughter Yoga Basic Learning Workshop Manual
COST SHEET- Tender and Quotation unit 2.pdf
ICG2025_ICG 6th steering committee 30-8-24.pptx
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
Reconciliation AND MEMORANDUM RECONCILATION
How to Get Business Funding for Small Business Fast
Nidhal Samdaie CV - International Business Consultant
5 Stages of group development guide.pptx
340036916-American-Literature-Literary-Period-Overview.ppt
How to Get Funding for Your Trucking Business
Types of control:Qualitative vs Quantitative
Chapter 5_Foreign Exchange Market in .pdf
Lecture (1)-Introduction.pptx business communication
Ad

Part 1.2 Understanding the relevance of your search with Elasticsearch and Kibana - Beginner's Crash Course to the Elastic Stack Series - .pdf

  • 1. Lisa Jung Developer Advocate @Elastic Beginner’s Crash Course to Elastic Stack Series Part 1.2: Understanding the Relevance of your search using Elasticsearch and Kibana
  • 2. Beginner’s crash course to the Elastic Stack Series ● Part 1.1: Intro to Elasticsearch and Kibana ○ use case of Elasticsearch and Kibana ○ the basic architecture of Elasticsearch ○ perform CRUD(Create, Read, Update, and Delete) operations with Elasticsearch and Kibana
  • 3. Missed the first workshop? No worries! ● Part 1.1: Intro to Elasticsearch and Kibana ○ Repo: https://ela.st/workshop-1-repo
  • 4. The Elastic Stack Reliably and securely take data from any source, in any format, then search, analyze, and visualize it in real time.
  • 7. Elastic is a search company. We focus on value to users by producing fast results that operate at scale and are relevant. This is our DNA. We believe search is an experience. It is what defines us, and makes us unique. Scale, Relevance
  • 9. How do we measure relevance? ● Precision ● Recall
  • 10. Store | Search | Analyze I store data as documents! Documents with similar traits are grouped into an index! Index Document Document Document Document Document Document
  • 11. When search query is sent, Elasticsearch retrieves relevant documents and presents the documents as search results. Index Document Document Document Document Document Document Search Results
  • 12. These two diagrams depict the same thing! Index Index Document Document Document Document Document Document
  • 13. Index True positives are relevant documents that are returned to the user. T T T T T True positives
  • 14. Index False positives are irrelevant documents that are returned to the user. T T T T T True positives F F False positives
  • 15. Index True negatives are irrelevant documents that are not returned to the user. T T T T T T T F F True negatives T T T T
  • 16. Index False negatives are relevant documents that were not returned to the user. T T T T T T T F F True negatives False negatives T T T T
  • 17. What is precision? Precision = True positives True positives + False positives What portion of the retrieved data is actually relevant to the search query?
  • 18. What is recall? Recall = True positives True positives + False negatives What portion of relevant data is being returned as search results? T T T
  • 19. Precision and Recall are inversely related Precision I want all the retrieved results to be a perfect match to the query, even if it means returning less documents. I want to retrieve more results even if documents may not be a perfect match to the query. Precision Recall Recall Precision Recall
  • 20. Precision and recall determine which documents are included in the search results. Precision and recall do not determine which of the returned documents are more relevant compared to the other!
  • 21. Ranking refers to ordering of the results (from most relevant results at the top, to least relevant at the bottom). Most Relevant … … … Less Relevant … … … Least Relevant How to form good habits (Highest Score) (Lowest Score)
  • 22. What is score? ● The score is a value that represents how relevant a document is to that specific query ● A score is computed for each document that is a hit
  • 23. What is score? ● Term Frequency(TF) ● Inverse Document Frequency(IDF)
  • 24. What is score? How to form good habits } Search Query Hits Most Relevant … … … Less Relevant … … … Least Relevant (Highest Score) (Lowest Score)
  • 25. Term Frequency(TF) determines how many times each search term appears in a document. How to form good habits TF= 4 TF= 1 If search terms are found in high frequency in a document, the document is considered more relevant to the search query. Search Terms } Search Query
  • 26. What is Inverse Document Frequency(IDF)? How to form good habits How to form a meetup group Hits Good chicken recipe How to form a band Good times rolling Good habits 101 Good habits are easy to master! We may contain some of the search terms but we have nothing to do with forming good habits! IDF diminishes the weight of terms that occur very frequently in the document set and increases the weight of terms that occur rarely!
  • 27. Fine tuning precision or recall using Elasticsearch and Kibana
  • 28. Click on the link to the workshop repo. https://ela.st/workshop-2-repo
  • 29. Scroll down to the Resources section & open Free Elastic Cloud Trial link in a new tab.
  • 30. Scroll down to the Resources section & open Free Elastic Cloud Trial link in a new tab.
  • 31. Go to the email account you signed up with and verify your email.
  • 32. Open email from Elastic. Click on verify and accept button.
  • 34. Click on start your free trial.
  • 37. Name your deployment then create deployment. Beginner’s Crash Course
  • 38. Save your deployment credentials.
  • 39. Open Kibana. Beginner’s Crash Course Beginner’s Crash Course
  • 40. Click on Explore on my own option.
  • 41. Click on Upload a file option
  • 42. Download and Unzip News Category Dataset from Kaggle
  • 43. Drag and drop a file you want to upload.
  • 44. Kibana will give you an analysis of the first 1000 lines of your data and give you a summary of your dataset.
  • 45. Field section displays fields identified, high level statistics, and top occuring values
  • 46. Click on import button
  • 47. Name your index and click on import.
  • 48. Then Elasticsearch will do the rest! news_headlines
  • 49. Click on menu icon, and open Dev Tools.
  • 50. Click on Explore on my own option.
  • 51. Fine tuning precision or recall using Elasticsearch and Kibana
  • 53. Join the Elastic Austin User Group for updates on future workshops!