SlideShare a Scribd company logo
HOW TO BUILD A
SEARCH ENGINE
(IN TWO DAYS)
How to build a search engine in 2 days
How to build a search engine in 2 days
SEARCH IS
NOT SEXY
I don’t have anything
against the search idea,
but it just doesn’t click for
me at this point.
How to build a search engine in 2 days
SEARCH IS
SUPERSEXY
How to build a search engine in 2 days
HOW TO BUILD A
SEARCH ENGINE
(IN TWO DAYS)
How to build a search engine in 2 days
SEMANTIC
FAST
RELIABLE
USER-FRIENDLY
SEMANTIC
SEARCH?
Semantic search seeks to improve search
accuracy by understanding the searcher's intent
and the contextual meaning of search terms
HANDLING
- synonyms
- generalizations
- concept matching
- natural language queries and questions
SEMANTIC
QUERY PARSING
ILLNESS
I’m sick
=
I’m ill
=
What is the policy for
sickness?
PARKING
Where can I park?
=
Where can I park my car?
=
I’m looking for a parking
spot
HANDLING
- synonyms
- generalizations
- concept matching
- natural language queries and questions
PARKING
Where can I park?
=
Where can I park my car?
=
I’m looking for a
parking spot
How to build a search engine in 2 days
PARKING
Where can I park?
=
Where can I park my car?
=
I’m looking for a
parking spot
VB
VB NN
NN NN
ILLNESS
I’m sick
=
I’m ill
=
What is the policy for
sickness
NN
NN
NN
NN
HANDLING
- synonyms
- generalizations
- concept matching
- natural language queries and questions
How to build a search engine in 2 days
CONCEPT
MATCHING
SYNONYMY
How to build a search engine in 2 days
INTERMEZZO
Funny fuckup
SEMANTIC
DOCUMENT ANALYSIS
FULL-TEXT?
How to build a search engine in 2 days
Master_WIT_Master_1_Sem2_Crypto_Readings_Pickpocketing.Mifare.pdf
Master
WIT -> Wiskundige Ingenieurstechnieken
Master
1
Sem -> Semester
2
Crypto -> Cryptography
Readings
Pickpocketing
Mifare
[pdf]
+ NER
How to build a search engine in 2 days
MAKING IT
FAST
How to build a search engine in 2 days
MAKING IT
USER-FRIENDLY
How to build a search engine in 2 days
TESTING IT’S
RELIABLE
How to build a search engine in 2 days
How to build a search engine in 2 days
OVERVIEW
ARCHITECTURE
Flask
Web Framework
Query
Spacy
Query Parsing
Concepts
Gensim & Fasttext
Word Embedding Documents
Word
Clouds
Results
ARCHITECTURE
Feedback?
Questions?
Want to start a study group? :-)
hello@willworkfor.coffee
twitter.com/@lienmichiels
medium.com/@Paliendroom

More Related Content

PPTX
Semantic search: from document retrieval to virtual assistants
PPT
3 Understanding Search
PDF
Charting Searchland, ACM SIG Data Mining
PDF
PPT
Semantic Search
PPTX
Semantic Search on the Rise
PPTX
Semantic Search keynote at CORIA 2015
PPTX
SearchLove Boston 2013_Bill Slawski_Future Search
Semantic search: from document retrieval to virtual assistants
3 Understanding Search
Charting Searchland, ACM SIG Data Mining
Semantic Search
Semantic Search on the Rise
Semantic Search keynote at CORIA 2015
SearchLove Boston 2013_Bill Slawski_Future Search

Similar to How to build a search engine in 2 days (20)

PPTX
Future search search love - bill slawski
PPTX
Semantic Search at Yahoo
PPTX
Query Understanding
PDF
sunny-slides
PPTX
Semantic seo for the people - Theory and Practice of Semantic Search
PDF
Семантический поиск - что это, как работает и чем отличается от просто поиска
PPTX
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?
PPTX
Recent Trends in Semantic Search Technologies
PPTX
Sem tech2013 tutorial
PPTX
Mining Web content for Enhanced Search
PPTX
Leveraging the semantic web meetup, Semantic Search, Schema.org and more
PPTX
Making the Web Searchable - Keynote ICWE 2015
PPTX
(Keynote) Peter Mika - “Making the Web Searchable”
PPTX
From Queries to Answers in the Web
PDF
JAB2012 Smart Search Presentation
PPT
Related Entity Finding on the Web
PPTX
Semantic Search tutorial at SemTech 2012
PPTX
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
PPTX
Large-Scale Semantic Search
PPTX
Web Search Engine, Web Crawler, and Semantics Web
Future search search love - bill slawski
Semantic Search at Yahoo
Query Understanding
sunny-slides
Semantic seo for the people - Theory and Practice of Semantic Search
Семантический поиск - что это, как работает и чем отличается от просто поиска
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?
Recent Trends in Semantic Search Technologies
Sem tech2013 tutorial
Mining Web content for Enhanced Search
Leveraging the semantic web meetup, Semantic Search, Schema.org and more
Making the Web Searchable - Keynote ICWE 2015
(Keynote) Peter Mika - “Making the Web Searchable”
From Queries to Answers in the Web
JAB2012 Smart Search Presentation
Related Entity Finding on the Web
Semantic Search tutorial at SemTech 2012
Exploratory Search upon Semantically Described Web Data Sources: Service regi...
Large-Scale Semantic Search
Web Search Engine, Web Crawler, and Semantics Web
Ad

More from Data Science Leuven (20)

PDF
Distributed Deep Learning Using Java on the Client and in the Cloud
PDF
Statbel and big data
PDF
Learning from positive and unlabeled data
PDF
Lighthouse - an open-source library to build data lakes - Kris Peeters
PPTX
Recommender systems for job search - Michael Reusens
PPTX
VITO WatchItGrow - Jeroen Dries
PDF
Uplift models
PDF
Value from health data
PPTX
Computing power and algorithms? In people we trust
PDF
Trumania , a realistic scenario-based data-generator
PDF
Recommender systems, optimizing least squares or user experience
PPTX
Replicability and questionable research practices
PDF
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
PPTX
Storytelling for impactful predictive models - Gert De Geyter
PDF
Lessons from driving analytics projects
PPTX
Geospatial visual analytics
PDF
Open-Source Data Science Crossing The Chasm
PDF
Probabilistic machine learning for optimization and solving complex
Distributed Deep Learning Using Java on the Client and in the Cloud
Statbel and big data
Learning from positive and unlabeled data
Lighthouse - an open-source library to build data lakes - Kris Peeters
Recommender systems for job search - Michael Reusens
VITO WatchItGrow - Jeroen Dries
Uplift models
Value from health data
Computing power and algorithms? In people we trust
Trumania , a realistic scenario-based data-generator
Recommender systems, optimizing least squares or user experience
Replicability and questionable research practices
Predicting Eurosong with Google Predicting Eurosong with Google and data visu...
Storytelling for impactful predictive models - Gert De Geyter
Lessons from driving analytics projects
Geospatial visual analytics
Open-Source Data Science Crossing The Chasm
Probabilistic machine learning for optimization and solving complex
Ad

Recently uploaded (20)

PDF
Mega Projects Data Mega Projects Data
PPTX
Introduction to machine learning and Linear Models
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
1_Introduction to advance data techniques.pptx
PDF
Lecture1 pattern recognition............
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
.pdf is not working space design for the following data for the following dat...
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Mega Projects Data Mega Projects Data
Introduction to machine learning and Linear Models
IB Computer Science - Internal Assessment.pptx
1_Introduction to advance data techniques.pptx
Lecture1 pattern recognition............
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Qualitative Qantitative and Mixed Methods.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
.pdf is not working space design for the following data for the following dat...
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Clinical guidelines as a resource for EBP(1).pdf
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Acceptance and paychological effects of mandatory extra coach I classes.pptx

How to build a search engine in 2 days