SlideShare a Scribd company logo
Content Analytics for Better Search Otis Gospodneti ć   •••  Sematext International
Agenda Intro: Otis & Sematext
Basic Search
Taming Search Results
Key Phrases
Beyond Search
About Otis Gospodneti ć Member: Apache Lucene, Solr, Nutch, Mahout
Author: Lucene in Action 1 & 2
Entrepreneur: Simpy (2004), Lucene Consulting (2005), Sematext Int'l (2007)
Organizer: NY Search & Discovery Meetup
About Sematext Consulting, development, support: Big Data  (Hadoop, HBase, Voldemort...)
Search  (Lucene, Solr, Elastic Search...)
Web Crawling  (Nutch)
Machine Learning  (Mahout)
Basic Search
Taming Search Results Related searches (high query volume)
Search results clustering (fuzzy)
Named Entity Recognition (NER)
Faceted search (structured input)
…
Example: Related Searches
Example: Results Clustering
Example: Named Entities Sorry, no screenshot, but I  know  sites use this! Really, I do! :)
Example: Faceted Search
Content Analysis: Key Phrases Related searches

More Related Content

ODP
Search Analytics with Flume and HBase
PPTX
Keyword Research and Topic Modeling in a Semantic Web
PDF
Search relevancy
PDF
Introducing Featured Search - Talk on the TYPO3 CAMP MALLORCA 2015
PPTX
Why and how does the SEO industry use expired domains
PDF
Semantics and Search by Upasna Gautam at PubCon Austin 2018
PPT
Proximity is NOT the #1 Local SEO Ranking Factor; Linkbuilding Could Be - Pub...
PPTX
Semantic search
Search Analytics with Flume and HBase
Keyword Research and Topic Modeling in a Semantic Web
Search relevancy
Introducing Featured Search - Talk on the TYPO3 CAMP MALLORCA 2015
Why and how does the SEO industry use expired domains
Semantics and Search by Upasna Gautam at PubCon Austin 2018
Proximity is NOT the #1 Local SEO Ranking Factor; Linkbuilding Could Be - Pub...
Semantic search

Viewers also liked (7)

PPTX
Building Confidence as a Student Speaker
PPTX
Profitable Key phrases
PPT
PPTX
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email Infrastructure
PPT
Build Your Confidence
PPTX
12 activities to revise key phrases
PPT
Startup Metrics for Pirates
Building Confidence as a Student Speaker
Profitable Key phrases
LAPHP/LAMPSig Talk: Intro to SendGrid - Building a Scalable Email Infrastructure
Build Your Confidence
12 activities to revise key phrases
Startup Metrics for Pirates
Ad

Similar to Key Phrases for Better Search (20)

ODP
Content Analytics for Better Search
PDF
Search Analytics Business Value & NoSQL Backend
KEY
Detecting Signals from Real-time Social Web
KEY
Detecting Signals from Real-time Social Web
PDF
Zemanta Tech Talk at Audible
PDF
The Future of Search
PDF
The Future of Search & Discovery
PDF
Search & Discovery Patterns
PPTX
Search, Signals & Sense: An Analytics Fueled Vision
PDF
Search Patterns: IUE 2010
PDF
Search Patterns: The Future of Discovery
PDF
Searchpatterns 100519055231-phpapp02
PPTX
TVOT June 2012
PDF
Under the Hood: Advanced Semantic Markup for SEO
PDF
Ubiquitous IA
PPTX
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
PPTX
Knowledge Panels, Rich Snippets and Semantic Markup
PDF
Information Architecture 3.0 (Second Life)
PDF
Seo for-content
PDF
Machine learning and multimedia information retrieval
Content Analytics for Better Search
Search Analytics Business Value & NoSQL Backend
Detecting Signals from Real-time Social Web
Detecting Signals from Real-time Social Web
Zemanta Tech Talk at Audible
The Future of Search
The Future of Search & Discovery
Search & Discovery Patterns
Search, Signals & Sense: An Analytics Fueled Vision
Search Patterns: IUE 2010
Search Patterns: The Future of Discovery
Searchpatterns 100519055231-phpapp02
TVOT June 2012
Under the Hood: Advanced Semantic Markup for SEO
Ubiquitous IA
The Actionable Guide to Doing Better Semantic Keyword Research #BrightonSEO (...
Knowledge Panels, Rich Snippets and Semantic Markup
Information Architecture 3.0 (Second Life)
Seo for-content
Machine learning and multimedia information retrieval
Ad

More from Sematext Group, Inc. (20)

PDF
Tweaking the Base Score: Lucene/Solr Similarities Explained
PDF
OOPs, OOMs, oh my! Containerizing JVM apps
PPTX
Is observability good for your brain?
PDF
Introducing log analysis to your organization
PPTX
Solr Search Engine: Optimize Is (Not) Bad for You
PDF
Solr on Docker - the Good, the Bad and the Ugly
PDF
Monitoring and Log Management for
PDF
Introduction to solr
PDF
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
PDF
Elasticsearch for Logs & Metrics - a deep dive
PDF
How to Run Solr on Docker and Why
PDF
Tuning Solr & Pipeline for Logs
PPTX
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
PDF
Top Node.js Metrics to Watch
PPT
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
PDF
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
PDF
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
PDF
Docker Logging Webinar
PDF
Docker Monitoring Webinar
PDF
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale
Tweaking the Base Score: Lucene/Solr Similarities Explained
OOPs, OOMs, oh my! Containerizing JVM apps
Is observability good for your brain?
Introducing log analysis to your organization
Solr Search Engine: Optimize Is (Not) Bad for You
Solr on Docker - the Good, the Bad and the Ugly
Monitoring and Log Management for
Introduction to solr
Building Resilient Log Aggregation Pipeline with Elasticsearch & Kafka
Elasticsearch for Logs & Metrics - a deep dive
How to Run Solr on Docker and Why
Tuning Solr & Pipeline for Logs
Running High Performance & Fault-tolerant Elasticsearch Clusters on Docker
Top Node.js Metrics to Watch
Running High Performance and Fault Tolerant Elasticsearch Clusters on Docker
Large Scale Log Analytics with Solr (from Lucene Revolution 2015)
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
Docker Logging Webinar
Docker Monitoring Webinar
Metrics, Logs, Transaction Traces, Anomaly Detection at Scale

Key Phrases for Better Search

Editor's Notes

  • #7: 10 days of data (5K/min)
  • #12: 10 days of data (5K/min)