SlideShare a Scribd company logo
Implementing Search with Solr at 7digital
James Atherton
Content Discovery Team Lead
Implementing Search
with Solr
James Atherton
Content Discovery Lead
@mr_road
Who is 7digital?
Online digital content provider
Covering over 47 territories
Online music store: www.7digital.com
API: api.7digital.com
We power a number of music services:
Samsung
Blackberry
Turntable.fm
Pure
Where we came from...
SQL Searches
SELECT *
FROM <table>
WHERE name LIKE '<search_term>%';
This was SLOW and BAD!!
Wrapped Solr in an API
Old Architecture
API
DB
Domain Objects
Artist Documents
Release Documents (e.g. album or single)
Track Documents
First Attempt - 2011
• Artists and Releases
• Solr 1.4
• 17 stores
• ~40GB
• Dropped DIH as it had issues
2011 Architecture
HTTP
API
Search
API Solr
DB
Solr
Tracks
Artists
Releases
2012
• Added Tracks Core
• Solr 3.5
• 47 stores
• ~400GB
• More than 430 M docs
• Didn't revisit DIH
Current Architecture
HTTP
API
Search
API
Artist/Release
Solrs
Track Solrs
Track Solrs
Track Solrs
Track Solrs
Artist/Release
Solrs
Things Learnt
We should have split by <X>; for us Shops.
Beware Inflection Points
Data size: 400GB != 40GB * 10
Throughput: 600 rpm IS NOT 4 * 150 rpm
What we want in our servers?
RAM ?
Fast Disks?
CPUs?
Virtual?
Bare Metal?
Optimize really...?
Cache Warming/First search?
Testing
Test ingestion/data import, then test again
Your data is not as clean as you think
Load test early and often
We need to be better at this still
Logs
Logging is worth its weight in gold
But don't get weighed down
Monitoring
We use statsd/graphite and NewRelic:
Visualise Indexing
Which territory's data has been indexed?
Instant Search
Magic Deploys
We recently adopted CFEngine, it is awesome!!
The Future
HTTP
API
Search
API
Artist Solrs
Track
Solrs
Track
Solrs
Track
Solrs
Release
Solrs
Track Solrs
Solr Cloud, in
the Cloud??
Questions
?
Resources
https://github.com/etsy/statsd/
https://github.com/7digital
http://d3js.org/
James Atherton
@mr_road
@7digital

More Related Content

PDF
Drupal case study: ABC Dig Music
PDF
Web Audio API: brief introduction
PDF
Andrew Mager, Spotify
PDF
Scaling Solr with SolrCloud
PDF
High Performance JSON Search and Relational Faceted Browsing with Lucene
PDF
Reflected intelligence evolving self-learning data systems
PDF
Enhancing relevancy through personalization & semantic search
PPTX
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Drupal case study: ABC Dig Music
Web Audio API: brief introduction
Andrew Mager, Spotify
Scaling Solr with SolrCloud
High Performance JSON Search and Relational Faceted Browsing with Lucene
Reflected intelligence evolving self-learning data systems
Enhancing relevancy through personalization & semantic search
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine

Viewers also liked (6)

PPTX
South Big Data Hub: Text Data Analysis Panel
PPTX
The Semantic Knowledge Graph
PPTX
Text Mining using LDA with Context
PPTX
The Apache Solr Smart Data Ecosystem
PPTX
Reflected Intelligence: Lucene/Solr as a self-learning data system
PPTX
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
South Big Data Hub: Text Data Analysis Panel
The Semantic Knowledge Graph
Text Mining using LDA with Context
The Apache Solr Smart Data Ecosystem
Reflected Intelligence: Lucene/Solr as a self-learning data system
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Ad

Similar to Implementing search with solr at 7digital (12)

PDF
Hive at Last.fm
PDF
Storm at Spotify
PPTX
Final Slide Deck.pptx
ZIP
Last.fm API workshop - Stockholm
PPTX
E commerce Search using Apache Solr
PDF
Elastic @Deezer
PPTX
Apache Solr
PPTX
Introduction to Apache Lucene/Solr
PDF
Mon norton tut_publishing01
KEY
State-of-the-Art Drupal Search with Apache Solr
KEY
State-of-the-Art Drupal Search with Apache Solr
PDF
Hive at Last.fm
Storm at Spotify
Final Slide Deck.pptx
Last.fm API workshop - Stockholm
E commerce Search using Apache Solr
Elastic @Deezer
Apache Solr
Introduction to Apache Lucene/Solr
Mon norton tut_publishing01
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
Ad

More from lucenerevolution (20)

PDF
Text Classification Powered by Apache Mahout and Lucene
PDF
State of the Art Logging. Kibana4Solr is Here!
PDF
Search at Twitter
PDF
Building Client-side Search Applications with Solr
PDF
Integrate Solr with real-time stream processing applications
PDF
Administering and Monitoring SolrCloud Clusters
PDF
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
PDF
Using Solr to Search and Analyze Logs
PDF
Real-time Inverted Search in the Cloud Using Lucene and Storm
PDF
Solr's Admin UI - Where does the data come from?
PDF
Schemaless Solr and the Solr Schema REST API
PDF
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
PDF
Faceted Search with Lucene
PDF
Recent Additions to Lucene Arsenal
PDF
Turning search upside down
PDF
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
PDF
Shrinking the haystack wes caldwell - final
PDF
The First Class Integration of Solr with Hadoop
PDF
A Novel methodology for handling Document Level Security in Search Based Appl...
PDF
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
Text Classification Powered by Apache Mahout and Lucene
State of the Art Logging. Kibana4Solr is Here!
Search at Twitter
Building Client-side Search Applications with Solr
Integrate Solr with real-time stream processing applications
Administering and Monitoring SolrCloud Clusters
Implementing a Custom Search Syntax using Solr, Lucene, and Parboiled
Using Solr to Search and Analyze Logs
Real-time Inverted Search in the Cloud Using Lucene and Storm
Solr's Admin UI - Where does the data come from?
Schemaless Solr and the Solr Schema REST API
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Faceted Search with Lucene
Recent Additions to Lucene Arsenal
Turning search upside down
Spellchecking in Trovit: Implementing a Contextual Multi-language Spellchecke...
Shrinking the haystack wes caldwell - final
The First Class Integration of Solr with Hadoop
A Novel methodology for handling Document Level Security in Search Based Appl...
How Lucene Powers the LinkedIn Segmentation and Targeting Platform

Recently uploaded (20)

PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Complications of Minimal Access Surgery at WLH
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
master seminar digital applications in india
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
Cell Types and Its function , kingdom of life
PPTX
Lesson notes of climatology university.
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Supply Chain Operations Speaking Notes -ICLT Program
Anesthesia in Laparoscopic Surgery in India
Complications of Minimal Access Surgery at WLH
O5-L3 Freight Transport Ops (International) V1.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
O7-L3 Supply Chain Operations - ICLT Program
Computing-Curriculum for Schools in Ghana
Renaissance Architecture: A Journey from Faith to Humanism
master seminar digital applications in india
GDM (1) (1).pptx small presentation for students
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Cell Types and Its function , kingdom of life
Lesson notes of climatology university.
PPH.pptx obstetrics and gynecology in nursing
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Module 4: Burden of Disease Tutorial Slides S2 2025
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf

Implementing search with solr at 7digital