SlideShare a Scribd company logo
Elastic @Deezer
Aurélien Saint Requier, Search Data Scientist
ELASTIC @DEEZER
/01
/02
/03
/04
Where?
Elasticsearch architecture
Querying Elasticsearch
ELK stack for analysis
Table of contents
ELASTIC #DEEZER
Where?
/01
ELASTIC @DEEZER
For search features
ELASTIC #DEEZER 4
For chart and new release features
ELASTIC #DEEZER 5
For recommendation features
ELASTIC #DEEZER 6
Elasticsearch Architecture
/02
ELASTIC @DEEZER
Elasticsearch architecture
Our needs
ELASTIC #DEEZER 8
● Search and recommend
○ 3 millions of artists
○ 5 millions of albums
○ 50 millions of tracks
○ 2 millions of playlists
● Search and recommend content based on
○ metadata and other features
○ tag description
● New releases should become available in less than 2 hours
● Queries have to respond in less than 100ms
Elasticsearch architecture
Overview
ELASTIC #DEEZER 9
Elasticsearch architecture
Data workflow
ELASTIC #DEEZER 10
Elasticsearch architecture
Data workflow
ELASTIC #DEEZER 11
How we deploy full indexes in production ?
ELASTIC #DEEZER
1. Get json data from Hadoop cluster (using WebHDFS)
2. Index documents on mastersearch (using ES bulk api)
3. Package the new index :
3.1. compress the ES index directory
3.2. generate a deployment script
4. Copy the package on the temporary node of each cluster (using
assassin, an homemade rsync deploy script)
5. Run deployment script :
5.1. Start a temporary ES instance and load the new index
5.2. Set the required number of replica
5.3. Wait until data is replicated and then shutting down the
temporary ES instance
5.4. Warm the new index
5.5. Switch alias on the new index and close the old index
12
Querying Elasticsearch
/03
ELASTIC @DEEZER
How we analyze musical data?
ELASTIC #DEEZER 14
Use custom analyzers
Black Pearl (He's A Pirate) [feat. Sidney Housen] - EP
The Black Eyed Peas
● Lowercase asciifolding and char filters, music field synonyms :
● Edge_ngram tokenizer :
How we search in our data?
ELASTIC #DEEZER 15
● Using a Java internal Elasticsearch plugin :
How we search in our data?
ELASTIC #DEEZER 16
● Using Multi Search API and Query DSL:
How we recommend our data?
ELASTIC #DEEZER 17
● Using function score queries :
How we explore our data?
ELASTIC #DEEZER 18
● Using aggregation:
Some feedbacks
ELASTIC #DEEZER
● In numbers:
○ More 25 millions queries a day, around 5000 queries / minute
○ Around 95% queries respond in less 100ms
● In lessons :
○ Be careful with fielddata usage
○ Big jvm ES instance = Long gc time
○ Avoid prefix queries : use edge-ngram tokenizer and do match
queries*
● In future :
○ Use a dedicated client/data/master architecture
○ Stop fuzzy queries (replaced by a “Did you mean“ approach)*
○ Migrate to Elasticsearch v2
19
*https://www.elastic.co/blog/elasticsearch-queries-or-term-queries-are-really-fast
ELK for analysis
/04
ELASTIC @DEEZER
Use of ELK
ELASTIC #DEEZER
● Elasticsearch v1.7.5 :
○ cluster of 3 nodes
○ index logs from Logstash and homemade scripts
○ Around 2 billions of documents
● Logstash 1.5
● Kibana v 4.1.1
○ 26 dashboards / 189 visualisations
● Tools:
○ curator for index retention
○ elasticdump for saving kibana settings
21
Use cases
Monitoring
ELASTIC #DEEZER 22
Use cases
Analysis what our users search
ELASTIC #DEEZER 23
Thanks for your attention
We are hiring !
jobs.deezer.com
Questions?

More Related Content

PDF
Redis: REmote DIctionary Server
PPTX
Fusion-io and MySQL at Craigslist
PPTX
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
ODP
MySQL And Search At Craigslist
PPTX
Clojure/conj 2017
ODP
Cool bonsai cool - an introduction to ElasticSearch
PDF
Nats and netlify
PDF
Brug af Solr i IMPACT
Redis: REmote DIctionary Server
Fusion-io and MySQL at Craigslist
Amazing Speed: Elasticsearch for the .NET Developer- Adrian Carr, Codestock 2015
MySQL And Search At Craigslist
Clojure/conj 2017
Cool bonsai cool - an introduction to ElasticSearch
Nats and netlify
Brug af Solr i IMPACT

What's hot (20)

PPTX
APMG juni 2014 - Regular Expression
PDF
Wikipedia: Tuned Predictions on Big Data
PPT
Not only SQL
PPSX
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
PPTX
Introduction to NoSQL
PDF
Building a custom time series db - Colin Hemmings at #DOXLON
PPT
GRIN-Global Status - II, CRI 2016 February
PDF
Polyglot persistence
PDF
Client-side storage
PPTX
Scaling Cloud Apps
PDF
Introducing TiDB - Percona Live Frankfurt
PDF
Extending Pandas using Apache Arrow and Numba
PDF
Small intro to Big Data - Old version
PPTX
Nextzy Technologies Co.,ltd. Jsoup
PDF
Sphinx && Perl Houston Perl Mongers - May 8th, 2014
PDF
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
PDF
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
PDF
Mongo db php_shaken_not_stirred_joomlafrappe
PDF
TiDB Introduction
PDF
InfluxDB Internals
APMG juni 2014 - Regular Expression
Wikipedia: Tuned Predictions on Big Data
Not only SQL
What I learnt: Elastic search & Kibana : introduction, installtion & configur...
Introduction to NoSQL
Building a custom time series db - Colin Hemmings at #DOXLON
GRIN-Global Status - II, CRI 2016 February
Polyglot persistence
Client-side storage
Scaling Cloud Apps
Introducing TiDB - Percona Live Frankfurt
Extending Pandas using Apache Arrow and Numba
Small intro to Big Data - Old version
Nextzy Technologies Co.,ltd. Jsoup
Sphinx && Perl Houston Perl Mongers - May 8th, 2014
使用 Elasticsearch 及 Kibana 進行巨量資料搜尋及視覺化-曾書庭
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Mongo db php_shaken_not_stirred_joomlafrappe
TiDB Introduction
InfluxDB Internals
Ad

Viewers also liked (17)

PPTX
El nuevo entorno digital en las empresas
PDF
Reo dc güç kaynakları
PDF
presentationfinal2
PPTX
DOCX
PDF
Buscando soñar (ebook)
PPTX
Lonz Presentation
PPT
kottayam Tourism | vacation Homes
PPTX
Presentation Child Welfare 2012
PDF
اورام الثدي
DOC
Rajesh_105902705
PPTX
Next generation family diagnostic analysers
PPTX
المشروع الاول 1
DOCX
Film Openings Research
DOCX
La dona al coneixement del medi
PPT
день землі
DOC
El nuevo entorno digital en las empresas
Reo dc güç kaynakları
presentationfinal2
Buscando soñar (ebook)
Lonz Presentation
kottayam Tourism | vacation Homes
Presentation Child Welfare 2012
اورام الثدي
Rajesh_105902705
Next generation family diagnostic analysers
المشروع الاول 1
Film Openings Research
La dona al coneixement del medi
день землі
Ad

Similar to Elastic @Deezer (20)

PPTX
Getting started with Laravel & Elasticsearch
PDF
Elasticsearch Basics
PPTX
PostgreSQL is the new NoSQL - at Devoxx 2018
PPTX
Programming with Semantic Broad Data
PPTX
Devnexus 2018
PPTX
Scaling Massive Elasticsearch Clusters
PPTX
Elastic pivorak
PPT
How ElasticSearch lives in my DevOps life
PPTX
Episerver and search engines
PPTX
Dev nexus 2017
PPT
Solr and Elasticsearch, a performance study
PDF
OpenSearch.pdf
PPTX
Exploring MongoDB & Elasticsearch: Better Together
PDF
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
PDF
Wisely Chen Spark Talk At Spark Gathering in Taiwan
PPTX
Performance Monitoring for the Cloud - Java2Days 2017
PDF
apidays Australia 2023 - How We Built Our Generative AI Assistant: New Relic ...
PPTX
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
PDF
The Data Mullet: From all SQL to No SQL back to Some SQL
PPTX
Elastic Stack Introduction
Getting started with Laravel & Elasticsearch
Elasticsearch Basics
PostgreSQL is the new NoSQL - at Devoxx 2018
Programming with Semantic Broad Data
Devnexus 2018
Scaling Massive Elasticsearch Clusters
Elastic pivorak
How ElasticSearch lives in my DevOps life
Episerver and search engines
Dev nexus 2017
Solr and Elasticsearch, a performance study
OpenSearch.pdf
Exploring MongoDB & Elasticsearch: Better Together
Searching and Querying Knowledge Graphs with Solr/SIREn - A Reference Archite...
Wisely Chen Spark Talk At Spark Gathering in Taiwan
Performance Monitoring for the Cloud - Java2Days 2017
apidays Australia 2023 - How We Built Our Generative AI Assistant: New Relic ...
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
The Data Mullet: From all SQL to No SQL back to Some SQL
Elastic Stack Introduction

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
A Presentation on Touch Screen Technology
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation theory and applications.pdf
PDF
Mushroom cultivation and it's methods.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
1. Introduction to Computer Programming.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
A Presentation on Touch Screen Technology
Approach and Philosophy of On baking technology
Encapsulation theory and applications.pdf
Mushroom cultivation and it's methods.pdf
A novel scalable deep ensemble learning framework for big data classification...
Tartificialntelligence_presentation.pptx
Programs and apps: productivity, graphics, security and other tools
Enhancing emotion recognition model for a student engagement use case through...
1. Introduction to Computer Programming.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
NewMind AI Weekly Chronicles - August'25-Week II
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Encapsulation_ Review paper, used for researhc scholars
Heart disease approach using modified random forest and particle swarm optimi...
Group 1 Presentation -Planning and Decision Making .pptx
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
MIND Revenue Release Quarter 2 2025 Press Release

Elastic @Deezer