SlideShare a Scribd company logo
Ad Personalization 
at Spotify 
Building personalized ad experiences through iterative 
engineering and product development 
Kinshuk Mishra 
Noel Cody
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
Music… 
...is with you throughout the day. 
...fits your mood. 
...fits your activity.
Music… 
...is personal.
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
If your day looks like this: 
Wake up Work out Commute Focus at Work Relax at Home Sleep
Ads should follow. 
Wake up Work out Commute Focus at Work Relax at Home Sleep 
Ads
Ads should follow. 
Wake up Work out Commute Focus at Work (Classical) Relax at Home Sleep 
Electronic Music ad 
Not bad. WTF?
Why Personalization?
Why Personalization? 
“...it works well the advertisements are annoying though I am not a fan of 
mainstream music so hearing about pop bands is also driving me crazy” 
“Great way to listen to whatever music you want. The ads can be really 
annoying though since they don't seem to be targeted. I HATE rap music, yet I 
seem to get a lot of ads for it.”
Data confirms anecdotal evidence
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
AD PERSONALIZATION
User Stories 
Hypotheses + Goals 
Product MVPs + Experiments
User Stories to Hypotheses & Goals: 
● Context-aware ads 
● Music ads like music recommendations 
● Ads that learn
Hypotheses to Products: 
● Real-time genre targeting 
● Historic genre targeting 
● Real-time moment targeting
(Product MVPs to Experiments) 
Control Variation 1 Variation 2
INFRASTRUCTURE
Ad Targeting Architecture 
Feedback Loop
Ad Targeting Architecture 
OSS Data 
Infrastructure 
Spotify Backend 
Infrastructure
Ad Targeting Architecture V1.0 
COTS Data 
Infrastructure 
Real-time Targeting 
Spotify Backend 
Infrastructure
Ad Targeting Architecture V2.0 
Real-time + Batch Targeting 
(a.k.a. Lambda Architecture)
Ad Targeting Architecture V2.5 
Transition to Persistent User Profile
Ad Targeting Architecture V3.0 
Richer Profile Schema with Persistence
Tech Choices
Kafka 
● Kafka is a distributed, partitioned, replicated commit log service. 
● Guarantees 
● Kafka provides a total order over messages within a partition 
● Fault tolerance : handles N-1 failures for replication factor N.
Ad Targeting Architecture V1.0 
COTS Data 
Infrastructure 
Real-time Targeting 
Spotify Backend 
Infrastructure
SSttoorrmm 
● Real time stream processing 
● Like hadoop without HDFS 
● Like Map/Reduce with many reducer steps 
● Fault tolerant and guaranteed message processing
Storm: Testing (since 0.8.1) 
Storm
Storm: Visualization (since 0.9.2) 
Storm
Ad Targeting Architecture V2.0 
Real-time + Batch Targeting
Apache Crunch 
● Framework for writing, testing, and running MapReduce pipelines 
● Pipelines are composed of user-defined functions and higher-level 
abstractions of common MR tasks (filter, join, etc.)
Apache Crunch 
Data structures: 
● PCollection<T> 
● PTable<K,V> 
● PGroupedTable<K,V> 
Functions: 
● MapFn<T1,T2>: T1 → T2 
● CombineFn<K,V>: (K, Iterable<V>) → (K, V)
Apache Crunch 
What’s wrong with plain Python Streaming MapReduce? 
● Testability 
● Optimization 
● Performance 
● IDE support 
● Type Safety 
● Lack of higher-level operations (filter/join/aggregate) 
From Spotify Presentation: Scalding the Crunchy Pig for Cascading into the Hive
Apache Crunch 
● About a 5x performance improvement over Python streaming MapReduce 
● Readable functional-style API in plain Java 
● Great local testing support 
● First-class support for Avro records. 
From Spotify Presentation: Scalding the Crunchy Pig for Cascading into the Hive
Apache Crunch
Apache Crunch
Ad Targeting Architecture V2.5 
Transition to Persistent User Profile
CASSANDRA 
Rich wide-column 
schema support 
Solid persistence 
and replication 
Slower reads 
MEMCACHED 
K/V only 
TTL is default (in-memory 
● Rich schema 
● Persistence 
mgmt) 
vs.
Ad Targeting Architecture V3.0 
Richer Profile Schema with Persistence
DATA INGESTION: 
CASSANDRA 
CRUNCH 
STORM 
HDFS 
KAFKA 
LOGS
TESTING
Ads Personalization at Spotify - NYC Data Engineering 10/23
User Stories 
Hypotheses + Goals 
Product MVPs + Experiments
AAR 
Vital Signs 
Ad-Specific Metrics
AAR 
Vital Signs 
Higher-level metrics 
are hard to move 
Ad-Specific Metrics
USER EXPERIENCE 
TEST ITERATION 
IMPACTS AAR
AAR 
Vital Signs 
Our focus Ad-Specific Metrics
Test evaluation 
● Positive Signals: CTR, Downstream Effects 
● Avoidance Signals: Volume, Audio Output 
● An “Ad Quality Score”
Thanks! 
(We’re hiring): 
spotify.com/us/jobs/

More Related Content

PDF
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)
PDF
Ad Yield Optimization @ Spotify - DataGotham 2013
PDF
Qcon London 2017 - Architecture overhaul - Ad serving @ Spotify scale
PPTX
SAP HANA Marketplace
PPT
Multi-location Search and Social PUBCON 2015
PDF
Real time ads personalization @ Spotify
PDF
The Evolution of Big Data at Spotify
PDF
Data at Spotify
Evolution of Spotify's ad architecture (Qcon 2016 Shanghai)
Ad Yield Optimization @ Spotify - DataGotham 2013
Qcon London 2017 - Architecture overhaul - Ad serving @ Spotify scale
SAP HANA Marketplace
Multi-location Search and Social PUBCON 2015
Real time ads personalization @ Spotify
The Evolution of Big Data at Spotify
Data at Spotify

Similar to Ads Personalization at Spotify - NYC Data Engineering 10/23 (20)

PDF
Delivering Personalized Music Discovery
PDF
Shortening the feedback loop
PDF
Data Infrastructure for a World of Music
PDF
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
PDF
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
PDF
Big Data At Spotify
PDF
From stream to recommendation using apache beam with cloud pubsub and cloud d...
KEY
Spotify cassandra london
PDF
Music Personalization : Real time Platforms.
KEY
Cassandra nyc
PPTX
The Evolution of Data Architecture
PDF
Hadoop at Musicmetric
PDF
Data platform architecture
PDF
Big data and cloud computing 9 sep-2017
PPTX
ARC202:real world real time analytics
PDF
Tech trends - Get some of these skills to stay current
PPTX
Big Data_Architecture.pptx
PDF
Big data beyond the hype may 2014
PPTX
Linthicum next generation-iaa s-paas-and-database-as-a-service
PDF
Technical Exposure for IT Blue Prints
Delivering Personalized Music Discovery
Shortening the feedback loop
Data Infrastructure for a World of Music
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Big Data At Spotify
From stream to recommendation using apache beam with cloud pubsub and cloud d...
Spotify cassandra london
Music Personalization : Real time Platforms.
Cassandra nyc
The Evolution of Data Architecture
Hadoop at Musicmetric
Data platform architecture
Big data and cloud computing 9 sep-2017
ARC202:real world real time analytics
Tech trends - Get some of these skills to stay current
Big Data_Architecture.pptx
Big data beyond the hype may 2014
Linthicum next generation-iaa s-paas-and-database-as-a-service
Technical Exposure for IT Blue Prints
Ad

Recently uploaded (20)

PPTX
Database Infoormation System (DBIS).pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Foundation of Data Science unit number two notes
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Lecture1 pattern recognition............
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Introduction to Business Data Analytics.
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
Database Infoormation System (DBIS).pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Foundation of Data Science unit number two notes
Reliability_Chapter_ presentation 1221.5784
Lecture1 pattern recognition............
.pdf is not working space design for the following data for the following dat...
Introduction-to-Cloud-ComputingFinal.pptx
Supervised vs unsupervised machine learning algorithms
Clinical guidelines as a resource for EBP(1).pdf
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Business Data Analytics.
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Miokarditis (Inflamasi pada Otot Jantung)
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
IB Computer Science - Internal Assessment.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Ad

Ads Personalization at Spotify - NYC Data Engineering 10/23