SlideShare a Scribd company logo
Beyond Data
From User Information to Business
Value

October, 2013
Xavier Amatriain
Director - Algorithms Engineering - Netflix

@xamat
“In a simple Netlfix-style item recommender, we would
simply apply some form of matrix factorization (i.e NMF)”
From the Netflix
Prize to today

2006

2013
Everything is
Everything is personalized
Ranking

Over 75% of what
people watch
comes from a
recommendation
Top 10
Personalization awareness

Diversity
But…
Support for Recommendations

Social Support
Genre Rows
Similars
EVERYTHING is a Recommendation
Consumer
(Data) Science
Consumer (Data) Science
1.

Start with a hypothesis:
■ Algorithm/feature/design X will increase member engagement
with our service, and ultimately member retention

2.

Design a test
■ Develop a solution or prototype
■ Think about dependent & independent variables, control,
significance…

3.
4.

Execute the test
Let data speak for itself
Offline/Online testing process
Executing A/B tests
Measure differences in metrics across statistically identical populations
that each experience a different algorithm.

■ Decisions on the product always data-driven
■ Overall Evaluation Criteria (OEC) = member retention
■ Use long-term metrics whenever possible
■ Short-term metrics can be informative and allow faster decisions
■ But, not always aligned with OEC

■ Significance and hypothesis testing (1000s of members and 220 cells)

■ A/B Tests allow testing many (radical) ideas at the same
time (typically 100s of customer A/B tests running)
Offline testing
■ Measure model performance, using (IR) metrics
■ Offline performance used as an indication to make
informed decisions on follow-up A/B tests
■ A critical (and mostly unsolved) issue is how offline
metrics can correlate with A/B test results.
■ Extremely important to define offline evaluation
framework that maps to online OEC
■ e.g. How to create training/testing datasets may not be trivial
Data
&
Models
Big Data @Netflix
Time
Impressions
Metadata
Social

■ > 40M subscribers
■ Ratings: ~5M/day
■ Searches: >3M/day
Geo-information
■ Plays: > 50M/day
■ Streamed hours:
○ 5B hours in
Member Behavior Q3 2013

Ratings

Device Info
Demographics
Smart Models

■ Regression models (Logistic,
Linear, Elastic nets)
■ SVD & other MF models
■ Factorization Machines
■ Restricted Boltzmann Machines
■ Markov Chains & other graph
models
■ Clustering (from k-means to
HDP)
■ Deep ANN
■ LDA
■ Association Rules
■ GBDT/RF
■ …
SVD for Rating Prediction
■ User factor vectors
■ Baseline (bias)

and item-factors vectors
(user & item deviation

from average)
■ Predict rating as
■ SVD++ (Koren et. Al) asymmetric variation w.
implicit feedback

■ Where
■
are three item factor vectors
■ Users are not parametrized, but rather represented by:
■ R(u): items rated by user u & N(u): items for which the user
has given implicit preference (e.g. rated/not rated)
Restricted Boltzmann Machines
■ Restrict the connectivity in ANN to
make learning easier.

■

Only one layer of hidden units.

■

■

Although multiple layers are possible

No connections between hidden
units.

■ Hidden units are independent given
the visible states..
■ RBMs can be stacked to form Deep
Belief Networks (DBN) – 4th generation
of ANNs
Ranking
■ Ranking = Scoring + Sorting + Filtering
bags of movies for presentation to a user
■ Key algorithm, sorts titles in most contexts
■ Goal: Find the best possible ordering of a
set of videos for a user within a specific
context in real-time
■ Objective: maximize consumption &
“enjoyment”

■ Factors
■
■
■
■
■
■

Accuracy
Novelty
Diversity
Freshness
Scalability
…
Example: Two features, linear model

2
3
4

5

Popularity

Linear Model:
frank(u,v) = w1 p(v) + w2 r(u,v) + b

Final Ranking

Predicted Rating

1
Example: Two features, linear model

2
3
4

5

Popularity

Final Ranking

Predicted Rating

1
Ranking
Learning to Rank Approaches
■ ML problem: construct ranking model from training data
1.
2.

3.

Pointwise (Ordinal regression, Logistic regression, SVM, GBDT, …)
■
Loss function defined on individual relevance judgment
Pairwise (RankSVM, RankBoost, RankNet, FRank…)
■
Loss function defined on pair-wise preferences
■
Goal: minimize number of inversions in ranking
Listwise
■
Indirect Loss Function (RankCosine, ListNet…)
■
Directly optimize IR measures (NDCG, MRR, FCP…)
■
■
■
■

Genetic Programming or Simulated Annealing
Use boosting to optimize NDCG (Adarank)
Gradient descent on smoothed version (CLiMF, TFMAP, GAPfm @cikm13)
Iterative Coordinate Ascent (Direct Rank @kdd13)
Other research questions we are working on
●
●
●
●
●
●
●
●
●

Row selection
Diversity
Similarity
Context-aware recommendations
Explore/exploit
Presentation bias correction
Mood and session intent inference
Unavailable Title Search
...
More data or
better
models?
More data or better models?

Really?

Anand Rajaraman: Former Stanford Prof. &
Senior VP at Walmart
More data or better models?
Sometimes, it’s not
about more data
More data or better models?
[Banko and Brill, 2001]

Norvig: “Google does not
have better Algorithms,
only more Data”

Many features/
low-bias models
More data or better models?

Sometimes, it’s not
about more data
More data or better models?

X
“Data without a sound approach = noise”
More data +
Smarter models +
More accurate metrics +
Better approaches
Lots of room for improvement!
Xavier Amatriain (@xamat)
xavier@netflix.com

Thanks!

We are hiring!

More Related Content

PDF
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
PDF
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
PDF
Big & Personal: the data and the models behind Netflix recommendations by Xa...
PDF
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
PDF
Past, present, and future of Recommender Systems: an industry perspective
PDF
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
PDF
Machine Learning to Grow the World's Knowledge
PDF
Replicable Evaluation of Recommender Systems
Qcon SF 2013 - Machine Learning & Recommender Systems @ Netflix Scale
MLConf - Emmys, Oscars & Machine Learning Algorithms at Netflix
Big & Personal: the data and the models behind Netflix recommendations by Xa...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Past, present, and future of Recommender Systems: an industry perspective
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
Machine Learning to Grow the World's Knowledge
Replicable Evaluation of Recommender Systems

What's hot (20)

PDF
Recommender Systems In Industry
PDF
Kdd 2014 Tutorial - the recommender problem revisited
PDF
Machine Learning for Q&A Sites: The Quora Example
PDF
Past present and future of Recommender Systems: an Industry Perspective
PDF
Barcelona ML Meetup - Lessons Learned
PDF
Recsys 2016
PDF
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
PDF
Machine learning the high interest credit card of technical debt [PWL]
PPTX
PDF
MLConf Seattle 2015 - ML@Quora
PDF
Product Recommendations Enhanced with Reviews
PDF
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
PPTX
Recommendation at Netflix Scale
PDF
Recsys 2014 Tutorial - The Recommender Problem Revisited
PDF
A Multi-Armed Bandit Framework For Recommendations at Netflix
PDF
Déjà Vu: The Importance of Time and Causality in Recommender Systems
PDF
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
PDF
Artwork Personalization at Netflix
PPTX
Recommendations for Building Machine Learning Software
PPTX
Machine learning
Recommender Systems In Industry
Kdd 2014 Tutorial - the recommender problem revisited
Machine Learning for Q&A Sites: The Quora Example
Past present and future of Recommender Systems: an Industry Perspective
Barcelona ML Meetup - Lessons Learned
Recsys 2016
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Machine learning the high interest credit card of technical debt [PWL]
MLConf Seattle 2015 - ML@Quora
Product Recommendations Enhanced with Reviews
Building Large-scale Real-world Recommender Systems - Recsys2012 tutorial
Recommendation at Netflix Scale
Recsys 2014 Tutorial - The Recommender Problem Revisited
A Multi-Armed Bandit Framework For Recommendations at Netflix
Déjà Vu: The Importance of Time and Causality in Recommender Systems
Deep Learning For Practitioners, lecture 2: Selecting the right applications...
Artwork Personalization at Netflix
Recommendations for Building Machine Learning Software
Machine learning
Ad

Viewers also liked (20)

PDF
Modulo quimica 2
PPTX
1.ders
PDF
Vagas 01-05-2016
PDF
Driven to Tests
PPTX
Legendy Laochtir
PPTX
崩壊地名APIを作ってみた
PDF
101 home business_success_quotes
PDF
We are ludwig
PDF
模組化合成器介紹@NCTU, 2013.10.15
PPTX
Subir tic
PPTX
República Popular do Corinthians - Semiótica Perceiana
DOCX
PDF
Cuatro Estaciones
PDF
Microsoft Office 2007 PL. Rady i wskazówki. Rozwiązania w biznesie
PDF
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
PPTX
Hardware/Software Integration Testing
PDF
15 Trends for 2015
PPTX
best practices-NETFLIX
PPTX
Physic P2 F5
PPTX
Socialbakers Harvard Presentation
Modulo quimica 2
1.ders
Vagas 01-05-2016
Driven to Tests
Legendy Laochtir
崩壊地名APIを作ってみた
101 home business_success_quotes
We are ludwig
模組化合成器介紹@NCTU, 2013.10.15
Subir tic
República Popular do Corinthians - Semiótica Perceiana
Cuatro Estaciones
Microsoft Office 2007 PL. Rady i wskazówki. Rozwiązania w biznesie
Puppet Camp Chicago 2014: Running Multiple Puppet Masters (Beginner)
Hardware/Software Integration Testing
15 Trends for 2015
best practices-NETFLIX
Physic P2 F5
Socialbakers Harvard Presentation
Ad

Similar to Cikm 2013 - Beyond Data From User Information to Business Value (20)

PPTX
acmsigtalkshare-121023190142-phpapp01.pptx
PDF
Xavier amatriain, dir algorithms netflix m lconf 2013
PDF
Netflix Recommendations - Beyond the 5 Stars
PDF
10 Lessons Learned from Building Machine Learning Systems
PDF
PDF
Recent Trends in Personalization at Netflix
PDF
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
DOCX
Mining Large Streams of User Data for PersonalizedRecommenda.docx
PDF
Machine learning advanced applications
PPTX
Lessons learnt at building recommendation services at industry scale
PPTX
Олександр Обєдніков “Рекомендательные системы”
PPTX
Learning to Personalize
PDF
Introduction to behavior based recommendation system
PDF
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
PPTX
Big data and machine learning / Gil Chamiel
PDF
Big Data Science - hype?
PPTX
Personalized Page Generation for Browsing Recommendations
PPTX
Big data - A critical appraisal
PDF
Big data hype
PDF
Florian Douetteau @ Dataiku
acmsigtalkshare-121023190142-phpapp01.pptx
Xavier amatriain, dir algorithms netflix m lconf 2013
Netflix Recommendations - Beyond the 5 Stars
10 Lessons Learned from Building Machine Learning Systems
Recent Trends in Personalization at Netflix
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Mining Large Streams of User Data for PersonalizedRecommenda.docx
Machine learning advanced applications
Lessons learnt at building recommendation services at industry scale
Олександр Обєдніков “Рекомендательные системы”
Learning to Personalize
Introduction to behavior based recommendation system
Dataiku at SF DataMining Meetup - Kaggle Yandex Challenge
Big data and machine learning / Gil Chamiel
Big Data Science - hype?
Personalized Page Generation for Browsing Recommendations
Big data - A critical appraisal
Big data hype
Florian Douetteau @ Dataiku

More from Xavier Amatriain (17)

PDF
Data/AI driven product development: from video streaming to telehealth
PDF
AI-driven product innovation: from Recommender Systems to COVID-19
PDF
AI for COVID-19 - Q42020 update
PDF
AI for COVID-19: An online virtual care approach
PDF
Lessons learned from building practical deep learning systems
PDF
AI for healthcare: Scaling Access and Quality of Care for Everyone
PDF
Towards online universal quality healthcare through AI
PDF
From one to zero: Going smaller as a growth strategy
PDF
Learning to speak medicine
PDF
ML to cure the world
PDF
Medical advice as a Recommender System
PDF
Staying Shallow & Lean in a Deep Learning World
PDF
10 more lessons learned from building Machine Learning systems - MLConf
PDF
10 more lessons learned from building Machine Learning systems
PDF
Lean DevOps - Lessons Learned from Innovation-driven Companies
PDF
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
PDF
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud
Data/AI driven product development: from video streaming to telehealth
AI-driven product innovation: from Recommender Systems to COVID-19
AI for COVID-19 - Q42020 update
AI for COVID-19: An online virtual care approach
Lessons learned from building practical deep learning systems
AI for healthcare: Scaling Access and Quality of Care for Everyone
Towards online universal quality healthcare through AI
From one to zero: Going smaller as a growth strategy
Learning to speak medicine
ML to cure the world
Medical advice as a Recommender System
Staying Shallow & Lean in a Deep Learning World
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems
Lean DevOps - Lessons Learned from Innovation-driven Companies
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
MMDS 2014 Talk - Distributing ML Algorithms: from GPUs to the Cloud

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Cloud computing and distributed systems.
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PDF
Advanced Soft Computing BINUS July 2025.pdf
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Empathic Computing: Creating Shared Understanding
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
cuic standard and advanced reporting.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Big Data Technologies - Introduction.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Electronic commerce courselecture one. Pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Cloud computing and distributed systems.
CIFDAQ's Market Insight: SEC Turns Pro Crypto
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
Advanced Soft Computing BINUS July 2025.pdf
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Spectral efficient network and resource selection model in 5G networks
Empathic Computing: Creating Shared Understanding
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
cuic standard and advanced reporting.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
MYSQL Presentation for SQL database connectivity
Big Data Technologies - Introduction.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
20250228 LYD VKU AI Blended-Learning.pptx
Network Security Unit 5.pdf for BCA BBA.
Dropbox Q2 2025 Financial Results & Investor Presentation
Electronic commerce courselecture one. Pdf

Cikm 2013 - Beyond Data From User Information to Business Value

  • 1. Beyond Data From User Information to Business Value October, 2013 Xavier Amatriain Director - Algorithms Engineering - Netflix @xamat
  • 2. “In a simple Netlfix-style item recommender, we would simply apply some form of matrix factorization (i.e NMF)”
  • 3. From the Netflix Prize to today 2006 2013
  • 5. Everything is personalized Ranking Over 75% of what people watch comes from a recommendation
  • 11. EVERYTHING is a Recommendation
  • 13. Consumer (Data) Science 1. Start with a hypothesis: ■ Algorithm/feature/design X will increase member engagement with our service, and ultimately member retention 2. Design a test ■ Develop a solution or prototype ■ Think about dependent & independent variables, control, significance… 3. 4. Execute the test Let data speak for itself
  • 15. Executing A/B tests Measure differences in metrics across statistically identical populations that each experience a different algorithm. ■ Decisions on the product always data-driven ■ Overall Evaluation Criteria (OEC) = member retention ■ Use long-term metrics whenever possible ■ Short-term metrics can be informative and allow faster decisions ■ But, not always aligned with OEC ■ Significance and hypothesis testing (1000s of members and 220 cells) ■ A/B Tests allow testing many (radical) ideas at the same time (typically 100s of customer A/B tests running)
  • 16. Offline testing ■ Measure model performance, using (IR) metrics ■ Offline performance used as an indication to make informed decisions on follow-up A/B tests ■ A critical (and mostly unsolved) issue is how offline metrics can correlate with A/B test results. ■ Extremely important to define offline evaluation framework that maps to online OEC ■ e.g. How to create training/testing datasets may not be trivial
  • 18. Big Data @Netflix Time Impressions Metadata Social ■ > 40M subscribers ■ Ratings: ~5M/day ■ Searches: >3M/day Geo-information ■ Plays: > 50M/day ■ Streamed hours: ○ 5B hours in Member Behavior Q3 2013 Ratings Device Info Demographics
  • 19. Smart Models ■ Regression models (Logistic, Linear, Elastic nets) ■ SVD & other MF models ■ Factorization Machines ■ Restricted Boltzmann Machines ■ Markov Chains & other graph models ■ Clustering (from k-means to HDP) ■ Deep ANN ■ LDA ■ Association Rules ■ GBDT/RF ■ …
  • 20. SVD for Rating Prediction ■ User factor vectors ■ Baseline (bias) and item-factors vectors (user & item deviation from average) ■ Predict rating as ■ SVD++ (Koren et. Al) asymmetric variation w. implicit feedback ■ Where ■ are three item factor vectors ■ Users are not parametrized, but rather represented by: ■ R(u): items rated by user u & N(u): items for which the user has given implicit preference (e.g. rated/not rated)
  • 21. Restricted Boltzmann Machines ■ Restrict the connectivity in ANN to make learning easier. ■ Only one layer of hidden units. ■ ■ Although multiple layers are possible No connections between hidden units. ■ Hidden units are independent given the visible states.. ■ RBMs can be stacked to form Deep Belief Networks (DBN) – 4th generation of ANNs
  • 22. Ranking ■ Ranking = Scoring + Sorting + Filtering bags of movies for presentation to a user ■ Key algorithm, sorts titles in most contexts ■ Goal: Find the best possible ordering of a set of videos for a user within a specific context in real-time ■ Objective: maximize consumption & “enjoyment” ■ Factors ■ ■ ■ ■ ■ ■ Accuracy Novelty Diversity Freshness Scalability …
  • 23. Example: Two features, linear model 2 3 4 5 Popularity Linear Model: frank(u,v) = w1 p(v) + w2 r(u,v) + b Final Ranking Predicted Rating 1
  • 24. Example: Two features, linear model 2 3 4 5 Popularity Final Ranking Predicted Rating 1
  • 26. Learning to Rank Approaches ■ ML problem: construct ranking model from training data 1. 2. 3. Pointwise (Ordinal regression, Logistic regression, SVM, GBDT, …) ■ Loss function defined on individual relevance judgment Pairwise (RankSVM, RankBoost, RankNet, FRank…) ■ Loss function defined on pair-wise preferences ■ Goal: minimize number of inversions in ranking Listwise ■ Indirect Loss Function (RankCosine, ListNet…) ■ Directly optimize IR measures (NDCG, MRR, FCP…) ■ ■ ■ ■ Genetic Programming or Simulated Annealing Use boosting to optimize NDCG (Adarank) Gradient descent on smoothed version (CLiMF, TFMAP, GAPfm @cikm13) Iterative Coordinate Ascent (Direct Rank @kdd13)
  • 27. Other research questions we are working on ● ● ● ● ● ● ● ● ● Row selection Diversity Similarity Context-aware recommendations Explore/exploit Presentation bias correction Mood and session intent inference Unavailable Title Search ...
  • 29. More data or better models? Really? Anand Rajaraman: Former Stanford Prof. & Senior VP at Walmart
  • 30. More data or better models? Sometimes, it’s not about more data
  • 31. More data or better models? [Banko and Brill, 2001] Norvig: “Google does not have better Algorithms, only more Data” Many features/ low-bias models
  • 32. More data or better models? Sometimes, it’s not about more data
  • 33. More data or better models? X
  • 34. “Data without a sound approach = noise”
  • 35. More data + Smarter models + More accurate metrics + Better approaches Lots of room for improvement!