SlideShare a Scribd company logo
Recommendation Systems
APAM E4990
Modeling Social Data
Jake Hofman
Columbia University
March 13, 2015
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 1 / 29
Personalized recommendations
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 2 / 29
Personalized recommendations
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 3 / 29
http://netflixprize.com
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 4 / 29
http://netflixprize.com/rules
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 5 / 29
http://netflixprize.com/faq
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 6 / 29
Netflix prize: results
http://en.wikipedia.org/wiki/Netflix_Prize
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 7 / 29
Netflix prize: results
See [TJB09] and [Kor09] for more gory details.
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 8 / 29
http://bit.ly/beyond5stars
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 9 / 29
Recommendation systems
High-level approaches:
• Content-based methods
(e.g., wgenre: thrillers = +2.3, wdirector: coen brothers = +1.7)
• Collaborative methods
(e.g., “Users who liked this also liked”)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 10 / 29
Netflix prize: data
(userid, movieid, rating, date)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 11 / 29
Netflix prize: data
(movieid, year, title)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 11 / 29
Recommendation systems
High-level approaches:
• Content-based methods
(e.g., wgenre: thrillers = +2.3, wdirector: coen brothers = +1.7)
• Collaborative methods
(e.g., “Users who liked this also liked”)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 12 / 29
Collaborative filtering
Memory-based
(e.g., k-nearest neighbors)
Model-based
(e.g., matrix factorization)
http://research.yahoo.com/pub/2859
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 13 / 29
Problem statement
• Given a set of past ratings Rui that user u gave item i
• Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number
of stars for movie rating
• Or we may infer implicit ratings from user actions, e.g.
Rui = 1 if u purchased i; otherwise Rui = ?
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29
Problem statement
• Given a set of past ratings Rui that user u gave item i
• Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number
of stars for movie rating
• Or we may infer implicit ratings from user actions, e.g.
Rui = 1 if u purchased i; otherwise Rui = ?
• Make recommendations of several forms
• Predict unseen item ratings for a particular user
• Suggest items for a particular user
• Suggest items similar to a particular item
• . . .
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29
Problem statement
• Given a set of past ratings Rui that user u gave item i
• Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number
of stars for movie rating
• Or we may infer implicit ratings from user actions, e.g.
Rui = 1 if u purchased i; otherwise Rui = ?
• Make recommendations of several forms
• Predict unseen item ratings for a particular user
• Suggest items for a particular user
• Suggest items similar to a particular item
• . . .
• Compare to natural baselines
• Guess global average for item ratings
• Suggest globally popular items
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29
k-nearest neighbors
Key intuition:
Take a local popularity vote amongst “similar” users
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 15 / 29
k-nearest neighbors
User similarity
Quantify similarity as a function of users’ past ratings, e.g.
• Fraction of items u and v have in common
Suv =
|ru ∩ rv |
|ru ∪ rv |
= i Rui Rvi
i (Rui + Rvi − Rui Rvi )
(1)
Retain top-k most similar neighbors v for each user u
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 16 / 29
k-nearest neighbors
User similarity
Quantify similarity as a function of users’ past ratings, e.g.
• Angle between rating vectors
Suv =
ru · rv
|ru| |rv |
= i Rui Rvi
i R2
ui j R2
vj
(1)
Retain top-k most similar neighbors v for each user u
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 16 / 29
k-nearest neighbors
Predicted ratings
Predict unseen ratings ˆRui as a weighted vote over u’s neighbors’
ratings for item i
ˆRui = v Rvi Suv
v Suv
(2)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 17 / 29
k-nearest neighbors
Practical notes
We expect most users have nothing in common, so calculate
similarities as:
for each item i:
for all pairs of users u, v that have rated i:
calculate Suv (if not already calculated)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29
k-nearest neighbors
Practical notes
Alternatively, we can make recommendations using an item-based
approach [LSY03]:
• Compute similarities Sij between all pairs of items
• Predict ratings with a weighted vote ˆRui = j Ruj Sij / j Sij
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29
k-nearest neighbors
Practical notes
Several (relatively) simple ways to scale:
• Sample a subset of ratings for each user (by, e.g., recency)
• Use MinHash to cluster users [DDGR07]
• Distribute calculations with MapReduce
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29
Matrix factorization
Key intuition:
Model item attributes as belonging to a set of unobserved “topics
and user preferences across these “topics”
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 19 / 29
Matrix factorization
Linear model
Start with a simple linear model:
ˆRui = b0
global average
+ bu
user bias
+ bi
item bias
(3)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 20 / 29
Matrix factorization
Linear model
For example, we might predict that a harsh critic would score a
popular movie as
ˆRui = 3.6
global average
+ −0.5
user bias
+ 0.8
item bias
(3)
= 3.9 (4)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 20 / 29
Matrix factorization
Low-rank approximation
Add an interaction term:
ˆRui = b0
global average
+ bu
user bias
+ bi
item bias
+ Wui
user-item interaction
(5)
where Wui = pu · qi = k PukQik
• Puk is user u’s preference for topic k
• Qik is item i’s association with topic k
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 21 / 29
Matrix factorization
Loss function
Measure quality of model fit with squared-loss:
L =
(u,i)
ˆRui − Rui
2
(6)
=
(u,i)
PQT
ui
− Rui
2
(7)
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 22 / 29
Matrix factorization
Optimization
The loss is non-convex in (P, Q), so no global minimum exists
Instead we can optimize L iteratively, e.g.:
• Alternating least squares: update each row of P, holding Q
fixed, and vice-versa
• Stochastic gradient descent: update individual rows pu and qi
for each observed Rui
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 23 / 29
Matrix factorization
Alternating least squares
L is convex in rows of P with Q fixed, and Q with P fixed, so
alternate solutions to the normal equations:
pu = Q(u)T
Q(u)
−1
Q(u)T
r(u) (8)
qi = P(i)T
P(i)
−1
P(i)T
r(i) (9)
where:
• Q(u) is the item association matrix restricted to items rated
by user u
• P(i) is the user preference matrix restricted to users that have
rated item i
• r(u) are ratings by user u and r(i) are ratings on item i
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 24 / 29
Matrix factorization
Stochastic gradient descent
Alternatively, we can avoid inverting matrices by taking steps in
the direction of the negative gradient for each observed rating:
pu ← pu − η
∂L
∂pu
= pu + Rui − ˆRui qi (10)
qi ← qi − η
∂L
∂qi
= qi + Rui − ˆRui pu (11)
for some step-size η
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 25 / 29
Matrix factorization
Practical notes
Several ways to scale:
• Distribute matrix operations with MapReduce [GHNS11]
• Parallelize stochastic gradient descent [ZWSL10]
• Expectation-maximization for pLSI with MapReduce
[DDGR07]
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 26 / 29
Datasets
• Movielens
http://www.grouplens.org/node/12
• Reddit
http://bit.ly/redditdata
• CU “million songs”
http://labrosa.ee.columbia.edu/millionsong/
• Yahoo Music KDDcup
http://kddcup.yahoo.com/
• AudioScrobbler
http://bit.ly/audioscrobblerdata
• Delicious
http://bit.ly/deliciousdata
• . . .
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 27 / 29
References I
AS Das, M Datar, A Garg, and S Rajaram.
Google news personalization: scalable online collaborative
filtering.
page 280, 2007.
R Gemulla, PJ Haas, E Nijkamp, and Y Sismanis.
Large-scale matrix factorization with distributed stochastic
gradient descent.
2011.
Yehuda Koren.
The bellkor solution to the netflix grand prize.
pages 1–10, Aug 2009.
G Linden, B Smith, and J York.
Amazon. com recommendations: Item-to-item collaborative
filtering.
IEEE Internet computing, 7(1):76–80, 2003.
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 28 / 29
References II
A Toscher, M Jahrer, and RM Bell.
The bigchaos solution to the netflix grand prize.
2009.
M. Zinkevich, M. Weimer, A. Smola, and L. Li.
Parallelized stochastic gradient descent.
In Neural Information Processing Systems (NIPS), 2010.
Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 29 / 29

More Related Content

PDF
Modeling Social Data, Lecture 2: Introduction to Counting
PDF
Data-driven modeling: Lecture 10
PDF
Tutorial: Context In Recommender Systems
PDF
Data-driven modeling: Lecture 09
PDF
Data-driven modeling: Lecture 02
PDF
Data-driven modeling: Lecture 01
PDF
Tutorial: Context-awareness In Information Retrieval and Recommender Systems
PPTX
Recommender Systems: Advances in Collaborative Filtering
Modeling Social Data, Lecture 2: Introduction to Counting
Data-driven modeling: Lecture 10
Tutorial: Context In Recommender Systems
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 02
Data-driven modeling: Lecture 01
Tutorial: Context-awareness In Information Retrieval and Recommender Systems
Recommender Systems: Advances in Collaborative Filtering

Viewers also liked (20)

PDF
Costing System example מערכת תמחיר - דוגמא
XLSX
08 copia de tarjeta kardex
PPSX
Greek islands !!
PPT
Zeynep asa 2011 privacy
PPTX
Brite zeynep 2012
PPS
China's amazing bridges_-_2003v[1]
PPTX
D.psicologia
PPTX
2강 기업교육론 20110309
PDF
Conselho de classe - Pedagogo César Tavares
PPS
English astronomie21
PDF
Using Data to Understand the Brain
PDF
产品早期市场推广探路实践 by XDash
DOC
Stroke symposiuma tpaper91411
PDF
PPS
Paris e suas igrejas
PDF
La ecologia social de hoy
PPT
Presentatie carrieredagen 2011 linkedin
PPS
Australia grande barriera corallina
DOC
Maherprofessional c vbio216
PPT
Amore come sofferenza fisica
Costing System example מערכת תמחיר - דוגמא
08 copia de tarjeta kardex
Greek islands !!
Zeynep asa 2011 privacy
Brite zeynep 2012
China's amazing bridges_-_2003v[1]
D.psicologia
2강 기업교육론 20110309
Conselho de classe - Pedagogo César Tavares
English astronomie21
Using Data to Understand the Brain
产品早期市场推广探路实践 by XDash
Stroke symposiuma tpaper91411
Paris e suas igrejas
La ecologia social de hoy
Presentatie carrieredagen 2011 linkedin
Australia grande barriera corallina
Maherprofessional c vbio216
Amore come sofferenza fisica
Ad

Similar to Modeling Social Data, Lecture 8: Recommendation Systems (20)

PPTX
Tri-University Group Primo Usability Study
PDF
Real-world News Recommender Systems
PPT
Personalizing the web building effective recommender systems
PPTX
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
PPTX
A hybrid recommender system user profiling from keywords and ratings
PDF
IntroductionRecommenderSystems_Petroni.pdf
PPTX
Movie recommendation system using collaborative filtering system
PDF
Ensemble Methods and Recommender Systems
PDF
2.social recommedation
PDF
Recommender Systems
PDF
Ronny lempelyahooindiabigthinkerapril2013
PPTX
Entity Recommendations Using Hierarchical Knowledge Bases
PDF
Introduction to recomender systems new content
PDF
Towards Social User Profiling: Unified and Discriminative Influence Model for...
PPTX
Primo Central Trial, Usability Testing, and Implementation Options (2012)
PDF
[UMAP 2016] User-Oriented Context Suggestion
PPTX
Recommendation system
PDF
Big Data, Little Devices: Mobile A/B Testing
PDF
Collaborative Filtering 1: User-based CF
PPTX
The interplay of personal preference and social influence in sharing networks...
Tri-University Group Primo Usability Study
Real-world News Recommender Systems
Personalizing the web building effective recommender systems
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
A hybrid recommender system user profiling from keywords and ratings
IntroductionRecommenderSystems_Petroni.pdf
Movie recommendation system using collaborative filtering system
Ensemble Methods and Recommender Systems
2.social recommedation
Recommender Systems
Ronny lempelyahooindiabigthinkerapril2013
Entity Recommendations Using Hierarchical Knowledge Bases
Introduction to recomender systems new content
Towards Social User Profiling: Unified and Discriminative Influence Model for...
Primo Central Trial, Usability Testing, and Implementation Options (2012)
[UMAP 2016] User-Oriented Context Suggestion
Recommendation system
Big Data, Little Devices: Mobile A/B Testing
Collaborative Filtering 1: User-based CF
The interplay of personal preference and social influence in sharing networks...
Ad

More from jakehofman (20)

PPTX
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
PPTX
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
PDF
Modeling Social Data, Lecture 10: Networks
PDF
Modeling Social Data, Lecture 8: Classification
PDF
Modeling Social Data, Lecture 7: Model complexity and generalization
PDF
Modeling Social Data, Lecture 6: Regression, Part 1
PDF
Modeling Social Data, Lecture 4: Counting at Scale
PDF
Modeling Social Data, Lecture 3: Data manipulation in R
PDF
Modeling Social Data, Lecture 2: Introduction to Counting
PDF
Modeling Social Data, Lecture 1: Overview
PDF
Modeling Social Data, Lecture 6: Classification with Naive Bayes
PDF
Modeling Social Data, Lecture 3: Counting at Scale
PDF
Modeling Social Data, Lecture 1: Case Studies
PDF
NYC Data Science Meetup: Computational Social Science
PDF
Computational Social Science, Lecture 13: Classification
PDF
Computational Social Science, Lecture 11: Regression
PDF
Computational Social Science, Lecture 10: Online Experiments
PDF
Computational Social Science, Lecture 09: Data Wrangling
PDF
Computational Social Science, Lecture 08: Counting Fast, Part II
PDF
Computational Social Science, Lecture 07: Counting Fast, Part I
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 1: Case Studies
NYC Data Science Meetup: Computational Social Science
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 07: Counting Fast, Part I

Recently uploaded (20)

PDF
Sciences of Europe No 170 (2025)
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
The scientific heritage No 166 (166) (2025)
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
BIOMOLECULES PPT........................
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
HPLC-PPT.docx high performance liquid chromatography
Sciences of Europe No 170 (2025)
ECG_Course_Presentation د.محمد صقران ppt
The KM-GBF monitoring framework – status & key messages.pptx
Phytochemical Investigation of Miliusa longipes.pdf
TOTAL hIP ARTHROPLASTY Presentation.pptx
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
The scientific heritage No 166 (166) (2025)
. Radiology Case Scenariosssssssssssssss
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
INTRODUCTION TO EVS | Concept of sustainability
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
AlphaEarth Foundations and the Satellite Embedding dataset
BIOMOLECULES PPT........................
2Systematics of Living Organisms t-.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Introduction to Cardiovascular system_structure and functions-1
HPLC-PPT.docx high performance liquid chromatography

Modeling Social Data, Lecture 8: Recommendation Systems

  • 1. Recommendation Systems APAM E4990 Modeling Social Data Jake Hofman Columbia University March 13, 2015 Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 1 / 29
  • 2. Personalized recommendations Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 2 / 29
  • 3. Personalized recommendations Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 3 / 29
  • 4. http://netflixprize.com Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 4 / 29
  • 5. http://netflixprize.com/rules Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 5 / 29
  • 6. http://netflixprize.com/faq Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 6 / 29
  • 7. Netflix prize: results http://en.wikipedia.org/wiki/Netflix_Prize Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 7 / 29
  • 8. Netflix prize: results See [TJB09] and [Kor09] for more gory details. Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 8 / 29
  • 9. http://bit.ly/beyond5stars Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 9 / 29
  • 10. Recommendation systems High-level approaches: • Content-based methods (e.g., wgenre: thrillers = +2.3, wdirector: coen brothers = +1.7) • Collaborative methods (e.g., “Users who liked this also liked”) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 10 / 29
  • 11. Netflix prize: data (userid, movieid, rating, date) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 11 / 29
  • 12. Netflix prize: data (movieid, year, title) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 11 / 29
  • 13. Recommendation systems High-level approaches: • Content-based methods (e.g., wgenre: thrillers = +2.3, wdirector: coen brothers = +1.7) • Collaborative methods (e.g., “Users who liked this also liked”) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 12 / 29
  • 14. Collaborative filtering Memory-based (e.g., k-nearest neighbors) Model-based (e.g., matrix factorization) http://research.yahoo.com/pub/2859 Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 13 / 29
  • 15. Problem statement • Given a set of past ratings Rui that user u gave item i • Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number of stars for movie rating • Or we may infer implicit ratings from user actions, e.g. Rui = 1 if u purchased i; otherwise Rui = ? Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29
  • 16. Problem statement • Given a set of past ratings Rui that user u gave item i • Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number of stars for movie rating • Or we may infer implicit ratings from user actions, e.g. Rui = 1 if u purchased i; otherwise Rui = ? • Make recommendations of several forms • Predict unseen item ratings for a particular user • Suggest items for a particular user • Suggest items similar to a particular item • . . . Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29
  • 17. Problem statement • Given a set of past ratings Rui that user u gave item i • Users may explicitly assign ratings, e.g., Rui ∈ [1, 5] is number of stars for movie rating • Or we may infer implicit ratings from user actions, e.g. Rui = 1 if u purchased i; otherwise Rui = ? • Make recommendations of several forms • Predict unseen item ratings for a particular user • Suggest items for a particular user • Suggest items similar to a particular item • . . . • Compare to natural baselines • Guess global average for item ratings • Suggest globally popular items Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 14 / 29
  • 18. k-nearest neighbors Key intuition: Take a local popularity vote amongst “similar” users Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 15 / 29
  • 19. k-nearest neighbors User similarity Quantify similarity as a function of users’ past ratings, e.g. • Fraction of items u and v have in common Suv = |ru ∩ rv | |ru ∪ rv | = i Rui Rvi i (Rui + Rvi − Rui Rvi ) (1) Retain top-k most similar neighbors v for each user u Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 16 / 29
  • 20. k-nearest neighbors User similarity Quantify similarity as a function of users’ past ratings, e.g. • Angle between rating vectors Suv = ru · rv |ru| |rv | = i Rui Rvi i R2 ui j R2 vj (1) Retain top-k most similar neighbors v for each user u Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 16 / 29
  • 21. k-nearest neighbors Predicted ratings Predict unseen ratings ˆRui as a weighted vote over u’s neighbors’ ratings for item i ˆRui = v Rvi Suv v Suv (2) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 17 / 29
  • 22. k-nearest neighbors Practical notes We expect most users have nothing in common, so calculate similarities as: for each item i: for all pairs of users u, v that have rated i: calculate Suv (if not already calculated) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29
  • 23. k-nearest neighbors Practical notes Alternatively, we can make recommendations using an item-based approach [LSY03]: • Compute similarities Sij between all pairs of items • Predict ratings with a weighted vote ˆRui = j Ruj Sij / j Sij Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29
  • 24. k-nearest neighbors Practical notes Several (relatively) simple ways to scale: • Sample a subset of ratings for each user (by, e.g., recency) • Use MinHash to cluster users [DDGR07] • Distribute calculations with MapReduce Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 18 / 29
  • 25. Matrix factorization Key intuition: Model item attributes as belonging to a set of unobserved “topics and user preferences across these “topics” Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 19 / 29
  • 26. Matrix factorization Linear model Start with a simple linear model: ˆRui = b0 global average + bu user bias + bi item bias (3) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 20 / 29
  • 27. Matrix factorization Linear model For example, we might predict that a harsh critic would score a popular movie as ˆRui = 3.6 global average + −0.5 user bias + 0.8 item bias (3) = 3.9 (4) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 20 / 29
  • 28. Matrix factorization Low-rank approximation Add an interaction term: ˆRui = b0 global average + bu user bias + bi item bias + Wui user-item interaction (5) where Wui = pu · qi = k PukQik • Puk is user u’s preference for topic k • Qik is item i’s association with topic k Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 21 / 29
  • 29. Matrix factorization Loss function Measure quality of model fit with squared-loss: L = (u,i) ˆRui − Rui 2 (6) = (u,i) PQT ui − Rui 2 (7) Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 22 / 29
  • 30. Matrix factorization Optimization The loss is non-convex in (P, Q), so no global minimum exists Instead we can optimize L iteratively, e.g.: • Alternating least squares: update each row of P, holding Q fixed, and vice-versa • Stochastic gradient descent: update individual rows pu and qi for each observed Rui Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 23 / 29
  • 31. Matrix factorization Alternating least squares L is convex in rows of P with Q fixed, and Q with P fixed, so alternate solutions to the normal equations: pu = Q(u)T Q(u) −1 Q(u)T r(u) (8) qi = P(i)T P(i) −1 P(i)T r(i) (9) where: • Q(u) is the item association matrix restricted to items rated by user u • P(i) is the user preference matrix restricted to users that have rated item i • r(u) are ratings by user u and r(i) are ratings on item i Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 24 / 29
  • 32. Matrix factorization Stochastic gradient descent Alternatively, we can avoid inverting matrices by taking steps in the direction of the negative gradient for each observed rating: pu ← pu − η ∂L ∂pu = pu + Rui − ˆRui qi (10) qi ← qi − η ∂L ∂qi = qi + Rui − ˆRui pu (11) for some step-size η Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 25 / 29
  • 33. Matrix factorization Practical notes Several ways to scale: • Distribute matrix operations with MapReduce [GHNS11] • Parallelize stochastic gradient descent [ZWSL10] • Expectation-maximization for pLSI with MapReduce [DDGR07] Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 26 / 29
  • 34. Datasets • Movielens http://www.grouplens.org/node/12 • Reddit http://bit.ly/redditdata • CU “million songs” http://labrosa.ee.columbia.edu/millionsong/ • Yahoo Music KDDcup http://kddcup.yahoo.com/ • AudioScrobbler http://bit.ly/audioscrobblerdata • Delicious http://bit.ly/deliciousdata • . . . Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 27 / 29
  • 35. References I AS Das, M Datar, A Garg, and S Rajaram. Google news personalization: scalable online collaborative filtering. page 280, 2007. R Gemulla, PJ Haas, E Nijkamp, and Y Sismanis. Large-scale matrix factorization with distributed stochastic gradient descent. 2011. Yehuda Koren. The bellkor solution to the netflix grand prize. pages 1–10, Aug 2009. G Linden, B Smith, and J York. Amazon. com recommendations: Item-to-item collaborative filtering. IEEE Internet computing, 7(1):76–80, 2003. Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 28 / 29
  • 36. References II A Toscher, M Jahrer, and RM Bell. The bigchaos solution to the netflix grand prize. 2009. M. Zinkevich, M. Weimer, A. Smola, and L. Li. Parallelized stochastic gradient descent. In Neural Information Processing Systems (NIPS), 2010. Jake Hofman (Columbia University) Recommendation Systems March 13, 2015 29 / 29