SlideShare a Scribd company logo
Class Outline 
• Introduction: Recommendation System 
• Collaborative Filtering 
– Calculating Similarities 
– Recommending Items 
• Content-based Recommendation 
• Collaborative Filtering in R: Beer 
Recommendation based on Beer Advocate
Recommendation System Demo 
http://blog.yhathq.com/posts/recommender-system-in-r.html
Examples 
• Retail: Amazon 
• Movie: Netflix 
• Friends: Facebook 
• Professional connection: LinkedIn 
• Websites: Reddit
Key Ideas 
• Intuition: Low tech way to get recommendation - ask your friends! 
– Some of your friends have better “taste” than others (likely-minded) 
• Problem: Not scalable 
– As more and more options become available, it become less practical to 
decide what you want by asking a small group of people 
– They may not be aware of all the options 
• Solution: Collaborative filtering or KNN 
– Search a large group of people and find a smaller set with tastes similar to 
yours 
– Looks at other things they like an combines them to create a ranked list of 
suggestions 
– First used by David Goldberg (Xerox PARC, 1992): “Using collaborative filtering 
to weave an information tapestry.”
Input Data 
• Explicit (Questioning) 
– Explicit rating (1 -5 numerical ratings) 
– Favorites (Likes): 1 (liked), 0 (No vote), -1 (disliked) 
• Implicit (Behavioral) 
– Purchase: 1 (bought), 0 (didn’t buy) 
– Clicks: 1 (clicked), 0 (didn’t click) 
– Reads: 1 (read), 0 (didn’t read) 
– Watching a Video: 1 (watched), 0 (didn’t watch) 
– Hybrid: 2 (bought), 1 (browsed), 0 (didn’t buy)
Recommendation vs. Prediction 
• Recommendations 
– Suggestions 
– Top-N 
• Predictions 
– Ratings 
– Purchase
Preference Data: Structure 
• Rows: Customers/Users 
• Columns: Items 
Customer ID Lady in 
the 
Water 
• Large matrix Y(u,i) 
Snake 
on a 
Plane 
Just My 
Luck 
Superman 
Returns 
– Many zeros (Sparse) 
– Number of users: large (order of million) 
– Number of observations per customer: large (200 +) 
– Time/sequence information ignored 
You, Me, 
and 
Dupree 
The Night 
Listener 
Michael 2.5 3.5 3.0 3.5 2.5 3.0 
Jay 3.5 3.5 3.0 
July 3.5 3.0 4.0 2.5 4.5 
Peter 3.0 4.0 2.0 3.0 3.0 
Stephen 3.0 4.0 5.0 3.0
Collaborative Filtering Tasks 
• 1. Finding Similar Users: Calculating Similarities 
• 2. Ranking the Users 
• 3. Recommending Items based on weighted preference data
Finding Similar Users 
• Calculate pair-wise similarities 
– Euclidean Distance: Simple, but subject to rating inflation 
– Cosine similarity: better with binary/fractional data 
– Pearson correlation: continuous variables (e.g. numerical ratings) 
– Others: Jaccard coefficient, Manhattan distance
Ranking the Users 
• Focal customer 
– Toby: Preference Vector (“Snakes on a Plan”: 4.5, “You, Me, and Dupree: 1.0”, 
“Superman Returns”: 4.0) 
Customer ID Pearson Correlation Similarity 
Michael 0.99 
Jay 0.38 
July 0.89 
Peter 0.92 
Stephen 0.66 
– Top 3 matches: Michael, Peter, July -> Likely-minded!
Recommending Items – 1/2 
• Problems: If we only use top 1 likely-minded customer 
– May accidently turn up customers who haven’t reviewed some of the movies 
that I might like 
– Could return a customer who strangely liked a move that got bad reviews from 
all other customers 
• Solution: Score the items by producing a weighted score that ranks 
the customers (weights by similarity)
Recommending Items – 2/2 
Customer 
ID 
Similarity Lady in 
the 
Water 
Snake 
on a 
Plane 
Just 
My 
Luck 
Superm 
an 
Returns 
You, Me, 
and Dupree 
The Night 
Listener 
Michael 0.99 2.5 3.5 3.0 3.5 2.5 3.0 
Jay 0.38 3.5 3.5 3.0 
July 0.89 3.5 3.0 4.0 2.5 4.5 
Peter 0.92 3.0 4.0 2.0 3.0 3.0 
Stephen 0.66 3.0 4.0 5.0 3.0 
Total 8.5 18.5 8 15.5 8.5 16.5 
Sim. Sum 2.57 3.84 2.8 3.46 2.26 3.84 
Total/Sim. 
3.31 4.82 2.86 4.48 3.76 4.30 
Sum 
1 2 3
Problems with Collaborative Filtering 
• When data are sparse, correlations (weights) are based on very 
few common items -> unreliable 
• It cannot handle new items 
• It do not incorporate attribute information 
• Alternative way: content-based recommendations 
– Let’s use attribute information!
Content-based Recommendations 
• 1. Defined features and feature values (similar to conjoint analysis) 
• 2. Describe each item as a vector of features 
• 3. Develop a user profile: the types of items this user likes 
– A weighted vector of item attributes 
– Weights denote the importance of each attribute to the user 
• 4. Recommend items that are similar to those that a user liked in the 
past 
• Note 1: Similar to information retrieval (text mining) 
• Note 2: Pre-computation possible; More scalable -> Used by Amazon
More ideas for improvement 
• Ensemble methods (combining algorithms) 
– Most advanced/commercial algorithms combine kNN, matrix 
factorization (handling large/sparse matrix), and other classifiers 
• Marginal propensity to buy with/without recommendation 
(instead of probability of buy) 
– Anand Bodapati (JMR 2008): “Customers who buy this product buy these 
other products” kind of recommendation system frequently 
recommends what customers would have bought anyway and the 
recommendation system often creates only purchase acceleration rather 
than expand sales 
• Incorporate text reviews: text review data can be used as a basis 
to calculate similarities (i.e. text mining) 
– Basic methods only rely on numerical ratings/purchase data
• Collaborative Filtering in R: Beer 
Recommender from “Beer Advocate” Data
Beer Advocate Data 
• Each Record: a beer’s name, brewery, metadata (style, ABV), 
(numerical) ratings (1-5), short text reviews (250 – 5000 characters) 
• ~1.5 millions reviews posted on Beer Advocate from 1999 to 
2011.
Collaborative Filtering in R – 1/2 
Step 0) Install “R” and Packages 
R program: http://www.r-project.org/ 
Package: http://cran.r-project.org/web/packages/tm/index.html 
Package: http://cran.r-project.org/web/packages/twitteR/index.html 
Package: http://cran.r-project.org/web/packages/wordcloud/index.html 
Manual: http://cran.r-project.org/web/packages/tm/vignettes/tm.pdf 
Step 1) Collecting preference data (ratings) from 
“Beer Advocate” : Web crawling
Recommendation System in R – 2/2 
Step 2) Cleaning/Formatting Data (Python) 
Step 3) Importing Data into R 
Step 4) Finding Similar Users: Calculating 
Similarities 
Step 5) Ranking the Users 
Step 6) Recommending Items
Books 
• Programming Collective Intelligence (Toby Segaran) 
• Machine Learning for Hackers (Drew Conway and John 
Myles White) 
Articles (more technical) 
• Internet Recommendation Systems (Asim Ansari, 
Skander Essegaier, Rajeev Kohli) 
• Recommendation Systems with Purchase Data (Anand 
Bodapati) 
Reference

More Related Content

PPTX
Recommender Systems
PPTX
Recommendation Systems Basics
PDF
Overview of recommender system
PPTX
Recommender system introduction
PPTX
Recommender systems for E-commerce
PPTX
Recommendation system
PDF
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
PPTX
Recommender systems using collaborative filtering
Recommender Systems
Recommendation Systems Basics
Overview of recommender system
Recommender system introduction
Recommender systems for E-commerce
Recommendation system
Recommender Systems (Machine Learning Summer School 2014 @ CMU)
Recommender systems using collaborative filtering

What's hot (20)

PPTX
Recommendation System
PDF
Recommender Systems
PDF
Recent advances in deep recommender systems
PDF
Introduction to Recommendation Systems
PDF
Collaborative filtering
PDF
Recommender Systems
PPTX
Recommendation system
PDF
Recommendation engines
PPTX
Recommendation Engine Project Presentation
PPTX
Recommender systems: Content-based and collaborative filtering
PPTX
[Final]collaborative filtering and recommender systems
PDF
An introduction to Recommender Systems
PPTX
Recommendation system
PPTX
Recommender system
PDF
Recommendation System Explained
PPTX
Datajob 2013 - Construire un système de recommandation
PDF
Boston ML - Architecting Recommender Systems
PDF
Product Recommendations Enhanced with Reviews
PPTX
Recommendation Systems
PPTX
Collaborative Filtering Recommendation System
Recommendation System
Recommender Systems
Recent advances in deep recommender systems
Introduction to Recommendation Systems
Collaborative filtering
Recommender Systems
Recommendation system
Recommendation engines
Recommendation Engine Project Presentation
Recommender systems: Content-based and collaborative filtering
[Final]collaborative filtering and recommender systems
An introduction to Recommender Systems
Recommendation system
Recommender system
Recommendation System Explained
Datajob 2013 - Construire un système de recommandation
Boston ML - Architecting Recommender Systems
Product Recommendations Enhanced with Reviews
Recommendation Systems
Collaborative Filtering Recommendation System
Ad

Viewers also liked (20)

PPTX
Marketing Experiment - Part II: Analysis
PPTX
Promotion Analytics in Consumer Electronics - Module 1: Data
PPTX
Promotion Analytics - Module 2: Model and Estimation
PDF
PDF
Marketing Analytics Tools & Techniques
PDF
Personal Matching Recommendation system in TinderBox
PDF
"Knowledge Enabled Real-Time Recommendation System", by Jules Chevalier and S...
PPTX
User behavior modelling & recommendation system based on social networks
PDF
Toward Building a Content based Video Recommendation System Based on Low-leve...
PPTX
Marketing Experimentation - Part I
PDF
Developing and Movie Recommendation System in R
PPT
A Hybrid Recommendation system
PPTX
Dummy Variable Regression Analysis
PDF
Introduction to behavior based recommendation system
PPTX
Conjoint Analysis Part 3/3 - Market Simulator
PDF
Recommendation System --Theory and Practice
PPTX
Conjoint Analysis - Part 1/3
PPTX
Conjoint Analysis - Part 2/3
PPTX
Multiple Regression Analysis
PPTX
Measuring the Effectiveness of the Promotional Program
Marketing Experiment - Part II: Analysis
Promotion Analytics in Consumer Electronics - Module 1: Data
Promotion Analytics - Module 2: Model and Estimation
Marketing Analytics Tools & Techniques
Personal Matching Recommendation system in TinderBox
"Knowledge Enabled Real-Time Recommendation System", by Jules Chevalier and S...
User behavior modelling & recommendation system based on social networks
Toward Building a Content based Video Recommendation System Based on Low-leve...
Marketing Experimentation - Part I
Developing and Movie Recommendation System in R
A Hybrid Recommendation system
Dummy Variable Regression Analysis
Introduction to behavior based recommendation system
Conjoint Analysis Part 3/3 - Market Simulator
Recommendation System --Theory and Practice
Conjoint Analysis - Part 1/3
Conjoint Analysis - Part 2/3
Multiple Regression Analysis
Measuring the Effectiveness of the Promotional Program
Ad

Similar to Introduction to Recommendation System (20)

PDF
Recommendation Systems
PPT
Recommender systems
PPT
Content based recommendation systems
PPTX
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
PDF
A survey of memory based methods for collaborative filtering based techniques
PPT
Introduction to recommendation system
PPT
Filtering content bbased crs
PPTX
Lecture Notes on Recommender System Introduction
PDF
Notes on Recommender Systems pdf 2nd module
PPTX
movierecommendationproject-171223181147.pptx
PDF
IntroductionRecommenderSystems_Petroni.pdf
PDF
Movie recommendation project
PDF
best online data science courses
PDF
Big data certification training mumbai
PDF
Top data science institutes in hyderabad
PDF
Best data science courses in pune
PDF
Introduction to recommender systems
PPTX
recommender_systems
PDF
A recommendation engine for your php application
PDF
[系列活動] 人工智慧與機器學習在推薦系統上的應用
Recommendation Systems
Recommender systems
Content based recommendation systems
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
A survey of memory based methods for collaborative filtering based techniques
Introduction to recommendation system
Filtering content bbased crs
Lecture Notes on Recommender System Introduction
Notes on Recommender Systems pdf 2nd module
movierecommendationproject-171223181147.pptx
IntroductionRecommenderSystems_Petroni.pdf
Movie recommendation project
best online data science courses
Big data certification training mumbai
Top data science institutes in hyderabad
Best data science courses in pune
Introduction to recommender systems
recommender_systems
A recommendation engine for your php application
[系列活動] 人工智慧與機器學習在推薦系統上的應用

Recently uploaded (20)

PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PPTX
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
PDF
MSPs in 10 Words - Created by US MSP Network
PDF
How to Get Funding for Your Trucking Business
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PDF
Types of control:Qualitative vs Quantitative
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PPT
Chapter four Project-Preparation material
PPTX
Amazon (Business Studies) management studies
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PDF
Training And Development of Employee .pdf
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
PPTX
Probability Distribution, binomial distribution, poisson distribution
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
Business model innovation report 2022.pdf
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PPT
Data mining for business intelligence ch04 sharda
Power and position in leadershipDOC-20250808-WA0011..pdf
Dragon_Fruit_Cultivation_in Nepal ppt.pptx
MSPs in 10 Words - Created by US MSP Network
How to Get Funding for Your Trucking Business
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
Types of control:Qualitative vs Quantitative
340036916-American-Literature-Literary-Period-Overview.ppt
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Chapter four Project-Preparation material
Amazon (Business Studies) management studies
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
Training And Development of Employee .pdf
Belch_12e_PPT_Ch18_Accessible_university.pptx
Probability Distribution, binomial distribution, poisson distribution
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
Business model innovation report 2022.pdf
Reconciliation AND MEMORANDUM RECONCILATION
Data mining for business intelligence ch04 sharda

Introduction to Recommendation System

  • 1. Class Outline • Introduction: Recommendation System • Collaborative Filtering – Calculating Similarities – Recommending Items • Content-based Recommendation • Collaborative Filtering in R: Beer Recommendation based on Beer Advocate
  • 2. Recommendation System Demo http://blog.yhathq.com/posts/recommender-system-in-r.html
  • 3. Examples • Retail: Amazon • Movie: Netflix • Friends: Facebook • Professional connection: LinkedIn • Websites: Reddit
  • 4. Key Ideas • Intuition: Low tech way to get recommendation - ask your friends! – Some of your friends have better “taste” than others (likely-minded) • Problem: Not scalable – As more and more options become available, it become less practical to decide what you want by asking a small group of people – They may not be aware of all the options • Solution: Collaborative filtering or KNN – Search a large group of people and find a smaller set with tastes similar to yours – Looks at other things they like an combines them to create a ranked list of suggestions – First used by David Goldberg (Xerox PARC, 1992): “Using collaborative filtering to weave an information tapestry.”
  • 5. Input Data • Explicit (Questioning) – Explicit rating (1 -5 numerical ratings) – Favorites (Likes): 1 (liked), 0 (No vote), -1 (disliked) • Implicit (Behavioral) – Purchase: 1 (bought), 0 (didn’t buy) – Clicks: 1 (clicked), 0 (didn’t click) – Reads: 1 (read), 0 (didn’t read) – Watching a Video: 1 (watched), 0 (didn’t watch) – Hybrid: 2 (bought), 1 (browsed), 0 (didn’t buy)
  • 6. Recommendation vs. Prediction • Recommendations – Suggestions – Top-N • Predictions – Ratings – Purchase
  • 7. Preference Data: Structure • Rows: Customers/Users • Columns: Items Customer ID Lady in the Water • Large matrix Y(u,i) Snake on a Plane Just My Luck Superman Returns – Many zeros (Sparse) – Number of users: large (order of million) – Number of observations per customer: large (200 +) – Time/sequence information ignored You, Me, and Dupree The Night Listener Michael 2.5 3.5 3.0 3.5 2.5 3.0 Jay 3.5 3.5 3.0 July 3.5 3.0 4.0 2.5 4.5 Peter 3.0 4.0 2.0 3.0 3.0 Stephen 3.0 4.0 5.0 3.0
  • 8. Collaborative Filtering Tasks • 1. Finding Similar Users: Calculating Similarities • 2. Ranking the Users • 3. Recommending Items based on weighted preference data
  • 9. Finding Similar Users • Calculate pair-wise similarities – Euclidean Distance: Simple, but subject to rating inflation – Cosine similarity: better with binary/fractional data – Pearson correlation: continuous variables (e.g. numerical ratings) – Others: Jaccard coefficient, Manhattan distance
  • 10. Ranking the Users • Focal customer – Toby: Preference Vector (“Snakes on a Plan”: 4.5, “You, Me, and Dupree: 1.0”, “Superman Returns”: 4.0) Customer ID Pearson Correlation Similarity Michael 0.99 Jay 0.38 July 0.89 Peter 0.92 Stephen 0.66 – Top 3 matches: Michael, Peter, July -> Likely-minded!
  • 11. Recommending Items – 1/2 • Problems: If we only use top 1 likely-minded customer – May accidently turn up customers who haven’t reviewed some of the movies that I might like – Could return a customer who strangely liked a move that got bad reviews from all other customers • Solution: Score the items by producing a weighted score that ranks the customers (weights by similarity)
  • 12. Recommending Items – 2/2 Customer ID Similarity Lady in the Water Snake on a Plane Just My Luck Superm an Returns You, Me, and Dupree The Night Listener Michael 0.99 2.5 3.5 3.0 3.5 2.5 3.0 Jay 0.38 3.5 3.5 3.0 July 0.89 3.5 3.0 4.0 2.5 4.5 Peter 0.92 3.0 4.0 2.0 3.0 3.0 Stephen 0.66 3.0 4.0 5.0 3.0 Total 8.5 18.5 8 15.5 8.5 16.5 Sim. Sum 2.57 3.84 2.8 3.46 2.26 3.84 Total/Sim. 3.31 4.82 2.86 4.48 3.76 4.30 Sum 1 2 3
  • 13. Problems with Collaborative Filtering • When data are sparse, correlations (weights) are based on very few common items -> unreliable • It cannot handle new items • It do not incorporate attribute information • Alternative way: content-based recommendations – Let’s use attribute information!
  • 14. Content-based Recommendations • 1. Defined features and feature values (similar to conjoint analysis) • 2. Describe each item as a vector of features • 3. Develop a user profile: the types of items this user likes – A weighted vector of item attributes – Weights denote the importance of each attribute to the user • 4. Recommend items that are similar to those that a user liked in the past • Note 1: Similar to information retrieval (text mining) • Note 2: Pre-computation possible; More scalable -> Used by Amazon
  • 15. More ideas for improvement • Ensemble methods (combining algorithms) – Most advanced/commercial algorithms combine kNN, matrix factorization (handling large/sparse matrix), and other classifiers • Marginal propensity to buy with/without recommendation (instead of probability of buy) – Anand Bodapati (JMR 2008): “Customers who buy this product buy these other products” kind of recommendation system frequently recommends what customers would have bought anyway and the recommendation system often creates only purchase acceleration rather than expand sales • Incorporate text reviews: text review data can be used as a basis to calculate similarities (i.e. text mining) – Basic methods only rely on numerical ratings/purchase data
  • 16. • Collaborative Filtering in R: Beer Recommender from “Beer Advocate” Data
  • 17. Beer Advocate Data • Each Record: a beer’s name, brewery, metadata (style, ABV), (numerical) ratings (1-5), short text reviews (250 – 5000 characters) • ~1.5 millions reviews posted on Beer Advocate from 1999 to 2011.
  • 18. Collaborative Filtering in R – 1/2 Step 0) Install “R” and Packages R program: http://www.r-project.org/ Package: http://cran.r-project.org/web/packages/tm/index.html Package: http://cran.r-project.org/web/packages/twitteR/index.html Package: http://cran.r-project.org/web/packages/wordcloud/index.html Manual: http://cran.r-project.org/web/packages/tm/vignettes/tm.pdf Step 1) Collecting preference data (ratings) from “Beer Advocate” : Web crawling
  • 19. Recommendation System in R – 2/2 Step 2) Cleaning/Formatting Data (Python) Step 3) Importing Data into R Step 4) Finding Similar Users: Calculating Similarities Step 5) Ranking the Users Step 6) Recommending Items
  • 20. Books • Programming Collective Intelligence (Toby Segaran) • Machine Learning for Hackers (Drew Conway and John Myles White) Articles (more technical) • Internet Recommendation Systems (Asim Ansari, Skander Essegaier, Rajeev Kohli) • Recommendation Systems with Purchase Data (Anand Bodapati) Reference