Recommender Systems (RS)
and
Active Learning (AL)
Neil Rubens & Dain Kaplan
June 2016
• Value of RS
• RS Methods
• RS Objectives for Startups
• AL for RS
2
Outline
Value of 

Recommender Systems
http://previews.123rf.com/images/lisafx/lisafx1004/lisafx100400069/6903279-Customer-getting-advice-about-tape-from-a-hardware-store-clerk-Isolated-on-white--Stock-Photo.jpg
4
Why Recommender Systems?
User
System
Items
http://previews.123rf.com/images/lisafx/lisafx1004/lisafx100400069/6903279-Customer-getting-advice-about-tape-from-a-hardware-store-clerk-Isolated-on-white--Stock-Photo.jpg
• User Objectives
• finding needed item
• value
• utility
• enjoyment
• novelty
• serendipity
• etc.
• System Objectives
• revenue
• profit
• promote partners
• # of users
• # of visits
• time spent
• etc.
Objectives may overlap!
5
RS Objectives
Value of RS
• Amazon: 35% of sales from recommendations
• Netflix: 2/3 of the movies watched are
recommended
• Choicestream: 28% of the people would buy
more music if they found what they liked
• Google News: recommendations generate
38% more click-throughs
www.slideshare.net/kerveros99/machine-learning-for-recommender-systems-mlss-2015-sydney 6
7
RS Methods
User
Items
Predictions
8
What’s an RS?
Sarah
Like
Like
Like LikeHate
HateRatings ? ? ? ?
•Assumption: preferences of “similar”
items/users stay similar
•Similarity: variety of ways to define
9
Common Approach
Use ratings to estimate “similarity”
10
Collaborative Filtering (CF)
Users
Items Ratings
Love
Like
Okay
Dislike
Hate
https://buildingrecommenders.wordpress.com/2015/11/23/overview-of-recommender-algorithms-part-5/
Users with similar dis/likes are similar,
e.g. if Sarah and you have similar
tastes, then anything that Sarah likes
you will too (and vice versa)
Similar items will have similar
ratings, e.g., if you liked a book A,
you will also like a book B with a
similar rating
https://buildingrecommenders.wordpress.com/2015/11/23/overview-of-recommender-algorithms-part-5/
11
Item-based CF
User-based CF
5"
Model&
Model"(user)"Based"Es0ma0on"
Model"(item)"Based"Es0ma0on"
Ac0ve"Learning"
12
MODEL
MODEL (USER) BASED ESTIMATION
MODEL (ITEM) BASED ESTIMATION
ACTIVE LEARNING
GediminasAdomavicius,AlexanderTuzhilin,"TowardtheNextGenerationofRecommenderSystems:ASurveyoftheState-of-the-ArtandPossible
Extensions",IEEETransactionsonKnowledge&DataEngineering,vol.17,no.6,pp.734-749,June2005,doi:10.1109/TKDE.2005.99
Tailored to:
• domains
• item types
• data types
• objectives
• etc.
13
VARIETY OF RS APPROACHES
14
RS Objectives for
Startups
Established
Companies
Startups
“cruise mode”
• Many existing loyal users
• RS used to increase per-
user metrics, e.g. revenue,
profit, etc.
“launch mode”
• Still building user-base
• RS used to attract/
retain new users
15
Startups = Growth
“The only essential thing is
growth. Everything else we
associate with startups follows
from growth.”
(Paul Graham, Y Combinator)
16
https://caferacerlaurbanabike.files.wordpress.com/2015/02/new-store-coming-soon.jpg
Expecting many:
• new users
• new items
17
18
“Cold Start” Problem
? ? ? ? ?
?
?
?
?
?
?
• RS Needs user/item data to make
recommendations with CF
• For new users/new items, 

no data is available yet:
• New item problem
• New user problem
New User
New
Item
• Problem: don’t have any reviews yet 

(to base recommendations on)
• Solution: can use content-based item
similarity (to bootstrap recommendations)
19
New Item Problem
Jordan Jumpman Team II Air Jordan 1 Retro High
Nouveau
Hurley One And Only
Printed
Air Jordan 1 Retro
High OG
http://cache2.asset-cache.net/gc/558944927-side-view-of-man-opening-cafe-door-gettyimages.jpg?v=1&c=IWSAsset&k=2&d=BrH9aKEkYRiNc1pWhEX0etmgH38bczDi5XkuRcvp%2Bb9LQTmCIaIUqwLdVhpVf%2B9B
20
New User Problem
• Very important to make a good first impression:
bad first impression may lose potential user
• Problem: can’t make personalised
recommendations (no data on the user yet)
21
Importance of Good
Recommendations
Seriously?
http://www.smh.com.au/content/dam/images/2/5/u/n/g/image.related.articleLeadwide.620x349.25ume.png/1347596915177.jpg
22
Learning New User Preferences
• Talking: learn about the user
implicitly/explicitly
• Stalking: obtain data
indirectly
• Contacts:

friends may already be users
of app (likely to have similar
interests)
• Location
• Device type
• Social profile
NOTE: should not be intrusive
23
Indirect Data
http://orangewebsitedesign.com/wp-content/uploads/2015/01/Jim-working-on-website-design-for-client-on-glass-wall.jpg
24
System Interaction Data
• How: learn about user through
implicit/explicit interaction
• clicks (or their absence)
• duration
• navigation paths
• etc.
• What: make interaction more
informative: item selection
• position
• attributes
• grouping
Active Learning (AL)
for
Recommender Systems
• Recommend an item that a
user will like:
Popular items, i.e., everyone
likes (but provides little info
about user’s preferences)
• Present an item to learn about user’s
preferences (Active Learning, AL):
Contentious Items, i.e., many people
like / dislike (informative about user’s
preferences)
26
Item Selection
•RS Presents items for two primary purposes:
•In practice multiple items are shown for different
objectives
27
AL Categories
• Item-based AL: analyse items and select
items that seem most informative
• Model-based AL: analyse model and
select items that seem most informative
• Popular: rated by many users [Rashid 2002]
• High Variance in Ratings: item that people
either like or hate [Rashid 2002]
• Best/Worst: ask user which items s/he likes
most/least [Leino & Raiha 2007]
• Influential: items on which ratings of many
other items depend (representative + not
represented) [Rubens & Sugiyama 2007]
28
Item Categories
c
a
b
input1
input2
d
• 3R Properties:
• Represented by the
existing training set? E.g.,
(b) is already represented
• Representative of others? 

E.g., (a) is not this way
• Results in achieving
objective? E.g., (d) → max
coverage
[Rubens & Kaplan, 2010] 29
Item-based AL
!
"
#
!"#$%'
!"#$%'
)
Illustrative Example: movies
are clustered by genre
30
Item Selection:

Learning User Preferences
X1
X2
Limited
information
due to few
items
31
Simply Not Useful
X1 X1
X2
X2X2
Ratings
positive
negative
System: limited knowledge
User: not much variety, may get bored
32
User Satisfaction
Drawback
X1 X1
X2
User exposed to disliked items
33
Coverage
Drawback
34
decision
boundary
decision
boundary
Actual Model Random Sampling Active Learning
decision
boundary
Prediction Accuracy
10
11
X2
10
11
X2
X1
Initial
Improve Margin/
Confidence
Improve Orientation
35
AL Model Error


















g: optimal function (in the sollution
space)
bf : learned function
bfi ’s: learned functions from a slightly
di⇣erent training set.
EG = B +V +C
B =
⇣
Ebf (x) g(x)
⌘2
V =
⇣
bf Ebf (x)
⌘2
C = (g(x) f (x))2
Model Error – C
constant and is ignored
Bias – B
Hard to estimate, but is assumed
to vanish (assymptotically).
Variance – V
Estimate and minize.
10 / 20
36
AL Model Error
Table 1: Performance comparison of active learning strategies (“XX” Very Good, “X” Good, “ ” Poor, “-” Not Available)
ML: Movielens, NF: Netflix, EM: EachMovie, AWM: Active Web Museum, MP: MyPersonality, STS: South Tyrol Suggests, LF: Last.fm
Type Strategy
Metric Eval.
Compar. Strategies Datasets
MAE/RMSE
NDCG/MAP
Precision
#Rating
Online
Offline
Non-Personalized
Single
uncertainty based
1. variance [59, 61] X - - - - y 2, 4, 6, 9, 24 AWM, EM
2. entropy [20, 67] - - - - y 3, 6, 8, 9, 11, 13, 22 EM
3. entropy0 [67] XX - - XX y y 2, 6, 8, 11, 13, 22 ML
error reduction
4. greedy extend [68] X - - - - y 2, 3, 6, 7, 10, 11 NF
5. representative [69] - XX XX - - y 6 NF, ML, LF
attention based
6. popularity [20, 67] X - - XX y y 2, 8, 9, 11, 13, 22 ML
7. co-coverage [68] - - - - y 2, 3, 4, 6, 10, 11 NF
Combined
static combin.
8. rand-pop [20, 67] - - y y 2, 3, 6, 11, 13, 22 ML
9. log(pop)*entropy [20] XX - - X y y 3, 6, 8, 13 ML
10. sqrt(pop)*var [68] X - - - - y 2, 3, 4, 6, 7, 11 NF
11. HELF [67] XX - - y y 2, 3, 6, 8, 13, 22 ML
12. non-pers-part rand. [11] X XX X - y 1, 6, 9, 12, 14, 20, 21, 28, 29 ML, NF
Personalized
Single
acquisition prob.
13. item-item [20, 67] - - XX y y 2, 3, 6, 8, 9, 11, 22 ML
14. binary-pred [11, 12] X XX X - y 1, 6, 9, 12, 20, 21, 28, 29 ML, NF
15. personality-based [70, 97] XX XX - XX y y 3, 9, 14 STS, MP
16. impact analysis [71] XX - - - - y 9 ML
prediction based
17. aspect model [72, 73] X - - - - y 2 EM, ML
18. min rating [74] X - - - - y 19,25 ML
19. min norm [74] - - - - y 18,25 ML
20. highest-pred [11, 12] X XX X - y 1, 6, 9, 12, 14, 21, 28, 29 ML, NF
21. lowest-pred [11, 12] X X - y 1, 6, 9, 12, 14, 20, 28, 29 ML, NF
user partitioning
22. IGCN [67] XX - - X y y 2, 3, 6, 8, 11, 13 ML
23. decision tree [64] XX - - - - y 3, 4, 10, 11 NF
Combined
static combin.
24. influence based [61] XX - - - - y 1, 4, 6, 9 ML
25. non-myopic [74] X - - - - y 18, 19 ML
26. treeU [75] X - - - - y 23, 27 ML, EM, NF
27. fMF [75] XX - - - - y 23, 26 ML, EM, NF
28. pers-partially rand. [11] X XX X - y 1, 6, 9, 12, 14, 20, 21, 28, 29 ML, NF
29. voting [11, 12] XX XX - y 1, 6, 9, 12, 14, 20, 21, 28 ML, NF
adaptive combin. 30. switching [76] XX XX - XX - y 9, 20, 29 ML
Mehdi Elahi, Francesco Ricci, Neil Rubens,A survey of active learning in collaborative
filtering recommender systems, Computer Science Review, Elsevier, 2016.
It is clearly shown in the table that different strategies can improve different aspects of the recom-
mendation quality. In terms of rating prediction accuracy (MAE/RMSE), there are various strategies that
have shown excellent performance. While, some of these strategies are easy to implement (e.g., Entropy0
and Log(popularity)*Entropy), others are more complex and use more sophisticated Machine Learning
algorithms (e.g., Decision Tree, and Personality-based FM). Strategies that have shown excellent per-
formance in terms of ranking quality (NDCG/MAP), are Representative-based and Voting strategies.
In terms of precision, prediction-based strategies (Highest-predicted, and Binary-predicted) have shown
excellent performance. In terms of number of ratings acquired (# Ratings), as expected, strategies that
consider the popularity of items (Popularity and Entropy0) can acquire the largest number of ratings.
But, other strategies that maximize the chance that the selected items are familiar to the user (Item-item
and Personality-based) can also elicit a considerable number of ratings. For these strategies the success
ratio (#acquired_ratings/#requested_items) is the largest. This is an important factor, since strategies
that only focus on the informativeness of the items may fail to actually acquire ratings, by selecting
obscure items that users do not know and cannot rate.
Table 1: Performance comparison of active learning strategies (“XX” Very Good, “X” Good, “ ” Poor, “-” Not Available)
ML: Movielens, NF: Netflix, EM: EachMovie, AWM: Active Web Museum, MP: MyPersonality, STS: South Tyrol Suggests, LF: Last.fm
Metric Eval.
Tailored to:
•different
objectives
•different
data &
settings
37
MANY AL-RS APPROACHES
http://www.win.tue.nl/~eknutov/gaf.html 38
RS Complexity
• RS composed of many modules that need tuning to
achieve high performance
Take-home Messages
• RS shows users items they want
• RS accounts for a large portion of purchases
• RS methods: user/item-based
• RS is crucial for user growth, and:
• addressing new items/users (“cold start”) with:
• indirect data acquisition
• content-based item similarity
• informative item selection with AL
• Many RS components could be tuned to achieve high
performance
EOP

More Related Content

PDF
e-learning 3.0 and AI
PDF
Cities and Startups: Cultivating Deep Engagement
PPTX
Notes on Machine Learning and Data-centric Startups
PDF
Predictive apps for startups
PDF
Investor's View on Machine Intelligence startups, 1.0, @YellowDoors meetup Ap...
PDF
Startups are about learning, SW Startup Day at TUT
PDF
SUPERSMART LEARNING TOOLS for Lean Startups: Volume 1 - Six Question (Q) Temp...
DOCX
Investors foresee a safe bet on deep tech startups
e-learning 3.0 and AI
Cities and Startups: Cultivating Deep Engagement
Notes on Machine Learning and Data-centric Startups
Predictive apps for startups
Investor's View on Machine Intelligence startups, 1.0, @YellowDoors meetup Ap...
Startups are about learning, SW Startup Day at TUT
SUPERSMART LEARNING TOOLS for Lean Startups: Volume 1 - Six Question (Q) Temp...
Investors foresee a safe bet on deep tech startups

Viewers also liked (17)

PPTX
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
PPTX
Investor's view on machine intelligence startups, 2.0, Jan 2017
PDF
Deep learning in production with the best
PPTX
BootstrapLabs - Tracxn Report - artificial intelligence for the Applied Arti...
PDF
Machine learning and TensorFlow
PDF
Deep Learning & NLP: Graphs to the Rescue!
PDF
Venture Scanner Artificial Intelligence 2016 Q4
PDF
Introduction to Machine Learning and Deep Learning
PPTX
Deep Learning in Computer Vision
PDF
H2O Deep Learning at Next.ML
PDF
How to win data science competitions with Deep Learning
PDF
H2O Distributed Deep Learning by Arno Candel 071614
PDF
Transform your Business with AI, Deep Learning and Machine Learning
PDF
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
PDF
Deep Learning Computer Build
PDF
Deep learning - Conceptual understanding and applications
PDF
Passive stereo vision with deep learning
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Investor's view on machine intelligence startups, 2.0, Jan 2017
Deep learning in production with the best
BootstrapLabs - Tracxn Report - artificial intelligence for the Applied Arti...
Machine learning and TensorFlow
Deep Learning & NLP: Graphs to the Rescue!
Venture Scanner Artificial Intelligence 2016 Q4
Introduction to Machine Learning and Deep Learning
Deep Learning in Computer Vision
H2O Deep Learning at Next.ML
How to win data science competitions with Deep Learning
H2O Distributed Deep Learning by Arno Candel 071614
Transform your Business with AI, Deep Learning and Machine Learning
BigDL: A Distributed Deep Learning Library on Spark: Spark Summit East talk b...
Deep Learning Computer Build
Deep learning - Conceptual understanding and applications
Passive stereo vision with deep learning
Ad

Similar to Recommender Systems and Active Learning (for Startups) (20)

PDF
6 data envelopment_analysis
 
PPT
Ch02.ppt
PDF
BeepTunes Music Recommender System
PPT
LPILP Models-1.ppt
PPTX
Lessons learnt at building recommendation services at industry scale
PDF
Model Selection for Range Segmentation of Curved Objects 1st editon by Alirez...
PPTX
OR Ndejje Univ.pptx
PDF
Introduction to core science models
PPTX
Udacity webinar on Recommendation Systems
PDF
Entity Summarization with User Feedback (ESWC 2020)
PPTX
Rokach-GomaxSlides (1).pptx
PPTX
Rokach-GomaxSlides.pptx
PDF
Neural Architectures for Named Entity Recognition
PPTX
[UPDATE] Udacity webinar on Recommendation Systems
PPTX
OR Ndejje Univ (1).pptx
PPTX
Deep Dive to Learning to Rank for Graph Search.pptx
PDF
Context-aware Recommendation: A Quick View
PPTX
Mining massive datasets using recommender system
PDF
Ensemble Methods and Recommender Systems
6 data envelopment_analysis
 
Ch02.ppt
BeepTunes Music Recommender System
LPILP Models-1.ppt
Lessons learnt at building recommendation services at industry scale
Model Selection for Range Segmentation of Curved Objects 1st editon by Alirez...
OR Ndejje Univ.pptx
Introduction to core science models
Udacity webinar on Recommendation Systems
Entity Summarization with User Feedback (ESWC 2020)
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides.pptx
Neural Architectures for Named Entity Recognition
[UPDATE] Udacity webinar on Recommendation Systems
OR Ndejje Univ (1).pptx
Deep Dive to Learning to Rank for Graph Search.pptx
Context-aware Recommendation: A Quick View
Mining massive datasets using recommender system
Ensemble Methods and Recommender Systems
Ad

More from Neil Rubens (15)

PDF
Autism: Survey of Emerging Approaches [Clinical]
PDF
Collaborative Robotics (CoBot): Opportunities for Corporations
PDF
Autism: Survey of Emerging Approaches [Startups]
PDF
Solving the AL Chicken-and-Egg Corpus and Model Problem
PDF
ThingTank @ MIT-Skoltech Innovation Symposium 2014
PDF
Network Learning: AI-driven Connectivist Framework for E-Learning 3.0
PDF
Learning Networks: e-Learning 3.0
KEY
Active Learning in Recommender Systems
PPTX
Inconsistent Outliers
PDF
Outliers and Inconsistency
PPTX
Alumni Network Analysis
PPTX
Value Co-Creation in Innovation Ecosystems (Presentation @ Tokyo Institute of...
PPTX
Value Co-Creation in Innovation Ecosystems (English)
PPTX
Value Co-Creation in Innovation Ecosystems (Chinese)
PPTX
Japan Mobile
Autism: Survey of Emerging Approaches [Clinical]
Collaborative Robotics (CoBot): Opportunities for Corporations
Autism: Survey of Emerging Approaches [Startups]
Solving the AL Chicken-and-Egg Corpus and Model Problem
ThingTank @ MIT-Skoltech Innovation Symposium 2014
Network Learning: AI-driven Connectivist Framework for E-Learning 3.0
Learning Networks: e-Learning 3.0
Active Learning in Recommender Systems
Inconsistent Outliers
Outliers and Inconsistency
Alumni Network Analysis
Value Co-Creation in Innovation Ecosystems (Presentation @ Tokyo Institute of...
Value Co-Creation in Innovation Ecosystems (English)
Value Co-Creation in Innovation Ecosystems (Chinese)
Japan Mobile

Recently uploaded (20)

PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PDF
Global Data and Analytics Market Outlook Report
PPTX
eGramSWARAJ-PPT Training Module for beginners
PPTX
ai agent creaction with langgraph_presentation_
PPT
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
PDF
Microsoft Core Cloud Services powerpoint
PPTX
statsppt this is statistics ppt for giving knowledge about this topic
PPT
expt-design-lecture-12 hghhgfggjhjd (1).ppt
PDF
Best Data Science Professional Certificates in the USA | IABAC
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PPTX
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
PPTX
chrmotography.pptx food anaylysis techni
PDF
A biomechanical Functional analysis of the masitary muscles in man
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
DOCX
Factor Analysis Word Document Presentation
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Global Data and Analytics Market Outlook Report
eGramSWARAJ-PPT Training Module for beginners
ai agent creaction with langgraph_presentation_
PROJECT CYCLE MANAGEMENT FRAMEWORK (PCM).ppt
Microsoft Core Cloud Services powerpoint
statsppt this is statistics ppt for giving knowledge about this topic
expt-design-lecture-12 hghhgfggjhjd (1).ppt
Best Data Science Professional Certificates in the USA | IABAC
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Navigating the Thai Supplements Landscape.pdf
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
chuitkarjhanbijunsdivndsijvndiucbhsaxnmzsicvjsd
chrmotography.pptx food anaylysis techni
A biomechanical Functional analysis of the masitary muscles in man
IMPACT OF LANDSLIDE.....................
Crypto_Trading_Beginners.pptxxxxxxxxxxxxxx
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Factor Analysis Word Document Presentation

Recommender Systems and Active Learning (for Startups)