SlideShare a Scribd company logo
Empirical Evaluation of Active
Learning in Recommender Systems
Mehdi Elahi
Postdoc Researcher
Politecnico di Milano, Italy
1
seminar	@	Politecnico	di	Milano	
July	2015	
www.linkedin.com/in/mehdielahi
My Previous Research Group
2
Ph.D. Adviser:
Francesco Ricci
Full Professor
Dean of the faculty of CS
https://www.inf.unibz.it/idse/
My Current Research Group
3
Research Adviser:
Paolo Cremonesi
Associate Professor
DEIB @ Politecnico di Milano
http://recsys.deib.polimi.it
Outline
¤ Introduction
¤ Active Learning in RS
¤ Offline Evaluation and Results
¤ Application to Mobile RS 
¤ Conclusion and Future Works
4
Introduction
¤ Recommender Systems (RSs) are tools that support users
decision making by suggesting products that can be
interesting to them.
¤ Examples of Recommender Systems:
5
Introduction
¤ Collaborative Filtering:
¤ A technique to predict unknown ratings, by exploiting
ratings given by users, and to recommend the items with
highest predicted ratings.
6
Sparsity of the Data
¤ In Netflix: 98.8 % of the
ratings are unknown
¤ In Movielens: 95.7 % of
the ratings are
unknown
1 2 3 4 5 6 7 8 9
1 2
2 5 3 1
3 1 5
4 5 5
5
6 4 1 4
7
8 5 4
9 5
Items
Users
Ratings
7
Active Learning for Collaborative
Filtering
¤ Active Learning:
¤ Requests and try to collect more ratings from the users
before offering recommendations
8
Which Items should be chosen?
¤ Not all the ratings are equally useful, i.e.,
equally bring information to the system.
¤ To minimize the user rating effort only some
of them should be requested and acquired
9
Definition of AL Strategy
¤ An active learning strategy for a Collaborative
Filtering is a set of rules to choose the best items
for the users to rate
10
Non-Personalized Strategies
¤  Random: selects Items randomly (baseline)
¤  Popularity: scores an item according to the frequency of its ratings
and then chooses the highest scored items (Carenini, 2003)
¤  Entropy: scores each item with the entropy of its ratings and then
chooses the highest scored items (Rashid, 2002 and 2008)
¤  Variance: scores each item with the variance of its ratings and
then chooses the highest scored items (Rubens, 2011)
¤  log(Popularity)*Entropy: combines the popularity and entropy
scores and then chooses the highest scored items (Rashid, 2002
and 2008)
11
Personalized Single Strategies
¤  Highest Predicted: scores an item according to the prediction of
its ratings and then chooses the highest scored items (Elahi, 2011)
¤  Lowest Predicted: scores an item according to the prediction of its
ratings and then chooses the lowest scored items (Elahi, 2011)
¤  Highest-Lowest Predicted: combines the highest predicted and
lowest predicted scores and chooses the highest and lowest
scored items (Elahi, 2011)
¤  Binary Prediction: scores an item according to the prediction of its
ratings (using transformed matrix of user-item) and then chooses
the highest scored items (Elahi, 2011)
¤  Personality based binary prediction: extends the binary prediction
strategy by using user attributes, such as the scores for the Big Five
personality traits on a scale from 1 to 5 (Elahi, 2013).
12
Personalized Combined Strategies
¤ Combined with Voting: scores an item according to the
votes given by a committee of different strategies and then
chooses the highest scored items (Elahi, 2011)
¤ Combined with Switching: adaptively selects a strategy from
a pool of individual AL strategies, based on the estimation of how
well each strategy is able to cope with the conditions at hand.
Then the selected strategy scores an item according to its
criterion (Elahi, 2012)
13
Offline Evaluation (A)
¤  Datasets	are	par))oned	into	three	subsets:	
¤ Known	(K):	contains	the	rating	values	that	are	considered	to	be	known	by	the		
system	at	a	certain	point	in	time.		
¤ Unknown	 (X):	 contains	 the	 ratings	 that	 are	 considered	 to	 be	 known	 by	 the	
users	but	not	to	the	system.	These	ratings	are	incrementally	elicited,	i.e.,	they	are	
transferred	into	K	if	the	system	asks	them	to	the	(simulated)	users.	
¤ Test	(T):	contains	the	ratings	that	are	never	elicited	and	are	used	only	to	test	the	
recommendation	 effectiveness	 after	 the	 system	 has	 acquired	 the	 new	 elicited	
ratings.	
Netflix
No. of users: 480189
No. of items: 17770
No. of ratings: 100M*
Time span: 1998 – 2005
*We used the 1st 1M ratings
Movielens
No. of users: 6040
No. of items: 3900
No. of ratings: 1M
Time span: 2000- 2003
14
Learning Iteration
Item Score
1 151
2 44
3 7
4 1
5 42
6 34
7 9
8 55
9 20
… …
N 12
System computes the
scores for all the
items that can be
scored (according to
a strategy)
15
Learning Iteration
Top 10
items
Score
1 151
8 55
43 54
11 50
2 44
5 42
6 34
22 33
75 29
13 25
The system selects
the top 10 items
and presents them
to the simulated
user
16
Learning Iteration
The items that are
rated in the unknown
set (X) are found and
transferred to the
known set (K)
Rated
items
1
2
5
75
13
17
Learning Iteration
The items that are
rated in the unknown
set (X) are found and
transferred to the
known set (K)
18
System-wide vs User-centered
19
We	have	conducted	System-wide	evalua.on.
Results: MAE
¤ Mean Absolute Error
¤ The lower the better.
¤ Measures the average
absolute deviation of
the predicted rating
from the user's true
rating:
0 20 40 60 80 100 120 140 160 180 200
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
# of iterations
MAE
Mean Absolute Error (MAE)
random
popularity
lowest−pred
highest−pred
voting
(Elahi, 2011)
20
Effect on data distribution
1 2 3 4 5 6 7 8 9
1 5 5 4 2 5 4
2 4 5 5 3 1
3 1 5 4 5
4 5 5 4 4 5 5
5 5 4 5 4 5 5
6 3 5 5 1 4 4 5
7 4 4 5 5 5 5
8 5 5 4 5 5 5
9 5 5 5 5 4
Items
Users
1 2 3 4 5 6 7 8 9
1 2
2 5 3 1
3 1 5
4 5 5
5
6 3 1 4
7
8 5 4
9 5
Items
Rating
Elicitation
21
Histogram of Known Set
¤ Prediction Bias
¤ Since majority of the ratings added by highest-predicted
strategy are ratings with high values, the prediction for
the test set is biased
1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Rating values
Probability
Iteration 1
1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Rating values
Probability
Iteration 20
1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Rating values
Probability
Iteration 40
1 2 3 4 5
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
Rating values
Probability
Iteration 60
22
Evaluation: NDCG
¤ Normalized	Discounted	
Cumulative	Gain:		
¤ The	higher	the	better.	
¤ The	recommendations	for	
u	are	sorted	according	to	
the	 predicted	 rating	
values,	 then	 DCGu	 is	
computed:	 0 20 40 60 80 100 120 140 160 180 200
0.8
0.82
0.84
0.86
0.88
0.9
0.92
# of iterations
NDCG
Normalized Discounted Cumulative Gain (NDCG)
random
popularity
lowest−pred
highest−pred
voting
(Elahi, 2011)
23
Evaluation: Precision
¤ Precision:	 percentage	 of	
the	 items	 with	 rating	
values	(as	in T ) equal	to	4	
or	 5	 in the	 top	 10	
recommended	items.	
0 20 40 60 80 100 120 140 160 180 200
0.72
0.74
0.76
0.78
0.8
0.82
0.84
# of iterations
Precision
Presision
random
popularity
lowest−pred
highest−pred
voting
(Elahi, 2011)
24
Successful Requests
¤ The ratio of the ratings acquired over those requested at
different iterations.
Offline Evaluation (B)
26
Offline Evaluation (B)
¤ All the strategies show a non-monotone behavior, and
there are a lot of fluctuations, since the test set,
dynamically changes in every week.
¤ However, still the proposed strategies perform excellent
in this situation compared to the base-line.
0 20 40 60 80 100 120 140 160
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
# of iterations
MAE
Traditional Evaluation Setting
random
highest−pred
log(pop)*entropy
voting
5 10 15 20 25 30 35 40 45
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
1.05
# of weeks
MAE
Proposed Evaluation Setting
Natural Acquisition
random
highest−pred
log(pop)*entropy
voting
switching
Without Natural Acquisition (Elahi, 2011) With Natural Acquisition (Elahi, 2012)
27
Evaluation: MAE (normalized)
¤  The highest predicted strategy (the default strategy of RSs) is not
performing very differently from the natural acquisition of ratings.
¤  In fact, it is not acquiring additional ratings to those collected by
the natural rating acquisition, i.e., the user rates these items by her
own initiative
4 6 8 10 12 14 16 18 20 22 24
−0.3
−0.25
−0.2
−0.15
−0.1
−0.05
0
# of weeks
normalizedMAE
Normalized Mean Absolute Error
Natural Acquisition
random
highest−pred
log(pop)*entropy
voting
switching
(Elahi, 2012)
28
Evaluation: NDCG (normalized)
4 6 8 10 12 14 16 18 20 22 24
−0.005
0
0.005
0.01
0.015
0.02
0.025
0.03
# of weeks
normalizedNDCG
Normalized NDCG
Natural Acquisition
random
highest−pred
log(pop)*entropy
voting
switching
(Elahi, 2012)
29
¤ Our proposed Voting and Switching strategies, both
perform excellent.
Conclusion of Offline Evaluations
¤  We demonstrate that it is possible to adapt to the changes in
the characteristics of the rating set by proposing two novel
AL strategies:
¤  Combined with Voting
¤  Combined with Switching
¤  a more realistic active learning evaluation settings in which
ratings are added not only by the AL strategies, but also by
users without being prompted to rate (natural rating
acquisition).
¤  Our results show that the natural rating acquisition
considerably influences and changes the performance of
the AL strategies.
30
Application: South Tyrol Suggests(STS)
¤ A mobile Android context-aware
RS that recommends places of
interests (POIs) in the South Tyrol
region.
¤ The system was in an extreme
cold-start situation (only 700
ratings for total of 27,000 POIs).
31
STS: Personality Questionnaire
Neuroticism
Conscientious-
ness
Openness
ExtraversionAgreeableness
Big Five 
Personality Traits
32
STS: Personality Questionnaire
Neuroticism
Conscientious-
ness
Openness
ExtraversionAgreeableness
Big Five 
Personality Traits
33
STS: Active Learning
¤ Using the personality of the user in
the prediction model, the system
estimates which POIs the user likely
has experienced, and hence, can
rate.
34
STS: Contextual Factors
35
STS: Recommendations
¤ STS computes rating predictions for
all POIs from the database, using
the personality information of the
users and the ratings they have
given to the POIs.
36
User Study
Research Hypotheses:
¤ Our proposed personality-based active learning
strategy leads to a larger number of acquired user
ratings and related contextual conditions.
¤ The prediction accuracy and context-awareness of the
recommendation model improves the most when
utilizing our proposed active learning strategy.
37
Results: MAE
MAE
38
Results: Ratio of the Rating Acquisition
Pairs of strategies 

 Means
 p-value
 # of ratings
Random / log(popularity) * entropy
 1.35 / 2.07
 < 0.001
 73 / 112
Random / personality-based binary prediction
 1.35 / 2.31
 < 0.001
 73 / 125
Personality-based binary prediction / log(popularity)
* entropy
2.31 / 2.07
 0.005
 125 / 112
39
Results: Ratio of the Rating Acquisition
0 10 20 30 40 50 60
0
1
2
3
4
#ofacquiredratings
Random
0 10 20 30 40 50 60
0
1
2
3
4
#ofacquiredratings
Log(popularity) * Entropy
0 10 20 30 40 50 60
0
1
2
3
4
#ofacquiredratings
Personality−Based Binary Prediction
2.5
3
3.5
dratings
Comparison of Regressions
Random
Log(popularity) * Entropy
Personality−Based Binary Prediction
0 10 20 30 40 50 60
0
1
2
3
#ofacquiredrati
0 10 20 30 40 50 60
0
1
2
3
4
#ofacquiredratings
Log(popularity) * Entropy
0 10 20 30 40 50 60
0
1
2
3
4
#ofacquiredratings
Personality−Based Binary Prediction
0 10 20 30 40 50 60
1
1.5
2
2.5
3
3.5
Users over Time#ofacquiredratings
Comparison of Regressions
Random
Log(popularity) * Entropy
Personality−Based Binary Prediction
40
Results: Context-Awareness
41
log(popularity) * entropy
 personality based binary pred.
Q1
 3.58
 3.56
Q2
 2.95
 3.31
# of
context
 1.01
 1.52
Results: Context-Awareness
42
Comparison	of	MAE	(the	lower	the	be<er)	and	
nDCG	(the	higher	the	be<er)
Conclusion of User Study
In a live user study, we have:
ü  shown that user personality has an important impact in
user’s rating behavior.
ü  Successfully verified both research hypotheses, i.e., the
personality-based active learning strategy acquired
more ratings and improves the most the rating
prediction accuracy.
43
Main Contributions
¤  Proposing several novel personalized active learning
strategies for collaborative filtering.
¤  Offline evaluation of several active learning strategies with
regards to their system-wide effectiveness.
¤  Comprehensive evaluation of active learning strategies with
regards to several evaluation measures.
¤  Evaluation of active learning strategies with and without
natural acquisition of ratings.
¤  Application of active learning in an up-and-running mobile
context-aware recommender system.
44
Future Works
45
¤ Gamification in Active Learning for RS: making the
rating process more funny and enjoyable for the user.
Shoot the ball to the
place you visited
and liked the most
Future Works
46
¤ Active Learning for Relevant
Context Selection: how to
select context factors that are
relevant to the items.
46
Which contextual condition is more
relevant to this item?
Future Works
47
¤ Sequential Active Learning: selecting and presenting
the items to the user to rate incrementally.
¤ Hence the system can immediately adapt to the
remaining rating requests.
item 1 item 2 item 3 item 4
Future Works
48
3 2 5 1 2
2 3 3 3 4
4 5 3 4 1
5 2 2 1 5
4 1 2 1 5
5 5 2
2 1
5 3
3 4 1
? ? ? ? ?
5 2 3
2 1 4
5 5
5 3 1 2
5 3 2
Target domain Auxiliary source domainUser Personality
new user
Active Learning Active Learning
Future Works
49
High Color VarianceLow Color Variance
Publications on AL
Book Chapter:
2015
¤  N. Rubens, M. Elahi, M. Sugiyama, and D. Kaplan, Active Learning in Recommender Systems. Book
chapter in Recommender Systems Handbook, Springer Verlag, 2015
Journal:
2016
¤  M. Elahi, F. Ricci, N. Rubens, A survey of active learning in collaborative filtering recommender systems,
Computer Science Review,,,,2016,Elsevier
¤  I. Fern´andez-Tob´ıas, M. Braunhofer, M. Elahi, F. Ricci, and I. Cantador. Alleviating the New User Problem
in Collaborative Filtering by Exploiting Personality Information, User Modeling and User-Adapted
Interaction (UMUAI), Personality in Personalized Systems, 2016, Springer
2014
¤  M. Braunhofer, M. Elahi, and F. Ricci. Techniques for cold-starting context-aware mobile recommender
systems for tourism. Intelligenza Artificiale, 8(2):129–143, 2014
2013
¤  M. Elahi, F. Ricci, and N. Rubens. Active learning strategies for rating elicitation in collaborative filtering:
A system-wide perspective. ACM Transactions on Intelligent Systems and Technology (TIST), 5(1):13, 201
50
Full list @ www.researchgate.net/profile/Mehdi_Elahi2
Publications on AL
Conference:
2015
¤  M. Braunhofer, M. Elahi, and F. Ricci. User personality and the new user problem in a context-aware
points of interest recommender system. In Information and Communication Technologies in Tourism
2015. Springer International Publishing, 2015
2014
¤  M. Elahi, F. Ricci, and N. Rubens. Active learning in collaborative filtering recommender systems. In E-
Commerce and Web Technologies (EC-Web), pages 113–124. Springer International Publishing, 2014
¤  M. Braunhofer, M. Elahi, M. Ge, and F. Ricci. Context dependent preference acquisition with personality-
based active learning in mobile recommender systems. In Learning and Collaboration Technologies.
Technology-Rich Environments for Learning and Collaboration, pages 105–116. Springer International
Publishing, 2014
2013
¤  M. Elahi, M. Braunhofer, M. Ricci, and M. Tkalcic. Personality- based active learning for collaborative
filtering recommender systems. In AI* IA 2013: Advances in Artificial Intelligence, pages 360–371.
Springer International Publishing, 2013
51
Full list @ www.researchgate.net/profile/Mehdi_Elahi2
Publications on AL
Conference:
2012
¤  M. Elahi, F. Ricci, and N. Rubens. Adapting to natural rating acquisition with combined active learning
strategies. In Foundations of Intelligent Systems, pages 254–263. Springer Berlin Heidelberg, 2012
2011
¤  M. Elahi, V. Repsys, and F. Ricci. Rating elicitation strategies for collaborative filtering. In E-Commerce
and Web Technologies (EC-Web), pages 160–171. Springer Berlin Heidelberg, 2011
52
Full list @ www.researchgate.net/profile/Mehdi_Elahi2
Thank you!
53
seminar	@	Politecnico	di	Milano	
July	2015

More Related Content

PDF
User Personality and the New User Problem in a Context-Aware Point of Interes...
PDF
Active Learning in Collaborative Filtering Recommender Systems : a Survey
PDF
Interaction Design Patterns in Recommender Systems
PDF
Cold-Start Management with Cross-Domain Collaborative Filtering and Tags
PDF
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
PPT
Context-Aware Points of Interest Suggestion with Dynamic Weather Data Management
PDF
Replicable Evaluation of Recommender Systems
PDF
South Tyrol Suggests - STS
User Personality and the New User Problem in a Context-Aware Point of Interes...
Active Learning in Collaborative Filtering Recommender Systems : a Survey
Interaction Design Patterns in Recommender Systems
Cold-Start Management with Cross-Domain Collaborative Filtering and Tags
User Personality and the New User Problem in a Context-­‐Aware POI Recommende...
Context-Aware Points of Interest Suggestion with Dynamic Weather Data Management
Replicable Evaluation of Recommender Systems
South Tyrol Suggests - STS

What's hot (20)

PDF
Usability Assessment of a Context-Aware and Personality-Based Mobile Recommen...
PPT
Thesis Presentation
PDF
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
PDF
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
PDF
Ijmer 46067276
PPTX
Issue-based metrics
PDF
ch5 Issues based metrics
PDF
[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies
PDF
Product Recommendations Enhanced with Reviews
PDF
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...
PDF
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...
PPTX
Recommender Systems - A Review and Recent Research Trends
PPTX
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
PDF
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
PDF
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
PDF
Survey Research In Empirical Software Engineering
PPT
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
PPTX
Background on Usability Engineering
PPTX
Collaborative Filtering Recommendation System
PPTX
Interest Mining of Images based on User Interaction
Usability Assessment of a Context-Aware and Personality-Based Mobile Recommen...
Thesis Presentation
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
[WI 2017] Affective Prediction By Collaborative Chains In Movie Recommendation
Ijmer 46067276
Issue-based metrics
ch5 Issues based metrics
[WI 2017] Context Suggestion: Empirical Evaluations vs User Studies
Product Recommendations Enhanced with Reviews
Ontology-Based Data Access Mapping Generation using Data, Schema, Query, and ...
[ADMA 2017] Identification of Grey Sheep Users By Histogram Intersection In R...
Recommender Systems - A Review and Recent Research Trends
Comparative Recommender System Evaluation: Benchmarking Recommendation Frame...
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Continuous Evaluation of Collaborative Recommender Systems in Data Stream Man...
Survey Research In Empirical Software Engineering
Make the Most of Your Time: How Should the Analyst Work with Automated Tracea...
Background on Usability Engineering
Collaborative Filtering Recommendation System
Interest Mining of Images based on User Interaction
Ad

Similar to Empirical Evaluation of Active Learning in Recommender Systems (20)

PDF
Recommender Systems and Active Learning
PDF
Recommender Systems and Active Learning (for Startups)
PDF
Active Learning In Recommender Systems
PPTX
Олександр Обєдніков “Рекомендательные системы”
PDF
Offline evaluation of recommender systems: all pain and no gain?
PPTX
Rokach-GomaxSlides (1).pptx
PPTX
Rokach-GomaxSlides.pptx
PDF
What recommender systems can learn from decision psychology about preference ...
PDF
Modern Recommendation for Advanced Practitioners
PDF
Introduction to behavior based recommendation system
PDF
Past, present, and future of Recommender Systems: an industry perspective
PPTX
Udacity webinar on Recommendation Systems
PDF
Recommender Systems
PDF
Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...
PPTX
[UPDATE] Udacity webinar on Recommendation Systems
PDF
IUI Step-hai 2025 workshop Keynote by Martijn Willemsen
PDF
Modern Perspectives on Recommender Systems and their Applications in Mendeley
PPTX
Distribution Problems in Recommender Systems
PDF
Recsys 2018 overview and highlights
Recommender Systems and Active Learning
Recommender Systems and Active Learning (for Startups)
Active Learning In Recommender Systems
Олександр Обєдніков “Рекомендательные системы”
Offline evaluation of recommender systems: all pain and no gain?
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides.pptx
What recommender systems can learn from decision psychology about preference ...
Modern Recommendation for Advanced Practitioners
Introduction to behavior based recommendation system
Past, present, and future of Recommender Systems: an industry perspective
Udacity webinar on Recommendation Systems
Recommender Systems
Data Science Popup Austin: Predicting Customer Behavior & Enhancing Customer ...
[UPDATE] Udacity webinar on Recommendation Systems
IUI Step-hai 2025 workshop Keynote by Martijn Willemsen
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Distribution Problems in Recommender Systems
Recsys 2018 overview and highlights
Ad

Recently uploaded (20)

PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
annual-report-2024-2025 original latest.
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Mega Projects Data Mega Projects Data
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Fluorescence-microscope_Botany_detailed content
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
IB Computer Science - Internal Assessment.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction to Knowledge Engineering Part 1
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Qualitative Qantitative and Mixed Methods.pptx
ISS -ESG Data flows What is ESG and HowHow
IBA_Chapter_11_Slides_Final_Accessible.pptx
annual-report-2024-2025 original latest.
Acceptance and paychological effects of mandatory extra coach I classes.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Supervised vs unsupervised machine learning algorithms
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Mega Projects Data Mega Projects Data
Galatica Smart Energy Infrastructure Startup Pitch Deck
Fluorescence-microscope_Botany_detailed content
.pdf is not working space design for the following data for the following dat...
IB Computer Science - Internal Assessment.pptx

Empirical Evaluation of Active Learning in Recommender Systems

  • 1. Empirical Evaluation of Active Learning in Recommender Systems Mehdi Elahi Postdoc Researcher Politecnico di Milano, Italy 1 seminar @ Politecnico di Milano July 2015 www.linkedin.com/in/mehdielahi
  • 2. My Previous Research Group 2 Ph.D. Adviser: Francesco Ricci Full Professor Dean of the faculty of CS https://www.inf.unibz.it/idse/
  • 3. My Current Research Group 3 Research Adviser: Paolo Cremonesi Associate Professor DEIB @ Politecnico di Milano http://recsys.deib.polimi.it
  • 4. Outline ¤ Introduction ¤ Active Learning in RS ¤ Offline Evaluation and Results ¤ Application to Mobile RS ¤ Conclusion and Future Works 4
  • 5. Introduction ¤ Recommender Systems (RSs) are tools that support users decision making by suggesting products that can be interesting to them. ¤ Examples of Recommender Systems: 5
  • 6. Introduction ¤ Collaborative Filtering: ¤ A technique to predict unknown ratings, by exploiting ratings given by users, and to recommend the items with highest predicted ratings. 6
  • 7. Sparsity of the Data ¤ In Netflix: 98.8 % of the ratings are unknown ¤ In Movielens: 95.7 % of the ratings are unknown 1 2 3 4 5 6 7 8 9 1 2 2 5 3 1 3 1 5 4 5 5 5 6 4 1 4 7 8 5 4 9 5 Items Users Ratings 7
  • 8. Active Learning for Collaborative Filtering ¤ Active Learning: ¤ Requests and try to collect more ratings from the users before offering recommendations 8
  • 9. Which Items should be chosen? ¤ Not all the ratings are equally useful, i.e., equally bring information to the system. ¤ To minimize the user rating effort only some of them should be requested and acquired 9
  • 10. Definition of AL Strategy ¤ An active learning strategy for a Collaborative Filtering is a set of rules to choose the best items for the users to rate 10
  • 11. Non-Personalized Strategies ¤  Random: selects Items randomly (baseline) ¤  Popularity: scores an item according to the frequency of its ratings and then chooses the highest scored items (Carenini, 2003) ¤  Entropy: scores each item with the entropy of its ratings and then chooses the highest scored items (Rashid, 2002 and 2008) ¤  Variance: scores each item with the variance of its ratings and then chooses the highest scored items (Rubens, 2011) ¤  log(Popularity)*Entropy: combines the popularity and entropy scores and then chooses the highest scored items (Rashid, 2002 and 2008) 11
  • 12. Personalized Single Strategies ¤  Highest Predicted: scores an item according to the prediction of its ratings and then chooses the highest scored items (Elahi, 2011) ¤  Lowest Predicted: scores an item according to the prediction of its ratings and then chooses the lowest scored items (Elahi, 2011) ¤  Highest-Lowest Predicted: combines the highest predicted and lowest predicted scores and chooses the highest and lowest scored items (Elahi, 2011) ¤  Binary Prediction: scores an item according to the prediction of its ratings (using transformed matrix of user-item) and then chooses the highest scored items (Elahi, 2011) ¤  Personality based binary prediction: extends the binary prediction strategy by using user attributes, such as the scores for the Big Five personality traits on a scale from 1 to 5 (Elahi, 2013). 12
  • 13. Personalized Combined Strategies ¤ Combined with Voting: scores an item according to the votes given by a committee of different strategies and then chooses the highest scored items (Elahi, 2011) ¤ Combined with Switching: adaptively selects a strategy from a pool of individual AL strategies, based on the estimation of how well each strategy is able to cope with the conditions at hand. Then the selected strategy scores an item according to its criterion (Elahi, 2012) 13
  • 14. Offline Evaluation (A) ¤  Datasets are par))oned into three subsets: ¤ Known (K): contains the rating values that are considered to be known by the system at a certain point in time. ¤ Unknown (X): contains the ratings that are considered to be known by the users but not to the system. These ratings are incrementally elicited, i.e., they are transferred into K if the system asks them to the (simulated) users. ¤ Test (T): contains the ratings that are never elicited and are used only to test the recommendation effectiveness after the system has acquired the new elicited ratings. Netflix No. of users: 480189 No. of items: 17770 No. of ratings: 100M* Time span: 1998 – 2005 *We used the 1st 1M ratings Movielens No. of users: 6040 No. of items: 3900 No. of ratings: 1M Time span: 2000- 2003 14
  • 15. Learning Iteration Item Score 1 151 2 44 3 7 4 1 5 42 6 34 7 9 8 55 9 20 … … N 12 System computes the scores for all the items that can be scored (according to a strategy) 15
  • 16. Learning Iteration Top 10 items Score 1 151 8 55 43 54 11 50 2 44 5 42 6 34 22 33 75 29 13 25 The system selects the top 10 items and presents them to the simulated user 16
  • 17. Learning Iteration The items that are rated in the unknown set (X) are found and transferred to the known set (K) Rated items 1 2 5 75 13 17
  • 18. Learning Iteration The items that are rated in the unknown set (X) are found and transferred to the known set (K) 18
  • 20. Results: MAE ¤ Mean Absolute Error ¤ The lower the better. ¤ Measures the average absolute deviation of the predicted rating from the user's true rating: 0 20 40 60 80 100 120 140 160 180 200 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 # of iterations MAE Mean Absolute Error (MAE) random popularity lowest−pred highest−pred voting (Elahi, 2011) 20
  • 21. Effect on data distribution 1 2 3 4 5 6 7 8 9 1 5 5 4 2 5 4 2 4 5 5 3 1 3 1 5 4 5 4 5 5 4 4 5 5 5 5 4 5 4 5 5 6 3 5 5 1 4 4 5 7 4 4 5 5 5 5 8 5 5 4 5 5 5 9 5 5 5 5 4 Items Users 1 2 3 4 5 6 7 8 9 1 2 2 5 3 1 3 1 5 4 5 5 5 6 3 1 4 7 8 5 4 9 5 Items Rating Elicitation 21
  • 22. Histogram of Known Set ¤ Prediction Bias ¤ Since majority of the ratings added by highest-predicted strategy are ratings with high values, the prediction for the test set is biased 1 2 3 4 5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Rating values Probability Iteration 1 1 2 3 4 5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Rating values Probability Iteration 20 1 2 3 4 5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Rating values Probability Iteration 40 1 2 3 4 5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Rating values Probability Iteration 60 22
  • 23. Evaluation: NDCG ¤ Normalized Discounted Cumulative Gain: ¤ The higher the better. ¤ The recommendations for u are sorted according to the predicted rating values, then DCGu is computed: 0 20 40 60 80 100 120 140 160 180 200 0.8 0.82 0.84 0.86 0.88 0.9 0.92 # of iterations NDCG Normalized Discounted Cumulative Gain (NDCG) random popularity lowest−pred highest−pred voting (Elahi, 2011) 23
  • 24. Evaluation: Precision ¤ Precision: percentage of the items with rating values (as in T ) equal to 4 or 5 in the top 10 recommended items. 0 20 40 60 80 100 120 140 160 180 200 0.72 0.74 0.76 0.78 0.8 0.82 0.84 # of iterations Precision Presision random popularity lowest−pred highest−pred voting (Elahi, 2011) 24
  • 25. Successful Requests ¤ The ratio of the ratings acquired over those requested at different iterations.
  • 27. Offline Evaluation (B) ¤ All the strategies show a non-monotone behavior, and there are a lot of fluctuations, since the test set, dynamically changes in every week. ¤ However, still the proposed strategies perform excellent in this situation compared to the base-line. 0 20 40 60 80 100 120 140 160 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 # of iterations MAE Traditional Evaluation Setting random highest−pred log(pop)*entropy voting 5 10 15 20 25 30 35 40 45 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1.05 # of weeks MAE Proposed Evaluation Setting Natural Acquisition random highest−pred log(pop)*entropy voting switching Without Natural Acquisition (Elahi, 2011) With Natural Acquisition (Elahi, 2012) 27
  • 28. Evaluation: MAE (normalized) ¤  The highest predicted strategy (the default strategy of RSs) is not performing very differently from the natural acquisition of ratings. ¤  In fact, it is not acquiring additional ratings to those collected by the natural rating acquisition, i.e., the user rates these items by her own initiative 4 6 8 10 12 14 16 18 20 22 24 −0.3 −0.25 −0.2 −0.15 −0.1 −0.05 0 # of weeks normalizedMAE Normalized Mean Absolute Error Natural Acquisition random highest−pred log(pop)*entropy voting switching (Elahi, 2012) 28
  • 29. Evaluation: NDCG (normalized) 4 6 8 10 12 14 16 18 20 22 24 −0.005 0 0.005 0.01 0.015 0.02 0.025 0.03 # of weeks normalizedNDCG Normalized NDCG Natural Acquisition random highest−pred log(pop)*entropy voting switching (Elahi, 2012) 29 ¤ Our proposed Voting and Switching strategies, both perform excellent.
  • 30. Conclusion of Offline Evaluations ¤  We demonstrate that it is possible to adapt to the changes in the characteristics of the rating set by proposing two novel AL strategies: ¤  Combined with Voting ¤  Combined with Switching ¤  a more realistic active learning evaluation settings in which ratings are added not only by the AL strategies, but also by users without being prompted to rate (natural rating acquisition). ¤  Our results show that the natural rating acquisition considerably influences and changes the performance of the AL strategies. 30
  • 31. Application: South Tyrol Suggests(STS) ¤ A mobile Android context-aware RS that recommends places of interests (POIs) in the South Tyrol region. ¤ The system was in an extreme cold-start situation (only 700 ratings for total of 27,000 POIs). 31
  • 34. STS: Active Learning ¤ Using the personality of the user in the prediction model, the system estimates which POIs the user likely has experienced, and hence, can rate. 34
  • 36. STS: Recommendations ¤ STS computes rating predictions for all POIs from the database, using the personality information of the users and the ratings they have given to the POIs. 36
  • 37. User Study Research Hypotheses: ¤ Our proposed personality-based active learning strategy leads to a larger number of acquired user ratings and related contextual conditions. ¤ The prediction accuracy and context-awareness of the recommendation model improves the most when utilizing our proposed active learning strategy. 37
  • 39. Results: Ratio of the Rating Acquisition Pairs of strategies Means p-value # of ratings Random / log(popularity) * entropy 1.35 / 2.07 < 0.001 73 / 112 Random / personality-based binary prediction 1.35 / 2.31 < 0.001 73 / 125 Personality-based binary prediction / log(popularity) * entropy 2.31 / 2.07 0.005 125 / 112 39
  • 40. Results: Ratio of the Rating Acquisition 0 10 20 30 40 50 60 0 1 2 3 4 #ofacquiredratings Random 0 10 20 30 40 50 60 0 1 2 3 4 #ofacquiredratings Log(popularity) * Entropy 0 10 20 30 40 50 60 0 1 2 3 4 #ofacquiredratings Personality−Based Binary Prediction 2.5 3 3.5 dratings Comparison of Regressions Random Log(popularity) * Entropy Personality−Based Binary Prediction 0 10 20 30 40 50 60 0 1 2 3 #ofacquiredrati 0 10 20 30 40 50 60 0 1 2 3 4 #ofacquiredratings Log(popularity) * Entropy 0 10 20 30 40 50 60 0 1 2 3 4 #ofacquiredratings Personality−Based Binary Prediction 0 10 20 30 40 50 60 1 1.5 2 2.5 3 3.5 Users over Time#ofacquiredratings Comparison of Regressions Random Log(popularity) * Entropy Personality−Based Binary Prediction 40
  • 41. Results: Context-Awareness 41 log(popularity) * entropy personality based binary pred. Q1 3.58 3.56 Q2 2.95 3.31 # of context 1.01 1.52
  • 43. Conclusion of User Study In a live user study, we have: ü  shown that user personality has an important impact in user’s rating behavior. ü  Successfully verified both research hypotheses, i.e., the personality-based active learning strategy acquired more ratings and improves the most the rating prediction accuracy. 43
  • 44. Main Contributions ¤  Proposing several novel personalized active learning strategies for collaborative filtering. ¤  Offline evaluation of several active learning strategies with regards to their system-wide effectiveness. ¤  Comprehensive evaluation of active learning strategies with regards to several evaluation measures. ¤  Evaluation of active learning strategies with and without natural acquisition of ratings. ¤  Application of active learning in an up-and-running mobile context-aware recommender system. 44
  • 45. Future Works 45 ¤ Gamification in Active Learning for RS: making the rating process more funny and enjoyable for the user. Shoot the ball to the place you visited and liked the most
  • 46. Future Works 46 ¤ Active Learning for Relevant Context Selection: how to select context factors that are relevant to the items. 46 Which contextual condition is more relevant to this item?
  • 47. Future Works 47 ¤ Sequential Active Learning: selecting and presenting the items to the user to rate incrementally. ¤ Hence the system can immediately adapt to the remaining rating requests. item 1 item 2 item 3 item 4
  • 48. Future Works 48 3 2 5 1 2 2 3 3 3 4 4 5 3 4 1 5 2 2 1 5 4 1 2 1 5 5 5 2 2 1 5 3 3 4 1 ? ? ? ? ? 5 2 3 2 1 4 5 5 5 3 1 2 5 3 2 Target domain Auxiliary source domainUser Personality new user Active Learning Active Learning
  • 49. Future Works 49 High Color VarianceLow Color Variance
  • 50. Publications on AL Book Chapter: 2015 ¤  N. Rubens, M. Elahi, M. Sugiyama, and D. Kaplan, Active Learning in Recommender Systems. Book chapter in Recommender Systems Handbook, Springer Verlag, 2015 Journal: 2016 ¤  M. Elahi, F. Ricci, N. Rubens, A survey of active learning in collaborative filtering recommender systems, Computer Science Review,,,,2016,Elsevier ¤  I. Fern´andez-Tob´ıas, M. Braunhofer, M. Elahi, F. Ricci, and I. Cantador. Alleviating the New User Problem in Collaborative Filtering by Exploiting Personality Information, User Modeling and User-Adapted Interaction (UMUAI), Personality in Personalized Systems, 2016, Springer 2014 ¤  M. Braunhofer, M. Elahi, and F. Ricci. Techniques for cold-starting context-aware mobile recommender systems for tourism. Intelligenza Artificiale, 8(2):129–143, 2014 2013 ¤  M. Elahi, F. Ricci, and N. Rubens. Active learning strategies for rating elicitation in collaborative filtering: A system-wide perspective. ACM Transactions on Intelligent Systems and Technology (TIST), 5(1):13, 201 50 Full list @ www.researchgate.net/profile/Mehdi_Elahi2
  • 51. Publications on AL Conference: 2015 ¤  M. Braunhofer, M. Elahi, and F. Ricci. User personality and the new user problem in a context-aware points of interest recommender system. In Information and Communication Technologies in Tourism 2015. Springer International Publishing, 2015 2014 ¤  M. Elahi, F. Ricci, and N. Rubens. Active learning in collaborative filtering recommender systems. In E- Commerce and Web Technologies (EC-Web), pages 113–124. Springer International Publishing, 2014 ¤  M. Braunhofer, M. Elahi, M. Ge, and F. Ricci. Context dependent preference acquisition with personality- based active learning in mobile recommender systems. In Learning and Collaboration Technologies. Technology-Rich Environments for Learning and Collaboration, pages 105–116. Springer International Publishing, 2014 2013 ¤  M. Elahi, M. Braunhofer, M. Ricci, and M. Tkalcic. Personality- based active learning for collaborative filtering recommender systems. In AI* IA 2013: Advances in Artificial Intelligence, pages 360–371. Springer International Publishing, 2013 51 Full list @ www.researchgate.net/profile/Mehdi_Elahi2
  • 52. Publications on AL Conference: 2012 ¤  M. Elahi, F. Ricci, and N. Rubens. Adapting to natural rating acquisition with combined active learning strategies. In Foundations of Intelligent Systems, pages 254–263. Springer Berlin Heidelberg, 2012 2011 ¤  M. Elahi, V. Repsys, and F. Ricci. Rating elicitation strategies for collaborative filtering. In E-Commerce and Web Technologies (EC-Web), pages 160–171. Springer Berlin Heidelberg, 2011 52 Full list @ www.researchgate.net/profile/Mehdi_Elahi2