SlideShare a Scribd company logo
Predicting Performance in Recommender Systems
                                                                                                  Alejandro Bellogín
                                                                     Supervised by Pablo Castells and Iván Cantador
                                                           Information Retrieval Group, Universidad Autónoma de Madrid, Spain
                                                                                alejandro.bellogin@uam.es
                                                                                                                                                                                                                                                            IRG
                                                                                                                                                                                                                                                            IR Group @ UAM


          Motivation                                                                                                    Research questions
 Is it possible to predict the accuracy of a recommendation?                                             1. Is it possible to define a performance prediction theory for recommender
 E.g., we can decide whether to deliver a recommendation or not,                                            systems in a sound, formal way?
 depending on such prediction. Or, even, to combine different
                                                                                                         2. Is it possible to adapt query performance techniques (from IR) to the
 recommenders according to the expected performance of each one.
                                                                                                            recommendation task?
                                                                                                         3. What kind of evaluation should be performed? Is IR evaluation still valid in our
          Hypothesis                                                                                        problem?
 Data that are commonly available to a Recommender System could
 contain signals that enable an a priori estimation of the success of
                                                                                                         4. What kind of recommendation problems can these models be applied to?
 the recommendation


          Research question 1                                                                 Research question 2
 Is it possible to define a performance prediction theory                                                                                                                                                                     User clarity
                                                                            Is it possible to adapt query performance
 for recommender systems in a sound, formal way?                            techniques (from IR) to the recommendation task? It captures the uncertainty in user’s data
                                                                            • In Information Retrieval: “Estimation of the system’s
                                                                                                                                                                       p  x | u   user’s model
 a) Define a predictor of performance                                          performance in response to a specific query”         clarity  u    p  x | u  log 
                                                                                                                                                                       p  x     
                                                                                                                                                    x X                                                                                                                   c                    system’s model
                         = (u, i, r, …)                                   •       Several predictors proposed                                                                                                               Distance between the user’s and the
 b) Agree on a performance metric                                           •       We focus on query clarity  user clarity                                                                                                  system’s probability model
                         = (u, i, r, …)
                                                                                        0.7                                                           RatItem                                       p_c(x)
                                                                                        0.6                                                                                                         p(x|u1)
                                                                                                                                                                                                    p(x|u2)
                                                                                                                                                                                                                              We propose 3 formulations (for space X):
 c) Check predictive power by measuring correlation
                                                                                        0.5



                                                                                                                                                                                                                              • Based on ratings
                                                                                        0.4




         corr([(x1), …, (xn)], [(x1), …, (xn)])
                                                                                        0.3

                                                                                        0.2
                                                                                                                                                                                                                              • Based on items
                                                                                                                                                                                                                              • Based on ratings and items
                                                                                        0.1

 d) Evaluate final performance: dynamic vs static                                       0.0
                                                                                              4           3         5           2         1        Ratings 4         3       5       2          1




Correlation coefficients
                                                        r ~ 0.57
                                                                                              Research question 4
                                                                            What kind of recommendation problems can these models be applied to?
Pearson: linear correlation                                                 •      Whenever a combination of strategies is available
Spearman: rank correlation
                                                                            • Example 1: dynamic neighbor weighting                                                                                             • Example 2: dynamic ensemble recommendation
                                                                            • The user’s neighbors are                                                                                weighted                  • Weight is the same for every item and user
                                                                              according to their similarity                                                                                                       (learnt from training)
                                                                            • Can we take into account the uncertainty in                                                                                       • What about boosting those users predicted to
          Research question 3                                                 neighbor’s data?                                                                                                                    perform better for some recommender?
 What kind of evaluation should be performed? Is IR                         • User neighbor weighting:                                                                                                          • Hybrid recommendation:
 evaluation still valid in our problem?                                            • Static:                                g  u, i   C            sim u, v   rat  v, i                                  • Static: g u, i     gR1 u, i   1     gR2 u, i 
                                                                                                                                                   vN [ u ]
 •    In IR: Mean Average Precision + correlation                             • Dynamic: g  u, i   C  γ  v   sim u, v   rat  v, i  • Dynamic: g u, i   γ u   gR1 u, i   1  γ u   gR2 u, i 
                                                                                                       vN [ u ]
           50 points (queries) vs 1000+ points (users)                      • Correlation analysis:                                           • Correlation analysis:
 •    Performance metric is not clear:
             error-based?
                                                                            • Performance:
             precision-based?                                                   0,98                               Standard CF                            0,88 b) Neighbourhood size: 500
                                                                                                                                                                                  Standard CF                            b) 80% training
                                                                                 0,96                               Clarity-enhanced CF                    0,87                    Clarity-enhanced CF


 • What is performance?                                                          0,94                                                                      0,86
                                                                                                                                                                                                                                                                 nDCG@50
                                                                                 0,92
                                                                                 0,90
                                                                                                                                                           0,85
                                                                                                                                                                                                                • Performance:
                                                                                                                                                     MAE




                                                                                                                                                                                                                                                      0.2
                                                                           MAE




          It may depend on the final application
                                                                                                                                                           0,84
                                                                                 0,88
                                                                                                                                                           0,83                                                                                      0.15
                                                                                 0,86
                                                                                 0,84                                                                      0,82

 • Possible bias                                                                 0,82
                                                                                 0,80
                                                                                                                                                           0,81
                                                                                                                                                           0,80
                                                                                                                                                                                                                                                      0.1


                                                                                                                                                                                                                                                     0.05

          E.g., towards users or items with larger profiles
                                                                                         10       20     30    40    50    60       70   80   90       10      20 100 150 200 250 300 350 400 450 500
                                                                                                                                                                   30 40 50 60 70 80 90                        100 150 200 250 300 350 400 450 500
                                                                                                       % of ratings for training                                         Neighbourhood size                                                            0
                                                                                                                                                                                                                                                            H1   H2                      H3           H4

                                                                                                                                                                                                                                                                 Adaptive       Static




          Future Work                                                                                                                                                                                                 Publications
                  Explore other input sources                              We need a theoretical background                                                                                                   • A Performance Prediction Aproach to Enhance
                                                                           Why do some predictors work better?                                                                                                  Collaborative Filtering Performance. A. Bellogín
      Implicit data (with time)       Social links
                                                                                                                                                                                                                and P. Castells. In ECIR 2010.
      Item predictors                 o Use graph-based measures
                                                                                                                                                                                                              • Predicting the Performance of Recommender
      o Different recommender          as indicators of user strength
                                                                                                                                                                                                                Systems: An Information Theoretic Approach. A.
      behavior depending on item       o   First results are positive                                                                                                                                           Bellogín, P. Castells, and I. Cantador. In ICTIR
      attributes                                                                                                                                                                                                2011.
      o They would allow to capture                                                                                                                                                                           • Self-adjusting Hybrid Recommenders based on
      popularity, diversity, etc.                                                                                                                                                                               Social Network Analysis. A. Bellogín, P. Castells,
                                                                           Larger datasets
                                                                                                                                                                                                                and I. Cantador. In SIGIR 2011.
                                                 5th ACM Conference on Recommender Systems (RecSys2011) – Doctoral Symposium Acknowledgments to the National Science Foundation
                                                                        Chicago, USA, 23-27 October 2011                            for the funding to attend the conference

More Related Content

PDF
Extending Recommendation Systems With Semantics And Context Awareness
PDF
Intelligence Compensation Theory - Workplace Psychology Sig 2010
PDF
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
PDF
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
PDF
Multivariate analyses & decoding
PDF
[DL輪読会]マルチエージェント強化学習と⼼の理論 〜Hanabiゲームにおけるベイズ推論を⽤いたマルチエージェント 強化学習⼿法〜
PDF
Butler2 Blended Media Recipe Handouts
PDF
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...
Extending Recommendation Systems With Semantics And Context Awareness
Intelligence Compensation Theory - Workplace Psychology Sig 2010
HIS'2008: New Crossover Operator for Evolutionary Rule Discovery in XCS
Machine Learning for objective QoE assessment: Science, Myths and a look to t...
Multivariate analyses & decoding
[DL輪読会]マルチエージェント強化学習と⼼の理論 〜Hanabiゲームにおけるベイズ推論を⽤いたマルチエージェント 強化学習⼿法〜
Butler2 Blended Media Recipe Handouts
CCIA'2008: Can Evolution Strategies Improve Learning Guidance in XCS? Design ...

What's hot (12)

DOC
97cogsci.doc
PPTX
Calling Dr Watson To Radiology - RSNA Presentation
PDF
Meta analysis-sloan
PDF
IRJET - Ensembling Reinforcement Learning for Portfolio Management
DOCX
Performance analysis of machine learning algorithms on self localization system1
PPT
Paradigm shifts in wildlife and biodiversity management through machine learning
PDF
New challenges monolixday2011
PDF
5004 implementing aggregate_awareness_in_sap_business_objects
PDF
Statistically adaptive learning for a general class of..
PPTX
Secondary mathematics wednesday august 22 2012
DOC
Chapter6.doc
97cogsci.doc
Calling Dr Watson To Radiology - RSNA Presentation
Meta analysis-sloan
IRJET - Ensembling Reinforcement Learning for Portfolio Management
Performance analysis of machine learning algorithms on self localization system1
Paradigm shifts in wildlife and biodiversity management through machine learning
New challenges monolixday2011
5004 implementing aggregate_awareness_in_sap_business_objects
Statistically adaptive learning for a general class of..
Secondary mathematics wednesday august 22 2012
Chapter6.doc
Ad

Viewers also liked (6)

PPTX
2012 Digital Opportunities
PPT
Leergang Opleidingsmanagement
PPTX
OOcamp, presentatie Jongerenwelzijn, project OO voor ouders met tieners in ar...
DOCX
Segundo examen
PPT
PEShare.co.uk Shared Resource
PDF
Pratica i
2012 Digital Opportunities
Leergang Opleidingsmanagement
OOcamp, presentatie Jongerenwelzijn, project OO voor ouders met tieners in ar...
Segundo examen
PEShare.co.uk Shared Resource
Pratica i
Ad

Similar to Predicting performance in Recommender Systems - Poster (20)

PDF
Predicting performance in Recommender Systems - Slides
PDF
GECCO'2006: Bounding XCS’s Parameters for Unbalanced Datasets
PDF
Precision-oriented Evaluation of Recommender Systems: An Algorithmic Comparis...
PPTX
Leveraging collaborativetaggingforwebitemdesign ajithajjarani
PDF
Statistically adaptive learning for a general class of..
PDF
LAK13 linkedup tutorial_evaluation_framework
PDF
Comparing State-of-the-Art Collaborative Filtering Systems
PDF
Introducing LCS to Digital Design Verification
PPT
Overfitting and-tbl
PDF
Predicting performance in Recommender Systems - Poster slam
PDF
Probabilistic Retrieval
PDF
PROSA - A Framework for Online Failure Prediction through Online Testing
PDF
Latent factor models for Collaborative Filtering
DOC
uai2004_V1.doc.doc.doc
PDF
BM25 Scoring for Lucene: From Academia to Industry
PDF
Learning Classifier Systems for Class Imbalance Problems
PDF
S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis
PDF
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
PPTX
Symposium 28 june 2011
PPTX
Evolutionary Testing of Stateful Systems: a Holistic Approach
Predicting performance in Recommender Systems - Slides
GECCO'2006: Bounding XCS’s Parameters for Unbalanced Datasets
Precision-oriented Evaluation of Recommender Systems: An Algorithmic Comparis...
Leveraging collaborativetaggingforwebitemdesign ajithajjarani
Statistically adaptive learning for a general class of..
LAK13 linkedup tutorial_evaluation_framework
Comparing State-of-the-Art Collaborative Filtering Systems
Introducing LCS to Digital Design Verification
Overfitting and-tbl
Predicting performance in Recommender Systems - Poster slam
Probabilistic Retrieval
PROSA - A Framework for Online Failure Prediction through Online Testing
Latent factor models for Collaborative Filtering
uai2004_V1.doc.doc.doc
BM25 Scoring for Lucene: From Academia to Industry
Learning Classifier Systems for Class Imbalance Problems
S-CUBE LP: Analyzing Business Process Performance Using KPI Dependency Analysis
Towards Minimal Test Collections for Evaluation of Audio Music Similarity and...
Symposium 28 june 2011
Evolutionary Testing of Stateful Systems: a Holistic Approach

More from Alejandro Bellogin (17)

PDF
Recommender Systems and Misinformation: The Problem or the Solution?
PDF
Revisiting neighborhood-based recommenders for temporal scenarios
PDF
Evaluating decision-aware recommender systems
PDF
Replicable Evaluation of Recommender Systems
PDF
Implicit vs Explicit trust in Social Matrix Factorization
PDF
RiVal - A toolkit to foster reproducibility in Recommender System evaluation
PDF
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
PDF
CWI @ Contextual Suggestion track - TREC 2013
PDF
CWI @ Federated Web Track - TREC 2013
PDF
Probabilistic Collaborative Filtering with Negative Cross Entropy
PDF
Understanding Similarity Metrics in Neighbour-based Recommender Systems
PDF
Artist popularity: do web and social music services agree?
PDF
Improving Memory-Based Collaborative Filtering by Neighbour Selection based o...
PDF
Performance prediction and evaluation in Recommender Systems: an Information ...
PDF
Using Graph Partitioning Techniques for Neighbour Selection in User-Based Col...
PDF
Using Graph Partitioning Techniques for Neighbour Selection in User-Based Col...
PDF
Precision-oriented Evaluation of Recommender Systems: An Algorithmic Comparis...
Recommender Systems and Misinformation: The Problem or the Solution?
Revisiting neighborhood-based recommenders for temporal scenarios
Evaluating decision-aware recommender systems
Replicable Evaluation of Recommender Systems
Implicit vs Explicit trust in Social Matrix Factorization
RiVal - A toolkit to foster reproducibility in Recommender System evaluation
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
CWI @ Contextual Suggestion track - TREC 2013
CWI @ Federated Web Track - TREC 2013
Probabilistic Collaborative Filtering with Negative Cross Entropy
Understanding Similarity Metrics in Neighbour-based Recommender Systems
Artist popularity: do web and social music services agree?
Improving Memory-Based Collaborative Filtering by Neighbour Selection based o...
Performance prediction and evaluation in Recommender Systems: an Information ...
Using Graph Partitioning Techniques for Neighbour Selection in User-Based Col...
Using Graph Partitioning Techniques for Neighbour Selection in User-Based Col...
Precision-oriented Evaluation of Recommender Systems: An Algorithmic Comparis...

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Machine learning based COVID-19 study performance prediction
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
KodekX | Application Modernization Development
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
A Presentation on Artificial Intelligence
PPTX
Big Data Technologies - Introduction.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Cloud computing and distributed systems.
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
cuic standard and advanced reporting.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Approach and Philosophy of On baking technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Encapsulation theory and applications.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Machine learning based COVID-19 study performance prediction
The Rise and Fall of 3GPP – Time for a Sabbatical?
KodekX | Application Modernization Development
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Understanding_Digital_Forensics_Presentation.pptx
A Presentation on Artificial Intelligence
Big Data Technologies - Introduction.pptx
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Cloud computing and distributed systems.
The AUB Centre for AI in Media Proposal.docx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
cuic standard and advanced reporting.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

Predicting performance in Recommender Systems - Poster

  • 1. Predicting Performance in Recommender Systems Alejandro Bellogín Supervised by Pablo Castells and Iván Cantador Information Retrieval Group, Universidad Autónoma de Madrid, Spain alejandro.bellogin@uam.es IRG IR Group @ UAM Motivation Research questions Is it possible to predict the accuracy of a recommendation? 1. Is it possible to define a performance prediction theory for recommender E.g., we can decide whether to deliver a recommendation or not, systems in a sound, formal way? depending on such prediction. Or, even, to combine different 2. Is it possible to adapt query performance techniques (from IR) to the recommenders according to the expected performance of each one. recommendation task? 3. What kind of evaluation should be performed? Is IR evaluation still valid in our Hypothesis problem? Data that are commonly available to a Recommender System could contain signals that enable an a priori estimation of the success of 4. What kind of recommendation problems can these models be applied to? the recommendation Research question 1 Research question 2 Is it possible to define a performance prediction theory User clarity Is it possible to adapt query performance for recommender systems in a sound, formal way? techniques (from IR) to the recommendation task? It captures the uncertainty in user’s data • In Information Retrieval: “Estimation of the system’s  p  x | u   user’s model a) Define a predictor of performance performance in response to a specific query” clarity  u    p  x | u  log   p  x   x X  c  system’s model  = (u, i, r, …) • Several predictors proposed Distance between the user’s and the b) Agree on a performance metric • We focus on query clarity  user clarity system’s probability model  = (u, i, r, …) 0.7 RatItem p_c(x) 0.6 p(x|u1) p(x|u2) We propose 3 formulations (for space X): c) Check predictive power by measuring correlation 0.5 • Based on ratings 0.4 corr([(x1), …, (xn)], [(x1), …, (xn)]) 0.3 0.2 • Based on items • Based on ratings and items 0.1 d) Evaluate final performance: dynamic vs static 0.0 4 3 5 2 1 Ratings 4 3 5 2 1 Correlation coefficients r ~ 0.57 Research question 4 What kind of recommendation problems can these models be applied to? Pearson: linear correlation • Whenever a combination of strategies is available Spearman: rank correlation • Example 1: dynamic neighbor weighting • Example 2: dynamic ensemble recommendation • The user’s neighbors are weighted • Weight is the same for every item and user according to their similarity (learnt from training) • Can we take into account the uncertainty in • What about boosting those users predicted to Research question 3 neighbor’s data? perform better for some recommender? What kind of evaluation should be performed? Is IR • User neighbor weighting: • Hybrid recommendation: evaluation still valid in our problem? • Static: g  u, i   C  sim u, v   rat  v, i  • Static: g u, i     gR1 u, i   1     gR2 u, i  vN [ u ] • In IR: Mean Average Precision + correlation • Dynamic: g  u, i   C  γ  v   sim u, v   rat  v, i  • Dynamic: g u, i   γ u   gR1 u, i   1  γ u   gR2 u, i  vN [ u ] 50 points (queries) vs 1000+ points (users) • Correlation analysis: • Correlation analysis: • Performance metric is not clear:  error-based? • Performance:  precision-based? 0,98 Standard CF 0,88 b) Neighbourhood size: 500 Standard CF b) 80% training 0,96 Clarity-enhanced CF 0,87 Clarity-enhanced CF • What is performance? 0,94 0,86 nDCG@50 0,92 0,90 0,85 • Performance: MAE 0.2 MAE  It may depend on the final application 0,84 0,88 0,83 0.15 0,86 0,84 0,82 • Possible bias 0,82 0,80 0,81 0,80 0.1 0.05  E.g., towards users or items with larger profiles 10 20 30 40 50 60 70 80 90 10 20 100 150 200 250 300 350 400 450 500 30 40 50 60 70 80 90 100 150 200 250 300 350 400 450 500 % of ratings for training Neighbourhood size 0 H1 H2 H3 H4 Adaptive Static Future Work Publications Explore other input sources We need a theoretical background • A Performance Prediction Aproach to Enhance Why do some predictors work better? Collaborative Filtering Performance. A. Bellogín  Implicit data (with time)  Social links and P. Castells. In ECIR 2010.  Item predictors o Use graph-based measures • Predicting the Performance of Recommender o Different recommender as indicators of user strength Systems: An Information Theoretic Approach. A. behavior depending on item o First results are positive Bellogín, P. Castells, and I. Cantador. In ICTIR attributes 2011. o They would allow to capture • Self-adjusting Hybrid Recommenders based on popularity, diversity, etc. Social Network Analysis. A. Bellogín, P. Castells, Larger datasets and I. Cantador. In SIGIR 2011. 5th ACM Conference on Recommender Systems (RecSys2011) – Doctoral Symposium Acknowledgments to the National Science Foundation Chicago, USA, 23-27 October 2011 for the funding to attend the conference