SlideShare a Scribd company logo
Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike June 29th, 2009 HT 2009, Workshop “Web 3.0: Merging Semantic Web and Social Web” Dr. Peter Brusilovsky, Associate Professor Denis Parra, PhD Student School of Information Sciences University of Pittsburgh
Outline Motivation Methods CCF NwCF BM25 The Study Description of the Data Results Conclusions
Motivation Based on information available on CiteULike : Develop user-centered recommendations of scientific articles. Investigate the  potential of  users’  tags  in  collaborative tagging  systems  to provide recommendations . Compare  the accuracy of user-based  collaborative filtering methods . Why CiteULike? Popular collaborative tagging system more topic-oriented than delicious: article references. Familiarity with the system.
CiteULike
Methods: CCF (1 / 2) Classic Collaborative Filtering  (CCF): user-based recommendations, using Pearson Correlation (users’ similarity) and adjusted ratings to rank items to recommend [1]
Methods: CCF (2 / 2) 3 4 1 4 4 1 1 3 3 2 5 3 4 2 1 3 2 2 5 3 3 2
Methods: NwCF (1 / 2) Neighbor weighted Collaborative Filtering  (NwCF): Similar to CCF, yet incorporates  the “amount of  neighbors rating an item” in the ranking formula of recommended items
Methods: NwCF (2 / 2) 3 4 1 4 4 1 1 3 3 2 5 3 4 2 1 3 2 2 5 3 3 2
Methods: BM25 (1 / 2) BM25 :  We obtain the similarity between users (neighbors) using their set of tags as “documents” and performing an  Okapi BM25 (probabilistic IR model) Retrieval Status Value [2] calculation.
Methods: BM25 (2 / 2) Query terms Doc_1 Doc_2 Doc_3
The Study 7 subjects To each subject, four lists of 10 recommendations (each list) were created (CCF, NwCF, BM25_10, BM25_20) The four lists were combined and sorted randomly (due to overlapping of recommendations, less than 40 items) Subjects were asked to evaluate relevance (relevant/somewhat relevant/not relevant) and novelty (novel/ somewhat novel/ not novel)
Description of the Data Crawl CUL for 20 “center users” (only 7 were used for the study) Annotation: tuple {user, article, tag} Item # of unique instances users 358 articles 186,122 tags 51,903 annotations 902,711
Results (a) nDCG  (b) Average Novelty  (c) Precision_2@5 (d) Precision_2@10  (e) Precision_2_1@5  (f) Precision_2_1@10
Conclusions The rating scale must be considered carefully in a CF approach. NwCF, which incorporates the number of raters, decreases the uncertainty produced by items with too few ratings. The tag-based user similarity approach shows interesting results, which can lead us to consider it a valid approach to Pearson-correlation when using CF algorithms. We will incorporate more users in our future studies to make the results more conclusive.
Questions?
Bibliography [1] Schafer, J., Frankowski, D., Herlocker, J. and Sen, S. 2007 Collaborative Filtering Recommender Systems. The Adaptive Web. (May 2007), 291-324.  [2] Manning, C., Raghavan, P. and Schutze, H. 2008 Introduction to Information Retrieval. Cambridge University Press.

More Related Content

PPT
2010 ASIST - FRBR user research by Yin Zhang and Athena Salaba
PPT
NISO Webinar on Usage Data: An Overview of Recent Usage Data Research
PPT
Chang network analysis
PDF
Domain sensitive recommendation with user-item subgroup analysis
PPTX
what is a report?
PPTX
6 metabolite enrichment analysis
PPTX
Metabolomic Data Analysis Workshop and Tutorials (2014)
PPTX
Currents steps to be a researcher and faculty
2010 ASIST - FRBR user research by Yin Zhang and Athena Salaba
NISO Webinar on Usage Data: An Overview of Recent Usage Data Research
Chang network analysis
Domain sensitive recommendation with user-item subgroup analysis
what is a report?
6 metabolite enrichment analysis
Metabolomic Data Analysis Workshop and Tutorials (2014)
Currents steps to be a researcher and faculty

Viewers also liked (12)

PPTX
A Hybrid Peer Recommender System for a Online Community Teachers
PPTX
Twitter in Academic Conferences
PDF
Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural ...
PDF
SetFusion Visual Hybrid Recommender - IUI 2014
PDF
Walk the Talk: Analyzing the relation between implicit and explicit feedback ...
PPTX
Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Re...
PDF
LDA on social bookmarking systems
PDF
Research on Recommender Systems: Beyond Ratings and Lists
PDF
The Effect of Different Set-based Visualizations on User Exploration of Reco...
PDF
Data Fusion for Dealing with the Recommendation Problem
PDF
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
PDF
Interactive Recommender Systems
A Hybrid Peer Recommender System for a Online Community Teachers
Twitter in Academic Conferences
Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural ...
SetFusion Visual Hybrid Recommender - IUI 2014
Walk the Talk: Analyzing the relation between implicit and explicit feedback ...
Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Re...
LDA on social bookmarking systems
Research on Recommender Systems: Beyond Ratings and Lists
The Effect of Different Set-based Visualizations on User Exploration of Reco...
Data Fusion for Dealing with the Recommendation Problem
Network Visualization guest lecture at #DataVizQMSS at @Columbia / #SNA at PU...
Interactive Recommender Systems
Ad

Similar to Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike (20)

PPTX
Collaborative Filtering Survey
PDF
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...
DOCX
A scalable hybrid research paper recommender system for micro
PDF
Non-textual ranking in Digital Libraries
PDF
A Citation-Based Recommender System For Scholarly Paper Recommendation
PPTX
A task-based scientific paper recommender system for literature review and ma...
PDF
Tourism Based Hybrid Recommendation System
PDF
Query Recommendation by using Collaborative Filtering Approach
PDF
Applying ‘best fit’ frameworks to systematic review data extraction
PDF
Hybrid Personalized Recommender System Using Modified Fuzzy C-Means Clusterin...
PDF
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
PPTX
CBC versus ACBC (empirical findings, recommendations and best practices)
PPTX
How to write an academic paper by a Bulgarian teacher
PDF
Poster (1)
PDF
Cluster Based Association Rule Mining for Courses Recommendation System
PDF
CLUSTER BASED ASSOCIATION RULE MINING FOR COURSES RECOMMENDATION SYSTEM
PDF
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
PPTX
Final presentation construction collaboration
PPTX
PhD defense
PPTX
Collaborative filtering
Collaborative Filtering Survey
A Hybrid Personalized Scientific Paper Recommendation Approach Integrating Pu...
A scalable hybrid research paper recommender system for micro
Non-textual ranking in Digital Libraries
A Citation-Based Recommender System For Scholarly Paper Recommendation
A task-based scientific paper recommender system for literature review and ma...
Tourism Based Hybrid Recommendation System
Query Recommendation by using Collaborative Filtering Approach
Applying ‘best fit’ frameworks to systematic review data extraction
Hybrid Personalized Recommender System Using Modified Fuzzy C-Means Clusterin...
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
CBC versus ACBC (empirical findings, recommendations and best practices)
How to write an academic paper by a Bulgarian teacher
Poster (1)
Cluster Based Association Rule Mining for Courses Recommendation System
CLUSTER BASED ASSOCIATION RULE MINING FOR COURSES RECOMMENDATION SYSTEM
A Threshold Fuzzy Entropy Based Feature Selection: Comparative Study
Final presentation construction collaboration
PhD defense
Collaborative filtering
Ad

Recently uploaded (20)

PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Institutional Correction lecture only . . .
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Cell Types and Its function , kingdom of life
PDF
Pre independence Education in Inndia.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Insiders guide to clinical Medicine.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
2.FourierTransform-ShortQuestionswithAnswers.pdf
Institutional Correction lecture only . . .
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Sports Quiz easy sports quiz sports quiz
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Microbial diseases, their pathogenesis and prophylaxis
Cell Types and Its function , kingdom of life
Pre independence Education in Inndia.pdf
Cell Structure & Organelles in detailed.
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf

Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike

  • 1. Evaluation of Collaborative Filtering Algorithms for Recommending Articles on CiteULike June 29th, 2009 HT 2009, Workshop “Web 3.0: Merging Semantic Web and Social Web” Dr. Peter Brusilovsky, Associate Professor Denis Parra, PhD Student School of Information Sciences University of Pittsburgh
  • 2. Outline Motivation Methods CCF NwCF BM25 The Study Description of the Data Results Conclusions
  • 3. Motivation Based on information available on CiteULike : Develop user-centered recommendations of scientific articles. Investigate the potential of users’ tags in collaborative tagging systems to provide recommendations . Compare the accuracy of user-based collaborative filtering methods . Why CiteULike? Popular collaborative tagging system more topic-oriented than delicious: article references. Familiarity with the system.
  • 5. Methods: CCF (1 / 2) Classic Collaborative Filtering (CCF): user-based recommendations, using Pearson Correlation (users’ similarity) and adjusted ratings to rank items to recommend [1]
  • 6. Methods: CCF (2 / 2) 3 4 1 4 4 1 1 3 3 2 5 3 4 2 1 3 2 2 5 3 3 2
  • 7. Methods: NwCF (1 / 2) Neighbor weighted Collaborative Filtering (NwCF): Similar to CCF, yet incorporates the “amount of neighbors rating an item” in the ranking formula of recommended items
  • 8. Methods: NwCF (2 / 2) 3 4 1 4 4 1 1 3 3 2 5 3 4 2 1 3 2 2 5 3 3 2
  • 9. Methods: BM25 (1 / 2) BM25 : We obtain the similarity between users (neighbors) using their set of tags as “documents” and performing an Okapi BM25 (probabilistic IR model) Retrieval Status Value [2] calculation.
  • 10. Methods: BM25 (2 / 2) Query terms Doc_1 Doc_2 Doc_3
  • 11. The Study 7 subjects To each subject, four lists of 10 recommendations (each list) were created (CCF, NwCF, BM25_10, BM25_20) The four lists were combined and sorted randomly (due to overlapping of recommendations, less than 40 items) Subjects were asked to evaluate relevance (relevant/somewhat relevant/not relevant) and novelty (novel/ somewhat novel/ not novel)
  • 12. Description of the Data Crawl CUL for 20 “center users” (only 7 were used for the study) Annotation: tuple {user, article, tag} Item # of unique instances users 358 articles 186,122 tags 51,903 annotations 902,711
  • 13. Results (a) nDCG (b) Average Novelty (c) Precision_2@5 (d) Precision_2@10 (e) Precision_2_1@5 (f) Precision_2_1@10
  • 14. Conclusions The rating scale must be considered carefully in a CF approach. NwCF, which incorporates the number of raters, decreases the uncertainty produced by items with too few ratings. The tag-based user similarity approach shows interesting results, which can lead us to consider it a valid approach to Pearson-correlation when using CF algorithms. We will incorporate more users in our future studies to make the results more conclusive.
  • 16. Bibliography [1] Schafer, J., Frankowski, D., Herlocker, J. and Sen, S. 2007 Collaborative Filtering Recommender Systems. The Adaptive Web. (May 2007), 291-324. [2] Manning, C., Raghavan, P. and Schutze, H. 2008 Introduction to Information Retrieval. Cambridge University Press.