Top-N Recommender Systems: Revisiting Item Neighborhood Methods
George Karypis
Department of Computer Science & Engineering
University of Minnesota
karypis@cs.umn.edu
http://www.cs.umn.edu/~karypis


Abstract
Top-N recommender systems are designed to generate a ranked list of items that a user will find
useful based on the user’s prior activity. These systems have become ubiquitous and are an
essential tool for information filtering and (e-)commerce. Over the years, collaborative filtering,
which derive these recommendations by leveraging past activities of groups of users, has
emerged as the most prominent approach for solving this problem. Among the multitude of
methods that have been developed, item-based nearest neighbor algorithms are among the
simplest and yet best-performing methods for Top-N recommender systems. These methods
rank the items to be recommended based on how similar they are to the items in a user’s prior
activity history, using various co-occurrence similarity measures.
In this talk we present our recent work in these item-based neighborhood methods that has
substantially improved the accuracy of the predictions. One shortcoming of traditional item-
based neighborhood methods is that they rely on a similarity measure that needs to be specified
a priori. To address this problem we developed a class of item-based neighborhood methods
that directly estimate from the training data a sparse item-item similarity matrix. This similarity
matrix is estimated using a structural equation modeling (SEM) framework, which requires each
column of the user-item matrix to be approximated as a sparse aggregation of some other
columns. These other columns correspond to the learned neighbors and their aggregation
weights to the learned similarities. A second shortcoming of item-based neighborhood methods
is that the item-item similarity measures rely on co-occurrences, which become problematic
when the datasets are very sparse and the number of items pairs with sufficiently many co-
occurrences is small. To address this problem we extended the SEM framework to estimate a
factored version of the item-item similarity matrix. This factored representation projects the
items in a lower dimensional space, which allows for meaningful similarity estimates between
items that never co-occurred in the original user-item matrix. In addition to the above, we also
discuss and present result from our work to enhance the above SEM-models by incorporating
item side information to further improve the Top-N recommendation accuracy and to also
address the item cold-start recommendation problem.

Bio
George Karypis is a professor at the Department of Computer Science & Engineering at the
University of Minnesota, Twin Cities. His research interests spans the areas of data mining,
bioinformatics, cheminformatics, high performance computing, information retrieval,
collaborative filtering, and scientific computing. His research has resulted in the development of
software libraries for serial and parallel graph partitioning (METIS and ParMETIS), hypergraph
partitioning (hMETIS), for parallel Cholesky factorization (PSPASES), for collaborative filtering-
based recommendation algorithms (SUGGEST), clustering high dimensional datasets (CLUTO),
finding frequent patterns in diverse datasets (PAFI), and for protein secondary structure
prediction (YASSPP). He has coauthored over 200 papers on these topics and a book title
“Introduction to Parallel Computing” (Publ. Addison Wesley, 2003, 2nd edition). In addition, he is
serving on the program committees of many conferences and workshops on these topics, and
on the editorial boards of the IEEE Transactions on Knowledge and Data Engineering, Social
Network Analysis and Data Mining Journal, International Journal of Data Mining and
Bioinformatics, the journal on Current Proteomics, Advances in Bioinformatics, and Biomedicine
and Biotechnology.

More Related Content

DOCX
RELATIONAL COLLABORATIVE TOPIC REGRESSION FOR RECOMMENDER SYSTEMS
PPT
Defrag 2010 Collaborative Analytics
PPTX
Enhancing Exploratory Search with Hedonic Browsing Using Social Tagging Tools
DOCX
Crowdsourcing predictors of behavioral outcomes
PPTX
What is AI?
DOCX
Privacy and Cryptographic Security Issues within Mobile Recommender Syste...
PPTX
Carma internet research module preparing for manuscript submission
RELATIONAL COLLABORATIVE TOPIC REGRESSION FOR RECOMMENDER SYSTEMS
Defrag 2010 Collaborative Analytics
Enhancing Exploratory Search with Hedonic Browsing Using Social Tagging Tools
Crowdsourcing predictors of behavioral outcomes
What is AI?
Privacy and Cryptographic Security Issues within Mobile Recommender Syste...
Carma internet research module preparing for manuscript submission

What's hot (20)

PPT
Analytical Tools Primer
PDF
Interpreting sslar
PPTX
Dataset-driven research to improve TEL recommender systems
PPTX
Data Models
PDF
Information Retrieval and User-centric Recommender System Evaluation
PPTX
PhD Consortium ADBIS presetation.
PPTX
Recommenders, Topics, and Text
PPTX
Recommendation System
PPTX
PhD defense
PPTX
Data models
PPTX
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
PDF
A comprehensive survey of link mining and anomalies detection
PDF
Pie chart or pizza: identifying chart types and their virality on Twitter
PPT
Contractor-Borner-SNA-SAC
PPTX
Social Network Analysis (Part 1)
PPTX
Algorithms of Online Platforms and Networks
PPTX
Data model
PDF
Phd thesis final presentation
PDF
An empirical performance evaluation of relational keyword search systems
Analytical Tools Primer
Interpreting sslar
Dataset-driven research to improve TEL recommender systems
Data Models
Information Retrieval and User-centric Recommender System Evaluation
PhD Consortium ADBIS presetation.
Recommenders, Topics, and Text
Recommendation System
PhD defense
Data models
Selection of Articles Using Data Analytics for Behavioral Dissertation Resear...
A comprehensive survey of link mining and anomalies detection
Pie chart or pizza: identifying chart types and their virality on Twitter
Contractor-Borner-SNA-SAC
Social Network Analysis (Part 1)
Algorithms of Online Platforms and Networks
Data model
Phd thesis final presentation
An empirical performance evaluation of relational keyword search systems
Ad

Similar to George (20)

PDF
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
PPTX
Recommendation system
PDF
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...
PDF
Investigation and application of Personalizing Recommender Systems based on A...
PDF
Enhancing Matrix Factorization Through Initialization for Implicit Feedback D...
PPTX
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
PDF
Recommender Systems
PDF
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
PPTX
Initialization of matrix factorization (CaRR 2012 presentation)
PDF
k-Separability Presentation
PDF
Recommender Systems, Matrices and Graphs
PDF
Music Recommendation System with User-based and Item-based Collaborative Filt...
PDF
Matrix Factorization Techniques For Recommender Systems
PDF
Research_on_Personalized_Recommendation_Algorithm_Based_on_Time_Weighted_and_...
PDF
ESSIR 2013 Recommender Systems tutorial
PDF
Recommendation Engine Demystified
PDF
Recommendation Engine Demystified
PDF
MobiSys Seminar - Nov 4 2008
PDF
Linked Open Data to Support Content-based Recommender Systems - I-SEMANTIC…
PDF
Linked Open Data to support content based Recommender Systems
Efficient Similarity Computation for Collaborative Filtering in Dynamic Envir...
Recommendation system
PyCon Balkans 2018 // Recommender systems - collaborative filtering and dimen...
Investigation and application of Personalizing Recommender Systems based on A...
Enhancing Matrix Factorization Through Initialization for Implicit Feedback D...
Slides from CARR 2012 WS - Enhancing Matrix Factorization Through Initializat...
Recommender Systems
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Initialization of matrix factorization (CaRR 2012 presentation)
k-Separability Presentation
Recommender Systems, Matrices and Graphs
Music Recommendation System with User-based and Item-based Collaborative Filt...
Matrix Factorization Techniques For Recommender Systems
Research_on_Personalized_Recommendation_Algorithm_Based_on_Time_Weighted_and_...
ESSIR 2013 Recommender Systems tutorial
Recommendation Engine Demystified
Recommendation Engine Demystified
MobiSys Seminar - Nov 4 2008
Linked Open Data to Support Content-based Recommender Systems - I-SEMANTIC…
Linked Open Data to support content based Recommender Systems
Ad

George

  • 1. Top-N Recommender Systems: Revisiting Item Neighborhood Methods George Karypis Department of Computer Science & Engineering University of Minnesota karypis@cs.umn.edu http://www.cs.umn.edu/~karypis Abstract Top-N recommender systems are designed to generate a ranked list of items that a user will find useful based on the user’s prior activity. These systems have become ubiquitous and are an essential tool for information filtering and (e-)commerce. Over the years, collaborative filtering, which derive these recommendations by leveraging past activities of groups of users, has emerged as the most prominent approach for solving this problem. Among the multitude of methods that have been developed, item-based nearest neighbor algorithms are among the simplest and yet best-performing methods for Top-N recommender systems. These methods rank the items to be recommended based on how similar they are to the items in a user’s prior activity history, using various co-occurrence similarity measures. In this talk we present our recent work in these item-based neighborhood methods that has substantially improved the accuracy of the predictions. One shortcoming of traditional item- based neighborhood methods is that they rely on a similarity measure that needs to be specified a priori. To address this problem we developed a class of item-based neighborhood methods that directly estimate from the training data a sparse item-item similarity matrix. This similarity matrix is estimated using a structural equation modeling (SEM) framework, which requires each column of the user-item matrix to be approximated as a sparse aggregation of some other columns. These other columns correspond to the learned neighbors and their aggregation weights to the learned similarities. A second shortcoming of item-based neighborhood methods is that the item-item similarity measures rely on co-occurrences, which become problematic when the datasets are very sparse and the number of items pairs with sufficiently many co- occurrences is small. To address this problem we extended the SEM framework to estimate a factored version of the item-item similarity matrix. This factored representation projects the items in a lower dimensional space, which allows for meaningful similarity estimates between items that never co-occurred in the original user-item matrix. In addition to the above, we also discuss and present result from our work to enhance the above SEM-models by incorporating item side information to further improve the Top-N recommendation accuracy and to also address the item cold-start recommendation problem. Bio George Karypis is a professor at the Department of Computer Science & Engineering at the University of Minnesota, Twin Cities. His research interests spans the areas of data mining, bioinformatics, cheminformatics, high performance computing, information retrieval, collaborative filtering, and scientific computing. His research has resulted in the development of software libraries for serial and parallel graph partitioning (METIS and ParMETIS), hypergraph partitioning (hMETIS), for parallel Cholesky factorization (PSPASES), for collaborative filtering- based recommendation algorithms (SUGGEST), clustering high dimensional datasets (CLUTO), finding frequent patterns in diverse datasets (PAFI), and for protein secondary structure prediction (YASSPP). He has coauthored over 200 papers on these topics and a book title “Introduction to Parallel Computing” (Publ. Addison Wesley, 2003, 2nd edition). In addition, he is
  • 2. serving on the program committees of many conferences and workshops on these topics, and on the editorial boards of the IEEE Transactions on Knowledge and Data Engineering, Social Network Analysis and Data Mining Journal, International Journal of Data Mining and Bioinformatics, the journal on Current Proteomics, Advances in Bioinformatics, and Biomedicine and Biotechnology.