SlideShare a Scribd company logo
Local Collaborative Autoencoders
Minjin Choi1, Yoonki Jeong1, Joonseok Lee2, Jongwuk Lee1
Sungkyunkwan University (SKKU), South Korea1
Google Research, United States2
Motivation
2
Global Low-rank Assumption
➢Existing models are based on the global low-lank assumption.
• All users and items share the same latent features.
➢Limitation: some users/items may have different latent features.
3
1 1 ? ? 1
1 1 1 ? ?
1 1 ? 1 ?
? 1 1 1 ?
? 1 1 ? 1
0.5 ... 0.1
0.6 ... 0.2
0.7 ... 0.3
… ... ...
0.1 ... 0.7
0.2 0.4 1.2 … 2.3
… ... … ... ...
0.4 0.6 0.3 … 1.5
k features
k
features
Local Low-rank Assumption
➢A user-item matrix can be divided to several sub-matrices
with the local low-rank assumption.
• Each sub-matrix represents different communities.
• Local models represent various communities with different characteristics.
4
1 1 ? ? 1
1 1 1 ? ?
1 1 ? 1 ?
? 1 1 1 ?
? 1 1 ? 1
Limitation of Existing Local Models
➢If the local model is too large, it is close to the global model.
• Because LLORMA uses large local models, the local model may not
represent its unique characteristic.
• The performance gain may come from an ensemble effect.
5
1 1 ? ? 1
1 1 1 ? ?
1 1 ? 1 ?
? 1 1 1 ?
? 1 1 ? 1
1 1 ? ? 1
1 1 1 ? ?
1 1 ? 1 ?
? 1 1 1 ?
? 1 1 ? 1
Lee et al., "LLORMA: Local Low-Rank Matrix Approximation," JMLR 2016
Limitation of Existing Local Models
➢If the local model is too small, the accuracy is too low.
• Because sGLSVD uses small local models, some local model may have
insufficient training data.
6
1 1 ? ? 1
1 1 1 ? ?
1 1 ? 1 ?
? 1 1 1 ?
? 1 1 ? 1
Evangelia Christakopoulou and George Karypis, “Local Latent Space Models for Top-N Recommendation,” KDD 2018
1 1 ? ? 1
1 1 1 ? ?
1 1 ? 1 ?
? 1 1 1 ?
? 1 1 ? 1
Research Question
7
How to build coherent and accurate local models?
Our Key Contributions
➢When the local model keeps small and coherent,
we build the local model with a relatively large training data.
8
(Dataset: ML10M)
Training matrix
== Test matrix
training matrix
> test matrix
Our Key Contributions
➢Autoencoder-based models are used as the base model
to train the local model.
• They are useful for capturing non-linear and complicated patterns.
9
1 1 ? ? 1
1 1 1 ? ?
1 1 ? 1 ?
? 1 1 1 ?
? 1 1 ? 1
Sedhain et al., "AutoRec: Autoencoders Meet Collaborative Filtering," WWW 2015
2 1 3
…
2 1 3
…
…
𝒉(𝒓)
𝒓
ො
𝒓
𝑾, 𝒃
𝑾′
, 𝒃′
Autoencoder-based models take
each row as input.
Proposed Method
10
Local Collaborative Autoencoders (LOCA)
➢Overall architecture of LOCA
• Step 1: Discovering two local communities for an anchor user
• Step 2: Training a local model with an expanded community
• Step 3: Combining multiple local models
11
A local community
for an anchor user
An expanded local
community
Local model q
…
Final model
Local model 2
Local model 1
Step 1: Discovering Local Communities
➢For an anchor user, determine a local community
and expand the local community for training.
12
1.0
0.6
0.3
0.5
0.4
0.7
0.4
0.2
Calculate the similarities between the
anchor user and the other users.
…
…
1.0
0.7
0.6
1.0
0.7
0.6
0.5
0.5
0.4
0.4
Neighbors for the
local community
Expanded neighbors to
train the local community
Step 2: Training a Local Model
➢Train the local model with the expanded community.
• Use the autoencoder-based model for training the local model.
• Note: It is possible to utilize any base models, e.g., MF, AE and EASER.
• The similarities with the anchor are used for the user weights for training.
13
1.0
0.6
0.3
0.5
0.4
0.7
0.4
1 1 ? 1
1 1 1 1
1 1 ? ?
1 1 ? 1
1 ? ? ?
1 1 1 1
1 1 ? ?
…
…
…
…
…
…
…
…
Harald Steck., "Embarrassingly shallow autoencoders for sparse data,” WWW 2019.
Step 3: Combining Multiple Local Models
➢Aggregate multiple local models with the weight of
each local model.
14
Local model 1
Local model q
…
1 1 ? 1
1 1 1 1
1 1 ? ?
1 1 ? 1
1 ? ? ?
1 1 1 1
1 1 ? ?
? 1 1 ?
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
.9 .4 .5 .7
.8 .9 .6 .6
.7 .8 .5 .8
…
.7 .4 .5 .9
.7 .5 .5 .8
.8 .6 .7 .8
…
…
…
.9 .4 .5 .7
.8 .9 .6 .6
.7 .8 .5 .8
.8 .7 .5 1
.7 .4 .5 .9
.7 .5 .5 .8
.8 .6 .7 .8
.6 1 1 .5
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
…
Training Local Models in Detail
➢The loss function for the 𝑗-th local model
➢Aggregating all local models and a global model
15
argmin
𝜽(𝒋)
෍
𝒓𝒖∈𝑹
𝒕𝒖
(𝒋)
ℒ 𝒓𝒖, 𝑴𝒍𝒐𝒄𝒂𝒍
(𝒓𝒖; 𝜽(𝒋)
) + 𝝀𝛀(𝜽(𝒋)
)
෡
𝑹 = 𝜶𝑴𝒈𝒍𝒐𝒃𝒂𝒍 𝑹; 𝜽 𝒈 + 𝟏 − 𝜶 ෍
𝒋=𝟏
𝒒
𝒘(𝒋) ⊙ 𝑴𝒍𝒐𝒄𝒂𝒍(𝑹; 𝜽(𝒋)) ⊘ 𝒘
Parameter for the
𝒋-th local model
The 𝒋-th local model
User weight for training the 𝒋-th local model
User weight for inferring the 𝒋-th local model
The global model is used to handle the users
who are not covered by local models.
How to Choose Anchor Users
➢Random selection: Choose 𝑘 anchor users at random.
➢Need to maximize the coverage of users by local models.
• An optimal maximum coverage algorithm is the NP-hard problem.
➢Use the greedy method to maximize the coverage.
• Select the anchor user who has the most uncovered users iteratively.
16
Group1
Group 2
Group 3
Experiments
17
Experimental Setup: Dataset
➢We evaluate our model over five public datasets
with various characteristics (e.g., domain, sparsity).
18
Dataset # of users # of items # of ratings Sparsity
MovieLens 10M
(ML10M)
69,878 10,677 10,000,054 98.66%
MovieLens 20M
(ML20M)
138,493 26,744 20,000,263 99.46%
Amazon Music
(AMusic)
4,964 11,797 97,439 99.83%
Amazon Game
(AGame)
13,063 17,408 236,415 99.90%
Yelp 25,677 25,815 731,671 99.89%
Evaluation Protocol and Metrics
➢Evaluation protocol: leave-5-out
• Hold-out the last 5 interactions as the test data for each user.
➢Evaluation metrics
• Recall@100
• Measures the number of test items included in the top-N list.
• NDCG@100
• Measures the ranking of test items in the top-N list.
19
Time
Training data:
Test data:
User interaction:
Competitive Global/Local Models
➢Four autoencoder-based global models
• CDAE: a denoising autoencoder-based model with a latent user vector
• MultVAE: a VAE-based model
• RecVAE: a VAE-based model by improving MultVAE
• EASER: an item-to-item latent factor model
➢Two local models
• LLORMA: local model using MF as the base model
• sGLSVD: local model using SVD as the base model
20
Yao Wu et al., “Collaborative Denoising Auto-Encoders for Top-N Recommender Systems,” WSDM 2016
Dawen Liang et al., "Variational autoencoders for collaborative filtering,” WWW 2018.
Ilya Shenbin et al., "RecVAE: A new variational autoencoder for Top-N recommendations with implicit feedback,” WSDM 2020.
Harald Steck., "Embarrassingly shallow autoencoders for sparse data,” WWW 2019.
Lee et al., "LLORMA: Local Low-Rank Matrix Approximation," JMLR 2016
Evangelia Christakopoulou and George Karypis, “Local Latent Space Models for Top-N Recommendation,” KDD 2018
Accuracy: LOCA vs. Competing Models
21
Dataset Metric CDAE MultVAE EASER RecVAE LLORMA sGLSVD LOCAVAE LOCAEASE
ML10M
Recall@100 0.4685 0.4653 0.4648 0.4705 0.4692 0.4468 0.4865 0.4798
NDCG@100 0.1982 0.1945 0.2000 0.1996 0.2042 0.1953 0.2073 0.2049
ML20M
Recall@100 0.4324 0.4397 0.4468 0.4417 0.3355 0.4342 0.4419 0.4654
NDCG@100 0.1844 0.1860 0.1948 0.1857 0.1446 0.1919 0.1884 0.2024
Amusic
Recall@100 0.0588 0.0681 0.0717 0.0582 0.0517 0.0515 0.0748 0.0717
NDCG@100 0.712 0.0822 0.0821 0.0810 0.0638 0.0613 0.0893 0.0826
Agames
Recall@100 0.1825 0.2081 0.1913 0.1920 0.1223 0.1669 0.2147 0.1947
NDCG@100 0.0808 0.0920 0.0915 0.0849 0.0539 0.0777 0.0966 0.0922
Yelp
Recall@100 0.2094 0.2276 0.2187 0.2262 0.1013 0.1965 0.2354 0.2205
NDCG@100 0.0920 0.0982 0.0972 0.0975 0.0429 0.0857 0.1103 0.0981
Global Models Local Models Ours
➢LOCA consistently outperforms competitive
global/local models over five benchmark datasets.
Effect of the Number of Local Models
➢The accuracy of LOCA improved consistently with
an increase in the number of local models.
22
0.185
0.19
0.195
0.2
0.205
0.21
0 50 100 150 200 250 300
NDCG@100
The number of local models
ML10M
MultVAE Ensemble_VAE
LLORMA_VAE LOCA_VAE
0.078
0.08
0.082
0.084
0.086
0.088
0.09
0 50 100 150 200 250 300
NDCG@100 The number of local models
AMusic
MultVAE Ensemble_VAE
LLORMA_VAE LOCA_VAE
Effect of Anchor Selection Method
➢Our coverage-based anchor selection outperforms
the other methods in terms of accuracies and coverages.
23
0.196
0.198
0.2
0.202
0.204
0.206
0.208
50 100 150 200 250 300
NDCG@100
The number of local models
ML10M
Random K-means Ours
0.08
0.082
0.084
0.086
0.088
0.09
50 100 150 200 250 300
NDCG@100 The number of local models
AMusic
Random K-means Ours
Illustration of LOCA
➢When a user has multiple tastes, LOCA can capture
the user preference by combining different local patterns.
• For a user (66005 in ML10M) who likes Sci-Fi and Horror movies, LOCA
shows a better accuracy.
24
Recommendation Local 70 Local 179 Global Ground Truth
Top-1
Sci-Fi
Adventure
Horror
Action
Thriller
Action
Sci-Fi
Action
Top-2
Sci-Fi
Horror
Horror
Drama
Drama
Horror
Thriller
Top-3
Sci-Fi
Action
Horror
Drama
Drama
Mystery
Horror
Action
Conclusion
25
Conclusion
➢We propose a new local recommender framework,
namely local collaborative autoencoders (LOCA).
➢LOCA can handle a large number of local models effectively.
• Adopts a local model with different training/inference strategies.
• Utilizes autoencoder-based model as the base model.
• Makes use of a greedy maximum coverage method to build various
local models.
➢LOCA outperforms the state-of-the-art global and local models
over various benchmark datasets.
26
Q&A
27
Code: https://github.com/jin530/LOCA
Email: zxcvxd@skku.edu

More Related Content

PDF
Nâng cao chất lượng dịch vụ ăn uống tại nhà hàng Gogi House Giang Văn Minh
PDF
Online shopper behavior influences
PDF
Luận án: Vốn tâm lý, thái độ công việc và hiệu quả công việc của nhân viên
PDF
Trở thành nhà thiết kế - Become designers - FPT Arena Multimedia
PDF
Minna No Nihongo I - Hyoujun Mondaishuu.pdf
DOC
Sự Hài Lòng Công Việc Của Nhân Viên Trong Ngành Dịch Vụ Khách Sạn Tại Đà Nẵng...
PPTX
RS in the context of Big Data-v4
PPTX
Using Metrics for Fun, Developing with the KV Store + Javascript & News from ...
Nâng cao chất lượng dịch vụ ăn uống tại nhà hàng Gogi House Giang Văn Minh
Online shopper behavior influences
Luận án: Vốn tâm lý, thái độ công việc và hiệu quả công việc của nhân viên
Trở thành nhà thiết kế - Become designers - FPT Arena Multimedia
Minna No Nihongo I - Hyoujun Mondaishuu.pdf
Sự Hài Lòng Công Việc Của Nhân Viên Trong Ngành Dịch Vụ Khách Sạn Tại Đà Nẵng...
RS in the context of Big Data-v4
Using Metrics for Fun, Developing with the KV Store + Javascript & News from ...

Similar to Local collaborative autoencoders (WSDM2021) (20)

PPTX
Lessons learnt at building recommendation services at industry scale
PPTX
Machine Learning Impact on IoT - Part 2
PPTX
@RISK Unchained Webinar
PDF
Why biased matrix factorization works well?
PDF
Explain Yourself: Why You Get the Recommendations You Do
PDF
Towards a Quality Assessment of Web Corpora for Language Technology Applications
PPTX
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...
PDF
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
PDF
Deep Learning for Recommender Systems
PDF
Deep Learning for Recommender Systems
PPTX
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
PDF
Using Bayesian Optimization to Tune Machine Learning Models
PDF
Using Bayesian Optimization to Tune Machine Learning Models
PPTX
Machine Learning, Deep Learning and Data Analysis Introduction
PPTX
In the age of Big Data, what role for Software Engineers?
PDF
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
PDF
Moving Toward Deep Learning Algorithms on HPCC Systems
PDF
Building a Location Based Social Graph in Spark at InMobi-(Seinjuti Chatterje...
PPT
Performance evaluation of IR models
PDF
Machine Learning Interpretability
Lessons learnt at building recommendation services at industry scale
Machine Learning Impact on IoT - Part 2
@RISK Unchained Webinar
Why biased matrix factorization works well?
Explain Yourself: Why You Get the Recommendations You Do
Towards a Quality Assessment of Web Corpora for Language Technology Applications
Session-aware Linear Item-Item Models for Session-based Recommendation (WWW 2...
NLP-Focused Applied ML at Scale for Global Fleet Analytics at ExxonMobil
Deep Learning for Recommender Systems
Deep Learning for Recommender Systems
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
Machine Learning, Deep Learning and Data Analysis Introduction
In the age of Big Data, what role for Software Engineers?
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Moving Toward Deep Learning Algorithms on HPCC Systems
Building a Location Based Social Graph in Spark at InMobi-(Seinjuti Chatterje...
Performance evaluation of IR models
Machine Learning Interpretability
Ad

Recently uploaded (20)

PPTX
Welcome-grrewfefweg-students-of-2024.pptx
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPT
LEC Synthetic Biology and its application.ppt
PPTX
Seminar Hypertension and Kidney diseases.pptx
PPTX
gene cloning powerpoint for general biology 2
PPTX
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
PPT
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
PPT
Mutation in dna of bacteria and repairss
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
BIOMOLECULES PPT........................
PPT
6.1 High Risk New Born. Padetric health ppt
PPTX
endocrine - management of adrenal incidentaloma.pptx
PPTX
PMR- PPT.pptx for students and doctors tt
PPTX
Introcution to Microbes Burton's Biology for the Health
PPTX
Substance Disorders- part different drugs change body
PPTX
A powerpoint on colorectal cancer with brief background
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
Welcome-grrewfefweg-students-of-2024.pptx
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
LEC Synthetic Biology and its application.ppt
Seminar Hypertension and Kidney diseases.pptx
gene cloning powerpoint for general biology 2
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
Mutation in dna of bacteria and repairss
TORCH INFECTIONS in pregnancy with toxoplasma
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
BIOMOLECULES PPT........................
6.1 High Risk New Born. Padetric health ppt
endocrine - management of adrenal incidentaloma.pptx
PMR- PPT.pptx for students and doctors tt
Introcution to Microbes Burton's Biology for the Health
Substance Disorders- part different drugs change body
A powerpoint on colorectal cancer with brief background
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
Ad

Local collaborative autoencoders (WSDM2021)

  • 1. Local Collaborative Autoencoders Minjin Choi1, Yoonki Jeong1, Joonseok Lee2, Jongwuk Lee1 Sungkyunkwan University (SKKU), South Korea1 Google Research, United States2
  • 3. Global Low-rank Assumption ➢Existing models are based on the global low-lank assumption. • All users and items share the same latent features. ➢Limitation: some users/items may have different latent features. 3 1 1 ? ? 1 1 1 1 ? ? 1 1 ? 1 ? ? 1 1 1 ? ? 1 1 ? 1 0.5 ... 0.1 0.6 ... 0.2 0.7 ... 0.3 … ... ... 0.1 ... 0.7 0.2 0.4 1.2 … 2.3 … ... … ... ... 0.4 0.6 0.3 … 1.5 k features k features
  • 4. Local Low-rank Assumption ➢A user-item matrix can be divided to several sub-matrices with the local low-rank assumption. • Each sub-matrix represents different communities. • Local models represent various communities with different characteristics. 4 1 1 ? ? 1 1 1 1 ? ? 1 1 ? 1 ? ? 1 1 1 ? ? 1 1 ? 1
  • 5. Limitation of Existing Local Models ➢If the local model is too large, it is close to the global model. • Because LLORMA uses large local models, the local model may not represent its unique characteristic. • The performance gain may come from an ensemble effect. 5 1 1 ? ? 1 1 1 1 ? ? 1 1 ? 1 ? ? 1 1 1 ? ? 1 1 ? 1 1 1 ? ? 1 1 1 1 ? ? 1 1 ? 1 ? ? 1 1 1 ? ? 1 1 ? 1 Lee et al., "LLORMA: Local Low-Rank Matrix Approximation," JMLR 2016
  • 6. Limitation of Existing Local Models ➢If the local model is too small, the accuracy is too low. • Because sGLSVD uses small local models, some local model may have insufficient training data. 6 1 1 ? ? 1 1 1 1 ? ? 1 1 ? 1 ? ? 1 1 1 ? ? 1 1 ? 1 Evangelia Christakopoulou and George Karypis, “Local Latent Space Models for Top-N Recommendation,” KDD 2018 1 1 ? ? 1 1 1 1 ? ? 1 1 ? 1 ? ? 1 1 1 ? ? 1 1 ? 1
  • 7. Research Question 7 How to build coherent and accurate local models?
  • 8. Our Key Contributions ➢When the local model keeps small and coherent, we build the local model with a relatively large training data. 8 (Dataset: ML10M) Training matrix == Test matrix training matrix > test matrix
  • 9. Our Key Contributions ➢Autoencoder-based models are used as the base model to train the local model. • They are useful for capturing non-linear and complicated patterns. 9 1 1 ? ? 1 1 1 1 ? ? 1 1 ? 1 ? ? 1 1 1 ? ? 1 1 ? 1 Sedhain et al., "AutoRec: Autoencoders Meet Collaborative Filtering," WWW 2015 2 1 3 … 2 1 3 … … 𝒉(𝒓) 𝒓 ො 𝒓 𝑾, 𝒃 𝑾′ , 𝒃′ Autoencoder-based models take each row as input.
  • 11. Local Collaborative Autoencoders (LOCA) ➢Overall architecture of LOCA • Step 1: Discovering two local communities for an anchor user • Step 2: Training a local model with an expanded community • Step 3: Combining multiple local models 11 A local community for an anchor user An expanded local community Local model q … Final model Local model 2 Local model 1
  • 12. Step 1: Discovering Local Communities ➢For an anchor user, determine a local community and expand the local community for training. 12 1.0 0.6 0.3 0.5 0.4 0.7 0.4 0.2 Calculate the similarities between the anchor user and the other users. … … 1.0 0.7 0.6 1.0 0.7 0.6 0.5 0.5 0.4 0.4 Neighbors for the local community Expanded neighbors to train the local community
  • 13. Step 2: Training a Local Model ➢Train the local model with the expanded community. • Use the autoencoder-based model for training the local model. • Note: It is possible to utilize any base models, e.g., MF, AE and EASER. • The similarities with the anchor are used for the user weights for training. 13 1.0 0.6 0.3 0.5 0.4 0.7 0.4 1 1 ? 1 1 1 1 1 1 1 ? ? 1 1 ? 1 1 ? ? ? 1 1 1 1 1 1 ? ? … … … … … … … … Harald Steck., "Embarrassingly shallow autoencoders for sparse data,” WWW 2019.
  • 14. Step 3: Combining Multiple Local Models ➢Aggregate multiple local models with the weight of each local model. 14 Local model 1 Local model q … 1 1 ? 1 1 1 1 1 1 1 ? ? 1 1 ? 1 1 ? ? ? 1 1 1 1 1 1 ? ? ? 1 1 ? … … … … … … … … … … … … … … … .9 .4 .5 .7 .8 .9 .6 .6 .7 .8 .5 .8 … .7 .4 .5 .9 .7 .5 .5 .8 .8 .6 .7 .8 … … … .9 .4 .5 .7 .8 .9 .6 .6 .7 .8 .5 .8 .8 .7 .5 1 .7 .4 .5 .9 .7 .5 .5 .8 .8 .6 .7 .8 .6 1 1 .5 … … … … … … … … … … … … … … … … … … … … …
  • 15. Training Local Models in Detail ➢The loss function for the 𝑗-th local model ➢Aggregating all local models and a global model 15 argmin 𝜽(𝒋) ෍ 𝒓𝒖∈𝑹 𝒕𝒖 (𝒋) ℒ 𝒓𝒖, 𝑴𝒍𝒐𝒄𝒂𝒍 (𝒓𝒖; 𝜽(𝒋) ) + 𝝀𝛀(𝜽(𝒋) ) ෡ 𝑹 = 𝜶𝑴𝒈𝒍𝒐𝒃𝒂𝒍 𝑹; 𝜽 𝒈 + 𝟏 − 𝜶 ෍ 𝒋=𝟏 𝒒 𝒘(𝒋) ⊙ 𝑴𝒍𝒐𝒄𝒂𝒍(𝑹; 𝜽(𝒋)) ⊘ 𝒘 Parameter for the 𝒋-th local model The 𝒋-th local model User weight for training the 𝒋-th local model User weight for inferring the 𝒋-th local model The global model is used to handle the users who are not covered by local models.
  • 16. How to Choose Anchor Users ➢Random selection: Choose 𝑘 anchor users at random. ➢Need to maximize the coverage of users by local models. • An optimal maximum coverage algorithm is the NP-hard problem. ➢Use the greedy method to maximize the coverage. • Select the anchor user who has the most uncovered users iteratively. 16 Group1 Group 2 Group 3
  • 18. Experimental Setup: Dataset ➢We evaluate our model over five public datasets with various characteristics (e.g., domain, sparsity). 18 Dataset # of users # of items # of ratings Sparsity MovieLens 10M (ML10M) 69,878 10,677 10,000,054 98.66% MovieLens 20M (ML20M) 138,493 26,744 20,000,263 99.46% Amazon Music (AMusic) 4,964 11,797 97,439 99.83% Amazon Game (AGame) 13,063 17,408 236,415 99.90% Yelp 25,677 25,815 731,671 99.89%
  • 19. Evaluation Protocol and Metrics ➢Evaluation protocol: leave-5-out • Hold-out the last 5 interactions as the test data for each user. ➢Evaluation metrics • Recall@100 • Measures the number of test items included in the top-N list. • NDCG@100 • Measures the ranking of test items in the top-N list. 19 Time Training data: Test data: User interaction:
  • 20. Competitive Global/Local Models ➢Four autoencoder-based global models • CDAE: a denoising autoencoder-based model with a latent user vector • MultVAE: a VAE-based model • RecVAE: a VAE-based model by improving MultVAE • EASER: an item-to-item latent factor model ➢Two local models • LLORMA: local model using MF as the base model • sGLSVD: local model using SVD as the base model 20 Yao Wu et al., “Collaborative Denoising Auto-Encoders for Top-N Recommender Systems,” WSDM 2016 Dawen Liang et al., "Variational autoencoders for collaborative filtering,” WWW 2018. Ilya Shenbin et al., "RecVAE: A new variational autoencoder for Top-N recommendations with implicit feedback,” WSDM 2020. Harald Steck., "Embarrassingly shallow autoencoders for sparse data,” WWW 2019. Lee et al., "LLORMA: Local Low-Rank Matrix Approximation," JMLR 2016 Evangelia Christakopoulou and George Karypis, “Local Latent Space Models for Top-N Recommendation,” KDD 2018
  • 21. Accuracy: LOCA vs. Competing Models 21 Dataset Metric CDAE MultVAE EASER RecVAE LLORMA sGLSVD LOCAVAE LOCAEASE ML10M Recall@100 0.4685 0.4653 0.4648 0.4705 0.4692 0.4468 0.4865 0.4798 NDCG@100 0.1982 0.1945 0.2000 0.1996 0.2042 0.1953 0.2073 0.2049 ML20M Recall@100 0.4324 0.4397 0.4468 0.4417 0.3355 0.4342 0.4419 0.4654 NDCG@100 0.1844 0.1860 0.1948 0.1857 0.1446 0.1919 0.1884 0.2024 Amusic Recall@100 0.0588 0.0681 0.0717 0.0582 0.0517 0.0515 0.0748 0.0717 NDCG@100 0.712 0.0822 0.0821 0.0810 0.0638 0.0613 0.0893 0.0826 Agames Recall@100 0.1825 0.2081 0.1913 0.1920 0.1223 0.1669 0.2147 0.1947 NDCG@100 0.0808 0.0920 0.0915 0.0849 0.0539 0.0777 0.0966 0.0922 Yelp Recall@100 0.2094 0.2276 0.2187 0.2262 0.1013 0.1965 0.2354 0.2205 NDCG@100 0.0920 0.0982 0.0972 0.0975 0.0429 0.0857 0.1103 0.0981 Global Models Local Models Ours ➢LOCA consistently outperforms competitive global/local models over five benchmark datasets.
  • 22. Effect of the Number of Local Models ➢The accuracy of LOCA improved consistently with an increase in the number of local models. 22 0.185 0.19 0.195 0.2 0.205 0.21 0 50 100 150 200 250 300 NDCG@100 The number of local models ML10M MultVAE Ensemble_VAE LLORMA_VAE LOCA_VAE 0.078 0.08 0.082 0.084 0.086 0.088 0.09 0 50 100 150 200 250 300 NDCG@100 The number of local models AMusic MultVAE Ensemble_VAE LLORMA_VAE LOCA_VAE
  • 23. Effect of Anchor Selection Method ➢Our coverage-based anchor selection outperforms the other methods in terms of accuracies and coverages. 23 0.196 0.198 0.2 0.202 0.204 0.206 0.208 50 100 150 200 250 300 NDCG@100 The number of local models ML10M Random K-means Ours 0.08 0.082 0.084 0.086 0.088 0.09 50 100 150 200 250 300 NDCG@100 The number of local models AMusic Random K-means Ours
  • 24. Illustration of LOCA ➢When a user has multiple tastes, LOCA can capture the user preference by combining different local patterns. • For a user (66005 in ML10M) who likes Sci-Fi and Horror movies, LOCA shows a better accuracy. 24 Recommendation Local 70 Local 179 Global Ground Truth Top-1 Sci-Fi Adventure Horror Action Thriller Action Sci-Fi Action Top-2 Sci-Fi Horror Horror Drama Drama Horror Thriller Top-3 Sci-Fi Action Horror Drama Drama Mystery Horror Action
  • 26. Conclusion ➢We propose a new local recommender framework, namely local collaborative autoencoders (LOCA). ➢LOCA can handle a large number of local models effectively. • Adopts a local model with different training/inference strategies. • Utilizes autoencoder-based model as the base model. • Makes use of a greedy maximum coverage method to build various local models. ➢LOCA outperforms the state-of-the-art global and local models over various benchmark datasets. 26