SlideShare a Scribd company logo
A Study of Smoothing Methods for Relevance-Based
Language Modelling of Recommender Systems
Daniel Valcarce, Javier Parapar, Álvaro Barreiro
{daniel.valcarce, javierparapar, barreiro}@udc.es – http://www.irlab.org
Information Retrieval Lab, Computer Science Department, University of A Coruña
Overview
Language Models have been traditionally used in several fields such as speech recognition or document retrieval. Recently,
Relevance-Based Language Models have been extended to Collaborative Filtering Recommender Systems [1]. In
this field, a Relevance Model is estimated for each user based on the probabilities of the items. As it was thoroughly studied,
smoothing plays a key role in the estimation of a Language Model [2]. Our aim in this work is to study smoothing methods
in the context of Collaborative Filtering Recommender Systems.
RM for Recommendation
IR RecSys
Query Target user
Document Neighbour
Term Item
RM1 : p(i|Ru) ∝
v∈Vu
p(v)p(i|v)
j∈Iu
p(j|v) (1)
RM2 : p(i|Ru) ∝ p(i)
j∈Iu v∈Vu
p(i|v)p(v)
p(i)
p(j|v) (2)
• Iu is the set of items rated by the user u
• Vu is the set of neighbours of the user u
• p(i) and p(v) are considered uniform
• p(i|u) is computed smoothing pml(i|u) =
ru,i
j∈Iu
ru,j
Smoothing methods
Smoothing deals with data sparsity and plays a similar role to
the IDF using a background model: p(i|C) = v∈U rv,i
j∈I, v∈U rv,j
.
Jelinek-Mercer (JM) Linear interpolation. Parameter λ.
pλ(i|u) = (1 − λ) pml(i|u) + λ p(i|C) (3)
Dirichlet Priors (DP) Bayesian analysis. Parameter µ.
pµ(i|u) =
ru,i + µ p(i|C)
µ + j∈Iu
ru,j
(4)
Absolute Discounting (AD) Subtract a constant δ.
pδ(i|u) =
max(ru,i − δ, 0) + δ |Iu| p(i|C)
j∈Iu
ru,j
(5)
Experiments
0
0.05
0.1
0 100 200 300 400 500 600 700 800 900 1000
µ
RM1 + AD
RM1 + JM
RM1 + DP
RM2 + AD
RM2 + JM
RM2 + DP
0.25
0.3
0.35
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
P@5
λ / δ
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
01002003004005006007008009001000
0
0.2
0.4
0.6
0.8
1
P@5
δ
#ratings
P@5
0
0.2
0.4
0.6
0.8
1
Precision at 5 of the RM1 and the RM2 algorithms using Abso-
lute Discounting (AD), Jelinek-Mercer (JM) and Dirichlet pri-
ors (DP) smoothing methods for the MovieLens 100k dataset.
Precision at 5 of the RM2 algorithm using AD when varying
the smoothing intensity and considering different number of
ratings in the user profiles for the MovieLens 1M dataset.
Conclusions
• There are no big differences in terms of optimal pre-
cision among the studied smoothing techniques.
• Dirichlet priors and, specially, Jelinek-Mercer suffer a sig-
nificant decrease in precision when a high amount
of smoothing is applied.
• Absolute Discounting behaves almost as a
parameter-free smoothing method.
Bibliography
[1] J. Parapar, A. Bellogín, P. Castells, and A. Barreiro.
Relevance-based language modelling for recommender sys-
tems. IPM, 49(4):966–980, July 2013.
[2] C. Zhai and J. Lafferty. A study of smoothing methods
for language models applied to information retrieval. ACM
TOIS, 22(2):179–214, Apr. 2004.
ECIR 2015, 37th European Conference on Information Retrieval. 29 March - 2 April, 2015, Vienna, Austria

More Related Content

PPTX
Packing Problems Using Gurobi
PDF
Retrieval Performance Bound Analysis for Single Term Queries
PDF
Asymptotic boundpresentation
PPT
Eighan values and diagonalization
PPT
Quantitative Techniques
PPT
Gil Shapira's Active Appearance Model slides
PDF
6. assignment problems
Packing Problems Using Gurobi
Retrieval Performance Bound Analysis for Single Term Queries
Asymptotic boundpresentation
Eighan values and diagonalization
Quantitative Techniques
Gil Shapira's Active Appearance Model slides
6. assignment problems

What's hot (6)

PPTX
Nonlinear programming 2013
PDF
Combinatorial optimization CO-1
PDF
Tracking Faces using Active Appearance Models
PPT
Taylor introms10 ppt_03
PPT
Transportation Assignment
PPTX
Operation research model for solving TSP
Nonlinear programming 2013
Combinatorial optimization CO-1
Tracking Faces using Active Appearance Models
Taylor introms10 ppt_03
Transportation Assignment
Operation research model for solving TSP
Ad

Viewers also liked (17)

PPTX
Konsep dasar kewiraushaaan dalam koperasi
PPTX
Evalaucion del curriculo, del maestro y del portafolio profesional
PDF
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
PDF
Autodesk Certificates
DOC
Motor listrik
PDF
Norma kesopanan
PPTX
suicide silence
PDF
CLS_Import_Substitution 14-01-2015 _E_16-9
PPT
Cutting on the beat
PPTX
PDF
Shorter Multimarker signatures: a new tool to facilitate cancer diagnosis
PPT
Cutting on the beat
PDF
Computing Neighbourhoods with Language Models in a Collaborative Filtering Sc...
PDF
Romancing the Media for your Business
PPTX
Konsep Koperasi
PPTX
Itsm group15 project
PPTX
Positioning terminology of lower limbs
Konsep dasar kewiraushaaan dalam koperasi
Evalaucion del curriculo, del maestro y del portafolio profesional
Exploring Statistical Language Models for Recommender Systems [RecSys '15 DS ...
Autodesk Certificates
Motor listrik
Norma kesopanan
suicide silence
CLS_Import_Substitution 14-01-2015 _E_16-9
Cutting on the beat
Shorter Multimarker signatures: a new tool to facilitate cancer diagnosis
Cutting on the beat
Computing Neighbourhoods with Language Models in a Collaborative Filtering Sc...
Romancing the Media for your Business
Konsep Koperasi
Itsm group15 project
Positioning terminology of lower limbs
Ad

Similar to A Study of Smoothing Methods for Relevance-Based Language Modelling of Recommender Systems [ECIR '15 SP Poster] (14)

PDF
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
PDF
Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevan...
PDF
A Study of Priors for Relevance-Based Language Modelling of Recommender Syste...
PDF
Language Model Information Retrieval with Document Expansion
PDF
LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]
PPT
search engine
PPT
search.ppt
PDF
An Introduction to Information Retrieval.pdf
PDF
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
PPTX
Search Engines
PDF
Some Information Retrieval Models and Our Experiments for TREC KBA
PPT
Language Modeling Putting a curve to the bag of words
PDF
Improving Web Image Search Re-ranking
PPT
Artificial Intelligence
Additive Smoothing for Relevance-Based Language Modelling of Recommender Syst...
Axiomatic Analysis of Smoothing Methods in Language Models for Pseudo-Relevan...
A Study of Priors for Relevance-Based Language Modelling of Recommender Syste...
Language Model Information Retrieval with Document Expansion
LiMe: Linear Methods for Pseudo-Relevance Feedback [SAC '18 Slides]
search engine
search.ppt
An Introduction to Information Retrieval.pdf
Efficient Pseudo-Relevance Feedback Methods for Collaborative Filtering Recom...
Search Engines
Some Information Retrieval Models and Our Experiments for TREC KBA
Language Modeling Putting a curve to the bag of words
Improving Web Image Search Re-ranking
Artificial Intelligence

Recently uploaded (20)

PDF
annual-report-2024-2025 original latest.
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Microsoft Core Cloud Services powerpoint
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
A Complete Guide to Streamlining Business Processes
PPT
Predictive modeling basics in data cleaning process
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPT
DATA COLLECTION METHODS-ppt for nursing research
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
Mega Projects Data Mega Projects Data
PDF
How to run a consulting project- client discovery
PPTX
modul_python (1).pptx for professional and student
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
annual-report-2024-2025 original latest.
Qualitative Qantitative and Mixed Methods.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
climate analysis of Dhaka ,Banglades.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Microsoft Core Cloud Services powerpoint
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
A Complete Guide to Streamlining Business Processes
Predictive modeling basics in data cleaning process
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
DATA COLLECTION METHODS-ppt for nursing research
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
importance of Data-Visualization-in-Data-Science. for mba studnts
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Mega Projects Data Mega Projects Data
How to run a consulting project- client discovery
modul_python (1).pptx for professional and student
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...

A Study of Smoothing Methods for Relevance-Based Language Modelling of Recommender Systems [ECIR '15 SP Poster]

  • 1. A Study of Smoothing Methods for Relevance-Based Language Modelling of Recommender Systems Daniel Valcarce, Javier Parapar, Álvaro Barreiro {daniel.valcarce, javierparapar, barreiro}@udc.es – http://www.irlab.org Information Retrieval Lab, Computer Science Department, University of A Coruña Overview Language Models have been traditionally used in several fields such as speech recognition or document retrieval. Recently, Relevance-Based Language Models have been extended to Collaborative Filtering Recommender Systems [1]. In this field, a Relevance Model is estimated for each user based on the probabilities of the items. As it was thoroughly studied, smoothing plays a key role in the estimation of a Language Model [2]. Our aim in this work is to study smoothing methods in the context of Collaborative Filtering Recommender Systems. RM for Recommendation IR RecSys Query Target user Document Neighbour Term Item RM1 : p(i|Ru) ∝ v∈Vu p(v)p(i|v) j∈Iu p(j|v) (1) RM2 : p(i|Ru) ∝ p(i) j∈Iu v∈Vu p(i|v)p(v) p(i) p(j|v) (2) • Iu is the set of items rated by the user u • Vu is the set of neighbours of the user u • p(i) and p(v) are considered uniform • p(i|u) is computed smoothing pml(i|u) = ru,i j∈Iu ru,j Smoothing methods Smoothing deals with data sparsity and plays a similar role to the IDF using a background model: p(i|C) = v∈U rv,i j∈I, v∈U rv,j . Jelinek-Mercer (JM) Linear interpolation. Parameter λ. pλ(i|u) = (1 − λ) pml(i|u) + λ p(i|C) (3) Dirichlet Priors (DP) Bayesian analysis. Parameter µ. pµ(i|u) = ru,i + µ p(i|C) µ + j∈Iu ru,j (4) Absolute Discounting (AD) Subtract a constant δ. pδ(i|u) = max(ru,i − δ, 0) + δ |Iu| p(i|C) j∈Iu ru,j (5) Experiments 0 0.05 0.1 0 100 200 300 400 500 600 700 800 900 1000 µ RM1 + AD RM1 + JM RM1 + DP RM2 + AD RM2 + JM RM2 + DP 0.25 0.3 0.35 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 P@5 λ / δ 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 01002003004005006007008009001000 0 0.2 0.4 0.6 0.8 1 P@5 δ #ratings P@5 0 0.2 0.4 0.6 0.8 1 Precision at 5 of the RM1 and the RM2 algorithms using Abso- lute Discounting (AD), Jelinek-Mercer (JM) and Dirichlet pri- ors (DP) smoothing methods for the MovieLens 100k dataset. Precision at 5 of the RM2 algorithm using AD when varying the smoothing intensity and considering different number of ratings in the user profiles for the MovieLens 1M dataset. Conclusions • There are no big differences in terms of optimal pre- cision among the studied smoothing techniques. • Dirichlet priors and, specially, Jelinek-Mercer suffer a sig- nificant decrease in precision when a high amount of smoothing is applied. • Absolute Discounting behaves almost as a parameter-free smoothing method. Bibliography [1] J. Parapar, A. Bellogín, P. Castells, and A. Barreiro. Relevance-based language modelling for recommender sys- tems. IPM, 49(4):966–980, July 2013. [2] C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM TOIS, 22(2):179–214, Apr. 2004. ECIR 2015, 37th European Conference on Information Retrieval. 29 March - 2 April, 2015, Vienna, Austria