SlideShare a Scribd company logo
| 0
Daniel Kershaw (@danjamker)
Building Recommenders
20th September 2017
| 1
Mendeley
• Reference Manager
• Social Network
• Publication Catalogue
| 2
Science Direct
• Scientific publication database
• Used by the majority of
university and research
institutions
• Contains 12 million articles of
content from 3,500 academic
journals and 34,000 e-books
| 3
Why Recommendations
Pull
Allow users to discover more content
Make it easier to navigate catalogue
| 4
Why Recommendations
Pull
Allow users to discover more content
Make it easier to navigate catalogue
Push
Highlight new content to users
Bring users back to service
| 5
The five core components
Data Collection
Recommender Model
Recommendation
Post Processing
Online
Modules
User Interface
| 6
Outline
Developed Algorithms – keeping it simple
Practical Considerations – don’t look stupid
Implementation – how to scale a system
Evaluation – what is good enough
Evolution – what’s changed over time
Future Direction – the future’s bright the future’s is deep
| 7
Developed Algorithms
| 8
Available Data
Implicit
User libraries (Mendeley)
User article interactions (Science Direct)
Content
Abstracts
Titles
References
| 9
Content Based
Similarity between what users
have read
Similarity in references
Collaborative Collaborative
Matrix Factorization
KNN
LDA
Potential Methods
| 10
User item interaction matrix
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
| 11
Similarity between query users and other readers
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
| 12
Similarity between all users
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
| 13
Generating recommendations for user
User base CF – (kNN)
https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
| 14
• Ability to scale
• Matrix incredibly sparse
Why not Matrix Factorization
| 15
Practical Considerations
| 16
Explore/Exploit (Dithering)
Recommendations generated in batch
Users want an interactive experience
Slight shuffles give the impression of
freshness
Allow for the exploration of the list if only
a proportion shown
𝑠𝑐𝑜𝑟𝑒 𝑑𝑖𝑡ℎ𝑒𝑟𝑒𝑑 = log 𝑟𝑎𝑛𝑘 + 𝑁 0, log 𝜖
where 𝜀 =
∆ 𝑟𝑎𝑛𝑘
𝑟𝑎𝑛𝑘
and tipically 𝜀 ∈ [1.5,2]
| 17
Impression Discounting
• Experience deteriorates if exposed to the same information
• Push recommendations seen before down the list
Rank
Impressions
| 18
Impression Discounting
• Experience deteriorates if exposed to the same information
• Push recommendations seen before down the list
𝑠𝑐𝑜𝑟𝑒 𝑛𝑒𝑤 = scoreoriginal ∗ (w1 ∗ g impCount + w2 ∗ g lastSeen )
See Lee, P. et. al
| 19
Business Logic (Pre and Post Filtering)
Don’t show items they already have (bought, added, consumed)
Don’t feed the recommender positive feedback from recommender
Don’t recommend out of stock items
• A bad recommender has a cost
- Can be greater than not receiving a recommendation
| 20
Implementation
| 21
Systems Architecture
Impression
Discounting
API
Front End
AWS
Dithering
Candidate Selection
Content
Based
Item2Item
CF
Online
Offline
Logs
| 22
The unbundled mess
| 23
System
• Which run generated the
recommendation
• What was served to the user
• How was the score modified
• What was removed from the
recommendations
User (Feedback loop)
• What was displayed
• What was clicked
• When were they served
• Where the recommendations
displayed
Logging
Used for both debugging and feeding information to recommender
| 24
Evolutions
| 25
• User to Item CF
• Impression Discounting
Mendeley – Desktop Application
| 26
Mendeley – Online
• Implicit – serves
recommendations based on
user libraries
• Recent Activity – based off
recent additions to a users
library
• Research Interests - based on
user generated tags
• Discipline – based on their
self identified discipline
Most Personalized
Least Personalized
See Hristakeva, M et. Al (2017)
| 27
• Remove carousels
• Focus on implicit
recommendations
• Fall back to content based
solution
Mendeley – Online
| 28
• Recommendation based of the
complete library of the user
• Don’t send the same
recommendations twice
Mendeley - Email
| 29
• Item to Item
• Take user reading history
• Get recommendations for each
item
• Interleave recommendations
• Don’t send same
recommendations twice
Science Direct - Email
| 30
Science Direct – Article Page
Item to Item
Dither
recommendations
every 30 minutes
| 31
Evaluation
| 32
Off-line Methodology
Train model Query
Ground
truth
Time, user interactions
Test
| 33
Off-line evaluation - Mendeley
From Hristakeva, M et. al
| 34
Science Direct – Item-to-item
| 35
• Infrastructure takes a long time
to build
• Need feedback from users to
learn
1. Generate recommendations
off-line
2. Send to users via email (A/A)
3. Modify method based on
feedback
4. Send second set of users split
into A/B buckets
Static Recommendations for quick learnings
Email to users
Modify
Recommender
Email to users
| 36
Future Direction
| 37
Learning to rank (LtR)
Currently only using implicit feedback
No content used
Use CF as candidate selection
Re-rank results based on learnt model
optimised for CtR
Use item and user features
| 38
Deep Learning
Use to learn more complex features
Use as features in LtR
Build on the existing framework developed
Use pre-trained models before developing own
| 39
Conclusion (Take Homes)
• Log EVERYTHING
• Start Simple
• Iterate quickly
• Get recommendations out quickly to learn
• Don’t look stupid
• CTR ≇ Off-line Evaluation
| 40
www.elsevier.com/rd-solutions
Thank you,
Book chapter being written based on the content in this presentation
| 41
References
Hristakeva, M., Kershaw, D., Rossetti, M., Knoth, P., Pettit, B., Vargas, S., & Jack, K. (2017). Building
recommender systems for scholarly information. the 1st Workshop (pp. 25–32). New York, New York,
USA: ACM. http://doi.org/10.1145/3057148.3057152
Rossetti, M., Stella, F., & Zanker, M. (2016). Contrasting Offline and Online Results when Evaluating
Recommendation Algorithms (pp. 31–34). Presented at the Proceedings of the 10th ACM Conference
on Recommender Systems, New York, NY, USA: ACM. http://doi.org/10.1145/2959100.2959176
Lee, P., Lakshmanan, L. V. S., Tiwari, M., & Shah, S. (2014). Modeling impression discounting in
large-scale recommender systems (pp. 1837–1846). Presented at the Proceedings of the ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, New York,
USA: ACM Press. http://doi.org/10.1145/2623330.2623356
Koren, Y. (2010). Collaborative filtering with temporal dynamics. Communications of the ACM, 53(4),
89–97. http://doi.org/10.1145/1721654.1721677

More Related Content

PPTX
Building Recommender Systems for Scholarly Information
PDF
Lancaster UCREL Summer School 2017 - Big Data and NLP
PPTX
MIND MAP BASED USER MODELLING AND RECOMMENDER SYSTEM
PPTX
The Open Ecosystem: Issues and challenges for Institutional Repositories
PDF
Improving Social Recommendations by applying a Personalized Item Clustering P...
PDF
Cluster stability
PDF
2017 09-20-criteo-recsys-london
PDF
Mendeley Suggest: Engineering a Personalised Article Recommender System
Building Recommender Systems for Scholarly Information
Lancaster UCREL Summer School 2017 - Big Data and NLP
MIND MAP BASED USER MODELLING AND RECOMMENDER SYSTEM
The Open Ecosystem: Issues and challenges for Institutional Repositories
Improving Social Recommendations by applying a Personalized Item Clustering P...
Cluster stability
2017 09-20-criteo-recsys-london
Mendeley Suggest: Engineering a Personalised Article Recommender System

Similar to Building Recommender Systems - Mendeley and Science Direct (20)

PPT
Impersonal Recommendation system on top of Hadoop
PDF
Modern Perspectives on Recommender Systems and their Applications in Mendeley
PDF
Modern Perspectives on Recommender Systems and their Applications in Mendeley
PDF
Recommender systems
PDF
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
PDF
productionising-recommenders
PDF
Notes on Recommender Systems pdf 2nd module
PDF
Recommender.system.presentation.pjug.01.21.2014
PDF
IntroductionRecommenderSystems_Petroni.pdf
PDF
Bv31491493
PDF
IRJET- Survey Paper on Recommendation Systems
PDF
Introduction to recommender systems
PDF
Beyond Collaborative Filtering: Learning to Rank Research Articles
PDF
Demystifying Recommendation Systems
PDF
Introduction to Recommendation Systems
PDF
Introduction to Recommendation Systems
PPTX
Practical Recommendation System - Scalable Machine Learning
PDF
IRJET- Rating based Recommedation System for Web Service
PDF
FIND MY VENUE: Content & Review Based Location Recommendation System
PDF
Recommendation engines
Impersonal Recommendation system on top of Hadoop
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Recommender systems
Building a Recommender systems by Vivek Murugesan - Technical Architect at Cr...
productionising-recommenders
Notes on Recommender Systems pdf 2nd module
Recommender.system.presentation.pjug.01.21.2014
IntroductionRecommenderSystems_Petroni.pdf
Bv31491493
IRJET- Survey Paper on Recommendation Systems
Introduction to recommender systems
Beyond Collaborative Filtering: Learning to Rank Research Articles
Demystifying Recommendation Systems
Introduction to Recommendation Systems
Introduction to Recommendation Systems
Practical Recommendation System - Scalable Machine Learning
IRJET- Rating based Recommedation System for Web Service
FIND MY VENUE: Content & Review Based Location Recommendation System
Recommendation engines
Ad

Recently uploaded (20)

PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Hybrid model detection and classification of lung cancer
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
Chapter 5: Probability Theory and Statistics
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
project resource management chapter-09.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Encapsulation theory and applications.pdf
PPTX
1. Introduction to Computer Programming.pptx
Programs and apps: productivity, graphics, security and other tools
A novel scalable deep ensemble learning framework for big data classification...
Hybrid model detection and classification of lung cancer
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Encapsulation_ Review paper, used for researhc scholars
Hindi spoken digit analysis for native and non-native speakers
Chapter 5: Probability Theory and Statistics
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
MIND Revenue Release Quarter 2 2025 Press Release
1 - Historical Antecedents, Social Consideration.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
NewMind AI Weekly Chronicles - August'25-Week II
project resource management chapter-09.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Enhancing emotion recognition model for a student engagement use case through...
Encapsulation theory and applications.pdf
1. Introduction to Computer Programming.pptx
Ad

Building Recommender Systems - Mendeley and Science Direct

  • 1. | 0 Daniel Kershaw (@danjamker) Building Recommenders 20th September 2017
  • 2. | 1 Mendeley • Reference Manager • Social Network • Publication Catalogue
  • 3. | 2 Science Direct • Scientific publication database • Used by the majority of university and research institutions • Contains 12 million articles of content from 3,500 academic journals and 34,000 e-books
  • 4. | 3 Why Recommendations Pull Allow users to discover more content Make it easier to navigate catalogue
  • 5. | 4 Why Recommendations Pull Allow users to discover more content Make it easier to navigate catalogue Push Highlight new content to users Bring users back to service
  • 6. | 5 The five core components Data Collection Recommender Model Recommendation Post Processing Online Modules User Interface
  • 7. | 6 Outline Developed Algorithms – keeping it simple Practical Considerations – don’t look stupid Implementation – how to scale a system Evaluation – what is good enough Evolution – what’s changed over time Future Direction – the future’s bright the future’s is deep
  • 9. | 8 Available Data Implicit User libraries (Mendeley) User article interactions (Science Direct) Content Abstracts Titles References
  • 10. | 9 Content Based Similarity between what users have read Similarity in references Collaborative Collaborative Matrix Factorization KNN LDA Potential Methods
  • 11. | 10 User item interaction matrix User base CF – (kNN) https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
  • 12. | 11 Similarity between query users and other readers User base CF – (kNN) https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
  • 13. | 12 Similarity between all users User base CF – (kNN) https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
  • 14. | 13 Generating recommendations for user User base CF – (kNN) https://buildingrecommenders.wordpress.com/2015/11/18/overview-of-recommender-algorithms-part-2/
  • 15. | 14 • Ability to scale • Matrix incredibly sparse Why not Matrix Factorization
  • 17. | 16 Explore/Exploit (Dithering) Recommendations generated in batch Users want an interactive experience Slight shuffles give the impression of freshness Allow for the exploration of the list if only a proportion shown 𝑠𝑐𝑜𝑟𝑒 𝑑𝑖𝑡ℎ𝑒𝑟𝑒𝑑 = log 𝑟𝑎𝑛𝑘 + 𝑁 0, log 𝜖 where 𝜀 = ∆ 𝑟𝑎𝑛𝑘 𝑟𝑎𝑛𝑘 and tipically 𝜀 ∈ [1.5,2]
  • 18. | 17 Impression Discounting • Experience deteriorates if exposed to the same information • Push recommendations seen before down the list Rank Impressions
  • 19. | 18 Impression Discounting • Experience deteriorates if exposed to the same information • Push recommendations seen before down the list 𝑠𝑐𝑜𝑟𝑒 𝑛𝑒𝑤 = scoreoriginal ∗ (w1 ∗ g impCount + w2 ∗ g lastSeen ) See Lee, P. et. al
  • 20. | 19 Business Logic (Pre and Post Filtering) Don’t show items they already have (bought, added, consumed) Don’t feed the recommender positive feedback from recommender Don’t recommend out of stock items • A bad recommender has a cost - Can be greater than not receiving a recommendation
  • 22. | 21 Systems Architecture Impression Discounting API Front End AWS Dithering Candidate Selection Content Based Item2Item CF Online Offline Logs
  • 24. | 23 System • Which run generated the recommendation • What was served to the user • How was the score modified • What was removed from the recommendations User (Feedback loop) • What was displayed • What was clicked • When were they served • Where the recommendations displayed Logging Used for both debugging and feeding information to recommender
  • 26. | 25 • User to Item CF • Impression Discounting Mendeley – Desktop Application
  • 27. | 26 Mendeley – Online • Implicit – serves recommendations based on user libraries • Recent Activity – based off recent additions to a users library • Research Interests - based on user generated tags • Discipline – based on their self identified discipline Most Personalized Least Personalized See Hristakeva, M et. Al (2017)
  • 28. | 27 • Remove carousels • Focus on implicit recommendations • Fall back to content based solution Mendeley – Online
  • 29. | 28 • Recommendation based of the complete library of the user • Don’t send the same recommendations twice Mendeley - Email
  • 30. | 29 • Item to Item • Take user reading history • Get recommendations for each item • Interleave recommendations • Don’t send same recommendations twice Science Direct - Email
  • 31. | 30 Science Direct – Article Page Item to Item Dither recommendations every 30 minutes
  • 33. | 32 Off-line Methodology Train model Query Ground truth Time, user interactions Test
  • 34. | 33 Off-line evaluation - Mendeley From Hristakeva, M et. al
  • 35. | 34 Science Direct – Item-to-item
  • 36. | 35 • Infrastructure takes a long time to build • Need feedback from users to learn 1. Generate recommendations off-line 2. Send to users via email (A/A) 3. Modify method based on feedback 4. Send second set of users split into A/B buckets Static Recommendations for quick learnings Email to users Modify Recommender Email to users
  • 38. | 37 Learning to rank (LtR) Currently only using implicit feedback No content used Use CF as candidate selection Re-rank results based on learnt model optimised for CtR Use item and user features
  • 39. | 38 Deep Learning Use to learn more complex features Use as features in LtR Build on the existing framework developed Use pre-trained models before developing own
  • 40. | 39 Conclusion (Take Homes) • Log EVERYTHING • Start Simple • Iterate quickly • Get recommendations out quickly to learn • Don’t look stupid • CTR ≇ Off-line Evaluation
  • 41. | 40 www.elsevier.com/rd-solutions Thank you, Book chapter being written based on the content in this presentation
  • 42. | 41 References Hristakeva, M., Kershaw, D., Rossetti, M., Knoth, P., Pettit, B., Vargas, S., & Jack, K. (2017). Building recommender systems for scholarly information. the 1st Workshop (pp. 25–32). New York, New York, USA: ACM. http://doi.org/10.1145/3057148.3057152 Rossetti, M., Stella, F., & Zanker, M. (2016). Contrasting Offline and Online Results when Evaluating Recommendation Algorithms (pp. 31–34). Presented at the Proceedings of the 10th ACM Conference on Recommender Systems, New York, NY, USA: ACM. http://doi.org/10.1145/2959100.2959176 Lee, P., Lakshmanan, L. V. S., Tiwari, M., & Shah, S. (2014). Modeling impression discounting in large-scale recommender systems (pp. 1837–1846). Presented at the Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, New York, USA: ACM Press. http://doi.org/10.1145/2623330.2623356 Koren, Y. (2010). Collaborative filtering with temporal dynamics. Communications of the ACM, 53(4), 89–97. http://doi.org/10.1145/1721654.1721677