SlideShare a Scribd company logo
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Deep Learning for
Recommender
Systems
Asi Messica
April 2020
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Fiverr is the world’s largest
marketplace for freelance
services. We change how the
world works together by
connecting business decision
makers with talented
freelancers.
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Personalization - Why?
Help users find
servicesproductscontent
which is relevant for them to
maximize users satisfaction
and retention
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
~2016
Deep Learning becomes popular in
Recommender System
~2012
Deep Learning becomes popular in
Machine Learning
• Feature extraction directly from the content
• Heterogeneous data handled easily
• Sequential behavior modeling with RNNs/CNNs
• More accurate representation learning of users
and items
• Non linear transformation
• Deep learning worked well in other complex
domains. Worth a try!
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Item to Item Collaborative
Filtering
Recommend items that similar users have
chosen
Similar items are items which were chosen by
the same users
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Matrix Factorization
Approximate rating matrix by product of lower rank
matrix
Latent variables are introduced to represent the
underlying reasons of user purchasing a product
Each user and item are represented by a d
dimensional vector of latent features
…
Koren et al. (2009)
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Factorization Machines
In many cases we want to incorporate user or
item metadata or context into the
recommendation
(U,I,R) > (U,I,F1,F2,..,R)
Matrix Factorization
Factorization machines
Rendle (2010)
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Use Cases
Entity (Item)
Embedding
Sequence
Prediction
Hybrid
Explore-
Exploit
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Word Embedding
Goal: Learning vector space representations of
words capturing fine-grained semantic
regularities among words
Mikolov et al. (2013) – Word2Vec
Pennington et al. (2014) - GloVe
Word Analogy
Word Similariy
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Word2Vec CBOW
Goal: Learning vector space representations of
words capturing fine-grained semantic
regularities among words
● Continuous Bag of Words
● Maximizes the probability of the target word
given the Model
● Input: one-hot encoded words
● Input to hidden weights: Embedding
matrix of words
● Hidden to output weights
● Softmax transformation
Word2Vec - CBOW
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Item (Entity) Embedding
Replace words with items in a session/user
profile
-Embedding: a (learned) real value vector
representing an entity
-Similar entities’ embedding are similar
Used in recommenders
Features in more advanced algorithms Grbovic
& Chen 2018, AirBNB ;
Item-to-item recommendations
Model clustering (e.g. user country or sub-
categories)
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Prod2Vec
-Skip-gram model on products
Input: i-th product purchased by the user
Context: the other purchases of the user
-Learning user representation
Follows paragraph2vec
Input: user + products purchased except for the ith
Target: i-th product purchased by the user
[Grbovic et al. 2015]
-Skip-gram with Negative Sampling (SGNS) is
applied to event data
[Barkan & Koenigstein, 2016]
User Embedding
Grbovic et al. 2015
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Item Similarity
Similarity & Analogy Tests
Greenstein-Messica et al. (2017)
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Use Cases
Entity (Item)
Embedding
Sequence
Prediction
Hybrid
Explore-
Exploit
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Sequence Prediction
Session based recommendation
-Anonymous user recommendation/Intent
-Sequence of events (sometimes short)
-Predict next event. Ranking
GRU4REC
Network architecture:
Input: one hot encoded item Id
Optional embedding layer
GRU layer(s)
Output: scores over all items
Target: the next item in the session
Hidasi et al. (2015)
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
GRU4REC
Output sampling
Computing scores for all items in every step is slow
One positive item + several negative samples
Loss functions
Cross-entropy + Softmax
Average of BPR scores
Top1 score
Filtering:
1 click sessions, not realistically long
Items with support lower than 5 Hidasi et al. (2015)
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
GRU4REC
Key Observations
Similar accuracy with/without embedding
Multiple layers rarely help
Quick conversion (small changes after 5 -1 10
epochs)
Overall 20 – 30% improvement vs. item to item
recommendations
Hidasi et al. (2015)
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Session Based
Recommendation
Apply the advances in sequence modeling from
deep learning
- RNN architectures trained on the sequence
of user events in a session to predict next
item in session
- Report more than 10% accuracy gain over
baselines
- Adding context and attention
15,000 products, 999,000 filtered sessions, Embedded layer = 50
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Feature Rich Session Based
Recommendations
-Items have rich feature representations such as
pictures and text descriptions
-Incorporating image and text into the GRU4REC will
improve prediction accuracy
Images encoding:
-GoogleNet implementation, pre-trained with
ImageNet
-Features were extracted from the last average
pooling layer
Text encoding:
-Bag-of-words + TFIDF
Hidasi et al. (2016)
….
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Feature Rich Session Based
Recommendations
Key Observations
-The sequence of item features of itself is not
enough to model the session well
-Incorporating the item’s features increase the
MRR (about 6%), but don’t find more relevant
items (Recall)
Youtube-like dataset, not English
Hidasi et al. (2016)
….
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Use Cases
Entity (Item)
Embedding
Sequence
Prediction
Hybrid
Explore-
Exploit
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Wide & Deep
- Combines the strengths of linear models with deep learning models
- Used by Google Play app store recommendations
- Sparsity of the deep model handled by using embeddings
- The feature set includes raw input features and transformed features
- Both models are trained on the same time
- Deep works better than wide (+2.9%), but the deep & wide works best (+3.9%)
Cheng et al. 2016
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
You Tube Recommender
Key challenges: scale, freshness, noise, latency
Two deep neural nets: one for candidate generation,
one for ranking
Candidate generation
-Extreme multi-class
-Embeddings of both user and video
-The embeddings are learned jointy with the rest of
the architectures
-To train the model, they used negative sampling
Ranking
-The evaluation matrix time watched
-Hundreds of features (including image thumbnail
and hand-crafted )
Covington et al. 2016]
Overview
Candidate Generation
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
DeepFM
Wide and Deep architecture aiming to leverage
strengths of Factorization Machines for the
linear component
- Models train together and both parts shared
the same weights
- Flexible handling of mixed real/categorical
variables
Guo et al. 2017
25
AB Test – Huawei App Store
Control: Logistic regression; Test: DeepFM
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Multi-Modal Recommendations
Personalized tag recommendation for images
using deep transfer learning
- Visual image feature extraction via pre-
trained VGG-16
- Object detection via pre-trained YOLOv2
- Tagging history and image features are fed
into adapted factorization model
Ge et al. 2018
Nguyen et al. 2017
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Latest
XDeepFM (2018) – Combining Explicit and
Implicit Feature Interactions for Recommender
Systems
Deep Interest Evolution Network (2018)
NFFM (2019) - Operation-aware Neural
Networks for User Response Prediction
https://github.com/shenweichen/DeepCTR
Lian et al. 2018
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Use Cases
Entity (Item)
Embedding
Sequence
Prediction
Hybrid
Explore-
Exploit
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Gating
Goal: Optimize Exploration-Exploitation for
PayoffReward Maximization
In Recsys: the reward can be hits and the machines are the items to
recommend.
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Exploration Principles
The best long-term strategy may involve
short-term sacrifices
Gather information to make the best overall
decision
● Naive exploration: Add a noise to the
greedy policy [𝜀 greedy]
● Optimism in the face of uncertainty:
prefer actions with uncertain values.
[Upper Confidence Bound (UCB)]
● Probability matching: select the
actions according to the probability they
are the best. [Thompson Sampling]
Probability density over mean reward
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Evaluation
Policy evaluation*
Assumption: logging policy that was used to
gather the logged data chose each arm at each
time step uniformly at random
Chu et al. 2010. A contextual-bandit approach to
personalized news article recommendation.
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Estimate Probability Density
over CTR in DLRS?
Deep Bayesian Bandits Showndown (ICLR
2018) – An empirical comparison + a
python library
Simple methods (not the best)
● Neural Greedy
● Dropout
● Bootstrap
Online decision algorithm
https://github.com/tensorflow/models/tree/mast
er/research/deep_contextual_bandits
Russo et al. A tutorial on Thompson sampling. 2018
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
MCDropout as a Bayesian
Approximation: Representing
Model Uncertainty in Deep
Learning [Gal et al. 2016]
Use MC Dropout in inference time to estimate
model uncertainty
They claimed that using Dropout at inference
time is equivalent to doing Bayesian
approximation
Can be leveraged for TS, UCB for Recommender
System explorationexploitation combination
[Zeldes et al. 2017 (Taboola)]
https://github.com/yaringal/DropoutUncertaintyE
xps
https://www.cs.ox.ac.uk/people/yarin.gal/website/bl
og_3d801aa532c1ce.html
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Take Home Points
● There are advantages in using deep learning for recommender systems
● Proceed with caution
● Stay tuned
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Questions?
© 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary &
Confidential.
Thank You!
asi.messica@gmail.com

More Related Content

PDF
"Logo Maker’s micro guts — micro frontend at Fiverr", Yuriy Dadichin
PDF
AWS 인공지능 서비스와 서버리스 서비스를 이용한 동영상 분석 서비스 구축하기 (김현수/황윤상, AWS 솔루션즈 아키텍트) :: AWS D...
PDF
マイクロサービスを AWS サーバレス&コンテナで実装する方法
PDF
데이터센터 1도모르는 개발자가 마이크로서비스를 만났을때 (안주은, MyMusicTaste) :: AWS DevDay 2018
PDF
Deep Learning for Recommender Systems with Nick pentreath
PPTX
Deep Learning for Recommender Systems
PDF
Generative AI
PDF
NLP and Machine Learning for non-experts
"Logo Maker’s micro guts — micro frontend at Fiverr", Yuriy Dadichin
AWS 인공지능 서비스와 서버리스 서비스를 이용한 동영상 분석 서비스 구축하기 (김현수/황윤상, AWS 솔루션즈 아키텍트) :: AWS D...
マイクロサービスを AWS サーバレス&コンテナで実装する方法
데이터센터 1도모르는 개발자가 마이크로서비스를 만났을때 (안주은, MyMusicTaste) :: AWS DevDay 2018
Deep Learning for Recommender Systems with Nick pentreath
Deep Learning for Recommender Systems
Generative AI
NLP and Machine Learning for non-experts

Similar to When Deep Learning Meets Recommender System (20)

PPTX
Growth hacking in the age of Data
PPTX
Ml product page
PDF
Ml product page
PPTX
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
PPTX
Gen Ai Introduction to Generative AI to the world
PDF
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
PDF
2019-09-05Federated Learning.pdf
PPTX
Research data management 1.5
PPTX
Apidays Singapore 2024 - Privacy Enhancing Technologies for AI by Mark Choo, ...
PPTX
WSO2Con 2025 - Building AI Applications in the Enterprise (Part 1)
PPTX
Oracle Data Science Platform
PDF
Cyber security training using virtual labs 3 cs umuc presentation august 2018
PDF
Revolutionizing Industry 4.0: GPT-Enabled Real-Time Support
PPTX
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
PDF
Elastic @ Adobe: Making Search Smarter with Machine Learning at Scale
PDF
From the Promise of Spatial Computing to Actual Delivery of Real Solutions
PPTX
Search and Recommendations: 3 Sides of the Same Coin
PDF
楽天技術研究所の次世代AI 技術への挑戦
PPTX
Building Large Sustainable Apps
PDF
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
Growth hacking in the age of Data
Ml product page
Ml product page
Real-Time With AI – The Convergence Of Big Data And AI by Colin MacNaughton
Gen Ai Introduction to Generative AI to the world
Accelerate AI Agents with Multimodal RAG powered by Friendli Endpoints and Mi...
2019-09-05Federated Learning.pdf
Research data management 1.5
Apidays Singapore 2024 - Privacy Enhancing Technologies for AI by Mark Choo, ...
WSO2Con 2025 - Building AI Applications in the Enterprise (Part 1)
Oracle Data Science Platform
Cyber security training using virtual labs 3 cs umuc presentation august 2018
Revolutionizing Industry 4.0: GPT-Enabled Real-Time Support
AWS Toronto Summit 2019 - AIM302 - Build, train, and deploy ML models with Am...
Elastic @ Adobe: Making Search Smarter with Machine Learning at Scale
From the Promise of Spatial Computing to Actual Delivery of Real Solutions
Search and Recommendations: 3 Sides of the Same Coin
楽天技術研究所の次世代AI 技術への挑戦
Building Large Sustainable Apps
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
Ad

Recently uploaded (20)

PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
Transcultural that can help you someday.
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
DOCX
Factor Analysis Word Document Presentation
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
PPTX
New ISO 27001_2022 standard and the changes
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
Leprosy and NLEP programme community medicine
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
Global Data and Analytics Market Outlook Report
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Introduction to the R Programming Language
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Managing Community Partner Relationships
PDF
Microsoft Core Cloud Services powerpoint
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Transcultural that can help you someday.
[EN] Industrial Machine Downtime Prediction
STERILIZATION AND DISINFECTION-1.ppthhhbx
A Complete Guide to Streamlining Business Processes
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Factor Analysis Word Document Presentation
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
New ISO 27001_2022 standard and the changes
Pilar Kemerdekaan dan Identi Bangsa.pptx
Leprosy and NLEP programme community medicine
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
Global Data and Analytics Market Outlook Report
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Introduction to the R Programming Language
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Managing Community Partner Relationships
Microsoft Core Cloud Services powerpoint
Ad

When Deep Learning Meets Recommender System

  • 1. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Deep Learning for Recommender Systems Asi Messica April 2020
  • 2. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Fiverr is the world’s largest marketplace for freelance services. We change how the world works together by connecting business decision makers with talented freelancers.
  • 3. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Personalization - Why? Help users find servicesproductscontent which is relevant for them to maximize users satisfaction and retention
  • 4. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. ~2016 Deep Learning becomes popular in Recommender System ~2012 Deep Learning becomes popular in Machine Learning • Feature extraction directly from the content • Heterogeneous data handled easily • Sequential behavior modeling with RNNs/CNNs • More accurate representation learning of users and items • Non linear transformation • Deep learning worked well in other complex domains. Worth a try!
  • 5. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Item to Item Collaborative Filtering Recommend items that similar users have chosen Similar items are items which were chosen by the same users
  • 6. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Matrix Factorization Approximate rating matrix by product of lower rank matrix Latent variables are introduced to represent the underlying reasons of user purchasing a product Each user and item are represented by a d dimensional vector of latent features … Koren et al. (2009)
  • 7. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Factorization Machines In many cases we want to incorporate user or item metadata or context into the recommendation (U,I,R) > (U,I,F1,F2,..,R) Matrix Factorization Factorization machines Rendle (2010)
  • 8. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Use Cases Entity (Item) Embedding Sequence Prediction Hybrid Explore- Exploit
  • 9. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Word Embedding Goal: Learning vector space representations of words capturing fine-grained semantic regularities among words Mikolov et al. (2013) – Word2Vec Pennington et al. (2014) - GloVe Word Analogy Word Similariy
  • 10. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Word2Vec CBOW Goal: Learning vector space representations of words capturing fine-grained semantic regularities among words ● Continuous Bag of Words ● Maximizes the probability of the target word given the Model ● Input: one-hot encoded words ● Input to hidden weights: Embedding matrix of words ● Hidden to output weights ● Softmax transformation Word2Vec - CBOW
  • 11. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Item (Entity) Embedding Replace words with items in a session/user profile -Embedding: a (learned) real value vector representing an entity -Similar entities’ embedding are similar Used in recommenders Features in more advanced algorithms Grbovic & Chen 2018, AirBNB ; Item-to-item recommendations Model clustering (e.g. user country or sub- categories)
  • 12. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Prod2Vec -Skip-gram model on products Input: i-th product purchased by the user Context: the other purchases of the user -Learning user representation Follows paragraph2vec Input: user + products purchased except for the ith Target: i-th product purchased by the user [Grbovic et al. 2015] -Skip-gram with Negative Sampling (SGNS) is applied to event data [Barkan & Koenigstein, 2016] User Embedding Grbovic et al. 2015
  • 13. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Item Similarity Similarity & Analogy Tests Greenstein-Messica et al. (2017)
  • 14. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Use Cases Entity (Item) Embedding Sequence Prediction Hybrid Explore- Exploit
  • 15. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Sequence Prediction Session based recommendation -Anonymous user recommendation/Intent -Sequence of events (sometimes short) -Predict next event. Ranking GRU4REC Network architecture: Input: one hot encoded item Id Optional embedding layer GRU layer(s) Output: scores over all items Target: the next item in the session Hidasi et al. (2015)
  • 16. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. GRU4REC Output sampling Computing scores for all items in every step is slow One positive item + several negative samples Loss functions Cross-entropy + Softmax Average of BPR scores Top1 score Filtering: 1 click sessions, not realistically long Items with support lower than 5 Hidasi et al. (2015)
  • 17. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. GRU4REC Key Observations Similar accuracy with/without embedding Multiple layers rarely help Quick conversion (small changes after 5 -1 10 epochs) Overall 20 – 30% improvement vs. item to item recommendations Hidasi et al. (2015)
  • 18. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Session Based Recommendation Apply the advances in sequence modeling from deep learning - RNN architectures trained on the sequence of user events in a session to predict next item in session - Report more than 10% accuracy gain over baselines - Adding context and attention 15,000 products, 999,000 filtered sessions, Embedded layer = 50
  • 19. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Feature Rich Session Based Recommendations -Items have rich feature representations such as pictures and text descriptions -Incorporating image and text into the GRU4REC will improve prediction accuracy Images encoding: -GoogleNet implementation, pre-trained with ImageNet -Features were extracted from the last average pooling layer Text encoding: -Bag-of-words + TFIDF Hidasi et al. (2016) ….
  • 20. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Feature Rich Session Based Recommendations Key Observations -The sequence of item features of itself is not enough to model the session well -Incorporating the item’s features increase the MRR (about 6%), but don’t find more relevant items (Recall) Youtube-like dataset, not English Hidasi et al. (2016) ….
  • 21. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Use Cases Entity (Item) Embedding Sequence Prediction Hybrid Explore- Exploit
  • 22. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Wide & Deep - Combines the strengths of linear models with deep learning models - Used by Google Play app store recommendations - Sparsity of the deep model handled by using embeddings - The feature set includes raw input features and transformed features - Both models are trained on the same time - Deep works better than wide (+2.9%), but the deep & wide works best (+3.9%) Cheng et al. 2016
  • 23. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. You Tube Recommender Key challenges: scale, freshness, noise, latency Two deep neural nets: one for candidate generation, one for ranking Candidate generation -Extreme multi-class -Embeddings of both user and video -The embeddings are learned jointy with the rest of the architectures -To train the model, they used negative sampling Ranking -The evaluation matrix time watched -Hundreds of features (including image thumbnail and hand-crafted ) Covington et al. 2016] Overview Candidate Generation
  • 24. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. DeepFM Wide and Deep architecture aiming to leverage strengths of Factorization Machines for the linear component - Models train together and both parts shared the same weights - Flexible handling of mixed real/categorical variables Guo et al. 2017
  • 25. 25 AB Test – Huawei App Store Control: Logistic regression; Test: DeepFM
  • 26. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Multi-Modal Recommendations Personalized tag recommendation for images using deep transfer learning - Visual image feature extraction via pre- trained VGG-16 - Object detection via pre-trained YOLOv2 - Tagging history and image features are fed into adapted factorization model Ge et al. 2018 Nguyen et al. 2017
  • 27. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Latest XDeepFM (2018) – Combining Explicit and Implicit Feature Interactions for Recommender Systems Deep Interest Evolution Network (2018) NFFM (2019) - Operation-aware Neural Networks for User Response Prediction https://github.com/shenweichen/DeepCTR Lian et al. 2018
  • 28. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Use Cases Entity (Item) Embedding Sequence Prediction Hybrid Explore- Exploit
  • 29. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Gating Goal: Optimize Exploration-Exploitation for PayoffReward Maximization In Recsys: the reward can be hits and the machines are the items to recommend.
  • 30. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Exploration Principles The best long-term strategy may involve short-term sacrifices Gather information to make the best overall decision ● Naive exploration: Add a noise to the greedy policy [𝜀 greedy] ● Optimism in the face of uncertainty: prefer actions with uncertain values. [Upper Confidence Bound (UCB)] ● Probability matching: select the actions according to the probability they are the best. [Thompson Sampling] Probability density over mean reward
  • 31. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Evaluation Policy evaluation* Assumption: logging policy that was used to gather the logged data chose each arm at each time step uniformly at random Chu et al. 2010. A contextual-bandit approach to personalized news article recommendation.
  • 32. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Estimate Probability Density over CTR in DLRS? Deep Bayesian Bandits Showndown (ICLR 2018) – An empirical comparison + a python library Simple methods (not the best) ● Neural Greedy ● Dropout ● Bootstrap Online decision algorithm https://github.com/tensorflow/models/tree/mast er/research/deep_contextual_bandits Russo et al. A tutorial on Thompson sampling. 2018
  • 33. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. MCDropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning [Gal et al. 2016] Use MC Dropout in inference time to estimate model uncertainty They claimed that using Dropout at inference time is equivalent to doing Bayesian approximation Can be leveraged for TS, UCB for Recommender System explorationexploitation combination [Zeldes et al. 2017 (Taboola)] https://github.com/yaringal/DropoutUncertaintyE xps https://www.cs.ox.ac.uk/people/yarin.gal/website/bl og_3d801aa532c1ce.html
  • 34. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Take Home Points ● There are advantages in using deep learning for recommender systems ● Proceed with caution ● Stay tuned
  • 35. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Questions?
  • 36. © 2019 Fiverr Int. Lmt. All Rights Reserved. Proprietary & Confidential. Thank You! asi.messica@gmail.com