SlideShare a Scribd company logo
1
A COMBINATION OF SIMPLE MODELS BY FORWARD
PREDICTOR SELECTION FOR JOB RECOMMENDATION
Dávid Zibriczky, PhD (DaveXster)
Budapest University of Technology and Economics,
Budapest, Hungary
2
The Dataset – Data preparation
• Events (interactions, impressions)
› Target format: (time,user_id,item_id,type,value)
› Interactions  Format OK
› Impressions:
• Generating unique (time,user_id,item_id) triples
• Value  count of their occurrence
• Time  12pm on Thursday of the week
• Type  5
• Catalog (items, users)
› Target format:(id,key1,key2,…,keyN)
› Items and users  Format OK
› Unknown „0” values  empty values
› Inconsistency: Geo-location vs. country/region  Metadata enhancement based on geo-location
3
The Dataset – Basic statistics
Size of training set
• 211M events, 2.8M users, 1.3M items
• Effect: huge and very sparse matrix
Distribution
• 95% of events are impressions
• 72% of the users have impressions only
• Item support for interactions is low (~9)
• Effect: weak collaboration using interactions
Target users
• 150K users
• 73% active, 16% inactive, 12% new
• Effect: user cold start and warm-up problem
Data source #events #users #items
Interactions 8,826,678 784,687 1,029,480
Impressions 201,872,093 2,755,167 846,814
All events 210,698,777 2,792,405 1,257,422
Catalog - 1,367,057 1,358,098
Catalog OR Events - 2,829,563 1,362,890
4
Methods – Concept
Terminology
• Method: A technique of estimating the relevance of an item for a user (p-Value)
• Predictor/model: An instance of a method with a specified parameter setting
• Combination: Linear combination of prediction values for a user-item pairs
Approach
1. Exploring the properties of the data set
2. Definition of „simple” methods with different functionality (time-decay is commonly used)*
3. Finding a set of relevant predictors and optimal combination of them
4. Top-N ranking of available event supported items with non-zero p-Values (~200K)
* Equations of the methods can be found in the paper
5
Methods – Item-kNN
• Observation: Very sparse user-item matrix (0.005%), 211M events
• Goal: Next best items to click, estimating recommendations of Xing
• Method: Standard Item-based kNN with special fetures
› Input-output event types
› Controlling popularity factor
› Similarity of the same item is 0
› Efficient implementation
• Notation: IKNN(I,O)
› I: input event type
› O: output event type
• Comment: No improvement combining other CF algorithms (MF, FM, User-kNN)
6
Methods – Recalling recommendations
• Chart: The distribution of impression
events by the number of weeks on that the
same item has already been shown
• Observation: 38% of recommendations
are recurring items
• Goal: Reverse engineering, recalling
recommendations
• Method:
› Recommendation of already shown items
› Weighted by expected CTR
• Notation: RCTR
7
Methods – Already seen items
• Chart: The probability of returning to an already
seen item after interacting on other items
• Observation: Significant probability of re-
clicking on an already clicked item
• Goal: Capturing re-clicking phenomena
• Method: Recommendation of already clicked
items
• Notation: AS(I)
8
Methods – User metadata-based popularity
• Observation:
› Significant amount of passive and new users
› All target users have metadata
• Goal:
› Semi-personalized recommendations for new users
› Improving accuracy on inactive users
• Method:
1. Item model: Expected popularity of an item in each user group
2. Prediction: Average popularity of an item for a user
› Applied keys: jobroles, edu_fieldofstudies
• Notation: UPOP
9
Methods – MS: Meta cosine similarity
• Observation:
› Item-cold start problem, many low-supported items
› Almost all items has metadata
• Goal:
› Model building for new items
› Improving the model of low-supported items
• Method:
1. Item model: Meta-data representation, tf-idf
2. User model: Meta-words of items seen by the user
3. Prediction: Average cosine similarity between user-item models
› Keys: tags, title, industry_id, geo_country, geo_region,
discipline_id
• Notation: MS
10
Methods – AP: Age-based popularity change
• Observation: Significant drop in popularity of
items with ~30 and ~60 days
• Goal: Underscoring these items
• Method: Expected ratio of the popularity in the
next week
• Notation: AP
11
Methods – OM: The omit method
• Observation: Unwanted items in recommendation lists
• Goal: Omitting poorly modelled items of a predictor or combination
• Method:
1. Sub-train-test split
2. Retrain a new combination
3. Generating top-N recommendations
4. Measuring how the total evaluation would change by omitting items
5. Omitting worst K items on the original combination
• Notation: OM
12
Methods – Optimization
1. Time-based train-test split (test set: last week)
2. Coordinate gradient descent optimization of various methods  candidate predictor set
3. Support-based distinct user groups (new users, inactive users, 10 equal sized group of active users)
4. Forward Predictor Selection
1. Initialization:
1. Predictors that are selected from the candidate set for final combination  selected predictor set
2. Selected predictor set is empty in the beginning
2. Loop:
1. Calculate the accuracy of selected predictor set
2. For all remained candidate predictor, calculate the gain in accuracy that would give the predictor if it
would be moved to the selected set
3. Move the best one to the selected set and recalculate combination weights
4. Repeat the loop until there is improvement or reamining candidate preditor
3. Return: the set of the predictors and corresponding weights
5. Retrain selected predictors on the full data set
13
… let’s put it together and see how it performs!
14
Evaluation – Forward Predictor Selection
• Best single model
› Item-kNN trained on positive interactions
› 2.5 min training time
› 7 ms prediction time
# Predictor tTR(s)* tPR(ms)* Score Rank
1 IKNN(C,C) 148 7 450,046 24
* Java-based framework, 8-core 3.4 GHz CPU, 32 GB memory
15
Evaluation – Forward Predictor Selection
• Best single model
› Item-kNN trained on positive interactions
› 2.5 min training time
› 7 ms prediction time
• Sub-combinations
› 4 models: 600K+ score (w/o item metadata)
# Predictor tTR(s)* tPR(ms)* Score Rank
1 IKNN(C,C) 148 7 450,046 24
2 +RCTR 208 15 548,338 9
3 +AS(1) 237 17 590,526 6
4 +UPOP 247 50 614,674 5
16
Evaluation – Forward Predictor Selection
• Best single model
› Item-kNN trained on positive interactions
› 2.5 min training time
› 7 ms prediction time
• Sub-combinations
› 4 models: 600K+ score (w/o item metadata)
› 5 models: 3rd place
# Predictor tTR(s)* tPR(ms)* Score Rank
1 IKNN(C,C) 148 7 450,046 24
2 +RCTR 208 15 548,338 9
3 +AS(1) 237 17 590,526 6
4 +UPOP 247 50 614,674 5
5 +MS 364 122 623,909 3
17
Evaluation – Forward Predictor Selection
• Best single model
› Item-kNN trained on positive interactions
› 2.5 min training time
› 7 ms prediction time
• Sub-combinations
› 4 models: 600K+ score (w/o item metadata)
› 5 models: 3rd place
› 6 models: 95% of final score
# Predictor tTR(s)* tPR(ms)* Score Rank
1 IKNN(C,C) 148 7 450,046 24
2 +RCTR 208 15 548,338 9
3 +AS(1) 237 17 590,526 6
4 +UPOP 247 50 614,674 5
5 +MS 364 122 623,909 3
6 +IKNN(R,R) 1,150 168 635,278 3
18
Evaluation – Forward Predictor Selection
• Best single model
› Item-kNN trained on positive interactions
› 2.5 min training time
› 7 ms prediction time
• Sub-combinations
› 4 models: 600K+ score (w/o item metadata)
› 5 models: 3rd place
› 6 models: 95% of final score
› 10 models: 650K+ score (<30 mins. training time)
# Predictor tTR(s)* tPR(ms)* Score Rank
1 IKNN(C,C) 148 7 450,046 24
2 +RCTR 208 15 548,338 9
3 +AS(1) 237 17 590,526 6
4 +UPOP 247 50 614,674 5
5 +MS 364 122 623,909 3
6 +IKNN(R,R) 1,150 168 635,278 3
7 +AS(3) 1,205 178 636,498 3
8 +IKNN(R,C) 1,557 197 643,145 3
9 +AS(4) 1,582 202 644,710 3
10 +AP 1,621 207 652,802 3
19
Evaluation – Forward Predictor Selection
• Best single model
› Item-kNN trained on positive interactions
› 2.5 min training time
› 7 ms prediction time
• Sub-combinations
› 4 models: 600K+ score (w/o item metadata)
› 5 models: 3rd place
› 6 models: 95% of final score
› 10 models: 650K+ score (<30 mins. training time)
• Final combination
› 3rd place
› ~666K leaderboard score
› 11 instances
› user-support-based weighting
› 3h+ training time, 200 ms prediction time
# Predictor tTR(s)* tPR(ms)* Score Rank
1 IKNN(C,C) 148 7 450,046 24
2 +RCTR 208 15 548,338 9
3 +AS(1) 237 17 590,526 6
4 +UPOP 247 50 614,674 5
5 +MS 364 122 623,909 3
6 +IKNN(R,R) 1,150 168 635,278 3
7 +AS(3) 1,205 178 636,498 3
8 +IKNN(R,C) 1,557 197 643,145 3
9 +AS(4) 1,582 202 644,710 3
10 +AP 1,621 207 652,802 3
SUPP_C(1-10) 1,639 194 661,359 3
11 +OM 11,790 199 665,592 3
* Java-based framework, 8-core 3.4 GHz CPU, 32 GB memory
20
Evaluation – Timeline
39
1514141415
121110
2 3
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3
115.4
366.9
418.7
438.3
454.2
468.4
481.9
513.4
569.6
596.5
600.2
603.2
610.0
611.3
611.6
625.2
627.2
627.5
628.9
633.1
637.6
638.1
639.7
640.4
643.5
644.7
652.8
653.2
653.7
665.6
0
5
10
15
20
25
30
35
40
45
0.0
100.0
200.0
300.0
400.0
500.0
600.0
700.0
800.0
Apr-25
May-02
May-09
May-16
May-23
May-30
Jun-06
Jun-13
Jun-20
Jun-27
Leaderboardrank
Leaderboardscore(thousands)
Date
Timeline
Initial setup Model design and implementation Final sprint
21
Lessons learnt
• Exploiting the specificity of the dataset
• Using Item-kNN over factorization in a very sparse dataset
• Paying attention to recurrence
• Forward Predictor Selection is effective
• Different optimization for different user groups
• Underscoring/omitting weak items
• Ranking 200K items is slow
• Keep it simple and transparent!
22
Presenter
Contact
Thank you for your attention!
Dávid Zibriczky, PhD
david.zibriczky@gmail.com

More Related Content

PDF
Temporal Learning and Sequence Modeling for a Job Recommender System
PDF
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
PDF
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
PPTX
RecSys Challenge 2016
PDF
Matrix Factorization Technique for Recommender Systems
PPT
Item Based Collaborative Filtering Recommendation Algorithms
PPTX
LinkedIn talk at Netflix ML Platform meetup Sep 2019
PDF
Collaborative Filtering 2: Item-based CF
Temporal Learning and Sequence Modeling for a Job Recommender System
A Scalable, High-performance Algorithm for Hybrid Job Recommendations
Avito recsys-challenge-2016RecSys Challenge 2016: Job Recommendation Based on...
RecSys Challenge 2016
Matrix Factorization Technique for Recommender Systems
Item Based Collaborative Filtering Recommendation Algorithms
LinkedIn talk at Netflix ML Platform meetup Sep 2019
Collaborative Filtering 2: Item-based CF

What's hot (20)

PDF
Recsys2021_slides_sato
PPTX
Collaborative filtering at scale
PDF
Artwork Personalization at Netflix
PPTX
Collaborative Filtering using KNN
PDF
GTC 2021: Counterfactual Learning to Rank in E-commerce
PDF
Facebook Talk at Netflix ML Platform meetup Sep 2019
PPTX
Collaborative filtering
PDF
Summary of a Recommender Systems Survey paper
PDF
Replicable Evaluation of Recommender Systems
PDF
Survey of Recommendation Systems
PDF
Movie Recommendation engine
PPTX
Recommender Systems: Advances in Collaborative Filtering
PPTX
Recommender Systems
PDF
ACM SIGIR 2020 Tutorial - Reciprocal Recommendation: matching users with the ...
PDF
Collaborative Filtering 1: User-based CF
PDF
Recent advances in deep recommender systems
PPTX
Collaborative filtering
PDF
Recommender Systems! @ASAI 2011
PPTX
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
PDF
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
Recsys2021_slides_sato
Collaborative filtering at scale
Artwork Personalization at Netflix
Collaborative Filtering using KNN
GTC 2021: Counterfactual Learning to Rank in E-commerce
Facebook Talk at Netflix ML Platform meetup Sep 2019
Collaborative filtering
Summary of a Recommender Systems Survey paper
Replicable Evaluation of Recommender Systems
Survey of Recommendation Systems
Movie Recommendation engine
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems
ACM SIGIR 2020 Tutorial - Reciprocal Recommendation: matching users with the ...
Collaborative Filtering 1: User-based CF
Recent advances in deep recommender systems
Collaborative filtering
Recommender Systems! @ASAI 2011
Large-scale Parallel Collaborative Filtering and Clustering using MapReduce f...
HT2014 Tutorial: Evaluating Recommender Systems - Ensuring Replicability of E...
Ad

Similar to A Combination of Simple Models by Forward Predictor Selection for Job Recommendation (20)

PDF
Recommender Systems and Active Learning
PDF
Recommender Systems and Active Learning (for Startups)
PPT
Download
PPT
Download
PDF
PPT by Jannach_organized.pdf presentation on the recommendation
PPT
Introduction to recommendation system
PDF
PredictionIO - Building Applications That Predict User Behavior Through Big D...
PPT
Item basedcollaborativefilteringrecommendationalgorithms
PDF
Mobile App Recommendations Using Deep Learning and Big Data
PDF
IRJET- Online Course Recommendation System
PDF
CSE545_Porject
PPT
Recommender systems
PPTX
PPTX
Kddcup2011
PPTX
RS NAIVE BAYES ASSOCIATION RULE MINING AND BLACK BOX
PDF
Quest for prediction accuracy by Mekbib Awono
PPTX
Collaborative Filtering Recommendation System
PDF
Introduction to Recommender Systems
PPTX
Recommender Systems
PPTX
Lessons learnt at building recommendation services at industry scale
Recommender Systems and Active Learning
Recommender Systems and Active Learning (for Startups)
Download
Download
PPT by Jannach_organized.pdf presentation on the recommendation
Introduction to recommendation system
PredictionIO - Building Applications That Predict User Behavior Through Big D...
Item basedcollaborativefilteringrecommendationalgorithms
Mobile App Recommendations Using Deep Learning and Big Data
IRJET- Online Course Recommendation System
CSE545_Porject
Recommender systems
Kddcup2011
RS NAIVE BAYES ASSOCIATION RULE MINING AND BLACK BOX
Quest for prediction accuracy by Mekbib Awono
Collaborative Filtering Recommendation System
Introduction to Recommender Systems
Recommender Systems
Lessons learnt at building recommendation services at industry scale
Ad

More from David Zibriczky (10)

PDF
Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)
PDF
Predictive Solutions and Analytics for TV & Entertainment Businesses
PDF
Improving the TV User Experience by Algorithms: Personalized Content Recommen...
PPTX
Recommender Systems meet Finance - A literature review
PDF
Fast ALS-Based Matrix Factorization for Recommender Systems
PDF
EPG content recommendation in large scale: a case study on interactive TV pla...
PDF
Personalized recommendation of linear content on interactive TV platforms
PDF
An introduction to Recommender Systems
PDF
Data Modeling in IPTV and OTT Recommender Systems
PDF
Entropy based asset pricing
Highlights from the 8th ACM Conference on Recommender Systems (RecSys 2014)
Predictive Solutions and Analytics for TV & Entertainment Businesses
Improving the TV User Experience by Algorithms: Personalized Content Recommen...
Recommender Systems meet Finance - A literature review
Fast ALS-Based Matrix Factorization for Recommender Systems
EPG content recommendation in large scale: a case study on interactive TV pla...
Personalized recommendation of linear content on interactive TV platforms
An introduction to Recommender Systems
Data Modeling in IPTV and OTT Recommender Systems
Entropy based asset pricing

Recently uploaded (20)

PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Database Infoormation System (DBIS).pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Introduction to the R Programming Language
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
annual-report-2024-2025 original latest.
PPTX
Computer network topology notes for revision
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
1_Introduction to advance data techniques.pptx
Database Infoormation System (DBIS).pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to the R Programming Language
Business Ppt On Nestle.pptx huunnnhhgfvu
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
.pdf is not working space design for the following data for the following dat...
climate analysis of Dhaka ,Banglades.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
annual-report-2024-2025 original latest.
Computer network topology notes for revision
Lecture1 pattern recognition............
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Clinical guidelines as a resource for EBP(1).pdf
Data_Analytics_and_PowerBI_Presentation.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf

A Combination of Simple Models by Forward Predictor Selection for Job Recommendation

  • 1. 1 A COMBINATION OF SIMPLE MODELS BY FORWARD PREDICTOR SELECTION FOR JOB RECOMMENDATION Dávid Zibriczky, PhD (DaveXster) Budapest University of Technology and Economics, Budapest, Hungary
  • 2. 2 The Dataset – Data preparation • Events (interactions, impressions) › Target format: (time,user_id,item_id,type,value) › Interactions  Format OK › Impressions: • Generating unique (time,user_id,item_id) triples • Value  count of their occurrence • Time  12pm on Thursday of the week • Type  5 • Catalog (items, users) › Target format:(id,key1,key2,…,keyN) › Items and users  Format OK › Unknown „0” values  empty values › Inconsistency: Geo-location vs. country/region  Metadata enhancement based on geo-location
  • 3. 3 The Dataset – Basic statistics Size of training set • 211M events, 2.8M users, 1.3M items • Effect: huge and very sparse matrix Distribution • 95% of events are impressions • 72% of the users have impressions only • Item support for interactions is low (~9) • Effect: weak collaboration using interactions Target users • 150K users • 73% active, 16% inactive, 12% new • Effect: user cold start and warm-up problem Data source #events #users #items Interactions 8,826,678 784,687 1,029,480 Impressions 201,872,093 2,755,167 846,814 All events 210,698,777 2,792,405 1,257,422 Catalog - 1,367,057 1,358,098 Catalog OR Events - 2,829,563 1,362,890
  • 4. 4 Methods – Concept Terminology • Method: A technique of estimating the relevance of an item for a user (p-Value) • Predictor/model: An instance of a method with a specified parameter setting • Combination: Linear combination of prediction values for a user-item pairs Approach 1. Exploring the properties of the data set 2. Definition of „simple” methods with different functionality (time-decay is commonly used)* 3. Finding a set of relevant predictors and optimal combination of them 4. Top-N ranking of available event supported items with non-zero p-Values (~200K) * Equations of the methods can be found in the paper
  • 5. 5 Methods – Item-kNN • Observation: Very sparse user-item matrix (0.005%), 211M events • Goal: Next best items to click, estimating recommendations of Xing • Method: Standard Item-based kNN with special fetures › Input-output event types › Controlling popularity factor › Similarity of the same item is 0 › Efficient implementation • Notation: IKNN(I,O) › I: input event type › O: output event type • Comment: No improvement combining other CF algorithms (MF, FM, User-kNN)
  • 6. 6 Methods – Recalling recommendations • Chart: The distribution of impression events by the number of weeks on that the same item has already been shown • Observation: 38% of recommendations are recurring items • Goal: Reverse engineering, recalling recommendations • Method: › Recommendation of already shown items › Weighted by expected CTR • Notation: RCTR
  • 7. 7 Methods – Already seen items • Chart: The probability of returning to an already seen item after interacting on other items • Observation: Significant probability of re- clicking on an already clicked item • Goal: Capturing re-clicking phenomena • Method: Recommendation of already clicked items • Notation: AS(I)
  • 8. 8 Methods – User metadata-based popularity • Observation: › Significant amount of passive and new users › All target users have metadata • Goal: › Semi-personalized recommendations for new users › Improving accuracy on inactive users • Method: 1. Item model: Expected popularity of an item in each user group 2. Prediction: Average popularity of an item for a user › Applied keys: jobroles, edu_fieldofstudies • Notation: UPOP
  • 9. 9 Methods – MS: Meta cosine similarity • Observation: › Item-cold start problem, many low-supported items › Almost all items has metadata • Goal: › Model building for new items › Improving the model of low-supported items • Method: 1. Item model: Meta-data representation, tf-idf 2. User model: Meta-words of items seen by the user 3. Prediction: Average cosine similarity between user-item models › Keys: tags, title, industry_id, geo_country, geo_region, discipline_id • Notation: MS
  • 10. 10 Methods – AP: Age-based popularity change • Observation: Significant drop in popularity of items with ~30 and ~60 days • Goal: Underscoring these items • Method: Expected ratio of the popularity in the next week • Notation: AP
  • 11. 11 Methods – OM: The omit method • Observation: Unwanted items in recommendation lists • Goal: Omitting poorly modelled items of a predictor or combination • Method: 1. Sub-train-test split 2. Retrain a new combination 3. Generating top-N recommendations 4. Measuring how the total evaluation would change by omitting items 5. Omitting worst K items on the original combination • Notation: OM
  • 12. 12 Methods – Optimization 1. Time-based train-test split (test set: last week) 2. Coordinate gradient descent optimization of various methods  candidate predictor set 3. Support-based distinct user groups (new users, inactive users, 10 equal sized group of active users) 4. Forward Predictor Selection 1. Initialization: 1. Predictors that are selected from the candidate set for final combination  selected predictor set 2. Selected predictor set is empty in the beginning 2. Loop: 1. Calculate the accuracy of selected predictor set 2. For all remained candidate predictor, calculate the gain in accuracy that would give the predictor if it would be moved to the selected set 3. Move the best one to the selected set and recalculate combination weights 4. Repeat the loop until there is improvement or reamining candidate preditor 3. Return: the set of the predictors and corresponding weights 5. Retrain selected predictors on the full data set
  • 13. 13 … let’s put it together and see how it performs!
  • 14. 14 Evaluation – Forward Predictor Selection • Best single model › Item-kNN trained on positive interactions › 2.5 min training time › 7 ms prediction time # Predictor tTR(s)* tPR(ms)* Score Rank 1 IKNN(C,C) 148 7 450,046 24 * Java-based framework, 8-core 3.4 GHz CPU, 32 GB memory
  • 15. 15 Evaluation – Forward Predictor Selection • Best single model › Item-kNN trained on positive interactions › 2.5 min training time › 7 ms prediction time • Sub-combinations › 4 models: 600K+ score (w/o item metadata) # Predictor tTR(s)* tPR(ms)* Score Rank 1 IKNN(C,C) 148 7 450,046 24 2 +RCTR 208 15 548,338 9 3 +AS(1) 237 17 590,526 6 4 +UPOP 247 50 614,674 5
  • 16. 16 Evaluation – Forward Predictor Selection • Best single model › Item-kNN trained on positive interactions › 2.5 min training time › 7 ms prediction time • Sub-combinations › 4 models: 600K+ score (w/o item metadata) › 5 models: 3rd place # Predictor tTR(s)* tPR(ms)* Score Rank 1 IKNN(C,C) 148 7 450,046 24 2 +RCTR 208 15 548,338 9 3 +AS(1) 237 17 590,526 6 4 +UPOP 247 50 614,674 5 5 +MS 364 122 623,909 3
  • 17. 17 Evaluation – Forward Predictor Selection • Best single model › Item-kNN trained on positive interactions › 2.5 min training time › 7 ms prediction time • Sub-combinations › 4 models: 600K+ score (w/o item metadata) › 5 models: 3rd place › 6 models: 95% of final score # Predictor tTR(s)* tPR(ms)* Score Rank 1 IKNN(C,C) 148 7 450,046 24 2 +RCTR 208 15 548,338 9 3 +AS(1) 237 17 590,526 6 4 +UPOP 247 50 614,674 5 5 +MS 364 122 623,909 3 6 +IKNN(R,R) 1,150 168 635,278 3
  • 18. 18 Evaluation – Forward Predictor Selection • Best single model › Item-kNN trained on positive interactions › 2.5 min training time › 7 ms prediction time • Sub-combinations › 4 models: 600K+ score (w/o item metadata) › 5 models: 3rd place › 6 models: 95% of final score › 10 models: 650K+ score (<30 mins. training time) # Predictor tTR(s)* tPR(ms)* Score Rank 1 IKNN(C,C) 148 7 450,046 24 2 +RCTR 208 15 548,338 9 3 +AS(1) 237 17 590,526 6 4 +UPOP 247 50 614,674 5 5 +MS 364 122 623,909 3 6 +IKNN(R,R) 1,150 168 635,278 3 7 +AS(3) 1,205 178 636,498 3 8 +IKNN(R,C) 1,557 197 643,145 3 9 +AS(4) 1,582 202 644,710 3 10 +AP 1,621 207 652,802 3
  • 19. 19 Evaluation – Forward Predictor Selection • Best single model › Item-kNN trained on positive interactions › 2.5 min training time › 7 ms prediction time • Sub-combinations › 4 models: 600K+ score (w/o item metadata) › 5 models: 3rd place › 6 models: 95% of final score › 10 models: 650K+ score (<30 mins. training time) • Final combination › 3rd place › ~666K leaderboard score › 11 instances › user-support-based weighting › 3h+ training time, 200 ms prediction time # Predictor tTR(s)* tPR(ms)* Score Rank 1 IKNN(C,C) 148 7 450,046 24 2 +RCTR 208 15 548,338 9 3 +AS(1) 237 17 590,526 6 4 +UPOP 247 50 614,674 5 5 +MS 364 122 623,909 3 6 +IKNN(R,R) 1,150 168 635,278 3 7 +AS(3) 1,205 178 636,498 3 8 +IKNN(R,C) 1,557 197 643,145 3 9 +AS(4) 1,582 202 644,710 3 10 +AP 1,621 207 652,802 3 SUPP_C(1-10) 1,639 194 661,359 3 11 +OM 11,790 199 665,592 3 * Java-based framework, 8-core 3.4 GHz CPU, 32 GB memory
  • 20. 20 Evaluation – Timeline 39 1514141415 121110 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 115.4 366.9 418.7 438.3 454.2 468.4 481.9 513.4 569.6 596.5 600.2 603.2 610.0 611.3 611.6 625.2 627.2 627.5 628.9 633.1 637.6 638.1 639.7 640.4 643.5 644.7 652.8 653.2 653.7 665.6 0 5 10 15 20 25 30 35 40 45 0.0 100.0 200.0 300.0 400.0 500.0 600.0 700.0 800.0 Apr-25 May-02 May-09 May-16 May-23 May-30 Jun-06 Jun-13 Jun-20 Jun-27 Leaderboardrank Leaderboardscore(thousands) Date Timeline Initial setup Model design and implementation Final sprint
  • 21. 21 Lessons learnt • Exploiting the specificity of the dataset • Using Item-kNN over factorization in a very sparse dataset • Paying attention to recurrence • Forward Predictor Selection is effective • Different optimization for different user groups • Underscoring/omitting weak items • Ranking 200K items is slow • Keep it simple and transparent!
  • 22. 22 Presenter Contact Thank you for your attention! Dávid Zibriczky, PhD david.zibriczky@gmail.com