SlideShare a Scribd company logo
Data Science at OLX
Alexey Grigorev
24.03.2020
SPICED Academy
Plan
● What is OLX
● Data Science at OLX
● Way of working
● Expectations from data scientists
● Career progression
Data science at OLX
olxgroup.com
olx.in
olx.pl
olx.ua
OLX in Berlin
● 4 floors
● 263 people
● 51 nationalities
● 15+ teams (in tech)
● 13 data scientists + 1 student
○ 25 across all EU offices
Our office
Data science at OLX
Data science at OLX
Plan
● What is OLX
● Data Science at OLX
● Way of working
● Expectations from data scientists
● Career progression
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● Smart ranking
● Reducing null-searches
● Query categorization
● Spell checking
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● Collaborative filtering
● Item2Vec
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● NSFW detection
● Forbidden items
● Fraud detection
● Duplicate detection
● Chat moderation
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● Image quality
● Listing quality
● Deal prediction
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
● User segmentation
● Bid optimization
Data Science at OLX
Main areas:
● Search
● Recommendations
● Trust & Safety
● Seller Experience
● Monetization
Such description. So much text
https://olx.com
OLX
Such description. So
much text
Such description. So
much text
Such description. So
much text
https://olx.com
OLX
https://olx.com
OLX
https://olx.com
OLX
https://olx.com
OLX
Problems:
● Illegal items
● NSFW content
● Duplicates
● Spam
● Fraud
Content moderation
ML
Such description
So much text
Accept
Reject
Moderation queue
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
ML
Such description
So much text
Accept
Reject
Moderation queue
Automatic
moderation system
Duplicate
detection
Forbidden
items
Other ML
models
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Index listings & images
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Detect duplicates
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Moderate duplicates
ML
Such description
So much text
MP
Automatic
moderation system
Moderation panel
Accept
Reject
Moderators
s3
ES
Duplicate
detection
system
Hashes
Accept
Reject
Moderation queue
Collect feedback
https://www.slideshare.net/AlexeyGrigorev/fighting-fraud-finding-duplicates-at-scale-highload-2019-191304763
https://tech.olx.com/a-two-step-framework-for-duplicate-detection-fbbe4c905480
https://tech.olx.com/detecting-image-duplicates-at-olx-scale-7f59e4b6aef4
Plan
● What is OLX
● Data Science at OLX
● Way of working
● Expectations from data scientists
● Career progression
A project like this is very complex
We need a team (or multiple teams) to make it work: it’s a joined effort of many
people working together
Roles in teams
● Product Manager (PM)
● Engineering Manager (EM)
● Software Engineers
○ Backend Engineers (BE)
○ Data Engineers (DE)
○ ML Engineer (MLE)
○ Site Reliability Engineers (SRE)
○ Frontend Engineers (FE)
○ Mobile Engineers
● Product Analysts (PA)
● Data Scientists (DS)
Team A
Team B
Team C
Product
PM
PM
PM
Head of
Product
PA
PA
Head of
Analytics
DS
DS
DS
Manager
Data Tech
EM
EM
EM
Head of
Engineering
BE
DE
BE
FE
BE SRE
FE SRE
FE
Matrix structure
Feature teams
● A cross-functional team with experts in different areas
● All work together on one feature/product
● All have the same goal!
● Anyone can work on anything, as long as it helps achieve the goal
PA DS DE BE SREEMPM
Goal setting
● OKRs, set quarterly
● Great alignment tool: other teams know what you’re doing
● Whatever team is doing, should be in line with their OKRs
Example:
● O
○ Catch more fraudsters
● KRs
○ Precision of model A improves from 30% to 60% while staying at the same recall level
○ Model B is tested in 5 key markets
Plan
● What is OLX
● Data Science at OLX
● Way of working
● Expectations from data scientists
● Career progression
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
Roles
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
Focus on modelling and evaluation, a bit
on production
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
SPICED got it covered
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
+ Airflow, SQL, Spark
Product
Definition
Data
Processing
Modeling Evaluation Production Customers
Data
Collection
Stages
Learn from your PM!
Plan
● What is OLX
● Data Science at OLX
● Way of working
● Expectations from data scientists
● Career progression
Data Scientist
Senior Data
Scientist
Lead Data
Scientist
Principal Data
Scientist
Chief Data
Scientist
Data Science
Manager
Head of Data
Science
Expert track
Managerial track
Senior Leadership
(VP / Director / C
level)
Junior Data
Scientist
Career progression
● Junior
○ performs an assigned task mostly independently
○ makes product impact — possibly with supervision
● Middle
○ performs a task completely independently
○ takes active part in planning and task prioritization
● Senior
○ works independently on complex tasks and drives projects
○ communicates with stakeholders
○ helps less senior people by onboarding and mentoring them
○ owns end-to-end process for implementing ML solutions
Data Scientist
Senior Data
Scientist
Junior Data
Scientist
Performance evaluation
360 degree feedback — every 6 months
● Free form feedback
○ things to continue, start, stop doing
● Structured feedback
○ Following best engineering practices
○ Designing and delivering ML solutions
○ Finding areas where ML can bring value to our customers
○ Dealing with ambiguity
○ Knowledge sharing
○ Business acumen
○ Communicating with stakeholders
○ And 8 other dimensions
Plan
● What is OLX
● Data Science at OLX
● Way of working
● Expectations from data scientists
● Career progression
Credit
Many things from this presentation is a collective effort of many people:
● Responsibilities
● Career path
● Performance evaluation
Special thanks to:
● Mariana Matsiuk (and others from OLX HR department)
● Andreas Merentitis, Rahul Gupta, Elias Nema
● Liesbeth Dingemans, Paul van der Boor (and others from Prosus AI)
● And many others across OLX Group
Links
OLX:
● OLX tech blog: tech.olx.com
● Twitter: @olxtecheurope
● Jobs at OLX: joinolx.com
Me:
● contact@alexeygrigorev.com
● LinkedIn: agrigorev
● Twitter: @Al_Grigor

More Related Content

PDF
Image models infrastructure at OLX
PDF
MLOps at OLX
PDF
OLX Group presentation for AWS Redshift meetup in London, 5 July 2017
PDF
Advancing your data science career
PDF
Getting a Data Science Job
PPTX
Personalization Everywhere! Create a Personalization Strategy
PDF
Full-stack Data Scientist
PDF
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...
Image models infrastructure at OLX
MLOps at OLX
OLX Group presentation for AWS Redshift meetup in London, 5 July 2017
Advancing your data science career
Getting a Data Science Job
Personalization Everywhere! Create a Personalization Strategy
Full-stack Data Scientist
From Data to Artificial Intelligence with the Machine Learning Canvas — ODSC ...

What's hot (10)

PDF
ML Zoomcamp 1.4 - CRISP-DM
PDF
EIA2019HK - Problem-Solution Fit - Mike Kyriacou
PDF
Build a growth model to supercharge your Growth - Hila Qu
PPTX
UGym - Pitch Deck
PPTX
14.bölüm Amaca yönelik pazarlama
PDF
Starting startups
PDF
Nhan thuc ve chuyen doi so.pdf
PDF
[Product Camp 2020] - Se liga na descoberta: um case de produtos internos - A...
PDF
AutoML - The Future of AI
PDF
Strategic Role - Product Management
ML Zoomcamp 1.4 - CRISP-DM
EIA2019HK - Problem-Solution Fit - Mike Kyriacou
Build a growth model to supercharge your Growth - Hila Qu
UGym - Pitch Deck
14.bölüm Amaca yönelik pazarlama
Starting startups
Nhan thuc ve chuyen doi so.pdf
[Product Camp 2020] - Se liga na descoberta: um case de produtos internos - A...
AutoML - The Future of AI
Strategic Role - Product Management
Ad

Similar to Data science at OLX (20)

PDF
Codementor - Data Science at OLX
PPTX
Data Science in Manufacturing and Automation
PDF
From Lab to Factory: Creating value with data
PDF
Understanding Products Driven by Machine Learning and AI: A Data Scientist's ...
PDF
Putting data science in your business a first utility feedback
PPTX
ANIn Coimbatore Sep 2023 | Agile for data science by Venkatesa Prasanna Selvaraj
PDF
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
PDF
Guide for a Data Scientist
PDF
Architecting for analytics
PDF
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
PPTX
What is data science ?
PDF
Data Science Introduction and Process in Data Science
PDF
From Lab to Factory: Or how to turn data into value
PDF
Data science course in Moradabad.pdf
PPTX
Unit 1 -Introduction to Data Science.pptx
PDF
Building Data Science into Organizations: Field Experience
PDF
3 hacks to accelerate your data science career
PDF
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
PDF
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
PDF
Ultimate Data Science Cheat Sheet For Success
Codementor - Data Science at OLX
Data Science in Manufacturing and Automation
From Lab to Factory: Creating value with data
Understanding Products Driven by Machine Learning and AI: A Data Scientist's ...
Putting data science in your business a first utility feedback
ANIn Coimbatore Sep 2023 | Agile for data science by Venkatesa Prasanna Selvaraj
BDW17 London - Abed Ajraou - First Utility - Putting Data Science in your Bus...
Guide for a Data Scientist
Architecting for analytics
SDD2017 - 03 Abed Ajraou - putting data science in your business a first uti...
What is data science ?
Data Science Introduction and Process in Data Science
From Lab to Factory: Or how to turn data into value
Data science course in Moradabad.pdf
Unit 1 -Introduction to Data Science.pptx
Building Data Science into Organizations: Field Experience
3 hacks to accelerate your data science career
DevOps for Data Engineers - Automate Your Data Science Pipeline with Ansible,...
Data Science Job ready #DataScienceInterview Question and Answers 2022 | #Dat...
Ultimate Data Science Cheat Sheet For Success
Ad

More from Alexey Grigorev (20)

PDF
MLOps week 1 intro
PDF
Data Monitoring with whylogs
PDF
Data engineering zoomcamp introduction
PDF
AI in Fashion - Size & Fit - Nour Karessli
PDF
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
PDF
ML Zoomcamp 10 - Kubernetes
PDF
Paradoxes in Data Science
PDF
ML Zoomcamp 8 - Neural networks and deep learning
PDF
Algorithmic fairness
PDF
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
PDF
ML Zoomcamp 5 - Model deployment
PDF
Introduction to Transformers for NLP - Olga Petrova
PDF
ML Zoomcamp 4 - Evaluation Metrics for Classification
PDF
ML Zoomcamp 3 - Machine Learning for Classification
PDF
ML Zoomcamp Week #2 Office Hours
PDF
AMLD2021 - ML in online marketplaces
PDF
ML Zoomcamp 2 - Slides
PDF
ML Zoomcamp 2.1 - Car Price Prediction Project
PDF
ML Zoomcamp - Course Overview and Logistics
PDF
ML Zoomcamp 1.10 - Summary
MLOps week 1 intro
Data Monitoring with whylogs
Data engineering zoomcamp introduction
AI in Fashion - Size & Fit - Nour Karessli
AI-Powered Computer Vision Applications in Media Industry - Yulia Pavlova
ML Zoomcamp 10 - Kubernetes
Paradoxes in Data Science
ML Zoomcamp 8 - Neural networks and deep learning
Algorithmic fairness
ML Zoomcamp 6 - Decision Trees and Ensemble Learning
ML Zoomcamp 5 - Model deployment
Introduction to Transformers for NLP - Olga Petrova
ML Zoomcamp 4 - Evaluation Metrics for Classification
ML Zoomcamp 3 - Machine Learning for Classification
ML Zoomcamp Week #2 Office Hours
AMLD2021 - ML in online marketplaces
ML Zoomcamp 2 - Slides
ML Zoomcamp 2.1 - Car Price Prediction Project
ML Zoomcamp - Course Overview and Logistics
ML Zoomcamp 1.10 - Summary

Recently uploaded (20)

PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
PPT on Performance Review to get promotions
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Artificial Intelligence
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
Construction Project Organization Group 2.pptx
PDF
composite construction of structures.pdf
PDF
Well-logging-methods_new................
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Model Code of Practice - Construction Work - 21102022 .pdf
Internet of Things (IOT) - A guide to understanding
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPT on Performance Review to get promotions
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Embodied AI: Ushering in the Next Era of Intelligent Systems
Artificial Intelligence
Foundation to blockchain - A guide to Blockchain Tech
Construction Project Organization Group 2.pptx
composite construction of structures.pdf
Well-logging-methods_new................
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
bas. eng. economics group 4 presentation 1.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
UNIT 4 Total Quality Management .pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...

Data science at OLX