SlideShare a Scribd company logo
Quasar and ReMix
An introduction to LinkedIn's Ranking and Federation libraries
Andris Birkmanis & Lance Wall
1
Relevance: Verticals & Infrastructure
2
Relevance
Isolated ML
models
Integrated ML
models
Relevance
Infra
Relevance
Verticals
Deployed ML
services
ML algos Scoring
and Ranking
Tools
Relevance
service platform
Quasar ReMix
Quasar
Quick Scoring and Ranking
Our mission is making efficient feature transformation, scoring, and ranking simple.
3
Scoring
• Scoring
• Scorables
• Features
• Feature Transformation
4
Ranking
• Sorting
• TopK
• Filtering
• Group by
• Distinct
• Union
• Custom
5
Relevance Models: DAGs of Computations
Filter BY
interest-match
> 0.5
Filter BY
skill-match
> 0.7
TOP 50 BY
content-match
All
Documents
member
interest category
interest-
match-
score
news
feed
skill content
skill-
match-
score
content-
match-
score
10,000
500 500
50
ML and Training
• Tracking training dependencies between ML models
• Integrating with training engines via Training API
• Automatic type conversion for features and model parameters
• Reuse of feature transformations between training and prediction
7
Quasar Components
• Domain Specific Language (DSL)
▪ Oriented towards scoring and ranking concepts
▪ Supports various machine learning models
▪ Supports various ranking operators
▪ Supports pluggable feature transformers
▪ Supports arithmetical and logical expressions
• Library
▪ Includes out-of-box feature transformers tuned for performance (dense/sparse vectors, bags of
words, etc.)
▪ Extensible with custom transformers and ranking operators
• Execution engine
▪ Supports multiple evaluation strategies for different objectives (lazy/eager/batching/etc.)
▪ Debuggability, logging, and other cross-cutting concerns
▪ API for scoring, ranking, read/write access to features, training
• LinkedIn Relevance Products
▪ Feed, Recommendation, Search
• Adoption
▪ 1000+ Quasar models
Project Status
Future directions
• Better training support for external models (XGBoost, Tensorflow)
• Making feature transformers and operators more reusable
• Better type information
• Standardized storage formats for features and model parameters
• See the upcoming LinkedIn engineering blog for technical details
10
Lance Wall
ReMix
Example relevance workflows at LinkedIn
Member ID
Fetch
Member
Profile
Fetch
Member
Profile
Compute
Job
Recommendations
Compute
People
Recommendations
Format
Results
Member ID
Format
Results
Motivation
• Multitenant relevance workflow services with tens of engineers on
multiple teams contributing
• Each relevance workflow service has different APIs and conventions
• Lack of abstraction of system-level concerns from application logic
• Diminished productivity, operability, and leverage
ReMix’s Mission
Provide an easy to use platform for building relevance services
with a focus on optimizing leverage and automating common
operability concerns.
Design Goals
• Consolidation of various relevance service stacks
• Ease of support
• Ease of development
• Ease of operation
Features of ReMix
• Leverages ParSeq for easy asynchronous I/O
• Exposes declarative API for composing workflows
• Provides automated monitoring instrumentation and tooling
• Provides robust, extensible solutions for common workflow
functionality
• Provides isolation and robustness to downstream instability
How does ReMix work?
Operator
is assembled into
Workflow
is submitted to
WorkflowEngine
Operator
• Modular functional component of a Workflow
• ReMix provides Operators for common functionality
• ReMix provides decorative interfaces for common optimizations
• ReMix provides generic support for asynchronous execution
Example relevance workflows at LinkedIn
Member ID
Fetch
Member
Profile
Fetch
Member
Profile
Compute
Job
Recommendations
Compute
People
Recommendations
Format
Results
Member ID
Format
Results
Workflow
• Declaration of deferred execution
• Easy to understand declarative language
• Leverages ParSeq and exposes a simpler API
• Abstraction of execution behavior & optimizations
• Independent of environment or service (i.e. portable)
Example relevance workflows at LinkedIn
Member ID
Fetch
Member
Profile
Fetch
Member
Profile
Compute
Job
Recommendations
Compute
People
Recommendations
Format
Results
Member ID
Format
Results
WorkflowEngine
• Executor of Workflows
• Translates Workflows to ParSeq Tasks
• Provides special considerations for async/RPC operations
• Provides common operability functionality
Project Status & Planned Work
• ReMix adopters include job recommendations and blended search
• Working on integration with Quasar
▪ Complete solution for model serving from offline to online
• ReMix Cloud
▪ Simple toolkit/UI for creating a Workflow and deploying it to production
▪ Hosts Workflows in a managed service, with little to no operational cost to
Workflow developers
▪ Increased leverage due to reuse of common components in multitenant platform
Thanks!
Backup Slides
25
Quasar Model
26
MODELID "feed_quasar";
DOCPARAM com.linkedin.feed.FeedItem feedItem;
REQUEST PARAM Profile member;
REQUEST FEATURE VECTOR interests = GetInterests(member);
DOCUMENT FEATURE VECTOR categories = GetCategories(feedItem);
DOCUMENT FEATURE LONG publishedTime = GetPublishedTime(feedItem);
MODEL PARAM timeBuckets = { "1hr" : 60, "3hr" : 180 };
DOCUMENT FEATURE VECTOR normalizedTime = Bucketize(diffTime, timeBuckets);
DOCUMENT FEATURE VECTOR interestMatch = Similarity(interests, categories);
MODEL PARAM MAP<STRING, OBJECT> modelWeights = {
“normalizedTime”: { “1hr”: 0.234, “3hr”: 0.456, “Other”:0.21 }, “interestMatch”: 0.823 };
DOCUMENT FEATURE FLOAT score = LinearScore(modelWeights, “sigmoid”);
DOCUMENT FEATURE BOOLEAN aboveThreshold = score > 0.5
filteredFeed = FILTER DOCUMENTS BY aboveThreshold;
rerankedFeed = ORDER filteredFeed BY score WITH DESC;
RETURN rerankedFeed;
Candidate list of documents
Filter Documents
getInterest
s
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
getCateg
ories
getPublish
edTime
getSimilari
ty
Bucketize
LinearSco
re
1 3 4Request
1 3 4
3 1 4
Order
Documents
Pass 1
2
Pass 2
Decision
Tree
LinearSc
ore
Decision
Tree
LinearSc
ore
getVie
wTimes
Bucke
tize
Decision
Tree
LinearSc
ore
The multipass ensemble
model
at runtime
Vector Math and Expression Support
• Vector as first class citizen in DSL
• State-of-art Java Vector implementation
▪ Compact and efficient data structure
▪ Efficient Vector math computation
C++
Java
Networ
k
1.0
1.0
3.0
1.0
Linux
Member/Job
Similarity
Score
=
member.skill
Hadoop
Scala
Gradle
2.0
1.0
2.0
job.required_skill
dot
product

More Related Content

PDF
Automating machine learning lifecycle with kubeflow
PDF
Sysml 2019 demo_paper
PPTX
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
PPTX
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
PDF
Multi runtime serving pipelines for machine learning
PPTX
Feature store: Solving anti-patterns in ML-systems
PPTX
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
PDF
모델 서빙 파이프라인 구축하기
Automating machine learning lifecycle with kubeflow
Sysml 2019 demo_paper
DAIS Europe Nov. 2020 presentation on MLflow Model Serving
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
Multi runtime serving pipelines for machine learning
Feature store: Solving anti-patterns in ML-systems
Kubeflow: portable and scalable machine learning using Jupyterhub and Kuberne...
모델 서빙 파이프라인 구축하기

What's hot (20)

PDF
Continuous Deployment for Deep Learning
PDF
Serverless machine learning operations
PDF
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
PDF
Expanding beyond SPL -- More language support in IBM Streams V4.1
PDF
Hydrosphere.io for ODSC: Webinar on Kubeflow
PDF
Space-Based Architecture
PDF
Kubeflow at Spotify (For the Kubeflow Summit)
PDF
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
PDF
Running Apache Spark Jobs Using Kubernetes
PDF
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
PDF
Kubeflow Distributed Training and HPO
PDF
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
PDF
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
PDF
Grokking Techtalk #42: Engineering challenges on building data platform for M...
PPTX
Tensorflow model using docker and AWS SageMaker
PDF
Machine learning at scale by Amy Unruh from Google
PPTX
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
PPTX
Intro to SageMaker
PPTX
AWS Serverless patterns & best-practices in AWS
PDF
Ml 3 ways
Continuous Deployment for Deep Learning
Serverless machine learning operations
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Expanding beyond SPL -- More language support in IBM Streams V4.1
Hydrosphere.io for ODSC: Webinar on Kubeflow
Space-Based Architecture
Kubeflow at Spotify (For the Kubeflow Summit)
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
Running Apache Spark Jobs Using Kubernetes
Kyryl Truskovskyi: Kubeflow for end2end machine learning lifecycle
Kubeflow Distributed Training and HPO
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
ML at the Edge: Building Your Production Pipeline with Apache Spark and Tens...
Grokking Techtalk #42: Engineering challenges on building data platform for M...
Tensorflow model using docker and AWS SageMaker
Machine learning at scale by Amy Unruh from Google
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Intro to SageMaker
AWS Serverless patterns & best-practices in AWS
Ml 3 ways
Ad

Similar to ML Platform Q1 Meetup: An introduction to LinkedIn's Ranking and Federation Libraries (20)

PDF
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
PDF
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
PPTX
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
PDF
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
PDF
PDF
A Call for Sanity in NoSQL
PDF
World-class Data Engineering with Amazon Redshift
PPTX
Personalizing Search at LinkedIn
PDF
QNIBTerminal: Understand your datacenter by overlaying multiple information l...
PDF
Graph Stream Processing : spinning fast, large scale, complex analytics
PPTX
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
PDF
CI/CD for Machine Learning
PDF
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
PDF
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
PDF
Naukri Search Team achievements, 2009-2010
PDF
Recsys2016 Tutorial by Xavier and Deepak
PDF
Beam summit 2019 - Unifying Batch and Stream Data Processing with Apache Calc...
PPTX
Keeping Master Green at Scale
PPT
Dating with Models
PDF
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
Structure, Personalization, Scale: A Deep Dive into LinkedIn Search
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Search at Linkedin by Sriram Sankar and Kumaresh Pattabiraman
Virtual Flink Forward 2020: Data driven matchmaking streaming at Hyperconnect...
A Call for Sanity in NoSQL
World-class Data Engineering with Amazon Redshift
Personalizing Search at LinkedIn
QNIBTerminal: Understand your datacenter by overlaying multiple information l...
Graph Stream Processing : spinning fast, large scale, complex analytics
Introduction to Big Data Technologies: Hadoop/EMR/Map Reduce & Redshift
CI/CD for Machine Learning
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
Naukri Search Team achievements, 2009-2010
Recsys2016 Tutorial by Xavier and Deepak
Beam summit 2019 - Unifying Batch and Stream Data Processing with Apache Calc...
Keeping Master Green at Scale
Dating with Models
How Lucene Powers the LinkedIn Segmentation and Targeting Platform
Ad

Recently uploaded (20)

PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
medical staffing services at VALiNTRY
PDF
Digital Strategies for Manufacturing Companies
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
System and Network Administraation Chapter 3
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Nekopoi APK 2025 free lastest update
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
ai tools demonstartion for schools and inter college
PPTX
Transform Your Business with a Software ERP System
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
history of c programming in notes for students .pptx
Operating system designcfffgfgggggggvggggggggg
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
medical staffing services at VALiNTRY
Digital Strategies for Manufacturing Companies
How to Choose the Right IT Partner for Your Business in Malaysia
System and Network Administraation Chapter 3
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Nekopoi APK 2025 free lastest update
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Odoo POS Development Services by CandidRoot Solutions
wealthsignaloriginal-com-DS-text-... (1).pdf
ai tools demonstartion for schools and inter college
Transform Your Business with a Software ERP System
Upgrade and Innovation Strategies for SAP ERP Customers
VVF-Customer-Presentation2025-Ver1.9.pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
history of c programming in notes for students .pptx

ML Platform Q1 Meetup: An introduction to LinkedIn's Ranking and Federation Libraries

  • 1. Quasar and ReMix An introduction to LinkedIn's Ranking and Federation libraries Andris Birkmanis & Lance Wall 1
  • 2. Relevance: Verticals & Infrastructure 2 Relevance Isolated ML models Integrated ML models Relevance Infra Relevance Verticals Deployed ML services ML algos Scoring and Ranking Tools Relevance service platform Quasar ReMix
  • 3. Quasar Quick Scoring and Ranking Our mission is making efficient feature transformation, scoring, and ranking simple. 3
  • 4. Scoring • Scoring • Scorables • Features • Feature Transformation 4
  • 5. Ranking • Sorting • TopK • Filtering • Group by • Distinct • Union • Custom 5
  • 6. Relevance Models: DAGs of Computations Filter BY interest-match > 0.5 Filter BY skill-match > 0.7 TOP 50 BY content-match All Documents member interest category interest- match- score news feed skill content skill- match- score content- match- score 10,000 500 500 50
  • 7. ML and Training • Tracking training dependencies between ML models • Integrating with training engines via Training API • Automatic type conversion for features and model parameters • Reuse of feature transformations between training and prediction 7
  • 8. Quasar Components • Domain Specific Language (DSL) ▪ Oriented towards scoring and ranking concepts ▪ Supports various machine learning models ▪ Supports various ranking operators ▪ Supports pluggable feature transformers ▪ Supports arithmetical and logical expressions • Library ▪ Includes out-of-box feature transformers tuned for performance (dense/sparse vectors, bags of words, etc.) ▪ Extensible with custom transformers and ranking operators • Execution engine ▪ Supports multiple evaluation strategies for different objectives (lazy/eager/batching/etc.) ▪ Debuggability, logging, and other cross-cutting concerns ▪ API for scoring, ranking, read/write access to features, training
  • 9. • LinkedIn Relevance Products ▪ Feed, Recommendation, Search • Adoption ▪ 1000+ Quasar models Project Status
  • 10. Future directions • Better training support for external models (XGBoost, Tensorflow) • Making feature transformers and operators more reusable • Better type information • Standardized storage formats for features and model parameters • See the upcoming LinkedIn engineering blog for technical details 10
  • 12. Example relevance workflows at LinkedIn Member ID Fetch Member Profile Fetch Member Profile Compute Job Recommendations Compute People Recommendations Format Results Member ID Format Results
  • 13. Motivation • Multitenant relevance workflow services with tens of engineers on multiple teams contributing • Each relevance workflow service has different APIs and conventions • Lack of abstraction of system-level concerns from application logic • Diminished productivity, operability, and leverage
  • 14. ReMix’s Mission Provide an easy to use platform for building relevance services with a focus on optimizing leverage and automating common operability concerns.
  • 15. Design Goals • Consolidation of various relevance service stacks • Ease of support • Ease of development • Ease of operation
  • 16. Features of ReMix • Leverages ParSeq for easy asynchronous I/O • Exposes declarative API for composing workflows • Provides automated monitoring instrumentation and tooling • Provides robust, extensible solutions for common workflow functionality • Provides isolation and robustness to downstream instability
  • 17. How does ReMix work? Operator is assembled into Workflow is submitted to WorkflowEngine
  • 18. Operator • Modular functional component of a Workflow • ReMix provides Operators for common functionality • ReMix provides decorative interfaces for common optimizations • ReMix provides generic support for asynchronous execution
  • 19. Example relevance workflows at LinkedIn Member ID Fetch Member Profile Fetch Member Profile Compute Job Recommendations Compute People Recommendations Format Results Member ID Format Results
  • 20. Workflow • Declaration of deferred execution • Easy to understand declarative language • Leverages ParSeq and exposes a simpler API • Abstraction of execution behavior & optimizations • Independent of environment or service (i.e. portable)
  • 21. Example relevance workflows at LinkedIn Member ID Fetch Member Profile Fetch Member Profile Compute Job Recommendations Compute People Recommendations Format Results Member ID Format Results
  • 22. WorkflowEngine • Executor of Workflows • Translates Workflows to ParSeq Tasks • Provides special considerations for async/RPC operations • Provides common operability functionality
  • 23. Project Status & Planned Work • ReMix adopters include job recommendations and blended search • Working on integration with Quasar ▪ Complete solution for model serving from offline to online • ReMix Cloud ▪ Simple toolkit/UI for creating a Workflow and deploying it to production ▪ Hosts Workflows in a managed service, with little to no operational cost to Workflow developers ▪ Increased leverage due to reuse of common components in multitenant platform
  • 26. Quasar Model 26 MODELID "feed_quasar"; DOCPARAM com.linkedin.feed.FeedItem feedItem; REQUEST PARAM Profile member; REQUEST FEATURE VECTOR interests = GetInterests(member); DOCUMENT FEATURE VECTOR categories = GetCategories(feedItem); DOCUMENT FEATURE LONG publishedTime = GetPublishedTime(feedItem); MODEL PARAM timeBuckets = { "1hr" : 60, "3hr" : 180 }; DOCUMENT FEATURE VECTOR normalizedTime = Bucketize(diffTime, timeBuckets); DOCUMENT FEATURE VECTOR interestMatch = Similarity(interests, categories); MODEL PARAM MAP<STRING, OBJECT> modelWeights = { “normalizedTime”: { “1hr”: 0.234, “3hr”: 0.456, “Other”:0.21 }, “interestMatch”: 0.823 }; DOCUMENT FEATURE FLOAT score = LinearScore(modelWeights, “sigmoid”); DOCUMENT FEATURE BOOLEAN aboveThreshold = score > 0.5 filteredFeed = FILTER DOCUMENTS BY aboveThreshold; rerankedFeed = ORDER filteredFeed BY score WITH DESC; RETURN rerankedFeed;
  • 27. Candidate list of documents Filter Documents getInterest s getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re getCateg ories getPublish edTime getSimilari ty Bucketize LinearSco re 1 3 4Request 1 3 4 3 1 4 Order Documents Pass 1 2 Pass 2 Decision Tree LinearSc ore Decision Tree LinearSc ore getVie wTimes Bucke tize Decision Tree LinearSc ore The multipass ensemble model at runtime
  • 28. Vector Math and Expression Support • Vector as first class citizen in DSL • State-of-art Java Vector implementation ▪ Compact and efficient data structure ▪ Efficient Vector math computation C++ Java Networ k 1.0 1.0 3.0 1.0 Linux Member/Job Similarity Score = member.skill Hadoop Scala Gradle 2.0 1.0 2.0 job.required_skill dot product