SlideShare a Scribd company logo
Copyright © 2015 Criteo
Machine Learning for Performance Advertising
Grenoble Data Science
Copyright © 2015 Criteo
Machine Learning for Performance Advertising
Eustache Diemert, Staff Research Scientist @ Criteo Research
Grenoble Data Science Meetup – Oct. 2017
Copyright © 2015 Criteo
Part I : Introduction to Performance Advertising
Copyright © 2015 Criteo
Performance Advertising ?
4
• Advertisers want sales
• short-term
• measurable impact
• Not interested in
• brand awareness
• marketing pressure
• segments (e.g. socio-demo)
Copyright © 2015 Criteo
Programatic Advertising Scenario
56€
Promo	!
Copyright © 2015 Criteo
Programatic Advertising Scenario
User 123456789
Copyright © 2015 Criteo
Programatic Advertising Scenario
User 123456789
For Sale
Copyright © 2015 Criteo
Programatic Advertising Scenario
User 123456789
For sale
0,10€
0,15€
Advertiser 1
Advertiser 2
Copyright © 2015 Criteo
Programatic Advertising Scenario
User 123456789
For Sale
0,10€
0,15€
~30 ms
Copyright © 2015 Criteo
Programatic Advertising Scenario
Winning
Advertiser
Copyright © 2015 Criteo
Programatic Advertising Scenario
56€
Promo	!
Copyright © 2015 Criteo
Performance Advertising Setup
12
Advertiser
Publisher
1. User visits a publisher webpage
2. Bidders recieve real-time auction
3. Winner displays ad for advertiser
4. User converts on advertiser website
(click / sale / lead)
CPA
CPM
Bidder
Copyright © 2015 Criteo
Performance Advertising Metrics
13
• Ideally: number of sales « generated » by advertising for a given
budget
• But difficult/costly to measure and optimize
• E.g. incrementality A/B test
• (also sales amount, margin etc)
• Practically: number of sales attributed to advertising
• Commonly: last click attribution
• (also multi-touch, data driven etc)
Real-time bidding for performance
advertising
Key question : how much should we bid in the auction ?
Copyright © 2015 Criteo
A little bit of game theory: 2nd price auctions
Sealed, 1 turn auction, winner pays the second highest bid
Value = 1€
bid= 0,75€
bid= 1,1€
Value = 1€
bid= 0,75€
bid= 1,1€
Competition:
0,5€
Competition:
1,5€
Case 1
Case 2
Value = 1€
bid= 0,75€
bid= 1,1€
Competition:
1,05€
Case 3
Value = 1€
bid= 0,75€
bid= 1,1€
Competition:
0,8€
Case 4
Copyright © 2015 Criteo
Auction games
• Second-price auctions
• Dominant strategy: bid the expected gain (« truthful auction »)
• An overbid means you are losing money
• An underbid means you are losing potential revenue
• Also: non-second price
• Floors (hard/soft/dynamic)
Copyright © 2015 Criteo
Baseline Bidding Policy
17
• Under 2nd price auction hypothesis, dominant strategy is to bid
expected value
𝑏𝑖𝑑∗
= 𝐶𝑃𝐴	×	𝑝𝑆𝑎𝑙𝑒
« Probability of post-click
attributed conversion »
« Value of a conversion »
Copyright © 2015 Criteo
Baseline Bidding Policy
18
• Under 2nd price auction hypothesis, dominant strategy is to bid
expected value
𝑏𝑖𝑑∗
= 𝐶𝑃𝐴	×	𝑝𝑆𝑎𝑙𝑒
« Probability of post-click
attributed conversion »
« Value of a conversion »
Model quality/calibration impacts
revenue
Copyright © 2015 Criteo
§ What can we use to predict clicks & sales?
§ User behavior on advertizer’s website
§ time since last visit
§ engagement level
§ last product seen, etc..
§ user fatigue: nb displays in last x days
Data features
§ Publisher:
§ publisher_id
§ url
§ display format
§ Campaign:
§ vertical_id: travel, classified, cars, etc
§ average ctr
Copyright © 2015 Criteo
Learn on huge volumes of data
10 000 displays
Copyright © 2015 Criteo
Learn on huge volumes of data
10 000 displays
leads to
50 clicks
Copyright © 2015 Criteo
Learn on huge volumes of data
10 000 displays
leads to
50 clicks
leads to
1 sale
Copyright © 2015 Criteo
Sizing of our prediction problems
§ Class unbalance: 0.5 / 100
§ N samples: 109
§ N raw variables: 102
§ N encoded features: 107
Copyright © 2015 Criteo
Which algo to solve our problems?
Structured data
• Lots of info in the data
• High predictability
• Highly structured info
Unstructured data
• Poor predictability
• Signal dominated by noise
• Highly unstructured info
Copyright © 2015 Criteo
§ Predict: P(Sales) = P(Click) P(Sales | Click)
§ P(Sales) ~ Bernoulli
§ Use (regularized) logistic regression
P(Y=1 | X) = 1/ (1+e-wTx)
§ Outputs a score in [0,1], interpreted as a probability
§ Negative log likelihood:
NLLH (y, p) = – y log p – (1 – y) log (1 – p)
• Convex Optimization, using (cheap) 1+st order methods (SGD, L-BFGS, SAG, …)
Optimizing for sales
Copyright © 2015 Criteo
§ Vanilla Logistic Regression uses binary features only
§ Standard representation of categorical features: “one-hot” encoding
For instance, site feature
§ Dimensionality equal to the number of different values -- can be very large
§ Hashing to reduce dimensionality (made popular by John Langford in VW)
Hashing trick
cnn.com news.yahoo.com
0 0 01 0 0 0
h : string ! [0 . . . 2b
1]
Copyright © 2015 Criteo
§ Outer product between two features; similar to a polynomial kernel of degree 2
§ Large number of values hashing trick.
§ Example: between site and advertiser,
Feature is 1 site=finance.yahoo.com & advertiser=bank of america
Quadratic features
Publisher network
Publisher
Site
Url
Advertiser network
Ad
Campaign
Advertiser
,
Copyright © 2015 Criteo
Part II : Attribution Model for Bidding Performance
Joint work with Julien Meynet, Pierre Galland, Damien Lefortier
published at AdKDD & TargetAd workshop (KDD 2017)
Copyright © 2015 Criteo
Outline
• The problem: bidding in display advertising
• Model:
• Attribution model
• Attribution aware bidder
• Impact on offline evaluation metrics
• Experience & results
Copyright © 2015 Criteo
« Post-click attributed conversions »?
30
Display ad
impression
Paid search
click
Display ad
click
Email
open $$$
• Last-click is the de facto attribution model…
… but advertisers are moving towards “better” attribution models:
• Rule-based, uniform, linear, etc..
• Data driven: regression, shapley value, etc..
• But what is the impact from a bidder’s perspective?
• What is the optimal bidding strategy right after a click?
Attribution-aware
bidder
Copyright © 2015 Criteo
Attribution Probability Through Time Matters
32
Attributionprobability
givenconversion
Copyright © 2015 Criteo
Attribution Model
33
• How can we model probability of getting the attribution given there
will be a conversion?
• 𝑆:	Post click conversion
• 𝐴: Attributed conversion
• 𝑋: Contextual features
• Δ: Delay click/conversion
𝑃𝑟 𝐴 = 1 𝑆 = 1, 𝑋 = 𝑥, Δ = δ) =	 𝑒9: ; <
,
𝜆 𝑥 ≥ 0Tapez	une	équation	ici.
Copyright © 2015 Criteo
Conversion Modeling
34
• Baseline solution:
• 0/1 prediction problem ⟹ Logistic Regression
• Large scale / latency constraint ⟹ Hashing trick
𝑏𝑖𝑑∗
= 𝐶𝑃𝐴	×	𝑝𝑆𝑎𝑙𝑒 « Probability of post-click
attributed conversion »
But what are positives / negatives?
Copyright © 2015 Criteo
From Attribution Model to an Attribution Aware Bidder
35
𝐴PQ								0													0																											1	
𝐴RS 				1 3⁄ 							1 3⁄ 																						1 3⁄
𝐴VQ								1													0																											0	
𝐴WPP							1													1																											1	
𝐴WX							0.6									0.1																							0.3	
Cast the problem
as an internal
attribution
problem
Copyright © 2015 Criteo
Attribution Aware Bidder: An Intuitive View
36
AB: previous click gives us the
attribution, only bid « marginal value »
LCB: user is engaged, go for last-clickbidvalue
t
New display opportunity
Copyright © 2015 Criteo
Attribution Aware Bidder
37
• Baseline Last-click Bidder (LCB)
• Attribution-aware Bidder (AB):
𝛿[:	time	elapsed since last	click
𝑏𝑖𝑑 = 𝐶𝑃𝐴	×	𝑃𝑟 𝐴PQ = 1 	𝑋 = 𝑥)	Tapez	une	équation	ici.
𝑏𝑖𝑑 = 𝐶𝑃𝐴	×	𝑃𝑟 𝐴WPP = 1 	𝑋 = 𝑥)	 1	 −	 𝑒9: ; <b , Tapez	une	équatio
Bid proportionally to the marginal contribution of the display
Impact on the
offline evaluation
metrics
Copyright © 2015 Criteo
Offline Evaluation of Bidders
39
• Utility metric on logged
feedbacks:
• Expected Utility: add uncertainty
on the cost distribution:
𝑐	~	Γ 𝛼 = 𝛽𝑐h + 1, 𝛽
𝑈 𝑝k
= l(𝑎h 𝑣h − 𝑐h)𝕀(𝑝k
h
𝑣h > 	𝑐h)
h∈s
Tapez	une	équation	ici.
𝑝k
h
𝑣h
𝑐h
Copyright © 2015 Criteo
Attribution Aware Expected Utility*
40
• Inject attribution function in the Utility:
𝐴𝑈 𝑝k
, 𝑎 =	l(𝑎(𝑥h)𝑣h − 𝑐h)𝕀(𝑝k
h
𝑣h > 	𝑐h)
h∈s
Tapez	une	équation	ici.
Internal attribution function:
• can be last-click, first click, etc..
• can be the proposed attribution
model
* Evaluation of the proposed metric would require a
proper offline / online correlation analysis
Experiments &
Results
Copyright © 2015 Criteo
Offline Evaluation - Dataset
42
Log sampled from 30 days of Criteo traffic
• Anonymized
• Each line is an impression with:
• Timestamp
• Price paid
• Contextual features (user, advertiser, publisher)
• Click*, click position*, click number*
• Conversion*, conversion value*
• Attribution label (conversion was attributed to Criteo)
• 16M displays, 5M clicks, 800k conversions
Will be available at http://research.criteo.com/ soon
Copyright © 2015 Criteo
Attribution Rates vs Time
43
Decay of attribution rate after a click
> 40% of conversions have
more than one click in the
preceding 30d
Copyright © 2015 Criteo
Offline Evaluation – Impact on Bid Profiles
44
Post-click bid profiles for 3 bidders:
• Last-Click Bidder (LCB)
• First-Click Bidder (FCB)
• Attribution Bidder (𝐴𝐵)
All models are learn using
regularized logistic regression
+ hashing trick
Copyright © 2015 Criteo
Offline Evaluation – Bidders Comparision
45
Results for 3 bidders on the Attribution Aware Expected Utility
𝐿𝐶𝐵 𝐹𝐶𝐵 𝐴𝐵
Win Rate 0.94 0.90 0.89
𝑈W
∗
, 𝛽 = 1000 2852 ± 43 2888 ± 43 𝟑𝟑𝟗𝟔 ± 𝟓𝟑
• We limit user over exposure after a click
• We get closer to lift-based bidding
• We can reinvest budget on more profitable campaigns / more
incremental ads
Copyright © 2015 Criteo
Online result
46
We tested online a simple modification of baseline through A/B
testing:
Δ𝑂𝐸𝐶
(long term)
Revenue
(short term)
Advertiser
ROI
User ad
exposure
+𝟓. 𝟓%
world wide
negative positive lower
𝑏𝑖𝑑ˆ‰Šˆ = 𝑏𝑖𝑑‹‰Œ		×		𝐴	 1	 − 𝐵𝑒9:<b Tapez	une	équation	ici.
Future Research
Directions
Copyright © 2015 Criteo
Work in progress & Next steps
• Better attribution modeling
• Exponential decay is naive: build a better model (e.g travel
partners have different attribution schemes)
• Model both conversion lift and attribution lift
• Delayed feedback in both cases
• Derive a robust (counterfactual) offline metric
Questions?
Copyright © 2015 Criteo
Questions?
References
Simple and Scalable Response Prediction for
Display Advertising, O. Chapelle, E. Manavoglu,
and R. Rosales, ACM TIST, 2013.
Offline Evaluation of Response Prediction in
Online Advertising Auctions, O. Chapelle,
WWW’15
Attribution Modeling Increases Efficiency of
Bidding in Display Advertising, E. Diemert, J.
Meynet, P. Galland, D; Lefortier KDD’17 TargetAd
workshop best paper finalist
http://labs.criteo.com
Ø Articles on dev & science at Criteo
http://research.criteo.com
Ø Conference reports & cutting edge science ;)
e.diemert@criteo.com

More Related Content

PDF
Machine learning for profit: Computational advertising landscape
PDF
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
PPTX
Machine Learning for Computational Advertising
PDF
New challenges for scalable machine learning in online advertising
PDF
Criteo TektosData Meetup
PDF
New machine learning challenges at Criteo
PDF
Marketplace in motion - AdKDD keynote - 2020
PDF
RecSys 2015: Large-scale real-time product recommendation at Criteo
Machine learning for profit: Computational advertising landscape
Damien Lefortier, Senior Machine Learning Engineer and Tech Lead in the Predi...
Machine Learning for Computational Advertising
New challenges for scalable machine learning in online advertising
Criteo TektosData Meetup
New machine learning challenges at Criteo
Marketplace in motion - AdKDD keynote - 2020
RecSys 2015: Large-scale real-time product recommendation at Criteo

Similar to Machine Learning for Performance Advertising (20)

PPTX
Computational Advertising in Yelp Local Ads
PDF
Google Analytics Konferenz 2019_Attribution: building a model_Martin Frotzler...
PDF
TripleLift: Preparing for a New Programmatic Ad-Tech World
PPTX
RTB Bid Landscape in Adform
PPTX
Rise of the machine (learning algorithms)
PPTX
Search Engine Marketing 101 For Techstars Chicago 2017
PDF
Qsession #27 E-commerce med Google
PPTX
Measuring Performance in Advertising Effectiveness
PPTX
Criteo Infrastructure (Platform) Meetup
PDF
Unlocking Scale Through Pricing
PPTX
Machine learning at Criteo - Paris Datageeks
PDF
Recommendation at scale
PDF
3 Challenges of Building Complex Dashboards with Open Source Components
PPTX
(2016 07-19) providing click predictions in real-time at scale
PDF
Simon Dollé_Large-scale Real-time recommendation at Criteo
PPTX
Marketing automation solutions webinar (part 2)
PPTX
Making advertising personal, 4th NL Recommenders Meetup
PDF
Digital analytics: Optimization (Lecture 10)
PPTX
RecsysFR: Criteo presentation
PPTX
Double Click for Advertisers
Computational Advertising in Yelp Local Ads
Google Analytics Konferenz 2019_Attribution: building a model_Martin Frotzler...
TripleLift: Preparing for a New Programmatic Ad-Tech World
RTB Bid Landscape in Adform
Rise of the machine (learning algorithms)
Search Engine Marketing 101 For Techstars Chicago 2017
Qsession #27 E-commerce med Google
Measuring Performance in Advertising Effectiveness
Criteo Infrastructure (Platform) Meetup
Unlocking Scale Through Pricing
Machine learning at Criteo - Paris Datageeks
Recommendation at scale
3 Challenges of Building Complex Dashboards with Open Source Components
(2016 07-19) providing click predictions in real-time at scale
Simon Dollé_Large-scale Real-time recommendation at Criteo
Marketing automation solutions webinar (part 2)
Making advertising personal, 4th NL Recommenders Meetup
Digital analytics: Optimization (Lecture 10)
RecsysFR: Criteo presentation
Double Click for Advertisers
Ad

Recently uploaded (20)

PPT
LEC Synthetic Biology and its application.ppt
PPT
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PPTX
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
PPTX
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PPTX
Microbes in human welfare class 12 .pptx
PPT
Presentation of a Romanian Institutee 2.
PPTX
Understanding the Circulatory System……..
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
Welcome-grrewfefweg-students-of-2024.pptx
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PDF
Science Form five needed shit SCIENEce so
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PDF
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PPT
6.1 High Risk New Born. Padetric health ppt
PPTX
BIOMOLECULES PPT........................
PPT
Animal tissues, epithelial, muscle, connective, nervous tissue
LEC Synthetic Biology and its application.ppt
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
TORCH INFECTIONS in pregnancy with toxoplasma
Microbes in human welfare class 12 .pptx
Presentation of a Romanian Institutee 2.
Understanding the Circulatory System……..
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
Welcome-grrewfefweg-students-of-2024.pptx
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
Science Form five needed shit SCIENEce so
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
6.1 High Risk New Born. Padetric health ppt
BIOMOLECULES PPT........................
Animal tissues, epithelial, muscle, connective, nervous tissue
Ad

Machine Learning for Performance Advertising

  • 1. Copyright © 2015 Criteo Machine Learning for Performance Advertising Grenoble Data Science
  • 2. Copyright © 2015 Criteo Machine Learning for Performance Advertising Eustache Diemert, Staff Research Scientist @ Criteo Research Grenoble Data Science Meetup – Oct. 2017
  • 3. Copyright © 2015 Criteo Part I : Introduction to Performance Advertising
  • 4. Copyright © 2015 Criteo Performance Advertising ? 4 • Advertisers want sales • short-term • measurable impact • Not interested in • brand awareness • marketing pressure • segments (e.g. socio-demo)
  • 5. Copyright © 2015 Criteo Programatic Advertising Scenario 56€ Promo !
  • 6. Copyright © 2015 Criteo Programatic Advertising Scenario User 123456789
  • 7. Copyright © 2015 Criteo Programatic Advertising Scenario User 123456789 For Sale
  • 8. Copyright © 2015 Criteo Programatic Advertising Scenario User 123456789 For sale 0,10€ 0,15€ Advertiser 1 Advertiser 2
  • 9. Copyright © 2015 Criteo Programatic Advertising Scenario User 123456789 For Sale 0,10€ 0,15€ ~30 ms
  • 10. Copyright © 2015 Criteo Programatic Advertising Scenario Winning Advertiser
  • 11. Copyright © 2015 Criteo Programatic Advertising Scenario 56€ Promo !
  • 12. Copyright © 2015 Criteo Performance Advertising Setup 12 Advertiser Publisher 1. User visits a publisher webpage 2. Bidders recieve real-time auction 3. Winner displays ad for advertiser 4. User converts on advertiser website (click / sale / lead) CPA CPM Bidder
  • 13. Copyright © 2015 Criteo Performance Advertising Metrics 13 • Ideally: number of sales « generated » by advertising for a given budget • But difficult/costly to measure and optimize • E.g. incrementality A/B test • (also sales amount, margin etc) • Practically: number of sales attributed to advertising • Commonly: last click attribution • (also multi-touch, data driven etc)
  • 14. Real-time bidding for performance advertising Key question : how much should we bid in the auction ?
  • 15. Copyright © 2015 Criteo A little bit of game theory: 2nd price auctions Sealed, 1 turn auction, winner pays the second highest bid Value = 1€ bid= 0,75€ bid= 1,1€ Value = 1€ bid= 0,75€ bid= 1,1€ Competition: 0,5€ Competition: 1,5€ Case 1 Case 2 Value = 1€ bid= 0,75€ bid= 1,1€ Competition: 1,05€ Case 3 Value = 1€ bid= 0,75€ bid= 1,1€ Competition: 0,8€ Case 4
  • 16. Copyright © 2015 Criteo Auction games • Second-price auctions • Dominant strategy: bid the expected gain (« truthful auction ») • An overbid means you are losing money • An underbid means you are losing potential revenue • Also: non-second price • Floors (hard/soft/dynamic)
  • 17. Copyright © 2015 Criteo Baseline Bidding Policy 17 • Under 2nd price auction hypothesis, dominant strategy is to bid expected value 𝑏𝑖𝑑∗ = 𝐶𝑃𝐴 × 𝑝𝑆𝑎𝑙𝑒 « Probability of post-click attributed conversion » « Value of a conversion »
  • 18. Copyright © 2015 Criteo Baseline Bidding Policy 18 • Under 2nd price auction hypothesis, dominant strategy is to bid expected value 𝑏𝑖𝑑∗ = 𝐶𝑃𝐴 × 𝑝𝑆𝑎𝑙𝑒 « Probability of post-click attributed conversion » « Value of a conversion » Model quality/calibration impacts revenue
  • 19. Copyright © 2015 Criteo § What can we use to predict clicks & sales? § User behavior on advertizer’s website § time since last visit § engagement level § last product seen, etc.. § user fatigue: nb displays in last x days Data features § Publisher: § publisher_id § url § display format § Campaign: § vertical_id: travel, classified, cars, etc § average ctr
  • 20. Copyright © 2015 Criteo Learn on huge volumes of data 10 000 displays
  • 21. Copyright © 2015 Criteo Learn on huge volumes of data 10 000 displays leads to 50 clicks
  • 22. Copyright © 2015 Criteo Learn on huge volumes of data 10 000 displays leads to 50 clicks leads to 1 sale
  • 23. Copyright © 2015 Criteo Sizing of our prediction problems § Class unbalance: 0.5 / 100 § N samples: 109 § N raw variables: 102 § N encoded features: 107
  • 24. Copyright © 2015 Criteo Which algo to solve our problems? Structured data • Lots of info in the data • High predictability • Highly structured info Unstructured data • Poor predictability • Signal dominated by noise • Highly unstructured info
  • 25. Copyright © 2015 Criteo § Predict: P(Sales) = P(Click) P(Sales | Click) § P(Sales) ~ Bernoulli § Use (regularized) logistic regression P(Y=1 | X) = 1/ (1+e-wTx) § Outputs a score in [0,1], interpreted as a probability § Negative log likelihood: NLLH (y, p) = – y log p – (1 – y) log (1 – p) • Convex Optimization, using (cheap) 1+st order methods (SGD, L-BFGS, SAG, …) Optimizing for sales
  • 26. Copyright © 2015 Criteo § Vanilla Logistic Regression uses binary features only § Standard representation of categorical features: “one-hot” encoding For instance, site feature § Dimensionality equal to the number of different values -- can be very large § Hashing to reduce dimensionality (made popular by John Langford in VW) Hashing trick cnn.com news.yahoo.com 0 0 01 0 0 0 h : string ! [0 . . . 2b 1]
  • 27. Copyright © 2015 Criteo § Outer product between two features; similar to a polynomial kernel of degree 2 § Large number of values hashing trick. § Example: between site and advertiser, Feature is 1 site=finance.yahoo.com & advertiser=bank of america Quadratic features Publisher network Publisher Site Url Advertiser network Ad Campaign Advertiser ,
  • 28. Copyright © 2015 Criteo Part II : Attribution Model for Bidding Performance Joint work with Julien Meynet, Pierre Galland, Damien Lefortier published at AdKDD & TargetAd workshop (KDD 2017)
  • 29. Copyright © 2015 Criteo Outline • The problem: bidding in display advertising • Model: • Attribution model • Attribution aware bidder • Impact on offline evaluation metrics • Experience & results
  • 30. Copyright © 2015 Criteo « Post-click attributed conversions »? 30 Display ad impression Paid search click Display ad click Email open $$$ • Last-click is the de facto attribution model… … but advertisers are moving towards “better” attribution models: • Rule-based, uniform, linear, etc.. • Data driven: regression, shapley value, etc.. • But what is the impact from a bidder’s perspective? • What is the optimal bidding strategy right after a click?
  • 32. Copyright © 2015 Criteo Attribution Probability Through Time Matters 32 Attributionprobability givenconversion
  • 33. Copyright © 2015 Criteo Attribution Model 33 • How can we model probability of getting the attribution given there will be a conversion? • 𝑆: Post click conversion • 𝐴: Attributed conversion • 𝑋: Contextual features • Δ: Delay click/conversion 𝑃𝑟 𝐴 = 1 𝑆 = 1, 𝑋 = 𝑥, Δ = δ) = 𝑒9: ; < , 𝜆 𝑥 ≥ 0Tapez une équation ici.
  • 34. Copyright © 2015 Criteo Conversion Modeling 34 • Baseline solution: • 0/1 prediction problem ⟹ Logistic Regression • Large scale / latency constraint ⟹ Hashing trick 𝑏𝑖𝑑∗ = 𝐶𝑃𝐴 × 𝑝𝑆𝑎𝑙𝑒 « Probability of post-click attributed conversion » But what are positives / negatives?
  • 35. Copyright © 2015 Criteo From Attribution Model to an Attribution Aware Bidder 35 𝐴PQ 0 0 1 𝐴RS 1 3⁄ 1 3⁄ 1 3⁄ 𝐴VQ 1 0 0 𝐴WPP 1 1 1 𝐴WX 0.6 0.1 0.3 Cast the problem as an internal attribution problem
  • 36. Copyright © 2015 Criteo Attribution Aware Bidder: An Intuitive View 36 AB: previous click gives us the attribution, only bid « marginal value » LCB: user is engaged, go for last-clickbidvalue t New display opportunity
  • 37. Copyright © 2015 Criteo Attribution Aware Bidder 37 • Baseline Last-click Bidder (LCB) • Attribution-aware Bidder (AB): 𝛿[: time elapsed since last click 𝑏𝑖𝑑 = 𝐶𝑃𝐴 × 𝑃𝑟 𝐴PQ = 1 𝑋 = 𝑥) Tapez une équation ici. 𝑏𝑖𝑑 = 𝐶𝑃𝐴 × 𝑃𝑟 𝐴WPP = 1 𝑋 = 𝑥) 1 − 𝑒9: ; <b , Tapez une équatio Bid proportionally to the marginal contribution of the display
  • 38. Impact on the offline evaluation metrics
  • 39. Copyright © 2015 Criteo Offline Evaluation of Bidders 39 • Utility metric on logged feedbacks: • Expected Utility: add uncertainty on the cost distribution: 𝑐 ~ Γ 𝛼 = 𝛽𝑐h + 1, 𝛽 𝑈 𝑝k = l(𝑎h 𝑣h − 𝑐h)𝕀(𝑝k h 𝑣h > 𝑐h) h∈s Tapez une équation ici. 𝑝k h 𝑣h 𝑐h
  • 40. Copyright © 2015 Criteo Attribution Aware Expected Utility* 40 • Inject attribution function in the Utility: 𝐴𝑈 𝑝k , 𝑎 = l(𝑎(𝑥h)𝑣h − 𝑐h)𝕀(𝑝k h 𝑣h > 𝑐h) h∈s Tapez une équation ici. Internal attribution function: • can be last-click, first click, etc.. • can be the proposed attribution model * Evaluation of the proposed metric would require a proper offline / online correlation analysis
  • 42. Copyright © 2015 Criteo Offline Evaluation - Dataset 42 Log sampled from 30 days of Criteo traffic • Anonymized • Each line is an impression with: • Timestamp • Price paid • Contextual features (user, advertiser, publisher) • Click*, click position*, click number* • Conversion*, conversion value* • Attribution label (conversion was attributed to Criteo) • 16M displays, 5M clicks, 800k conversions Will be available at http://research.criteo.com/ soon
  • 43. Copyright © 2015 Criteo Attribution Rates vs Time 43 Decay of attribution rate after a click > 40% of conversions have more than one click in the preceding 30d
  • 44. Copyright © 2015 Criteo Offline Evaluation – Impact on Bid Profiles 44 Post-click bid profiles for 3 bidders: • Last-Click Bidder (LCB) • First-Click Bidder (FCB) • Attribution Bidder (𝐴𝐵) All models are learn using regularized logistic regression + hashing trick
  • 45. Copyright © 2015 Criteo Offline Evaluation – Bidders Comparision 45 Results for 3 bidders on the Attribution Aware Expected Utility 𝐿𝐶𝐵 𝐹𝐶𝐵 𝐴𝐵 Win Rate 0.94 0.90 0.89 𝑈W ∗ , 𝛽 = 1000 2852 ± 43 2888 ± 43 𝟑𝟑𝟗𝟔 ± 𝟓𝟑 • We limit user over exposure after a click • We get closer to lift-based bidding • We can reinvest budget on more profitable campaigns / more incremental ads
  • 46. Copyright © 2015 Criteo Online result 46 We tested online a simple modification of baseline through A/B testing: Δ𝑂𝐸𝐶 (long term) Revenue (short term) Advertiser ROI User ad exposure +𝟓. 𝟓% world wide negative positive lower 𝑏𝑖𝑑ˆ‰Šˆ = 𝑏𝑖𝑑‹‰Œ × 𝐴 1 − 𝐵𝑒9:<b Tapez une équation ici.
  • 48. Copyright © 2015 Criteo Work in progress & Next steps • Better attribution modeling • Exponential decay is naive: build a better model (e.g travel partners have different attribution schemes) • Model both conversion lift and attribution lift • Delayed feedback in both cases • Derive a robust (counterfactual) offline metric
  • 50. Copyright © 2015 Criteo Questions? References Simple and Scalable Response Prediction for Display Advertising, O. Chapelle, E. Manavoglu, and R. Rosales, ACM TIST, 2013. Offline Evaluation of Response Prediction in Online Advertising Auctions, O. Chapelle, WWW’15 Attribution Modeling Increases Efficiency of Bidding in Display Advertising, E. Diemert, J. Meynet, P. Galland, D; Lefortier KDD’17 TargetAd workshop best paper finalist http://labs.criteo.com Ø Articles on dev & science at Criteo http://research.criteo.com Ø Conference reports & cutting edge science ;) e.diemert@criteo.com