Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Marketplace (Sponsor Talk)

The Brains and the Brawn:
The Logic and Implementation of a
Redesigned Advertising Marketplace
pydata 2015
stephanie tzeng
salvatore rinchiera

ONLINE ADVERTISING
USER
business cat
WEBSITE
(PUBLISHER)
litterboxes.com
AD
BOGO
LASER
POINTERS

APPNEXUS
•  Platform that connects advertisers and websites
•  Daily Volume:
•  120+ Billion Auctions
•  44 Billion Ads Served
•  170+ TB
•  Fill human brain capacity in 12 days
•  700 billion pages of printed information
•  40,000 miles of printed material
•  Reaches moon in 6.5 days
•  $273,000,000,000 in printing costs

BIG data
Have you
tried
atkins?
medium data

I
M
P

B
U
S
AN
Bidder
log
HDFS
Data
Pipeline
Vertica
MySQL
impression
Bidder
Bidder
Bidder
Other
Bidders
Optimization

PERFORMANCE BUYING
Advertiser only pays
website when someone
buys lasers
Website accepts CPA
payment
BOGO
LASER
POINTERS

I’ll give you
$10 if
someone
clicks on my
ad
k
BOGO
LASER
POINTERS
litterboxes.com

9
$5 /
click
$50/
purchase
$3 /
click
$7 /
click
$10 /
subscription
$9 /
click
$100/
purchase
$10/
purchase
$4 /
click
$.5 /
click
$13 /
click
$8/
purchase
$5 /
click
$3 /
click
Uhhh…
litterboxes.com

OFFERS -> BID VALUES
•  Cost Per Click (CPC) = $10
•  Pr(Click) = 0.1
•  E(Imp) = $10 x 0.1 = $1
Pr(Click) = # of clicks / # of impressions
What happens when we have no data?

An Event = Information
•  Pr(Click) is crucial to expected value of an oﬀer
•  Rare events like clicks hold much more information than having
impressions with no clicks
•  0 / 10 imps = 0 …
•  0 / 1000 imps = 0 …
•  0 / 10000 imps = 0 … still??
•  1 / 10000 … now we’re getting somewhere!

OFFER STATES
Oﬀers are allocated into 2 diﬀerent auction states
per website
•  Optimized State = Exploit. Capitalizing on
known information
•  Learn State = Explore. Acquiring new
information
In learn, there is no data. We estimate an initial
click or conversion probability to “buy” information.
IMPRESSIONS

learn
auctions
optimized
auctions

OLD WORLD
•  Compute a predicted probability for all oﬀer:website
combinations
•  Update this prediction as you gain information
•  Store all prediction combinations in memory
•  Problem: Elaborate probability prediction schemes are not
accurate enough, and you end up collecting real data very
slowly for all oﬀers

A SINGLE BID IN LEARN
WEBSITE
oﬀer
learn valuation
instant ramen subscription
3
quest bars
3.5
bh farms carrot juice
0.8
cholula hot sauce
0.00001
starbucks oprah chai lattes
2
justin bieber fanny pack
0.000005
…
…
recorder midi ﬁle download
1.9
bidder memory

THE HYPOTHESIS
For information gathering, it is more cost effective to test out
“enough” impressions for a concentrated number of offers
than buying impressions for all offers based off a flawed
prediction.
RECALL: when working with rare events, the event holds the key
to your information.
You can use this event with limited impressions to classify an
offer as good or bad.
CLICK GOOD!
NO CLICK BAD!
QUICK-TEST

WHAT IS “ENOUGH”?
N impressions = the minimal amount of impressions you need to
deem an offer is bad (given no events).
Compute N impressions based on the offer’s payout (Goal), desired
conversion rate (p0), and our tolerance for false negatives (λ).
For each website, set
If p represents the “true” click through rate for offer:website, then
N = min(n) such that
QUICK-TEST

THE MATH
Bayes theorem tells us
So given an intelligent prior H(p) for p, we can solve for N from
QUICK-TEST

LAMBDA VS N
QUICK-TEST
lambda
(falsenegativerate)
N
1
0.5
Stop buying!
As you decrease your threshold
for false negatives, you are
more confident that this offer is
not good.

CHOOSING LAMBDA
ROC Curve
sensitivity
1 - specificity
Sensitivity =
= True Positive Rate
1 – Specificity = 1 – True Negative
Rate
=
= False Positive
Rate
ideal

Pre-rank offers for each seller unit based on predicted revenue
1. Rank
1.  offer ----
2.  offer ----
3.  offer ----
4.  offer ----
5.  offer ----
6.  offer ----
7.  offer ----
8.  offer ----
9.  offer ----
10.  offer ----
….
SELLER UNIT

Only top offers test at a time for learn auctions
2. Select Top Offers
1.  offer ----
2.  offer ----
3.  offer ----
4.  offer ----
5.  offer ----
6.  offer ----
7.  offer ----
8.  offer ----
9.  offer ----
10.  offer ----
….
SELLER UNIT

Each offer is given a chance to test.
As they “pass or fail,” they are quickly removed from the testing state
to give chance to other offers.
3. Quick-Test the Offers
offer ---- passed
offer ---- passed
offer ---- passed
offer ---- failed
offer ---- failed
offer ---- failed
…
IMPRESSIONS

learn
auctions
optimized
auctions
ad hell

RESULTS
Number of Offers
Bid Type
−− Learn
−− Optimized

Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Marketplace (Sponsor Talk)

•  Distribute computation across several machines
•  A large unit of work is called a job
•  Jobs are split into smaller units of work called tasks
•  Scheduler is responsible for kicking off job and splitting jobs into tasks
•  Task computation is carried out by worker machines
Distributed Work Queue
Job
task
task
task
task

•  In-house system implemented in Python
•  Scheduler and worker machines are listening on RabbitMQ
•  Similar to celery
•  Scheduler begins job when it receives notification on the exchange
•  Store job history and current status in MySQL
•  23 worker machines
Distributed Work Queue

RabbitMQ
1. Your data is ready!
2.
3.
4.
worker
worker
worker
Scheduler
Data
Pipeline
5.

RabbitMQ
•  Not interested in pushing broker service to the limit
•  10’s messages / second
•  Lots of flexibility with regard to routing rules
•  Fault tolerant
•  Clustering, federation, mirroring, message durability
•  Fantastic Python support with pika
•  Built in features that make it ideal for a work queue
•  Acking
•  Auto re-routing on failure

Vertica
Redis

HDFS
raw log
data
1.aggregate
3. transition
MySQL
Data flow
Bidder
Memory

4. sync

•  Stores log level data on every single impression
•  170+ TB / day
•  Great for large data sets
•  Fault tolerant
•  ~1400 machines
•  2 datacenters, NYM & LAX
Distributed Filesystem

Vertica
Redis

HDFS
raw log
data
1.aggregate
3. transition
MySQL
You are here.
Bidder
Memory

4. sync

Aggregate into Vertica
•  HP column store database
•  Fast aggregations!
•  Horizontally scaling
•  Supports frequent querying
•  ~150 Vertica nodes
Aggregation (Java)

Vertica
•  331,813,174,897 rows

$$$ing
•  Many processes need to access the same aggregated data
•  Can’t be querying Vertica constantly

Slicing + Dicing with Pandas
•  Represent rows and columns in memory
•  API supports complex row manipulation
•  Let pandas worry about performance for you! (mostly)
•  pyodbc + iopro under the hood

What do we cache?
• Websites with new data
•  If a website has new data, we’ll need to reevaluate which offers it is testing
• How many offers should be testing on each website at a time
•  AKA – Max Testing Offer
• Which offers allowed to serve on which countries / sizes?
• How many impressions / clicks / conversion on a website / offer
combination

•  Key-value store
•  Rich feature set
•  Sets
•  Hash set
•  Ordered set
•  Enforce best practices
•  Common layer to handle
serialization +
deserialization
•  Defined namespaces
•  redis-py API

Beware! Resource starvation
•  Apache Zookeeper + Kazoo to the rescue!
pls help,
I’m so
hungry

Load to MySQL with SqlAlchemy
•  After transition computation we write data to MySQL
•  SqlAlchemy to do writes

Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Marketplace (Sponsor Talk)

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Marketplace (Sponsor Talk) (20)

More from PyData (20)

Recently uploaded (20)

Brains & Brawn: the Logic and Implementation of a Redesigned Advertising Marketplace (Sponsor Talk)