Learn Like a Human: Taking Machine Learning from Batch to Real-Time

Learn Like a Human: Taking Machine
Learning from Batch to Real-Time
By Elad Rosenheim, Senior Software Architect
delivered in

Learn Like a Human: Taking Machine Learning from Batch to Real-Time
• The Industry’s Problem
• Old School Solutions
• Meet the Contextual Bandits
On the Agenda:

• Publishers, retailers, SaaS
all share a common problem
• They know their domain
but not how to optimize for each user
• Screen real-estate is limited
yet everyone sees the same thing
The Industry’s Problem

Commonly asked questions
• What articles should be
shown on the homepage of
an online publisher?
• What titles and images
would attract the most
clicks?
• What should the order and
layout of the articles be?
• Should every user see the
“Top Videos” section?
• What is the best layout for
the homepage?

• Which product sorting
layout would yield the
highest revenue?
• Should the layout differ
significantly between
different user segments?

• What types of products should be shown,
and to whom?

In the Beginning:
• First came the educated guess
• Then came the A/B test
• "Data Beats Opinion“
• Freedom to experiment (with nice tools)
• Hopefully: Less fear of change, less politics
• How does it work?
• Split traffic between baseline and alternative variations
• In theory: Sit and wait for significant results
• In practice: Peek at the numbers ‘til “95% confidence” is reached

While you wait, you're bleeding clicks!
Clicks = Money
What about the really dynamic stuff?
Campaigns, headlines, products on sale
A/B Tests: Already Old School?

• A Single-arm Bandit
• Suppose I have multiple arms in front of me,
each with its unknown mean reward…
• How do I optimize income from multiple
machines?
• Caution or Haste?
• Explore vs. Exploit
• In our context:
How do I optimize multiple variations?
Enter the Multi-arm Bandits

• (Very) Simple Solutions
• ε-greedy, ε–decreasing
• First 100% random explore, then ~90% exploit?
• Magic numbers, built-in revenue loss
• Bayesian-based approaches
• Smoother curve from explore to exploit
• “Winner” is now a less relevant term
Bandits - A Classic Problem

We want to find the variation “best on average“
…but we’re not improving the conversion rate of any single variation
Bandits work well when…
0.4%1.7%2.4%

• Each of us is a beautiful and unique feature vector!
• By showing the right variation to the right people, we can improve
conversions per variation and beat the best variation
• Machine learning challenge accepted!
Enter Personalization

The Usual Suspects
• Collaborative Filtering?
• Very big, very sparse matrix
• Cold start, batch process
• Not suitable in this case
• Classifiers?
• Logistic Regression, Random Forest et al.
• Periodically learn over all converters so far
• More data = more time, bigger model
• “Partial Feedback” problem:
• Users aren’t of class “Variation A” or “Variation B”.
• Rather, for any given user, we’re placing our bets on a variation and
don’t know what the reward would be for any other alternative.

• Like a bandit, we need to learn as we go (not in batch),
but this time with “context” - the user’s data
• Incremental Learning over the stream of impressions & rewards
(“Partial Fit”)
• We’re looking to…
• Start learning from the first impression
• Handle the explore-exploit curve
• Run fast (enough)
• Worst case: Converge on the best variation like a bandit
What We Need

• They “eat” the data stream
• They demand fast access to user data
• Historical or immediate
• Their model is always ready for action
• In the Papers
• Linear Bayes, LinUCB
• What we do: Per-Variation Logistic Regression
• A variant supporting updates in “mini-batches”
• Exploration-on-top
• Worst case: “Garbage In Multi Arm Bandit Out”
• Light on memory, compact output
Meet the Contextual Bandits

• Online should be fast & scaled
• Offline: A test-bed for iteratively testing new ideas
• New algorithms
• Tweaked parameters
• Feature transformations - last but not least!
How We Do It: Online & Offline

Persist Model
The Online Flow
DY Web Servers:
a. Get our script
b. Log impressions,
conversions
Learn Workers
User DB
Load to Predict Server
Queue Per Test
A B C
A B C
A B C
Predictions

• Test, Improve, Iterate
• Using real-world data
• Using generated data
• From easy to hard
The Offline Evaluator

• Learn in the center site, predict quickly in each geo. How?
• Push models via local Redis slaves
• Compressed SSH tunnel
• User data - daily aggregation
• Store into LMDB (simple, fast memory-mapped K/V DB)
• Sync via S3 (LZ4 compressed), read from local SSD
• Learn & Predict services
• Python as ML lingua franca: NumPy, SciPy, scikit-learn
Going Global

• Better data beats better algorithms
• Reduce aggressively
Keep It Simple, Smart!
By Elad Rosenheim, Senior Software Architect
Final Word:
* Thanks to Idan Michaeli, Lead Data Scientist

Learn Like a Human: Taking Machine Learning from Batch to Real-Time

More Related Content

What's hot (20)

Similar to Learn Like a Human: Taking Machine Learning from Batch to Real-Time (20)

More from Dynamic Yield (10)

Recently uploaded (20)

Learn Like a Human: Taking Machine Learning from Batch to Real-Time

Editor's Notes