Taras Matsyk - Event Driven ML

Go && Python
Event driven ML
Lalafo case

How it all started
Part 1. The Phantom Menace

StackOverflow: Ready to rescue

$
Feedback
Processes Monitoring
RAM - expiration?
Relational DB?

Any issues there?
1. Request monitoring (?)
2. Hard to reason GPU usage (varies 3 to 7GB)
3. ~5Gb RAM out of the box (scale?)
4. Redis is in-memory storage (temporary)
5. Threading != Parallel

http://www.nooooooooooooooo.com/

What if we split
API and ML
Part 2. A new hope

Requirements
Lalafo case
1. Single image/request processing time: < 2 seconds
2. Visibility
3. Persistence
4. Scalability
5. Make it extendable for new features
a. Price prediction
b. Similarity search
c. Segmentation
6. SDK friendly (well documented, tested etc)

After 2 months
API Listeners
Feedback
$

Requirements
Lalafo case
1. Single image/request processing time: < 2 seconds
2. Visibility - Decouple request and prediction
3. Persistence
4. Scalability
5. Make it extendable for new features
a. Price prediction
b. Similarity search
c. Segmentation
6. SDK friendly (well documented, tested etc)

2 seconds per request
Part 3. The Empire Strikes Back

API Listeners
1 second
0.01 sec

segmentio/kafka-go -> shopify/sarama
API Listeners
0.02 sec
0.01 sec

API Listeners
0.02 sec
0.3 second
0.3 second
Synchronous request

2.4 seconds per request
API Listeners
0.02 sec
Synchronous request

API Listeners
0.02 sec
Synchronous request
Add ThreadPool

Success?
Not yet

API Listeners
0.02 sec

API Listeners
0.02 sec
1 second and growing

API Listeners
0.02 sec

API Listeners
0.02 sec
Scale
Scale
Master -> Slave -> balancer
Topics -> Partitions

< 1 second per request
Success?
Not yet

1 second per request
Success?
Not yet

Back to 2, 5, 10 seconds per request

Troubleshooting
API Listeners
0.02 sec
? ?

Troubleshooting
API Listeners
0.02 sec
?

Troubleshooting
API Listeners
0.02 sec
?
175k records per request

Faust 1.4.6: No latest offset
https://github.com/robinhood/faust

Issues to solve
1. Occasional spikes in performance (GC, network latency)
2. Message broker (Kafka rebalancing, offset etc)
3. How to handle DB migrations
4. Something we are not aware of yet

Lessons learnt
1. CPU bound tasks != IO bound (¯_(ツ)_/¯)
2. High coupling - low cohesion
3. You need to know how to cook MongoDB
4. Go is not that obvious and library reach as Python
5. Simple != Easier
6. Concurrency != Parallelism (obviously)

+ Live statistics from PostgreSQL
API Listeners

Taras Matsyk - Event Driven ML

More Related Content

What's hot (20)

Similar to Taras Matsyk - Event Driven ML (20)

More from PyCon Odessa (7)

Recently uploaded (20)

Taras Matsyk - Event Driven ML