How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Streams And Actors

1 Proprietary & Confidential1 Proprietary & Confidential
Using Akka Streams
For Real Time Decision Making
Dustin Lyons
Engineering Manager, Data Platform

2 Proprietary & Confidential
● Engineer turned Engineering Manager
at Credit Karma
● Data & Analytics on the Platform team
● Build things that make decisions on
where data should go
● Lover of science fiction, sushi, and
electronic music
Who I am

Credit Karma is a free financial assistant, helping over
60 million people make progress.

1. Data Infrastructure at Credit Karma: Past and current
2. Mo’ data, mo’ problems
3. Akka Streams saves the day
4. Results and learnings
5. Q&A
Agenda for today

Data scale (MB/min) @ Credit Karma

Credit Karma data platform: PHP days
PHP Scripts

New tools to help with scale

Credit Karma data platform: Scala in 2014
Data Warehouse Import

New tools to help with concurrency

Credit Karma data platform: Akka in 2015
Analytics Export
Service
+
Data Warehouse
Import

Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server

Workers
Kafka Importer
Workers
HTTP Ingest Server

Data warehouse import
ReaderDeduplicatorProcessor Extractors
Data Warehouse Import Service

Marble maze

Marble maze
1Reading from file

Marble maze
1
2
Reading from file
Waiting for external service

Marble maze
1
3
2
Reading from file
Objects sit in heap

Marble maze
1
3
2
Reading from file
Objects sit in heap
4 Database Insert

Backpressure

What is backpressure?
Backpressure refers to the buildup of data at an I/O switch
when buffers are full and not able to receive additional data.
No additional data packets are transferred until the
bottleneck of data has been eliminated or the buffer has been
emptied.

Workers
Kafka Importer
Workers
HTTP Ingest Server

Akka Streams: Backpressure in action
Actor Actor
Data
Demand

Akka Streams: Creating a stream
Source Flow Sink

Akka Streams: Built in stages
Built In Sources
• actorRef • actorPublisher
• fromIterator • fromFile
• Apply (from a Seq)
Built In Processing Stages
• map • filter
• grouped • drop/take
• dropWhile/takeWhile • sliding
Built In Sinks
• head • last
• seq • foreach
• actorRef • actorSubscriber
• reduce • fold
Backpressure Aware Stages
• mapAsync • buffer (Backpressure)
• batch • buffer (Drop)
• buffer (Fail)
Reference: http://doc.akka.io/docs/akka/current/scala/stream/stages-overview.html

Workers
Kafka Importer
Workers
HTTP Ingest Server

Coordinator
HTTP Ingest ServerAkka Stream

Extractors
Akka Stream

Data warehouse import service

Analytics export service heap (before)
GiB=>
Time =>
28 GiB
Red: Heap Space
Blue: Used Heap Space
Purple: Max Heap Space

Analytics export service heap (after)
GiB=>
Time =>
28 GiB

• Akka Streams allowed us to move data with increased throughput and optimal
performance
• No longer getting paged for JVM out of memory or spending time tuning our
services
• Reduced the SLA for data delivery to our business stakeholders
Final results

• Akka Actors: Great for low latency
• Akka Streams: Optimized for high throughput and solving back pressure
• Built on top of Akka Actors
• Don’t try to build high throughput systems with an actor system, you’ll just start
building Akka Streams
Lessons learned

48 Proprietary & Confidential48 Proprietary & Confidential
Thank you!
Q&A
Dustin Lyons
Engineering Manager, Data Platform

How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Streams And Actors

More Related Content

What's hot (20)

Similar to How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Streams And Actors (20)

More from Lightbend (20)

Recently uploaded (20)

How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Streams And Actors