SlideShare a Scribd company logo
1 Proprietary & Confidential1 Proprietary & Confidential
Using Akka Streams
For Real Time Decision Making
Dustin Lyons
Engineering Manager, Data Platform
2 Proprietary & Confidential
● Engineer turned Engineering Manager
at Credit Karma
● Data & Analytics on the Platform team
● Build things that make decisions on
where data should go
● Lover of science fiction, sushi, and
electronic music
Who I am
3 Proprietary & Confidential
Credit Karma is a free financial assistant, helping over
60 million people make progress.
4 Proprietary & Confidential
1. Data Infrastructure at Credit Karma: Past and current
2. Mo’ data, mo’ problems
3. Akka Streams saves the day
4. Results and learnings
5. Q&A
Agenda for today
5 Proprietary & Confidential
Data scale (MB/min) @ Credit Karma
6 Proprietary & Confidential
Credit Karma data platform: PHP days
PHP Scripts
7 Proprietary & Confidential
New tools to help with scale
8 Proprietary & Confidential
Credit Karma data platform: Scala in 2014
Data Warehouse Import
9 Proprietary & Confidential
New tools to help with concurrency
10 Proprietary & Confidential
Credit Karma data platform: Akka in 2015
Analytics Export
Service
+
Data Warehouse
Import
11 Proprietary & Confidential
Credit Karma data platform: Akka in 2015
Analytics Export
Service
+
Data Warehouse
Import
12 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
13 Proprietary & Confidential
Analytics export service
14 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
15 Proprietary & Confidential
Analytics export service
16 Proprietary & Confidential
Data warehouse import
ReaderDeduplicatorProcessor Extractors
Data Warehouse Import Service
17 Proprietary & Confidential
Data warehouse import
18 Proprietary & Confidential
Marble maze
19 Proprietary & Confidential
Marble maze
20 Proprietary & Confidential
Marble maze
21 Proprietary & Confidential
Marble maze
22 Proprietary & Confidential
Marble maze
1Reading from file
23 Proprietary & Confidential
Marble maze
1
2
Reading from file
Waiting for external service
24 Proprietary & Confidential
Marble maze
1
3
2
Reading from file
Objects sit in heap
Waiting for external service
25 Proprietary & Confidential
Marble maze
1
3
2
Reading from file
Objects sit in heap
Waiting for external service
4 Database Insert
26 Proprietary & Confidential
Backpressure
27 Proprietary & Confidential
What is backpressure?
Backpressure refers to the buildup of data at an I/O switch
when buffers are full and not able to receive additional data.
No additional data packets are transferred until the
bottleneck of data has been eliminated or the buffer has been
emptied.
28 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
29 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
30 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
31 Proprietary & Confidential
Data warehouse import
ReaderDeduplicatorProcessor Extractors
Data Warehouse Import Service
32 Proprietary & Confidential
Akka Streams: Backpressure in action
Actor Actor
Data
Demand
33 Proprietary & Confidential
Akka Streams: Creating a stream
Source Flow Sink
34 Proprietary & Confidential
Akka Streams: Built in stages
Built In Sources
• actorRef • actorPublisher
• fromIterator • fromFile
• Apply (from a Seq)
Built In Processing Stages
• map • filter
• grouped • drop/take
• dropWhile/takeWhile • sliding
Built In Sinks
• head • last
• seq • foreach
• actorRef • actorSubscriber
• reduce • fold
Backpressure Aware Stages
• mapAsync • buffer (Backpressure)
• batch • buffer (Drop)
• buffer (Fail)
Reference: http://doc.akka.io/docs/akka/current/scala/stream/stages-overview.html
35 Proprietary & Confidential
Analytics export service
Coordinator Data Transformer
Workers
Kafka Importer
Workers
Analytics Export Service
HTTP Ingest Server
36 Proprietary & Confidential
Analytics export service
Coordinator
Analytics Export Service
HTTP Ingest ServerAkka Stream
37 Proprietary & Confidential
Analytics export service
38 Proprietary & Confidential
Data warehouse import
ReaderDeduplicatorProcessor Extractors
Data Warehouse Import Service
39 Proprietary & Confidential
Data warehouse import
Extractors
Data Warehouse Import Service
Akka Stream
40 Proprietary & Confidential
Data warehouse import service
41 Proprietary & Confidential
Analytics export service heap (before)
GiB=>
Time =>
28 GiB
Red: Heap Space
Blue: Used Heap Space
Purple: Max Heap Space
42 Proprietary & Confidential
Analytics export service heap (after)
GiB=>
Time =>
28 GiB
43 Proprietary & Confidential
Data warehouse import
44 Proprietary & Confidential
Data warehouse import
45 Proprietary & Confidential
Data warehouse import
46 Proprietary & Confidential
• Akka Streams allowed us to move data with increased throughput and optimal
performance
• No longer getting paged for JVM out of memory or spending time tuning our
services
• Reduced the SLA for data delivery to our business stakeholders
Final results
47 Proprietary & Confidential
• Akka Actors: Great for low latency
• Akka Streams: Optimized for high throughput and solving back pressure
• Built on top of Akka Actors
• Don’t try to build high throughput systems with an actor system, you’ll just start
building Akka Streams
Lessons learned
48 Proprietary & Confidential48 Proprietary & Confidential
Thank you!
Q&A
Dustin Lyons
Engineering Manager, Data Platform

More Related Content

PDF
Lightbend Fast Data Platform
PDF
Monitoring Large-Scale Apache Spark Clusters at Databricks
PPTX
Stream Processing Live Traffic Data with Kafka Streams
PDF
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
PDF
Flink at netflix paypal speaker series
PDF
Simplify Governance of Streaming Data
PDF
Akka at Enterprise Scale: Performance Tuning Distributed Applications
PDF
Confluent kafka meetupseattle jan2017
Lightbend Fast Data Platform
Monitoring Large-Scale Apache Spark Clusters at Databricks
Stream Processing Live Traffic Data with Kafka Streams
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
Flink at netflix paypal speaker series
Simplify Governance of Streaming Data
Akka at Enterprise Scale: Performance Tuning Distributed Applications
Confluent kafka meetupseattle jan2017

What's hot (20)

PDF
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
PPTX
Running Kafka for Maximum Pain
PDF
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
PDF
Using Apache Kafka to Analyze Session Windows
PDF
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
PDF
Leveraging services in stream processor apps at Ticketmaster (Derek Cline, Ti...
PDF
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
PDF
Apache kafka-a distributed streaming platform
PDF
Time Series Analysis Using an Event Streaming Platform
PDF
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
PPTX
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
PPTX
Lambda architecture: from zero to One
PDF
The State of Stream Processing
PDF
Building Reactive Distributed Systems For Streaming Big Data, Analytics & Mac...
PPTX
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
PDF
Introduction to Apache Kafka and Confluent... and why they matter
PDF
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
PDF
The Netflix Way to deal with Big Data Problems
PDF
Data Pipelines Made Simple with Apache Kafka
PDF
Event Sourcing, Stream Processing and Serverless (Ben Stopford, Confluent) K...
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
Running Kafka for Maximum Pain
Kafka: Journey from Just Another Software to Being a Critical Part of PayPal ...
Using Apache Kafka to Analyze Session Windows
ETL as a Platform: Pandora Plays Nicely Everywhere with Real-Time Data Pipelines
Leveraging services in stream processor apps at Ticketmaster (Derek Cline, Ti...
Kafka Summit SF 2017 - Query the Application, Not a Database: “Interactive Qu...
Apache kafka-a distributed streaming platform
Time Series Analysis Using an Event Streaming Platform
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
Lambda architecture: from zero to One
The State of Stream Processing
Building Reactive Distributed Systems For Streaming Big Data, Analytics & Mac...
Reactive Fast Data & the Data Lake with Akka, Kafka, Spark
Introduction to Apache Kafka and Confluent... and why they matter
dotScale 2017 Keynote: The Rise of Real Time by Neha Narkhede
The Netflix Way to deal with Big Data Problems
Data Pipelines Made Simple with Apache Kafka
Event Sourcing, Stream Processing and Serverless (Ben Stopford, Confluent) K...
Ad

Similar to How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Streams And Actors (20)

PDF
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
PDF
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
PDF
Big Data Q2 Customer Education Webcast: New DMX Change Data Capture for Hadoo...
PDF
[Public] 7 archetipi della tecnologia moderna [italy]
PDF
Off-Label Data Mesh: A Prescription for Healthier Data
PPTX
Accelerating Data Lakes and Streams with Real-time Analytics
PDF
What's New in Upcoming Apache Spark 2.3
PDF
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
PDF
A Journey into Databricks' Pipelines: Journey and Lessons Learned
PDF
2018 02-08-what's-new-in-apache-spark-2.3
PDF
Presto @ Zalando - Big Data Tech Warsaw 2020
PPTX
Convergent Replicated Data Types in Riak 2.0
PDF
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
PDF
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
KEY
Inside Of Mbga Open Platform
PDF
What’s New in Syncsort Integrate? New User Experience for Fast Data Onboarding
PDF
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
PDF
Oracle Stream Analytics - Simplifying Stream Processing
PDF
エンタープライズブロックチェーン基盤のひとつとしてのHyperledger Fabricの強みと課題
PPTX
Puppet at Scale – Case Study of PayPal's Learnings - PuppetConf 2013
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Big Data Q2 Customer Education Webcast: New DMX Change Data Capture for Hadoo...
[Public] 7 archetipi della tecnologia moderna [italy]
Off-Label Data Mesh: A Prescription for Healthier Data
Accelerating Data Lakes and Streams with Real-time Analytics
What's New in Upcoming Apache Spark 2.3
[QCon.ai 2019] People You May Know: Fast Recommendations Over Massive Data
A Journey into Databricks' Pipelines: Journey and Lessons Learned
2018 02-08-what's-new-in-apache-spark-2.3
Presto @ Zalando - Big Data Tech Warsaw 2020
Convergent Replicated Data Types in Riak 2.0
Avoiding Common Pitfalls: Spark Structured Streaming with Kafka
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Inside Of Mbga Open Platform
What’s New in Syncsort Integrate? New User Experience for Fast Data Onboarding
Oracle GoldenGate and Apache Kafka A Deep Dive Into Real-Time Data Streaming
Oracle Stream Analytics - Simplifying Stream Processing
エンタープライズブロックチェーン基盤のひとつとしてのHyperledger Fabricの強みと課題
Puppet at Scale – Case Study of PayPal's Learnings - PuppetConf 2013
Ad

More from Lightbend (20)

PDF
IoT 'Megaservices' - High Throughput Microservices with Akka
PDF
How Akka Cluster Works: Actors Living in a Cluster
PDF
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
PDF
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
PDF
Digital Transformation with Kubernetes, Containers, and Microservices
PDF
Cloudstate - Towards Stateful Serverless
PDF
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
PDF
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
PPTX
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
PDF
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
PDF
Microservices, Kubernetes, and Application Modernization Done Right
PDF
Full Stack Reactive In Practice
PDF
Akka and Kubernetes: A Symbiotic Love Story
PPTX
Scala 3 Is Coming: Martin Odersky Shares What To Know
PDF
Migrating From Java EE To Cloud-Native Reactive Systems
PDF
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
PDF
Designing Events-First Microservices For A Cloud Native World
PDF
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
PDF
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
PDF
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes
IoT 'Megaservices' - High Throughput Microservices with Akka
How Akka Cluster Works: Actors Living in a Cluster
The Reactive Principles: Eight Tenets For Building Cloud Native Applications
Putting the 'I' in IoT - Building Digital Twins with Akka Microservices
Digital Transformation with Kubernetes, Containers, and Microservices
Cloudstate - Towards Stateful Serverless
Digital Transformation from Monoliths to Microservices to Serverless and Beyond
Akka Anti-Patterns, Goodbye: Six Features of Akka 2.6
Lessons From HPE: From Batch To Streaming For 20 Billion Sensors With Lightbe...
How to build streaming data pipelines with Akka Streams, Flink, and Spark usi...
Microservices, Kubernetes, and Application Modernization Done Right
Full Stack Reactive In Practice
Akka and Kubernetes: A Symbiotic Love Story
Scala 3 Is Coming: Martin Odersky Shares What To Know
Migrating From Java EE To Cloud-Native Reactive Systems
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Designing Events-First Microservices For A Cloud Native World
Scala Security: Eliminate 200+ Code-Level Threats With Fortify SCA For Scala
How To Build, Integrate, and Deploy Real-Time Streaming Pipelines On Kubernetes
A Glimpse At The Future Of Apache Spark 3.0 With Deep Learning And Kubernetes

Recently uploaded (20)

DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Electronic commerce courselecture one. Pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
cuic standard and advanced reporting.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation theory and applications.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
Building Integrated photovoltaic BIPV_UPV.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Electronic commerce courselecture one. Pdf
Empathic Computing: Creating Shared Understanding
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Digital-Transformation-Roadmap-for-Companies.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Network Security Unit 5.pdf for BCA BBA.
cuic standard and advanced reporting.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
A Presentation on Artificial Intelligence
Unlocking AI with Model Context Protocol (MCP)
Encapsulation theory and applications.pdf
Understanding_Digital_Forensics_Presentation.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy

How Credit Karma Makes Real-Time Decisions For 60 Million Users With Akka Streams And Actors

  • 1. 1 Proprietary & Confidential1 Proprietary & Confidential Using Akka Streams For Real Time Decision Making Dustin Lyons Engineering Manager, Data Platform
  • 2. 2 Proprietary & Confidential ● Engineer turned Engineering Manager at Credit Karma ● Data & Analytics on the Platform team ● Build things that make decisions on where data should go ● Lover of science fiction, sushi, and electronic music Who I am
  • 3. 3 Proprietary & Confidential Credit Karma is a free financial assistant, helping over 60 million people make progress.
  • 4. 4 Proprietary & Confidential 1. Data Infrastructure at Credit Karma: Past and current 2. Mo’ data, mo’ problems 3. Akka Streams saves the day 4. Results and learnings 5. Q&A Agenda for today
  • 5. 5 Proprietary & Confidential Data scale (MB/min) @ Credit Karma
  • 6. 6 Proprietary & Confidential Credit Karma data platform: PHP days PHP Scripts
  • 7. 7 Proprietary & Confidential New tools to help with scale
  • 8. 8 Proprietary & Confidential Credit Karma data platform: Scala in 2014 Data Warehouse Import
  • 9. 9 Proprietary & Confidential New tools to help with concurrency
  • 10. 10 Proprietary & Confidential Credit Karma data platform: Akka in 2015 Analytics Export Service + Data Warehouse Import
  • 11. 11 Proprietary & Confidential Credit Karma data platform: Akka in 2015 Analytics Export Service + Data Warehouse Import
  • 12. 12 Proprietary & Confidential Analytics export service Coordinator Data Transformer Workers Kafka Importer Workers Analytics Export Service HTTP Ingest Server
  • 13. 13 Proprietary & Confidential Analytics export service
  • 14. 14 Proprietary & Confidential Analytics export service Coordinator Data Transformer Workers Kafka Importer Workers Analytics Export Service HTTP Ingest Server
  • 15. 15 Proprietary & Confidential Analytics export service
  • 16. 16 Proprietary & Confidential Data warehouse import ReaderDeduplicatorProcessor Extractors Data Warehouse Import Service
  • 17. 17 Proprietary & Confidential Data warehouse import
  • 18. 18 Proprietary & Confidential Marble maze
  • 19. 19 Proprietary & Confidential Marble maze
  • 20. 20 Proprietary & Confidential Marble maze
  • 21. 21 Proprietary & Confidential Marble maze
  • 22. 22 Proprietary & Confidential Marble maze 1Reading from file
  • 23. 23 Proprietary & Confidential Marble maze 1 2 Reading from file Waiting for external service
  • 24. 24 Proprietary & Confidential Marble maze 1 3 2 Reading from file Objects sit in heap Waiting for external service
  • 25. 25 Proprietary & Confidential Marble maze 1 3 2 Reading from file Objects sit in heap Waiting for external service 4 Database Insert
  • 26. 26 Proprietary & Confidential Backpressure
  • 27. 27 Proprietary & Confidential What is backpressure? Backpressure refers to the buildup of data at an I/O switch when buffers are full and not able to receive additional data. No additional data packets are transferred until the bottleneck of data has been eliminated or the buffer has been emptied.
  • 28. 28 Proprietary & Confidential Analytics export service Coordinator Data Transformer Workers Kafka Importer Workers Analytics Export Service HTTP Ingest Server
  • 29. 29 Proprietary & Confidential Analytics export service Coordinator Data Transformer Workers Kafka Importer Workers Analytics Export Service HTTP Ingest Server
  • 30. 30 Proprietary & Confidential Analytics export service Coordinator Data Transformer Workers Kafka Importer Workers Analytics Export Service HTTP Ingest Server
  • 31. 31 Proprietary & Confidential Data warehouse import ReaderDeduplicatorProcessor Extractors Data Warehouse Import Service
  • 32. 32 Proprietary & Confidential Akka Streams: Backpressure in action Actor Actor Data Demand
  • 33. 33 Proprietary & Confidential Akka Streams: Creating a stream Source Flow Sink
  • 34. 34 Proprietary & Confidential Akka Streams: Built in stages Built In Sources • actorRef • actorPublisher • fromIterator • fromFile • Apply (from a Seq) Built In Processing Stages • map • filter • grouped • drop/take • dropWhile/takeWhile • sliding Built In Sinks • head • last • seq • foreach • actorRef • actorSubscriber • reduce • fold Backpressure Aware Stages • mapAsync • buffer (Backpressure) • batch • buffer (Drop) • buffer (Fail) Reference: http://doc.akka.io/docs/akka/current/scala/stream/stages-overview.html
  • 35. 35 Proprietary & Confidential Analytics export service Coordinator Data Transformer Workers Kafka Importer Workers Analytics Export Service HTTP Ingest Server
  • 36. 36 Proprietary & Confidential Analytics export service Coordinator Analytics Export Service HTTP Ingest ServerAkka Stream
  • 37. 37 Proprietary & Confidential Analytics export service
  • 38. 38 Proprietary & Confidential Data warehouse import ReaderDeduplicatorProcessor Extractors Data Warehouse Import Service
  • 39. 39 Proprietary & Confidential Data warehouse import Extractors Data Warehouse Import Service Akka Stream
  • 40. 40 Proprietary & Confidential Data warehouse import service
  • 41. 41 Proprietary & Confidential Analytics export service heap (before) GiB=> Time => 28 GiB Red: Heap Space Blue: Used Heap Space Purple: Max Heap Space
  • 42. 42 Proprietary & Confidential Analytics export service heap (after) GiB=> Time => 28 GiB
  • 43. 43 Proprietary & Confidential Data warehouse import
  • 44. 44 Proprietary & Confidential Data warehouse import
  • 45. 45 Proprietary & Confidential Data warehouse import
  • 46. 46 Proprietary & Confidential • Akka Streams allowed us to move data with increased throughput and optimal performance • No longer getting paged for JVM out of memory or spending time tuning our services • Reduced the SLA for data delivery to our business stakeholders Final results
  • 47. 47 Proprietary & Confidential • Akka Actors: Great for low latency • Akka Streams: Optimized for high throughput and solving back pressure • Built on top of Akka Actors • Don’t try to build high throughput systems with an actor system, you’ll just start building Akka Streams Lessons learned
  • 48. 48 Proprietary & Confidential48 Proprietary & Confidential Thank you! Q&A Dustin Lyons Engineering Manager, Data Platform