SlideShare a Scribd company logo
H O P E S, R E G RETS A N D
“ B E ST P R ACTICES”
RISKING
EVERYT HING
W IT H
AKKA ST REAMS
08-12-20 1 6
JOACHIM HOFER
(@JOHOF ER )
2
0. What We Do
1. First Impressions
2. Lessons Learnt
3. Awesomeness
4. Ops
5. Verdict
T ABLE OF
CONT ENTS
3
WHAT WE DO @ ZALANDO
4
net sales 2015: 3 billion euros
several 1000 updates / second
latency: ideally seconds
WHAT WE DO: SOME NUMBERS
5
“ SOME” RISK INVOLVED!
6
Akka Streams vs. Futures, RxScala, Actors
Tech Blog “Which shoe fits you?”
https://github.com/zalando/
scala-concurrency-playground
Akka Streams fits us best!
Image: Nevit Dilmen (CC BY-SA 3.0)
FIRST IMPRESSIONS — PREPARATION
7
H U H?
FIRST IMPRESSIONS — GRAPH DSL
import GraphDSL.Implicits._
val bcSqsIn = b add Broadcast[StreamCreationEvent](2)
val rules = b add ruleStore.stage.async
val eval = b add evaluator.stage.async
val publish = b add nakadi.stage.async
val ack = b add ackStage.stage.async
bcSqsIn ~> rules ~> eval.in0
bcSqsIn ~> eval.in1; eval.out ~> publish ~> ack
FlowShape(bcSqsIn.in, ack.out)
8
A H !
FIRST IMPRESSIONS — PLAIN SOURCES
products
.mapAsync(parallelism = 5)(ruleEvaluatorEvent(flowId))
.groupedWithin(3, 100 millis)
.filter(_.nonEmpty)
.mapAsync(parallelism = 50)(sqsGateway.send)
.runForeach(result => log.info(result.getFailed.size))
9 Images: Jerry Daykin (CC BY 2.0, left), The Present Group (CC BY 3.0 US, right)
FIRST IMPRESSIONS — OTHER BIG SCARIES
MAT ERIALISATION B A C K P R E S SU R E
1 0
T AST Y DOCUMENTATION
Image: Wikivisual (CC BY-NC-SA 3.0)
1 1
MISTAKES W ERE MADE
Image: Hapesoft (public domain)
1 2
Caused by onError
Completes the stream
Recover using recover / recoverWithRetries
FAILURES (VS ERRORS)
1 3
Caused by exceptions
Get escalated to failures by default
Recover using a Supervision Strategy
Can be ignored easily (“Resume” strategy)
ERRORS (VS FAILURES)
1 4
R E S UMING
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E1
1 5
R E S UMING
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E
1
1
1 6
R E S UMING
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E
1
1 7
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E
1
2
1 8
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E
1
2
2
1 9
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E
1
2
2
2 0
G O IN G O U T - OF-SYNC
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E
12
2
2 1
Unconsumed response entities
Solution: discardEntityBytes
LESSONS LEARNT — AKKA-HT TP CLIENTS (1)
2 2
No response from the server
“Currently Akka HTTP doesn’t implement client-side request timeout
checking itself as this functionality can be regarded as a more general
purpose streaming infrastructure feature.” (akka http docs)
Solution: Use an explicit xxxTimeout stage
LESSONS LEARNT — AKKA-HT TP CLIENTS (2)
2 3
Default: only up to 16?!
16 (buffer size) x 50 (parallel flows) x 10 (events per batch) = 8000 events
Solution: keep low, use explicit buffer stages
LESSONS LEARNT — INTERNAL BUFFERS
2 4
C O N FIGURATION A S A S O U RCE?
Image: Alpha (CC BY-SA 2.0)
LESSONS LEARNT
RETRIEVE
EVENTS
EVENTS
RETRIEVE
RULES
EVALUAT E
CONFI GU R A TI ON
2 5
REACTIVE
MANIFESTO
2 6
Backpressure
Automatically asynchronous
AWESOMENESS — REACTIVE
2 7
AWESOMENESS — TESTABILITY
EVALUAT ETEST EVENTS OUTPUT S TO CHECK
2 8
groupedWithin
throttle
mapConcat
recoverWithRetries
…
AWESOMENESS — BUILT -I N STAGES
Flow[RuleEvaluationEvent]
.groupedWithin(batchSize, batchTimeout)
.mapAsync(parallelism = 1)(provider.deleteFromStreamCreation)
.mapConcat(identity)
.via(logAndMonitor(Checkpoint.Acknowledged, "acknowledged msg"))
2 9
no out-of-the-box solution
built-in stage monitor not very helpful
no access to internal buffers
OPS — MONITORI NG: THE BAD
3 0
Trace events along streams
Create your own monitoring stage
OPS — MONITORI NG: THE GOOD
3 1
E X A MPLE P A S S -THROUGH S T AG E L O G IC
OPS — MONITORI NG
…
new GraphStageLogic(shape) {
setHandlers(in, out, new InHandler with OutHandler {
override def onPush(): Unit = {
push(out, grab(in))
}
override def onPull(): Unit = {
pull(in)
}
})
}
…
3 2
E X A MPLE P A S S -THROUGH S T AG E L O G IC
OPS — MONITORI NG
…
new GraphStageLogic(shape) {
setHandlers(in, out, new InHandler with OutHandler {
override def onPush(): Unit = {
stats.countPush()
push(out, grab(in))
}
override def onPull(): Unit = {
stats.countPull()
pull(in)
}
})
}
…
3 3
easy to tune
very efficient
easy to scale
see also: Gearpump Materializer
Image: Duncan Rawlinson (CC BY 2.0)
OPS — TUNING AND SCALING
3 4
everything under control
just keeps on running
OPS — RELIABILITY
3 6
Takes time to understand
Very potent
Image: Twice25 (CC BY-SA 2.5)
VERDICT: IT’S MAGIC!
3 7
R I S K ➟ REW ARD
joachim.hofer@zalando.de
@johofer
J O A CHIM H O F ER
Availability Engineering
Backend Engineer
08-12-20 1 6

More Related Content

PDF
Controlling Technical Debt with Continuous Delivery
PDF
Monitoring and Logging in Wonderland
PDF
Reliability Patterns for Fun and Profit
PDF
Meteor WWNRW Intro
PDF
Error Handling in Reactive Systems
PDF
Apache Gearpump - Lightweight Real-time Streaming Engine
PDF
End to End Akka Streams / Reactive Streams - from Business to Socket
PDF
Gearpump akka streams
Controlling Technical Debt with Continuous Delivery
Monitoring and Logging in Wonderland
Reliability Patterns for Fun and Profit
Meteor WWNRW Intro
Error Handling in Reactive Systems
Apache Gearpump - Lightweight Real-time Streaming Engine
End to End Akka Streams / Reactive Streams - from Business to Socket
Gearpump akka streams

Similar to Risking Everything with Akka Streams (20)

PDF
8 akka anti-patterns you'd better be aware of - Reactive Summit Austin 2017
PDF
Reactive Summit 2017 Highlights!
PDF
Akka Streams - From Zero to Kafka
PPTX
Stream processing from single node to a cluster
PDF
Journey into Reactive Streams and Akka Streams
ODP
Introduction to Akka Streams [Part-I]
PDF
Writing Asynchronous Programs with Scala & Akka
PDF
Reactive Streams: Handling Data-Flow the Reactive Way
PPTX
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
PDF
Reactive mistakes - ScalaDays Chicago 2017
PDF
PSUG #52 Dataflow and simplified reactive programming with Akka-streams
PDF
Scala Days Copenhagen - 8 Akka anti-patterns you'd better be aware of
PPTX
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
PPTX
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
PDF
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
PDF
Building a Reactive System with Akka - Workshop @ O'Reilly SAConf NYC
PPTX
Reactive Streams - László van den Hoek
PPTX
Akka.NET streams and reactive streams
PDF
8 Akka anti-patterns you'd better be aware of
PDF
Akka Streams and HTTP
8 akka anti-patterns you'd better be aware of - Reactive Summit Austin 2017
Reactive Summit 2017 Highlights!
Akka Streams - From Zero to Kafka
Stream processing from single node to a cluster
Journey into Reactive Streams and Akka Streams
Introduction to Akka Streams [Part-I]
Writing Asynchronous Programs with Scala & Akka
Reactive Streams: Handling Data-Flow the Reactive Way
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Reactive mistakes - ScalaDays Chicago 2017
PSUG #52 Dataflow and simplified reactive programming with Akka-streams
Scala Days Copenhagen - 8 Akka anti-patterns you'd better be aware of
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Ka...
Back-Pressure in Action: Handling High-Burst Workloads with Akka Streams & Kafka
Reactive Design Patterns: a talk by Typesafe's Dr. Roland Kuhn
Building a Reactive System with Akka - Workshop @ O'Reilly SAConf NYC
Reactive Streams - László van den Hoek
Akka.NET streams and reactive streams
8 Akka anti-patterns you'd better be aware of
Akka Streams and HTTP
Ad

Recently uploaded (20)

PDF
System and Network Administration Chapter 2
PDF
Nekopoi APK 2025 free lastest update
PPTX
Essential Infomation Tech presentation.pptx
PDF
Understanding Forklifts - TECH EHS Solution
PDF
medical staffing services at VALiNTRY
PPTX
L1 - Introduction to python Backend.pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
AI in Product Development-omnex systems
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
System and Network Administration Chapter 2
Nekopoi APK 2025 free lastest update
Essential Infomation Tech presentation.pptx
Understanding Forklifts - TECH EHS Solution
medical staffing services at VALiNTRY
L1 - Introduction to python Backend.pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Wondershare Filmora 15 Crack With Activation Key [2025
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Operating system designcfffgfgggggggvggggggggg
Reimagine Home Health with the Power of Agentic AI​
Softaken Excel to vCard Converter Software.pdf
2025 Textile ERP Trends: SAP, Odoo & Oracle
AI in Product Development-omnex systems
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Ad

Risking Everything with Akka Streams

  • 1. H O P E S, R E G RETS A N D “ B E ST P R ACTICES” RISKING EVERYT HING W IT H AKKA ST REAMS 08-12-20 1 6 JOACHIM HOFER (@JOHOF ER )
  • 2. 2 0. What We Do 1. First Impressions 2. Lessons Learnt 3. Awesomeness 4. Ops 5. Verdict T ABLE OF CONT ENTS
  • 3. 3 WHAT WE DO @ ZALANDO
  • 4. 4 net sales 2015: 3 billion euros several 1000 updates / second latency: ideally seconds WHAT WE DO: SOME NUMBERS
  • 5. 5 “ SOME” RISK INVOLVED!
  • 6. 6 Akka Streams vs. Futures, RxScala, Actors Tech Blog “Which shoe fits you?” https://github.com/zalando/ scala-concurrency-playground Akka Streams fits us best! Image: Nevit Dilmen (CC BY-SA 3.0) FIRST IMPRESSIONS — PREPARATION
  • 7. 7 H U H? FIRST IMPRESSIONS — GRAPH DSL import GraphDSL.Implicits._ val bcSqsIn = b add Broadcast[StreamCreationEvent](2) val rules = b add ruleStore.stage.async val eval = b add evaluator.stage.async val publish = b add nakadi.stage.async val ack = b add ackStage.stage.async bcSqsIn ~> rules ~> eval.in0 bcSqsIn ~> eval.in1; eval.out ~> publish ~> ack FlowShape(bcSqsIn.in, ack.out)
  • 8. 8 A H ! FIRST IMPRESSIONS — PLAIN SOURCES products .mapAsync(parallelism = 5)(ruleEvaluatorEvent(flowId)) .groupedWithin(3, 100 millis) .filter(_.nonEmpty) .mapAsync(parallelism = 50)(sqsGateway.send) .runForeach(result => log.info(result.getFailed.size))
  • 9. 9 Images: Jerry Daykin (CC BY 2.0, left), The Present Group (CC BY 3.0 US, right) FIRST IMPRESSIONS — OTHER BIG SCARIES MAT ERIALISATION B A C K P R E S SU R E
  • 10. 1 0 T AST Y DOCUMENTATION Image: Wikivisual (CC BY-NC-SA 3.0)
  • 11. 1 1 MISTAKES W ERE MADE Image: Hapesoft (public domain)
  • 12. 1 2 Caused by onError Completes the stream Recover using recover / recoverWithRetries FAILURES (VS ERRORS)
  • 13. 1 3 Caused by exceptions Get escalated to failures by default Recover using a Supervision Strategy Can be ignored easily (“Resume” strategy) ERRORS (VS FAILURES)
  • 14. 1 4 R E S UMING LESSONS LEARNT RETRIEVE EVENTS EVENTS RETRIEVE RULES EVALUAT E1
  • 15. 1 5 R E S UMING LESSONS LEARNT RETRIEVE EVENTS EVENTS RETRIEVE RULES EVALUAT E 1 1
  • 16. 1 6 R E S UMING LESSONS LEARNT RETRIEVE EVENTS EVENTS RETRIEVE RULES EVALUAT E 1
  • 20. 2 0 G O IN G O U T - OF-SYNC LESSONS LEARNT RETRIEVE EVENTS EVENTS RETRIEVE RULES EVALUAT E 12 2
  • 21. 2 1 Unconsumed response entities Solution: discardEntityBytes LESSONS LEARNT — AKKA-HT TP CLIENTS (1)
  • 22. 2 2 No response from the server “Currently Akka HTTP doesn’t implement client-side request timeout checking itself as this functionality can be regarded as a more general purpose streaming infrastructure feature.” (akka http docs) Solution: Use an explicit xxxTimeout stage LESSONS LEARNT — AKKA-HT TP CLIENTS (2)
  • 23. 2 3 Default: only up to 16?! 16 (buffer size) x 50 (parallel flows) x 10 (events per batch) = 8000 events Solution: keep low, use explicit buffer stages LESSONS LEARNT — INTERNAL BUFFERS
  • 24. 2 4 C O N FIGURATION A S A S O U RCE? Image: Alpha (CC BY-SA 2.0) LESSONS LEARNT RETRIEVE EVENTS EVENTS RETRIEVE RULES EVALUAT E CONFI GU R A TI ON
  • 27. 2 7 AWESOMENESS — TESTABILITY EVALUAT ETEST EVENTS OUTPUT S TO CHECK
  • 28. 2 8 groupedWithin throttle mapConcat recoverWithRetries … AWESOMENESS — BUILT -I N STAGES Flow[RuleEvaluationEvent] .groupedWithin(batchSize, batchTimeout) .mapAsync(parallelism = 1)(provider.deleteFromStreamCreation) .mapConcat(identity) .via(logAndMonitor(Checkpoint.Acknowledged, "acknowledged msg"))
  • 29. 2 9 no out-of-the-box solution built-in stage monitor not very helpful no access to internal buffers OPS — MONITORI NG: THE BAD
  • 30. 3 0 Trace events along streams Create your own monitoring stage OPS — MONITORI NG: THE GOOD
  • 31. 3 1 E X A MPLE P A S S -THROUGH S T AG E L O G IC OPS — MONITORI NG … new GraphStageLogic(shape) { setHandlers(in, out, new InHandler with OutHandler { override def onPush(): Unit = { push(out, grab(in)) } override def onPull(): Unit = { pull(in) } }) } …
  • 32. 3 2 E X A MPLE P A S S -THROUGH S T AG E L O G IC OPS — MONITORI NG … new GraphStageLogic(shape) { setHandlers(in, out, new InHandler with OutHandler { override def onPush(): Unit = { stats.countPush() push(out, grab(in)) } override def onPull(): Unit = { stats.countPull() pull(in) } }) } …
  • 33. 3 3 easy to tune very efficient easy to scale see also: Gearpump Materializer Image: Duncan Rawlinson (CC BY 2.0) OPS — TUNING AND SCALING
  • 34. 3 4 everything under control just keeps on running OPS — RELIABILITY
  • 35. 3 6 Takes time to understand Very potent Image: Twice25 (CC BY-SA 2.5) VERDICT: IT’S MAGIC!
  • 36. 3 7 R I S K ➟ REW ARD
  • 37. joachim.hofer@zalando.de @johofer J O A CHIM H O F ER Availability Engineering Backend Engineer 08-12-20 1 6

Editor's Notes

  • #2: clickbait ask about experiences
  • #4: sell clothes and shoes online monolithic fashion store -> platform for fashion services tech: microservices, connected by event streams we: decide if product should be available why: based on rules (e.g. no price, image quality, missing stuff)
  • #7: BEGIN first impressions
  • #8: - difficult for Scala beginners
  • #9: just using sources a lot easier sufficient for most of the code best practice: start out with sources, transition to graphs later
  • #10: next up: the solution: rtfm
  • #11: best practice: RTFM pancakes example (go find it!)
  • #12: BEGIN Lessons Learnt next up: failure handling
  • #13: tell about failures vs errors first!
  • #14: Resume: we’re using it a lot, very convenient, but…
  • #21: out-of-sync deadlock by backpressure solution: debug on whiteboard, use application layer “errors” (Either)
  • #22: see documentation
  • #24: - premature optimisation - starts up slowly - bugs unnoticed longer, harder to find - keep to defaults (or adapt carefully) - use explicit buffer stages where necessary
  • #25: caching — HoldWithWait decoupling not as a stream source? currently traditional cache / DI, regrets… maybe just add it to event metadata
  • #26: BEGIN awesomeness clean and well-thought-out concept
  • #28: thinking in streams: thinking “stateless by default” easily test individual stages or test partial graphs
  • #29: very little code (!) very readable (because high-level) throttle: see canarying
  • #30: BEGIN ops what about Kamon? contribute to akka-streams? create open source?
  • #34: tuning: dispatchers, # parallel flows, async boundaries efficient: backpressure regulates rate “magically” scale: stream-oriented in general: just add more machines
  • #35: threads, memory fully under control no strange outages as of yet, fingers crossed NEXT: VERDICT
  • #37: Fast to develop, fast to execute ~ 300 events/sec/instance for rather complicated use case latency down from hours (legacy) to < 1 min (1 s outside peaks, not full system yet) the perfect abstraction for our use case
  • #38: no incidents yet because of Akka Streams there’s always more planets to land on risk was worth taking