SlideShare a Scribd company logo
@ELMANU 
BACK < 
& future: ACTORS 
AND > PIPES < 
using akka for large-scale data 
migration 
manuel BERNHART
@ELMANU 
AGENDA 
• { BACKGROUND STORY 
• } FUTURES > PIPES < ACTORS 
• | LESSONS LEARNED
@ELMANU 
who is speaking? 
• freelance software consultant 
based in Vienna 
• Vienna Scala User Group 
• web, web, web 
• writing a book on reactive 
web-applications
@ELMANU 
[ { 
BACKGROUND 
STORY
@ELMANU 
talenthouse 
• www.talenthouse.com 
• based in Los Angeles 
• connecting brands and artists 
• 3+ million users
Back to the futures, actors and pipes: using Akka for large-scale data migration
Back to the futures, actors and pipes: using Akka for large-scale data migration
Back to the futures, actors and pipes: using Akka for large-scale data migration
Back to the futures, actors and pipes: using Akka for large-scale data migration
@ELMANU 
BACKGROUND STORY 
• old, slow (very slow) platform 
• re-implementation from scratch with Scala & Play 
• tight schedule, a lot of data to migrate
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM 
DISCLAIMER: 
What follows is not intended as a 
bashing of the source system, but as a 
necessary explanation of its complexity in 
relation to data migration.
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM
@ELMANU 
SOURCE SYSTEM
@ELMANU 
MIGRATION schedule 
•basically, one week-end 
•big-bang kind-of migration 
•if possible incremental migration beforehand
@ELMANU 
[ } 
FUTURES > PIPES 
< ACTORS
@ELMANU 
FUTURES
@ELMANU 
FUTURES 
•scala.concurrent.Future[T] 
•holds a value of type T 
•can either fail or succeed
@ELMANU 
FUTURES: HAPPY 
PATH 
import scala.concurrent._ 
import scala.concurrent.ExecutionContext.Implicits.global 
val futureSum: Future[Int] = Future { 1 + 1 } 
futureSum.map { sum => 
println("The sum is " + sum) 
}
@ELMANU 
FUTURES: SAD PATH 
import scala.concurrent._ 
import scala.concurrent.ExecutionContext.Implicits.global 
import scala.concurrent.duration._ 
val futureDiv: Future[Int] = Future { 1 / 0 } 
val futurePrint: Future[Unit] = futureDiv.map { div => 
println("The division result is " + div) 
} 
Await.result(futurePrint, 1 second)
@ELMANU 
FUTURES: SAD PATH 
import scala.concurrent._ 
import scala.concurrent.ExecutionContext.Implicits.global 
import scala.concurrent.duration._ 
val futureDiv: Future[Int] = Future { 1 / 0 } 
val futurePrint: Future[Unit] = futureDiv.map { div => 
println("The division result is " + div) 
} 
Await.result(futurePrint, 1 second) 
Avoid blocking if possible
@ELMANU 
FUTURES: SAD PATH 
import scala.concurrent._ 
import scala.concurrent.ExecutionContext.Implicits.global 
import scala.concurrent.duration._ 
scala> 
Await.result(futureDiv, 
1.second) 
java.lang.ArithmeticException: 
/ 
by 
zero 
at 
$anonfun$1.apply$mcI$sp(<console>:11) 
at 
$anonfun$1.apply(<console>:11) 
at 
$anonfun$1.apply(<console>:11) 
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) 
at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) 
at 
scala.concurrent.impl.ExecutionContextImpl 
$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) 
at 
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) 
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) 
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 
val futureDiv: Future[Int] = Future { 1 / 0 } 
futureDiv.map { div => 
println("The division result is " + div) 
} 
Await.result(futureDiv, 1 second)
@ELMANU 
FUTURES: SAD PATH 
import scala.concurrent._ 
import scala.concurrent.ExecutionContext.Implicits.global 
import scala.concurrent.duration._ 
val futureDiv: Future[Int] = Future { 1 / 0 } 
val futurePrint: Future[Unit] = futureDiv.map { div => 
println("The division result is " + div) 
}.recover { 
case a: java.lang.ArithmeticException => 
println("What on earth are you trying to do?") 
} 
Await.result(futurePrint, 1 second) Be mindful of failure
@ELMANU 
FUTURES: SAD PATH 
•Exceptions are propagated up the chain 
•Without recover there is no guarantee that 
failure will ever get noticed!
@ELMANU 
COMPOSING FUTURES 
val futureA: Future[Int] = Future { 1 + 1 } 
val futureB: Future[Int] = Future { 2 + 2 } 
val futureC: Future[Int] = for { 
a <- futureA 
b <- futureB 
} yield { 
a + b 
}
@ELMANU 
COMPOSING FUTURES 
val futureC: Future[Int] = for { 
a <- Future { 1 + 1 } 
b <- Future { 2 + 2 } 
} yield { 
a + b 
}
@ELMANU 
COMPOSING FUTURES 
val futureC: Future[Int] = for { 
a <- Future { 1 + 1 } 
b <- Future { 2 + 2 } 
} yield { 
a + b 
} 
This runs in sequence 
Don’t do this
@ELMANU 
FUTURES: CALLBACKS 
import scala.concurrent._ 
import scala.concurrent.ExecutionContext.Implicits.global 
val futureDiv: Future[Int] = Future { 1 / 0 } 
futureDiv.onSuccess { case result => 
println("Result: " + result) 
} 
futureDiv.onFailure { case t: Throwable => 
println("Oh no!") 
}
@ELMANU 
using FUTURES 
•a Future { … } block that doesn’t do any I/O 
is code smell 
•use them in combination with the “right” 
ExecutionContext set-up 
•when you have blocking operations, wrap 
them into a blocking block
@ELMANU 
using FUTURES 
import scala.concurrent.blocking 
Future { 
blocking { 
DB.withConnection { implicit connection => 
val query = SQL("select * from bar") 
query() 
} 
} 
}
@ELMANU 
naming FUTURES
@ELMANU 
naming FUTURES 
“Say 
eventuallyMaybe 
one more time!”
@ELMANU 
ACTORS
@ELMANU 
ACTORS 
•lightweight objects 
•send and receive messages (mailbox) 
•can have children (supervision)
@ELMANU 
ACTORS 
Mailbox Mailbox 
akka://application/user/georgePeppard akka://application/user/audreyHepburn 
Mailbox 
akka://application/user/audreyHepburn/cat
@ELMANU 
ACTORS 
Mailbox Mailbox 
Holly, I'm in love with you. 
akka://application/user/georgePeppard akka://application/user/audreyHepburn 
akka://application/user/audreyHepburn/cat
@ELMANU 
ACTORS 
Mailbox Mailbox 
Holly, I'm in love with you. 
So what? 
akka://application/user/georgePeppard akka://application/user/audreyHepburn 
akka://application/user/audreyHepburn/cat
@ELMANU 
GETTING AN ACTOR 
import akka.actor._ 
class AudreyHepburn extends Actor { 
def receive = { ... } 
} 
val system: ActorSystem = ActorSystem() 
val audrey: ActorRef = system.actorOf(Props[AudreyHepburn])
@ELMANU 
SENDING AND 
RECEIVING MESSAGES 
case class Script(text: String) 
class AudreyHepburn extends Actor { 
def receive = { 
case Script(text) => 
read(text) 
} 
}
@ELMANU 
SENDING AND 
RECEIVING MESSAGES 
case class Script(text: String) 
class AudreyHepburn extends Actor { 
def receive = { 
case Script(text) => 
read(text) 
} 
} 
audrey ! Script(breakfastAtTiffany)
@ELMANU 
SENDING AND 
RECEIVING MESSAGES 
case class Script(text: String) 
class AudreyHepburn extends Actor { 
def receive = { 
case Script(text) => 
read(text) 
} 
} 
audrey ! Script(breakfastAtTiffany) 
“tell” - fire-forget
@ELMANU 
ASK PATTERN 
import akka.pattern.ask 
import scala.concurrent.ExecutionContext.Implicits.global 
import scala.concurrent.duration._ 
implicit val timeout = akka.util.Timeout(1 second) 
val maybeAnswer: Future[String] = 
audrey ? "Where should we have breakfast?"
@ELMANU 
ASK PATTERN 
import akka.pattern.ask 
import scala.concurrent.ExecutionContext.Implicits.global 
import scala.concurrent.duration._ 
implicit val timeout = akka.util.Timeout(1 second) 
val maybeAnswer: Future[String] = 
audrey ? "Where should we have breakfast?" 
“ask”
@ELMANU 
SUPERVISION 
class UserMigrator extends Actor { 
lazy val workers: ActorRef = context 
.actorOf[UserMigrationWorker] 
.withRouter(RoundRobinRouter(nrOfInstances = 100)) 
}
@ELMANU 
SUPERVISION 
class UserMigrator extends Actor { 
actor context 
lazy val workers: ActorRef = context 
.actorOf[UserMigrationWorker] 
.withRouter(RoundRobinRouter(nrOfInstances = 100)) 
} 
router type many children
@ELMANU 
SUPERVISION
@ELMANU 
SUPERVISION 
class UserMigrator extends Actor { 
lazy val workers: ActorRef = context 
.actorOf[UserMigrationWorker] 
.withRouter(RoundRobinRouter(nrOfInstances = 100)) 
override def supervisorStrategy: SupervisorStrategy = 
OneForOneStrategy(maxNrOfRetries = 3) { 
case t: Throwable => 
log.error(“A child died!”, t) 
Restart 
} 
}
@ELMANU 
PIPES
@ELMANU 
CECI EST UNE PIPE 
•Akka pattern to combine Futures and Actors 
•Sends the result of a Future to an Actor 
•Be careful with error handling
@ELMANU 
CECI EST UNE PIPE 
class FileFetcher extends Actor { 
def receive = { 
case FetchFile(url) => 
val originalSender = sender() 
val download: Future[DownloadedFile] = 
WS.url(url).get().map { response => 
DownloadedFile( 
url, 
response.ahcResponse.getResponseBodyAsBytes 
) 
} 
import akka.pattern.pipe 
download pipeTo originalSender 
} 
}
@ELMANU 
CECI EST UNE PIPE 
class FileFetcher extends Actor { 
def receive = { 
case FetchFile(url) => 
val originalSender = sender() 
val download: Future[DownloadedFile] = 
WS.url(url).get().map { response => 
DownloadedFile( 
url, 
response.ahcResponse.getResponseBodyAsBytes 
) 
} 
import akka.pattern.pipe 
download pipeTo originalSender 
} 
} 
This is how you pipe
@ELMANU 
CECI EST UNE PIPE 
class FileFetcher extends Actor { 
def receive = { 
case FetchFile(url) => 
val originalSender = sender() 
val download: Future[DownloadedFile] = 
WS.url(url).get().map { response => 
DownloadedFile( 
url, 
response.ahcResponse.getResponseBodyAsBytes 
) 
} 
import akka.pattern.pipe 
download pipeTo originalSender 
} 
} 
Keep reference to original sender - what follows is a Future!
@ELMANU 
CECI EST UNE PIPE 
class FileFetcher extends Actor { 
def receive = { 
case FetchFile(url) => 
val originalSender = sender() 
val download: Future[DownloadedFile] = 
WS.url(url).get().map { response => 
DownloadedFile( 
url, 
response.ahcResponse.getResponseBodyAsBytes 
) 
} 
import akka.pattern.pipe 
download pipeTo originalSender 
} 
} 
Wrap your result into something you can easily match against
@ELMANU 
CECI EST UNE PIPE 
class FileFetcher extends Actor { 
def receive = { 
case FetchFile(url) => 
val originalSender = sender 
val download: Future[Array[Byte]] = 
WS.url(url).get().map { response => 
DownloadedFile( 
url, 
response.ahcResponse.getResponseBodyAsBytes 
) 
} 
import akka.pattern.pipe 
download pipeTo originalSender 
} 
} 
Will this work?
@ELMANU 
PIPES AND error 
handling 
class FileFetcher extends Actor { 
def receive = { 
case FetchFile(url) => 
val originalSender = sender() 
val download = 
WS.url(url).get().map { response => 
DownloadedFile(...) 
} recover { case t: Throwable => 
DownloadFileFailure(url, t) 
} 
Don’t forget to recover! 
import akka.pattern.pipe 
download pipeTo originalSender 
} 
}
@ELMANU 
SUMMARY 
•Futures: manipulate and combine 
asynchronous operation results 
•Actors: organise complex asynchronous flows, 
deal with failure via supervision 
•Pipes: deal with results of asynchronous 
computation inside of actors
@ELMANU 
[ | 
LESSONS LEARNED
@ELMANU 
design according to 
YOUR DATA 
User migrator 
Worker Worker Worker Worker Worker
@ELMANU 
design according to 
YOUR DATA 
Item migrator 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
design A
@ELMANU 
design according to 
YOUR DATA 
Item migrator 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
design A 
Not all users have the same amount of items
@ELMANU 
design according to 
YOUR DATA 
Item migrator 
Item 
migration 
worker 
User item 
migrator 
User item 
migrator 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
Item 
migration 
worker 
Item 
migration 
worker 
Item 
migration 
worker 
File 
fetcher 
File 
fetcher 
File 
uploader 
File 
uploader 
Soundcloud 
worker 
design B
@ELMANU 
design according to 
YOUR DATA 
Item 
migration 
worker 
User item 
migrator 
User item 
migrator 
Item 
migration 
worker 
Item 
migration 
worker 
Item 
migration 
worker 
Item 
migration 
worker 
Item 
migration 
worker 
File 
fetcher 
File 
fetcher 
File 
uploader 
File 
uploader 
Soundcloud 
worker 
Pools of actors 
design B 
Item migrator 
User item 
migrator
@ELMANU 
KNOW THE limits OF 
THY SOURCE SYSTEM
@ELMANU 
KNOW THE limits OF 
THY SOURCE SYSTEM
@ELMANU 
DATA MIGRATION 
SHOULD not BE A RACE 
•Your goal is to get 
the data, not to be 
as fast as possible 
•Be gentle to the 
legacy system(s)
@ELMANU 
CLOUD API 
STANDARDS 
•ISO-28601 Data formats in REST APIs 
•ISO-28700 Response times and failure 
communication of REST APIs 
•ISO-28701 Rate limits in REST APIs and HTTP 
error codes
@ELMANU 
CLOUD API 
STANDARDS 
•ISO-28601 Data formats in REST APIs 
•ISO-28700 Response times and failure 
communication of REST APIs 
•ISO-28701 Rate limits in REST APIs and HTTP 
error codes DREAM ON
@ELMANU 
NO STANDARDS! 
•The cloud is heterogenous 
•Response times, rate limits, error codes all 
different 
•Don’t even try to treat all systems the same
@ELMANU 
RATE limits
@ELMANU 
RATE limits 
•Read the docs - most cloud API docs will 
warn you about them 
•Design your actor system so that you can 
queue if necessary 
•Keep track of migration status
@ELMANU 
RATE limits 
•Example: Soundcloud API 
•500 Internal Server Error after seemingly 
random amount of requests
@ELMANU 
RATE limits 
•Example: Soundcloud API 
•500 Internal Server Error after seemingly 
random amount of requests 
Magic User-Agent 
WS 
.url("http://api.soundcloud.com/resolve.json") 
.withHeaders("User-Agent" -> “FOOBAR”) // the magic ingredient that 
// opens the door to Soundcloud
@ELMANU 
BLOCKING
@ELMANU 
seriously, do not 
BLOCK 
•Seems innocent at first to block from time to 
time 
•OutOfMemory after 8 hours of migration run 
is not very funny 
•You will end up rewriting your whole code to 
be async anyway
@ELMANU 
MISC 
•Unstable primary IDs in source system 
•Build a lot of small tools, be pragmatic 
•sbt-tasks (http://yobriefca.se/sbt-tasks/)
@ELMANU 
THE END
@ELMANU 
THE END 
QUESTIONS?

More Related Content

PDF
Akka and futures
PDF
3 things you must know to think reactive - Geecon Kraków 2015
PPTX
Fullstack Conference - Proxies before proxies: The hidden gems of Javascript...
PDF
Reactive Web-Applications @ LambdaDays
PDF
Akka Futures and Akka Remoting
PDF
The dark side of Akka and the remedy
PDF
Actor Clustering with Docker Containers and Akka.Net in F#
PDF
Concurrecny inf sharp
Akka and futures
3 things you must know to think reactive - Geecon Kraków 2015
Fullstack Conference - Proxies before proxies: The hidden gems of Javascript...
Reactive Web-Applications @ LambdaDays
Akka Futures and Akka Remoting
The dark side of Akka and the remedy
Actor Clustering with Docker Containers and Akka.Net in F#
Concurrecny inf sharp

What's hot (20)

PDF
2014-02-20 | Akka Concurrency (Vienna Scala User Group)
PPTX
The dark side of Akka and the remedy - bp.scala meetup
PDF
Introduction to Asynchronous scala
PDF
Advanced akka features
PDF
Actor Model Akka Framework
PPTX
Introduction to Akka - Atlanta Java Users Group
PPTX
Concurrency in Scala - the Akka way
PDF
Reactive stream processing using Akka streams
PDF
Akka 2.0 Reloaded
PDF
Akka Cluster in Java - JCConf 2015
PDF
Sane Sharding with Akka Cluster
PDF
The internet of (lego) trains
ZIP
Above the clouds: introducing Akka
PDF
Akka persistence == event sourcing in 30 minutes
PDF
Reactive streams processing using Akka Streams
PPTX
Akka Actor presentation
PDF
Concurrency and scalability with akka
PPTX
Scalamen and OT
PDF
React Native One Day
PDF
Crossing the Bridge: Connecting Rails and your Front-end Framework
2014-02-20 | Akka Concurrency (Vienna Scala User Group)
The dark side of Akka and the remedy - bp.scala meetup
Introduction to Asynchronous scala
Advanced akka features
Actor Model Akka Framework
Introduction to Akka - Atlanta Java Users Group
Concurrency in Scala - the Akka way
Reactive stream processing using Akka streams
Akka 2.0 Reloaded
Akka Cluster in Java - JCConf 2015
Sane Sharding with Akka Cluster
The internet of (lego) trains
Above the clouds: introducing Akka
Akka persistence == event sourcing in 30 minutes
Reactive streams processing using Akka Streams
Akka Actor presentation
Concurrency and scalability with akka
Scalamen and OT
React Native One Day
Crossing the Bridge: Connecting Rails and your Front-end Framework
Ad

Similar to Back to the futures, actors and pipes: using Akka for large-scale data migration (20)

PDF
Writing Asynchronous Programs with Scala & Akka
PPTX
Scale up your thinking
PDF
Message-based communication patterns in distributed Akka applications
PDF
Introducing Akka
PDF
Basics of Akka
KEY
Akka london scala_user_group
PDF
Introduction to Actor Model and Akka
PDF
PDF
Codemotion 2015 - Akka voló sobre el nido del future
PDF
Codemotion akka voló sobre el nido del future
PDF
Using Akka Futures
KEY
Introduction to Actor Model and Akka
PDF
PDF
Akka with Scala
PDF
Anatomy of a Reactive Application
PPTX
Nairobi JVM meetup : Introduction to akka
 
PDF
Akka knolx
PDF
Why scala is not my ideal language and what I can do with this
PDF
From polling to real time: Scala, Akka, and Websockets from scratch
PDF
Akka lsug skills matter
Writing Asynchronous Programs with Scala & Akka
Scale up your thinking
Message-based communication patterns in distributed Akka applications
Introducing Akka
Basics of Akka
Akka london scala_user_group
Introduction to Actor Model and Akka
Codemotion 2015 - Akka voló sobre el nido del future
Codemotion akka voló sobre el nido del future
Using Akka Futures
Introduction to Actor Model and Akka
Akka with Scala
Anatomy of a Reactive Application
Nairobi JVM meetup : Introduction to akka
 
Akka knolx
Why scala is not my ideal language and what I can do with this
From polling to real time: Scala, Akka, and Websockets from scratch
Akka lsug skills matter
Ad

More from Manuel Bernhardt (17)

PDF
Is there anybody out there? Reactive Systems Hamburg
PDF
Is there anybody out there? Scala Days Berlin 2018
PDF
Is there anybody out there?
PDF
Is there anybody out there?
PDF
8 akka anti-patterns you'd better be aware of - Reactive Summit Austin 2017
PDF
Scala Days Copenhagen - 8 Akka anti-patterns you'd better be aware of
PDF
8 Akka anti-patterns you'd better be aware of
PDF
Beyond the buzzword: a reactive web-appliction in practice
PDF
Beyond the Buzzword - a reactive application in practice
PDF
Six years of Scala and counting
PDF
Writing a technical book
PDF
Voxxed Days Vienna - The Why and How of Reactive Web-Applications on the JVM
PDF
Project Phoenix - From PHP to the Play Framework in 3 months
PDF
Scala - Java2Days Sofia
PDF
Tips and tricks for setting up a Play 2 project
PDF
Introduction to Scala
PDF
Scala pitfalls
Is there anybody out there? Reactive Systems Hamburg
Is there anybody out there? Scala Days Berlin 2018
Is there anybody out there?
Is there anybody out there?
8 akka anti-patterns you'd better be aware of - Reactive Summit Austin 2017
Scala Days Copenhagen - 8 Akka anti-patterns you'd better be aware of
8 Akka anti-patterns you'd better be aware of
Beyond the buzzword: a reactive web-appliction in practice
Beyond the Buzzword - a reactive application in practice
Six years of Scala and counting
Writing a technical book
Voxxed Days Vienna - The Why and How of Reactive Web-Applications on the JVM
Project Phoenix - From PHP to the Play Framework in 3 months
Scala - Java2Days Sofia
Tips and tricks for setting up a Play 2 project
Introduction to Scala
Scala pitfalls

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Machine learning based COVID-19 study performance prediction
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Cloud computing and distributed systems.
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
A Presentation on Artificial Intelligence
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Electronic commerce courselecture one. Pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine learning based COVID-19 study performance prediction
Dropbox Q2 2025 Financial Results & Investor Presentation
MYSQL Presentation for SQL database connectivity
Understanding_Digital_Forensics_Presentation.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
cuic standard and advanced reporting.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Cloud computing and distributed systems.
Advanced methodologies resolving dimensionality complications for autism neur...
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Monthly Chronicles - July 2025
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Network Security Unit 5.pdf for BCA BBA.
Spectral efficient network and resource selection model in 5G networks
A Presentation on Artificial Intelligence
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Electronic commerce courselecture one. Pdf

Back to the futures, actors and pipes: using Akka for large-scale data migration

  • 1. @ELMANU BACK < & future: ACTORS AND > PIPES < using akka for large-scale data migration manuel BERNHART
  • 2. @ELMANU AGENDA • { BACKGROUND STORY • } FUTURES > PIPES < ACTORS • | LESSONS LEARNED
  • 3. @ELMANU who is speaking? • freelance software consultant based in Vienna • Vienna Scala User Group • web, web, web • writing a book on reactive web-applications
  • 4. @ELMANU [ { BACKGROUND STORY
  • 5. @ELMANU talenthouse • www.talenthouse.com • based in Los Angeles • connecting brands and artists • 3+ million users
  • 10. @ELMANU BACKGROUND STORY • old, slow (very slow) platform • re-implementation from scratch with Scala & Play • tight schedule, a lot of data to migrate
  • 12. @ELMANU SOURCE SYSTEM DISCLAIMER: What follows is not intended as a bashing of the source system, but as a necessary explanation of its complexity in relation to data migration.
  • 21. @ELMANU MIGRATION schedule •basically, one week-end •big-bang kind-of migration •if possible incremental migration beforehand
  • 22. @ELMANU [ } FUTURES > PIPES < ACTORS
  • 24. @ELMANU FUTURES •scala.concurrent.Future[T] •holds a value of type T •can either fail or succeed
  • 25. @ELMANU FUTURES: HAPPY PATH import scala.concurrent._ import scala.concurrent.ExecutionContext.Implicits.global val futureSum: Future[Int] = Future { 1 + 1 } futureSum.map { sum => println("The sum is " + sum) }
  • 26. @ELMANU FUTURES: SAD PATH import scala.concurrent._ import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.duration._ val futureDiv: Future[Int] = Future { 1 / 0 } val futurePrint: Future[Unit] = futureDiv.map { div => println("The division result is " + div) } Await.result(futurePrint, 1 second)
  • 27. @ELMANU FUTURES: SAD PATH import scala.concurrent._ import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.duration._ val futureDiv: Future[Int] = Future { 1 / 0 } val futurePrint: Future[Unit] = futureDiv.map { div => println("The division result is " + div) } Await.result(futurePrint, 1 second) Avoid blocking if possible
  • 28. @ELMANU FUTURES: SAD PATH import scala.concurrent._ import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.duration._ scala> Await.result(futureDiv, 1.second) java.lang.ArithmeticException: / by zero at $anonfun$1.apply$mcI$sp(<console>:11) at $anonfun$1.apply(<console>:11) at $anonfun$1.apply(<console>:11) at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) at scala.concurrent.impl.ExecutionContextImpl $AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) val futureDiv: Future[Int] = Future { 1 / 0 } futureDiv.map { div => println("The division result is " + div) } Await.result(futureDiv, 1 second)
  • 29. @ELMANU FUTURES: SAD PATH import scala.concurrent._ import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.duration._ val futureDiv: Future[Int] = Future { 1 / 0 } val futurePrint: Future[Unit] = futureDiv.map { div => println("The division result is " + div) }.recover { case a: java.lang.ArithmeticException => println("What on earth are you trying to do?") } Await.result(futurePrint, 1 second) Be mindful of failure
  • 30. @ELMANU FUTURES: SAD PATH •Exceptions are propagated up the chain •Without recover there is no guarantee that failure will ever get noticed!
  • 31. @ELMANU COMPOSING FUTURES val futureA: Future[Int] = Future { 1 + 1 } val futureB: Future[Int] = Future { 2 + 2 } val futureC: Future[Int] = for { a <- futureA b <- futureB } yield { a + b }
  • 32. @ELMANU COMPOSING FUTURES val futureC: Future[Int] = for { a <- Future { 1 + 1 } b <- Future { 2 + 2 } } yield { a + b }
  • 33. @ELMANU COMPOSING FUTURES val futureC: Future[Int] = for { a <- Future { 1 + 1 } b <- Future { 2 + 2 } } yield { a + b } This runs in sequence Don’t do this
  • 34. @ELMANU FUTURES: CALLBACKS import scala.concurrent._ import scala.concurrent.ExecutionContext.Implicits.global val futureDiv: Future[Int] = Future { 1 / 0 } futureDiv.onSuccess { case result => println("Result: " + result) } futureDiv.onFailure { case t: Throwable => println("Oh no!") }
  • 35. @ELMANU using FUTURES •a Future { … } block that doesn’t do any I/O is code smell •use them in combination with the “right” ExecutionContext set-up •when you have blocking operations, wrap them into a blocking block
  • 36. @ELMANU using FUTURES import scala.concurrent.blocking Future { blocking { DB.withConnection { implicit connection => val query = SQL("select * from bar") query() } } }
  • 38. @ELMANU naming FUTURES “Say eventuallyMaybe one more time!”
  • 40. @ELMANU ACTORS •lightweight objects •send and receive messages (mailbox) •can have children (supervision)
  • 41. @ELMANU ACTORS Mailbox Mailbox akka://application/user/georgePeppard akka://application/user/audreyHepburn Mailbox akka://application/user/audreyHepburn/cat
  • 42. @ELMANU ACTORS Mailbox Mailbox Holly, I'm in love with you. akka://application/user/georgePeppard akka://application/user/audreyHepburn akka://application/user/audreyHepburn/cat
  • 43. @ELMANU ACTORS Mailbox Mailbox Holly, I'm in love with you. So what? akka://application/user/georgePeppard akka://application/user/audreyHepburn akka://application/user/audreyHepburn/cat
  • 44. @ELMANU GETTING AN ACTOR import akka.actor._ class AudreyHepburn extends Actor { def receive = { ... } } val system: ActorSystem = ActorSystem() val audrey: ActorRef = system.actorOf(Props[AudreyHepburn])
  • 45. @ELMANU SENDING AND RECEIVING MESSAGES case class Script(text: String) class AudreyHepburn extends Actor { def receive = { case Script(text) => read(text) } }
  • 46. @ELMANU SENDING AND RECEIVING MESSAGES case class Script(text: String) class AudreyHepburn extends Actor { def receive = { case Script(text) => read(text) } } audrey ! Script(breakfastAtTiffany)
  • 47. @ELMANU SENDING AND RECEIVING MESSAGES case class Script(text: String) class AudreyHepburn extends Actor { def receive = { case Script(text) => read(text) } } audrey ! Script(breakfastAtTiffany) “tell” - fire-forget
  • 48. @ELMANU ASK PATTERN import akka.pattern.ask import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.duration._ implicit val timeout = akka.util.Timeout(1 second) val maybeAnswer: Future[String] = audrey ? "Where should we have breakfast?"
  • 49. @ELMANU ASK PATTERN import akka.pattern.ask import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent.duration._ implicit val timeout = akka.util.Timeout(1 second) val maybeAnswer: Future[String] = audrey ? "Where should we have breakfast?" “ask”
  • 50. @ELMANU SUPERVISION class UserMigrator extends Actor { lazy val workers: ActorRef = context .actorOf[UserMigrationWorker] .withRouter(RoundRobinRouter(nrOfInstances = 100)) }
  • 51. @ELMANU SUPERVISION class UserMigrator extends Actor { actor context lazy val workers: ActorRef = context .actorOf[UserMigrationWorker] .withRouter(RoundRobinRouter(nrOfInstances = 100)) } router type many children
  • 53. @ELMANU SUPERVISION class UserMigrator extends Actor { lazy val workers: ActorRef = context .actorOf[UserMigrationWorker] .withRouter(RoundRobinRouter(nrOfInstances = 100)) override def supervisorStrategy: SupervisorStrategy = OneForOneStrategy(maxNrOfRetries = 3) { case t: Throwable => log.error(“A child died!”, t) Restart } }
  • 55. @ELMANU CECI EST UNE PIPE •Akka pattern to combine Futures and Actors •Sends the result of a Future to an Actor •Be careful with error handling
  • 56. @ELMANU CECI EST UNE PIPE class FileFetcher extends Actor { def receive = { case FetchFile(url) => val originalSender = sender() val download: Future[DownloadedFile] = WS.url(url).get().map { response => DownloadedFile( url, response.ahcResponse.getResponseBodyAsBytes ) } import akka.pattern.pipe download pipeTo originalSender } }
  • 57. @ELMANU CECI EST UNE PIPE class FileFetcher extends Actor { def receive = { case FetchFile(url) => val originalSender = sender() val download: Future[DownloadedFile] = WS.url(url).get().map { response => DownloadedFile( url, response.ahcResponse.getResponseBodyAsBytes ) } import akka.pattern.pipe download pipeTo originalSender } } This is how you pipe
  • 58. @ELMANU CECI EST UNE PIPE class FileFetcher extends Actor { def receive = { case FetchFile(url) => val originalSender = sender() val download: Future[DownloadedFile] = WS.url(url).get().map { response => DownloadedFile( url, response.ahcResponse.getResponseBodyAsBytes ) } import akka.pattern.pipe download pipeTo originalSender } } Keep reference to original sender - what follows is a Future!
  • 59. @ELMANU CECI EST UNE PIPE class FileFetcher extends Actor { def receive = { case FetchFile(url) => val originalSender = sender() val download: Future[DownloadedFile] = WS.url(url).get().map { response => DownloadedFile( url, response.ahcResponse.getResponseBodyAsBytes ) } import akka.pattern.pipe download pipeTo originalSender } } Wrap your result into something you can easily match against
  • 60. @ELMANU CECI EST UNE PIPE class FileFetcher extends Actor { def receive = { case FetchFile(url) => val originalSender = sender val download: Future[Array[Byte]] = WS.url(url).get().map { response => DownloadedFile( url, response.ahcResponse.getResponseBodyAsBytes ) } import akka.pattern.pipe download pipeTo originalSender } } Will this work?
  • 61. @ELMANU PIPES AND error handling class FileFetcher extends Actor { def receive = { case FetchFile(url) => val originalSender = sender() val download = WS.url(url).get().map { response => DownloadedFile(...) } recover { case t: Throwable => DownloadFileFailure(url, t) } Don’t forget to recover! import akka.pattern.pipe download pipeTo originalSender } }
  • 62. @ELMANU SUMMARY •Futures: manipulate and combine asynchronous operation results •Actors: organise complex asynchronous flows, deal with failure via supervision •Pipes: deal with results of asynchronous computation inside of actors
  • 63. @ELMANU [ | LESSONS LEARNED
  • 64. @ELMANU design according to YOUR DATA User migrator Worker Worker Worker Worker Worker
  • 65. @ELMANU design according to YOUR DATA Item migrator User item migrator Item migration worker Item migration worker User item migrator Item migration worker Item migration worker User item migrator Item migration worker Item migration worker design A
  • 66. @ELMANU design according to YOUR DATA Item migrator User item migrator Item migration worker Item migration worker User item migrator Item migration worker Item migration worker User item migrator Item migration worker Item migration worker design A Not all users have the same amount of items
  • 67. @ELMANU design according to YOUR DATA Item migrator Item migration worker User item migrator User item migrator User item migrator Item migration worker Item migration worker Item migration worker Item migration worker Item migration worker File fetcher File fetcher File uploader File uploader Soundcloud worker design B
  • 68. @ELMANU design according to YOUR DATA Item migration worker User item migrator User item migrator Item migration worker Item migration worker Item migration worker Item migration worker Item migration worker File fetcher File fetcher File uploader File uploader Soundcloud worker Pools of actors design B Item migrator User item migrator
  • 69. @ELMANU KNOW THE limits OF THY SOURCE SYSTEM
  • 70. @ELMANU KNOW THE limits OF THY SOURCE SYSTEM
  • 71. @ELMANU DATA MIGRATION SHOULD not BE A RACE •Your goal is to get the data, not to be as fast as possible •Be gentle to the legacy system(s)
  • 72. @ELMANU CLOUD API STANDARDS •ISO-28601 Data formats in REST APIs •ISO-28700 Response times and failure communication of REST APIs •ISO-28701 Rate limits in REST APIs and HTTP error codes
  • 73. @ELMANU CLOUD API STANDARDS •ISO-28601 Data formats in REST APIs •ISO-28700 Response times and failure communication of REST APIs •ISO-28701 Rate limits in REST APIs and HTTP error codes DREAM ON
  • 74. @ELMANU NO STANDARDS! •The cloud is heterogenous •Response times, rate limits, error codes all different •Don’t even try to treat all systems the same
  • 76. @ELMANU RATE limits •Read the docs - most cloud API docs will warn you about them •Design your actor system so that you can queue if necessary •Keep track of migration status
  • 77. @ELMANU RATE limits •Example: Soundcloud API •500 Internal Server Error after seemingly random amount of requests
  • 78. @ELMANU RATE limits •Example: Soundcloud API •500 Internal Server Error after seemingly random amount of requests Magic User-Agent WS .url("http://api.soundcloud.com/resolve.json") .withHeaders("User-Agent" -> “FOOBAR”) // the magic ingredient that // opens the door to Soundcloud
  • 80. @ELMANU seriously, do not BLOCK •Seems innocent at first to block from time to time •OutOfMemory after 8 hours of migration run is not very funny •You will end up rewriting your whole code to be async anyway
  • 81. @ELMANU MISC •Unstable primary IDs in source system •Build a lot of small tools, be pragmatic •sbt-tasks (http://yobriefca.se/sbt-tasks/)
  • 83. @ELMANU THE END QUESTIONS?