SlideShare a Scribd company logo
Programming the cloud withSkywritingDerek MurraywithMalteSchwarzkopf, Chris Smowton, Anil Madhavapeddy and Steve Hand
OutlineState of the artSkywriting by exampleIterative algorithmsHeterogeneous clustersSpeculative executionPerformance case studiesFuture directions
TaskTaskTaskTaskTask farmingTaskTaskTask
Task farmingMasterWorkerWorkerWorker
Programming the cloud with Skywriting
Task farmingABruns before
MapReduceInputMapShuffleReduceOutput
Programming the cloud with Skywriting
Dryad
Programming the cloud with Skywriting
Programming the cloud with Skywriting
Problem: iterative algorithmsNot convergedTaskConverged
Problem: cluster heterogeneityMasterWorkerWorkerWorker
Problem: cluster heterogeneityMaster
Problem: cluster heterogeneityMaster
Problem: speculative execution
Programming the cloud with Skywriting
Solution: SkywritingTuring-complete coordination languageSupport for spawning tasksInterface to external codeDistributed execution engineExecutes tasks in parallel on a clusterHandles failure, locality, data motion, etc.
Spawning a Skywriting taskfunction f(arg1, arg2) { … }result = spawn(f, [arg1, arg2]);
Building a task graphfunction f(x, y) { … }function g(x, y){ … }function h(x, y) { … }a = spawn(f, [7, 8]);b = spawn(g, [a, 0]);c = spawn(g, [a, 1]);d = spawn(h, [b, c]);return d;faaggcbhd
Iterative algorithmcurrent = …;do {prev = current;    a = spawn(f, [prev, 0]);b= spawn(f, [prev, 1]);c = spawn(f, [prev, 2]);    current = spawn(g, [a, b, c]);    done = spawn(h, [current]);while (!*done);
Iterative algorithmfffghfff
Aside: recursive algorithmfunction f(x) {if (/* x is small enough */) {return /* do something with x */;    } else {x_lo = /* bottom half of x */;x_hi = /* top half of x */;return [spawn(f, [x_lo]),spawn(f, [x_hi])];    }}
Executing external codey = exec(executor_name,        { “inputs” : [x1, x2, x3], … },num_outputs);Run Java, C, .NET and pipe-based code
Heterogeneous cluster support
Workers advertise “execution facilities”
Tasks migrate to necessary facilitiesSpeculative execution
Speculative executionx = …;a = spawn(f, [x]);b= spawn(f, [x]);c= spawn(f, [x]);result =waituntil(any, [a, b, c]);return result[“available”];
Performance case studiesAll experiments used Amazon EC2m1.smallinstances, running Ubuntu 8.10MicrobenchmarkSmith-Waterman
Job creation overhead
Smith-Waterman data flow
Parallel Smith-Waterman
Parallel Smith-Waterman
Future workDistributed data structuresCoping when the lists etc. get bigBetter language integrationCompile to JVM, CLR, LLVM etc.Decentralised master-workerRun on multiple cloudsSelf-scaling clustersAdd and remove workers as needed

More Related Content

PPT
Dynamic allocation
PDF
Matlab integration
PPT
Matlab Nn Intro
PPTX
PPT
16858 memory management2
PPT
Dynamic Memory Allocation
PDF
MBrace: Large-scale cloud computation with F# (CUFP 2014)
PPTX
16 dynamic-memory-allocation
Dynamic allocation
Matlab integration
Matlab Nn Intro
16858 memory management2
Dynamic Memory Allocation
MBrace: Large-scale cloud computation with F# (CUFP 2014)
16 dynamic-memory-allocation

What's hot (20)

PDF
MBrace: Cloud Computing with F#
PDF
Hw5 2017-spring
PDF
13. dynamic allocation
PDF
Py lecture5 python plots
PDF
Introducton to Convolutional Nerural Network with TensorFlow
PDF
Introduction to TensorFlow
PPTX
Dynamic memory allocation in c
PPTX
Intoduction to dynamic memory allocation
PDF
Use the Matplotlib, Luke @ PyCon Taiwan 2012
PPTX
Malloc() and calloc() in c
PPTX
Lecture 1 mte 407
PPTX
Lecture 1 mte 407
PPT
16829 memory management2
PDF
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
PPTX
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
PDF
Anders Nielsen template model-builder
PDF
Rajat Monga at AI Frontiers: Deep Learning with TensorFlow
PDF
Anders Nielsen AD Model-Builder
PPTX
Tensorflow windows installation
MBrace: Cloud Computing with F#
Hw5 2017-spring
13. dynamic allocation
Py lecture5 python plots
Introducton to Convolutional Nerural Network with TensorFlow
Introduction to TensorFlow
Dynamic memory allocation in c
Intoduction to dynamic memory allocation
Use the Matplotlib, Luke @ PyCon Taiwan 2012
Malloc() and calloc() in c
Lecture 1 mte 407
Lecture 1 mte 407
16829 memory management2
DSD-INT 2018 Work with iMOD MODFLOW models in Python - Visser Bootsma
Build 2017 - B8037 - Explore the next generation of innovative UI in the Visu...
Anders Nielsen template model-builder
Rajat Monga at AI Frontiers: Deep Learning with TensorFlow
Anders Nielsen AD Model-Builder
Tensorflow windows installation
Ad

Similar to Programming the cloud with Skywriting (20)

PDF
Programming the cloud with Skywriting
PDF
International Journal of Engineering Inventions (IJEI)
PPTX
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
PDF
Nephele efficient parallel data processing in the cloud
PDF
Research Inventy : International Journal of Engineering and Science
ODP
Cloud accounting software uk
PPTX
IEEE CLOUD \'11
PDF
Bft mr-clouds-of-clouds-discco2012 - navtalk
PDF
Exploiting dynamic resource allocation for
PPTX
Parallel Linear Regression in Interative Reduce and YARN
PPT
High Performance Computing - Cloud Point of View
PDF
Handout3o
PPTX
Amazon ECS at Coursera: A unified execution framework while defending against...
PPT
PPTX
Взгляд на облака с точки зрения HPC
PDF
Research Inventy : International Journal of Engineering and Science
PDF
E5 05 ijcite august 2014
PDF
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
PPTX
The Architecture of Continuous Innovation - OSCON 2015
PDF
Hadoop.mapreduce
Programming the cloud with Skywriting
International Journal of Engineering Inventions (IJEI)
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Nephele efficient parallel data processing in the cloud
Research Inventy : International Journal of Engineering and Science
Cloud accounting software uk
IEEE CLOUD \'11
Bft mr-clouds-of-clouds-discco2012 - navtalk
Exploiting dynamic resource allocation for
Parallel Linear Regression in Interative Reduce and YARN
High Performance Computing - Cloud Point of View
Handout3o
Amazon ECS at Coursera: A unified execution framework while defending against...
Взгляд на облака с точки зрения HPC
Research Inventy : International Journal of Engineering and Science
E5 05 ijcite august 2014
Modeling and Simulation of Parallel and Distributed Computing Systems with Si...
The Architecture of Continuous Innovation - OSCON 2015
Hadoop.mapreduce
Ad

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Cloud computing and distributed systems.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Big Data Technologies - Introduction.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
KodekX | Application Modernization Development
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Electronic commerce courselecture one. Pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation theory and applications.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Encapsulation_ Review paper, used for researhc scholars
Cloud computing and distributed systems.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MYSQL Presentation for SQL database connectivity
Empathic Computing: Creating Shared Understanding
Big Data Technologies - Introduction.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Building Integrated photovoltaic BIPV_UPV.pdf
KodekX | Application Modernization Development
Advanced methodologies resolving dimensionality complications for autism neur...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Programming the cloud with Skywriting

Editor's Notes

  • #2: Thanks for the introduction, Eva. Well, as Eva said, my name’s Derek Murray, I’m a third year PhD student at Cambridge, and today I’m going to talk about Skywriting, which is a little bit of work I’ve been doing with these guys: Malte, Chris, Anil and my supervisor Steve Hand.Skywriting is a system for large-scale distributed computation – in this respect it’s similar to things like Google MapReduce and Microsoft’s Dryad – so that’s systems where your data or compute need is so big that you have to use a cluster in parallel to get the job done.It was the success of these systems – in particular Hadoop, the open-source MapReduce – that motivated us to start this work. What I found interesting was that people were using these things in entirely unexpected ways… taking MapReduce, which is excellent for log-processing, and running some big iterative machine learning algorithm on it. We reckoned that people were using MapReduce not because of its programming model, but despite it.So we set out to build something that combines all the advantages of previous systems, with a very flexible programming model. The result was Skywriting, so let’s see what you think…
  • #4: All the systems we’ll discuss today use the simple notion of task parallelism. Many algorithms can be divided into tasks, which are just chunks of sequential code. The key observation is that two independent tasks can run in parallel. And when your whole job divides into a fully independent bag of tasks, it’s said to be “embarrassingly parallel”.
  • #5: And how do you run these embarrassingly parallel jobs? Well, you give your bag of tasks to a master, which doles them out on demand to a set of workers.This is a very simple architecture to program. And it has a lot of benefits. If one of the workers crashes, fine! The master will notice and give that worker’s current task to someone else. And if a worker is a bit slower than the others, that’s also fine! Each worker pulls a new task when it has completed the last one, so even a heterogeneous pool can do useful work.
  • #6: Embarrassing parallelism is not very interesting: it only lets you do boring things like search for aliens and brute-force people’s passwords.
  • #7: It gets much more interesting – i.e. commercially useful – when the tasks have dependencies between them. So here, we have two tasks A and B, and a relation that says A must run before B. The usual reason for this is because A writes some output, and B wants to read it.Think of this like makefile rules. You can build up graphs out of these dependencies, and resolve them in parallel.In fact, the original name for this project was “Cloud Make”. Fortunately it changed….
  • #8: Are you all familiar with MapReduce?Introduced by Google in 2004, MapReduce used the observation that the map() function from functional programming can run in parallel over large lists. So they broke down their huge data into chunks, and ran each through a “map task”, generating some key-value pairs that are then sorted by key in this shuffle phase, and then the values for each key are folded in parallel using a “reduce task”.This basically uses the same master-worker task farm that I showed on a previous slide, with the single constraint that all the map tasks must finish before the reduce tasks begin. Therefore it had the benefit of working at huge scale, and being very reliable.
  • #10: A couple of years later, Microsoft, which also has a search engine, released “Dryad”, which generalisesMapReduce by allowing the user to specify a job as any directed acyclic graph. The graph has vertices – which are arbitrary sequential code in your favourite language – and channels, which could be files, in-memory FIFOs, TCP connections or whatever.Clearly you can implement MapReduce in Dryad, since it’s just a DAG. But Dryad makes things like Joins much easier, because a task can have multiple inputs.
  • #13: So far, we can run any finite directed acyclic graph using Dryad. As the name suggests, however, Dryad is not terribly good at cyclic data flows.These turn up all the time in fields like machine learning, scientific computing and information retrieval. Take PageRank, for example, which involves repeatedly premultiplying a vector by a large sparse matrix representing the web. You keep doing this until you reach a fixpoint, and the PageRank vector has converged.At present, all you can do is submit one job after another. This is bad for a number of reasons. First of all, it’s very slow: MapReduce and Dryad are designed for batch submission, and so starting an individual job takes on the order of 30 seconds. If your iteration is shorter than that, you’re losing out on parallel speedup.It also introduces a co-dependency between the client and the cluster. Now the client, which is just some simple program that submits jobs to the cluster, has to stay running for the duration of the job, but since it’s outside the cluster, it gets none of the advantages of fault-tolerance, of data locality, of fair scheduling. Since the client now contains critical job state, it’s necessary to add all these features manually.
  • #14: Remember our Master-worker architecture? Well, if you’ve ever tried to setup Hadoop or Dryad, you’ll know that you need to make sure all of the workers are the same, running the same operating system, on the same local network.
  • #15: But what if all you have is a little ad-hoc cluster, with a Windows desktop, a Linux server and a Mac laptop?
  • #16: Or, perhaps less contrived, what if your data are spread between different cloud providers. So you might have some data in Amazon S3, some in Google’s App Engine, and some in Windows Azure. Our mantra is “put the computation near the data”, and it’s not practical to shift all the data to one place.
  • #17: And what about this? Say you have a really important task to complete, but you don’t know how long it’ll take – maybe you’re using some kind of randomised algorithm. So you fire off three copies of the same task… and eventually one finishes. At this point, you can just kill the other two.Although MapReduce and Dryad have limited support for this, it’s not first-class: you can’t do it on demand, only in response to “straggler” nodes that take much longer to complete than others.
  • #18: I’ve spent quite a lot of slides being rather coy about what’s to come, but if you’ve read the abstract, you’ll know that Skywriting is
  • #19: …two things. First, instead of using DAGs to describe a job, we use the most powerful thing available to us: a Turing-complete coordination language. This sounds ominous and theoretical, but actually it’s just a programming language that looks a lot like JavaScript, with all the usual control flow structures, loops, ifs, functions and so on.Since we want to run things efficiently in parallel, it has support for spawning tasks, and a way to call external code.The other main component is the distributed execution engine, which actually executes Skywriting programs in the cluster. The interesting thing about this is that a “task” is just a Skywriting function – a continuation to be more precise – which means that tasks can spawn other tasks, and thereby grow the job dynamically.
  • #28: 1.0 – 1.2 GHz Xeon or Opteron. 1.7GB RAM, 150GB disk.
  • #31: 50 x 50 on 50 workers.Input size is
  • #32: Best score is 15x15 = 225 tasks, at 83 s (2.6x speedup).