Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing

1. Architecture and Performance of Runtime Environments for Data Intensive Scalable ComputingSC09 Doctoral Symposium, Portland, 11/18/2009Student: Jaliya EkanayakeAdvisor: Prof. Geoffrey FoxCommunity Grids Laboratory, Digital Science CenterPervasive Technology InstituteIndiana University

2. Cloud Runtimes for Data/Compute Intensive ApplicationsCloud RuntimesMapReduce Dryad/DryadLINQSector/Sphere Moving Computation to DataSimple communication topologiesMapReduceDirected Acyclic Graphs (DAG)sDistributed File SystemsFault ToleranceData/Compute intensive Applications

3. Represented as filter pipelines

4. Parallelizable filtersApplications using Hadoop and DryadLINQ (1)Input files (FASTA)CAP3 [1] - Expressed Sequence Tag assembly to re-construct full-length mRNACAP3CAP3CAP3DryadLINQOutput files“Map only” operation in HadoopSingle “Select” operation in DryadLINQ[1] X. Huang, A. Madan, “CAP3: A DNA Sequence Assembly Program,” Genome Research, vol. 9, no. 9, pp. 868-877, 1999.

5. Applications using Hadoop and DryadLINQ (2)PhyloD [1]project from Microsoft ResearchDerive associations between HLA alleles and HIV codons and between codons themselvesDryadLINQ implementation[1] Microsoft Computational Biology Web Tools, http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/

6. Applications using Hadoop and DryadLINQ (3)125 million distances4 hours & 46 minutesCalculate Pairwise Distances (Smith Waterman Gotoh)Calculate pairwise distances for a collection of genes (used for clustering, MDS)Fine grained tasks in MPICoarse grained tasks in DryadLINQPerformed on 768 cores (Tempest Cluster)

7. Applications using Hadoop and DryadLINQ (4)High Energy Physics (HEP)

8. K-Means Clustering

9. Matrix Multiplication

10. Multi-Dimensional Scaling (MDS)MapReduce for Iterative ComputationsClassic MapReduce RuntimesGoogle, Apache Hadoop, Sector/Sphere, DryadLINQ (DAG based)Focus on Single Step MapReduce computations onlyIntermediate data is stored and accessed via file systemsBetter fault tolerance supportHigher latenciesIterative MapReduce computations uses new maps/reducesin each iterationFixed data is loaded again and againInefficient for many iterative computations to which the MapReduce technique could be appliedSolution: i-MapReduce

11. Applications & Different Interconnection PatternsInputmapiterationsInputInputmapmapOutputPijreducereduceMPIDomain of MapReduce and Iterative Extensions

12. i-MapReduceIn-memory MapReduce

13. Distinction on static data and variable data (data flow vs. δ flow)

14. Cacheable map/reduce tasks (long running tasks)

15. Combine operation

16. Support fast intermediate data transfersStaticdataConfigure()IterateUser Programδ flowMap(Key, Value) Reduce (Key, List<Value>) Close()Combine (Key, List<Value>)Different synchronization and intercommunication mechanisms used by the parallel runtimes

17. i-MapReduceProgramming ModelrunMapReduce() IterationsWorker NodesconfigureMaps()Local DiskconfigureReduce()Cacheable map/reduce taskswhile(condition){Can send <Key,Value> pairs directlyMap()Reduce()Combine() operationCommunications/data transfers via the pub-sub broker networkupdateCondition()Two configuration options :Using local disks (only for maps)Using pub-sub bus } //end whileclose()User program’s process space

18. i-MapReduceArchitecturePub/Sub Broker NetworkMap WorkerMWorker NodesReduce WorkerDMRDriverUserProgramDRMMMMMRDeamonDRRRRData Read/WriteFile SystemCommunicationData SplitStreaming based communication

19. Eliminates file based communication

20. Cacheable map/reduce tasks

21. Static data remains in memory

22. User Program is the composer of MapReduce computations

23. Extends the MapReduce model to iterative computations

24. A limitation:

25. Assume that static data fits in to distributed memory12/6/2009Jaliya Ekanayake11

26. Applications – Pleasingly ParallelCAP3- Expressed Sequence TaggingInput files (FASTA)CAP3CAP3High Energy Physics (HEP) Data Analysis Output files

27. Applications - IterativePerformance of K-MeansClusteringParallel Overhead of Matrix multiplication

28. Current ResearchVirtualization OverheadApplications more susceptible to latencies (higher communication/computation ratio) => higher overheads under virtualizationHadoop shows 15% performance degradation on a private cloudLatency effect on i-MapReduceis lower compared to MPI due to the coarse grained tasks?Fault Tolerance for i-MapReduceReplicated dataSaving state after n iterations

29. Related WorkGeneral MapReduce References:Google MapReduceApache HadoopMicrosoft DryadLINQPregel : Large-scale graph computing at GoogleSector/SphereAll-PairsSAGA: MapReduceDisco

30. ContributionsProgramming model for iterative MapReduce computationsi-MapReduceimplementationMapReduce algorithms/implementations for a series of scientific applicationsApplicability of cloud runtimes to different classes of data/compute intensive applicationsComparison of cloud runtimes with MPI Virtualization overhead of HPC Applications and Cloud Runtimes

31. PublicationsJaliya Ekanayake, (Advisor: Geoffrey Fox) Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing, Accepted for the Doctoral Showcase, SuperComputing2009.Xiaohong Qiu, Jaliya Ekanayake, Scott Beason, Thilina Gunarathne, Geoffrey Fox, Roger Barga, Dennis Gannon, Cloud Technologies for Bioinformatics Applications, Accepted for publication in 2nd ACM Workshop on Many-Task Computing on Grids and Supercomputers, SuperComputing2009.Jaliya Ekanayake, Atilla Soner Balkir, Thilina Gunarathne, Geoffrey Fox, Christophe Poulain, Nelson Araujo, Roger Barga, DryadLINQ for Scientific Analyses, Accepted for publication in Fifth IEEE International Conference on e-Science (eScience2009), Oxford, UK.Jaliya Ekanayake and Geoffrey Fox, High Performance Parallel Computing with Clouds and Cloud Technologies, First International Conference on Cloud Computing (CloudComp2009), Munich, Germany. – An extended version of this paper goes to a book chapter.Geoffrey Fox, Seung-Hee Bae, Jaliya Ekanayake, Xiaohong Qiu, and Huapeng Yuan, Parallel Data Mining from Multicore to Cloudy Grids, High Performance Computing and Grids workshop, 2008. – An extended version of this paper goes to a book chapter.Jaliya Ekanayake, Shrideep Pallickara, Geoffrey Fox, MapReduce for Data Intensive Scientific Analyses, Fourth IEEE International Conference on eScience, 2008, pp.277-284.Jaliya Ekanayake, Shrideep Pallickara, and Geoffrey Fox, A collaborative framework for scientific data analysis and visualization, Collaborative Technologies and Systems(CTS08), 2008, pp. 339-346.Shrideep Pallickara, Jaliya Ekanayake and Geoffrey Fox, A Scalable Approach for the Secure and Authorized Tracking of the Availability of Entities in Distributed Systems, 21st IEEE International Parallel & Distributed Processing Symposium (IPDPS 2007).

32. AcknowledgementsMy Ph.D. Committee: Prof. Geoffrey FoxProf. Andrew LumsdaineProf. Dennis GannonProf. David LeakeSALSA Team @ IUEspecially: Judy Qiu, Scott Beason, Thilina Gunarathne, Hui LiMicrosoft ResearchRoger BargeChristophe Poulain

33. Questions? Thank you!

34. Parallel Runtimes – DryadLINQ vs. Hadoop

35. Cluster ConfigurationsDryadLINQHadoop / MPIDryadLINQ / MPI

Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing

More Related Content

What's hot (20)

Similar to Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing (20)

Recently uploaded (20)

Architecture and Performance of Runtime Environments for Data Intensive Scalable Computing

Editor's Notes