SlideShare a Scribd company logo
Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure  ANL  * Calit2 * LBNL * NICS * ORNL *  SDSC Report to the  Dept. of Energy Advanced Scientific Computing Advisory Committee Oak Ridge, TN November 3, 2009 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor,  Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD Twitter: lsmarr
Project StarGate ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
Credits Lawrence Berkeley National Laboratory (ESnet) Eli Dart San Diego Supercomputer Center Science application Michael Norman Rick Wagner (coordinator) Network Tom Hutton Oak Ridge National Laboratory Susan Hicks National Institute for Computational Sciences Nathaniel Mendoza Argonne National Laboratory Network/Systems Linda Winkler  Loren Jan Wilson Visualization Joseph Insley Eric Olsen Mark Hereld Michael Papka [email_address] Larry Smarr (Overall Concept) Brian Dunne (Networking) Joe Keefe (OptIPortal) Kai Doerr, Falko Kuester (CGLX) ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
Exploring Cosmology With Supercomputers, Supernetworks, and Supervisualization 4096 3  particle/cell hydrodynamic cosmology simulation NICS Kraken (XT5) 16,384 cores Output 148 TB movie output (0.25 TB/file) 80 TB diagnostic dumps (8 TB/file) Science:  Norman, Harkness,Paschos SDSC Visualization:  Insley, ANL; Wagner SDSC ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC Intergalactic medium on 2 Glyr scale
Project StarGate Goals Explore Use of OptIPortals as Petascale Supercomputer “Scalable Workstations” Exploit Dynamic 10 Gbs Circuits on ESnet Connect Hardware Resources at ORNL, ANL, SDSC Show that Data Need Not be Trapped by the Network “Event Horizon” [email_address] Rick Wagner Mike Norman ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
Why Supercomputer Centers Shouldn’t Be  Data Black Holes or Island Universes Results are the  Intellectual Property of the Investigator, Not the Center Where it was Computed Petascale HPC Machines Not Ideal for Analysis/Viz Doesn’t Take Advantage of Local CI Resources on Campuses (e.g., Triton) or at other National Facilities (e.g., ANL Eureka) ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
Opening Up 10Gbps Data Path ORNL/NICS to ANL to SDSC Connectivity provided by ESnet Science Data Network End-to-End Coupling of User with DOE/NSF HPC Facilities
StarGate Network & Hardware ALCF DOE Eureka 100 Dual Quad Core  Xeon Servers 200 NVIDIA Quadro FX GPUs in 50 Quadro Plex S4 1U enclosures 3.2 TB RAM SDSC NICS Calit2/SDSC OptIPortal1 20 30” (2560 x 1600 pixel) LCD panels 10 NVIDIA Quadro FX 4600 graphics cards > 80 gigapixels 10 Gb/s network throughout NSF TeraGrid Kraken Cray XT5 8,256 Compute Nodes 99,072 Compute Cores 129 TB RAM simulation rendering visualization Science Data Network (SDN) > 10 Gb/s fiber optic network Dynamic VLANs configured using OSCARS ESnet ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC Challenge: Kraken  is not on ESnet
StarGate Streaming Rendering ALCF SDSC flPy, a parallel (MPI) tiled image/movie viewer composites the individual movies, and synchronizes the movie playback across the OptIPortal rendering nodes. ESnet Simulation volume is rendered using vl3 , a parallel (MPI) volume renderer utilizing Eureka’s GPUs. The rendering changes views steadily to highlight 3D structure. A media bridge at the border provides secure access to the parallel rendering streams. gs1.intrepid.alcf.anl.gov ALCF Internal 1 The full image is broken into subsets (tiles). The tiles are continuously encoded as a separate movies. 2 3 4 Updated instructions are sent back to the renderer to change views, or load a different dataset. 5 ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
Test animation of  1/64  of the data volume (1024 3   region) www.mcs.anl.gov/~insley/ENZO/BAO/B4096/enzo-b4096-1024subregion-test.mov ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
Data Moved ORNL to ANL data transfer nodes 577 time steps ~148TB Peak bandwidth ~2.4Gb/s Disk to disk GridFTP, Multiple Simultaneous Transfers, Each with Multiple TCP Connects Average Aggregate Bandwidth <800mb/s, Using Multiple Transfers Additionally Pre-Transfer: Data was Stored in ORNL HPSS, Had to be Staged to Disk on Data Transfer Nodes One Moved to HPSS Partition, Cant Move Data Back Post-Transfer: Each Time Step was a Tar File, Had to Untar Moving Forward, will Need Direct High-Bandwidth Path from Kraken (NICS) to Eureka (ALCF) ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
ANL Eureka Graphics Cluster Data Analytics and Visualization Cluster at ALCF (2) Head Nodes, (100) Compute Nodes (2) Nvidia Quadro FX5600 Graphics Cards (2) XEON E5405 2.00 GHz Quad Core Processors 32 GB RAM: (8) 4 Rank, 4GB DIMMS (1) Myricom 10G CX4 NIC (2) 250GB Local Disks; (1) System, (1) Minimal Scratch 32 GFlops per Server ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
Visualization Pipeline vl3 – Hardware Accelerated Volume Rendering Library 4096 3  Volume on 65 Nodes of Eureka Enzo Reader can Load from Native HDF5 Format  Uniform Grid and AMR, Resampled to Uniform grid Locally Run Interactively on Subset of Data On a Local Workstation, 512 3  Subvolume Batch for Generating Animations on Eureka Working Toward Remote Display and Control  ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC
vl3 Rendering Performance on Eureka Image Size: 4096x4096 Number of Samples: 4096 Note Data I/O Bottleneck ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC Data Size Number of Processors/ Graphics Cards Load Time Render/Composite Time 2048 3 17 2min 27sec 9.22 sec 4096 3 129 5min 10sec 4.51 sec 6400 3  (AMR) 129 4min 17sec 13.42sec
Next Experiments SC09 - Stream a 4Kx2K Movie From ANL Storage Device to OptIPortable on Show Floor Mike Norman is a 2009 INCITE investigator  6 M SU on Jaguar  Supersonic MHD Turbulence Simulations for Star Formation Use Similar Data Path for This to Show Replicability Can DOE Make This New Mode Available to Other Users? ANL  *  Calit2  *  LBNL  *  NICS  *  ORNL  *   SDSC

More Related Content

PPT
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
PPT
Bringing Mexico Into the Global LambdaGrid
PDF
Supercomputer End Users: the OptIPuter Killer Application
PPT
Report to the NAC
PDF
Preparing Your Campus for Data Intensive Researchers
PPT
Towards GigaPixel Displays
PDF
The Strongly Coupled LambdaCloud
PPT
The Energy Efficient Cyberinfrastructure in Slowing Climate Change
Cyberinfrastructure for Advanced Marine Microbial Ecology Research and Analys...
Bringing Mexico Into the Global LambdaGrid
Supercomputer End Users: the OptIPuter Killer Application
Report to the NAC
Preparing Your Campus for Data Intensive Researchers
Towards GigaPixel Displays
The Strongly Coupled LambdaCloud
The Energy Efficient Cyberinfrastructure in Slowing Climate Change

What's hot (20)

PPT
Opportunities for Advanced Technology in Telecommunications
PPT
Living in the Future
PPTX
Toward A National Big Data Superhighway
PPT
Living in a World of Nanobioinfotechnology
PPT
The Future of Telecommunications and Information Technology
PPTX
SC21: Larry Smarr on The Rise of Supernetwork Data Intensive Computing
PPT
Remote Telepresence for Exploring Virtual Worlds
PPT
Bringing Mexico Into the Global LambdaGrid
PPT
Why Researchers are Using Advanced Networks
PPT
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
PPT
Toward a Global Interactive Earth Observing Cyberinfrastructure
PPT
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
PPT
The Optiputer - Toward a Terabit LAN
PPT
From the Shared Internet to Personal Lightwaves: How the OptIPuter is Transfo...
PPT
The OptIPuter Project: From the Grid to the LambdaGrid
PPTX
Information Technology Infrastructure Committee (ITIC): Report to the NAC
PPT
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
PPT
Toward Greener Cyberinfrastructure
PPT
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...
PPT
The Future of the Internet and its Impact on Digitally Enabled Genomic Medicine
Opportunities for Advanced Technology in Telecommunications
Living in the Future
Toward A National Big Data Superhighway
Living in a World of Nanobioinfotechnology
The Future of Telecommunications and Information Technology
SC21: Larry Smarr on The Rise of Supernetwork Data Intensive Computing
Remote Telepresence for Exploring Virtual Worlds
Bringing Mexico Into the Global LambdaGrid
Why Researchers are Using Advanced Networks
OptIPuter-A High Performance SOA LambdaGrid Enabling Scientific Applications
Toward a Global Interactive Earth Observing Cyberinfrastructure
The Jump to Light Speed - Data Intensive Earth Sciences are Leading the Way t...
The Optiputer - Toward a Terabit LAN
From the Shared Internet to Personal Lightwaves: How the OptIPuter is Transfo...
The OptIPuter Project: From the Grid to the LambdaGrid
Information Technology Infrastructure Committee (ITIC): Report to the NAC
High Performance Cyberinfrastructure is Needed to Enable Data-Intensive Scien...
Toward Greener Cyberinfrastructure
The OptiPuter, Quartzite, and Starlight Projects: A Campus to Global-Scale Te...
The Future of the Internet and its Impact on Digitally Enabled Genomic Medicine
Ad

Similar to Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC (20)

PPT
Science and Cyberinfrastructure in the Data-Dominated Era
PPT
How to Terminate the GLIF by Building a Campus Big Data Freeway System
PPT
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
PPTX
TransPAC3/ACE Measurement & PerfSONAR Update
PPT
The OptIPuter as a Prototype for CalREN-XD
PPT
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
PDF
Barcelona Supercomputing Center, Generador de Riqueza
PPT
Valladolid final-septiembre-2010
PDF
Sierra Supercomputer: Science Unleashed
PPT
OptIPuter Overview
PPTX
Secure lustre on openstack
PDF
Streaming exa-scale data over 100Gbps networks
PDF
The Cell at Los Alamos: From Ray Tracing to Roadrunner
PPT
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
PDF
MVAPICH: How a Bunch of Buckeyes Crack Tough Nuts
PDF
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
PDF
Programming Trends in High Performance Computing
PPT
Riding the Light: How Dedicated Optical Circuits are Enabling New Science
PDF
MARC ONERA Toulouse2012 Altreonic
PPTX
Sierra overview
Science and Cyberinfrastructure in the Data-Dominated Era
How to Terminate the GLIF by Building a Campus Big Data Freeway System
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
TransPAC3/ACE Measurement & PerfSONAR Update
The OptIPuter as a Prototype for CalREN-XD
A Campus-Scale High Performance Cyberinfrastructure is Required for Data-Int...
Barcelona Supercomputing Center, Generador de Riqueza
Valladolid final-septiembre-2010
Sierra Supercomputer: Science Unleashed
OptIPuter Overview
Secure lustre on openstack
Streaming exa-scale data over 100Gbps networks
The Cell at Los Alamos: From Ray Tracing to Roadrunner
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
MVAPICH: How a Bunch of Buckeyes Crack Tough Nuts
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
Programming Trends in High Performance Computing
Riding the Light: How Dedicated Optical Circuits are Enabling New Science
MARC ONERA Toulouse2012 Altreonic
Sierra overview
Ad

More from Larry Smarr (20)

PPTX
Smart Patients, Big Data, NextGen Primary Care
PPTX
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
PPTX
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
PPTX
National Research Platform: Application Drivers
PPT
From Supercomputing to the Grid - Larry Smarr
PPTX
The CENIC-AI Resource - Los Angeles Community College District (LACCD)
PPT
Redefining Collaboration through Groupware - From Groupware to Societyware
PPT
The Coming of the Grid - September 8-10,1997
PPT
Supercomputers: Directions in Technology, Architecture, and Applications
PPT
High Performance Geographic Information Systems
PPT
Data Intensive Applications at UCSD: Driving a Campus Research Cyberinfrastru...
PPT
Enhanced Telepresence and Green IT — The Next Evolution in the Internet
PPTX
The CENIC AI Resource CENIC AIR - CENIC Retreat 2024
PPTX
The CENIC-AI Resource: The Right Connection
PPTX
The Pacific Research Platform: The First Six Years
PPTX
The NSF Grants Leading Up to CHASE-CI ENS
PPTX
Integrated Optical Fiber/Wireless Systems for Environmental Monitoring
PPTX
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
PPTX
Toward a National Research Platform to Enable Data-Intensive Computing
PPTX
Digital Twins of Physical Reality - Future in Review
Smart Patients, Big Data, NextGen Primary Care
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
Internet2 and QUILT Initiatives with Regional Networks -6NRP Larry Smarr and ...
National Research Platform: Application Drivers
From Supercomputing to the Grid - Larry Smarr
The CENIC-AI Resource - Los Angeles Community College District (LACCD)
Redefining Collaboration through Groupware - From Groupware to Societyware
The Coming of the Grid - September 8-10,1997
Supercomputers: Directions in Technology, Architecture, and Applications
High Performance Geographic Information Systems
Data Intensive Applications at UCSD: Driving a Campus Research Cyberinfrastru...
Enhanced Telepresence and Green IT — The Next Evolution in the Internet
The CENIC AI Resource CENIC AIR - CENIC Retreat 2024
The CENIC-AI Resource: The Right Connection
The Pacific Research Platform: The First Six Years
The NSF Grants Leading Up to CHASE-CI ENS
Integrated Optical Fiber/Wireless Systems for Environmental Monitoring
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
Toward a National Research Platform to Enable Data-Intensive Computing
Digital Twins of Physical Reality - Future in Review

Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC

  • 1. Project StarGate An End-to-End 10Gbps HPC to User Cyberinfrastructure ANL * Calit2 * LBNL * NICS * ORNL * SDSC Report to the Dept. of Energy Advanced Scientific Computing Advisory Committee Oak Ridge, TN November 3, 2009 Dr. Larry Smarr Director, California Institute for Telecommunications and Information Technology Harry E. Gruber Professor, Dept. of Computer Science and Engineering Jacobs School of Engineering, UCSD Twitter: lsmarr
  • 2. Project StarGate ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 3. Credits Lawrence Berkeley National Laboratory (ESnet) Eli Dart San Diego Supercomputer Center Science application Michael Norman Rick Wagner (coordinator) Network Tom Hutton Oak Ridge National Laboratory Susan Hicks National Institute for Computational Sciences Nathaniel Mendoza Argonne National Laboratory Network/Systems Linda Winkler Loren Jan Wilson Visualization Joseph Insley Eric Olsen Mark Hereld Michael Papka [email_address] Larry Smarr (Overall Concept) Brian Dunne (Networking) Joe Keefe (OptIPortal) Kai Doerr, Falko Kuester (CGLX) ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 4. Exploring Cosmology With Supercomputers, Supernetworks, and Supervisualization 4096 3 particle/cell hydrodynamic cosmology simulation NICS Kraken (XT5) 16,384 cores Output 148 TB movie output (0.25 TB/file) 80 TB diagnostic dumps (8 TB/file) Science: Norman, Harkness,Paschos SDSC Visualization: Insley, ANL; Wagner SDSC ANL * Calit2 * LBNL * NICS * ORNL * SDSC Intergalactic medium on 2 Glyr scale
  • 5. Project StarGate Goals Explore Use of OptIPortals as Petascale Supercomputer “Scalable Workstations” Exploit Dynamic 10 Gbs Circuits on ESnet Connect Hardware Resources at ORNL, ANL, SDSC Show that Data Need Not be Trapped by the Network “Event Horizon” [email_address] Rick Wagner Mike Norman ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 6. Why Supercomputer Centers Shouldn’t Be Data Black Holes or Island Universes Results are the Intellectual Property of the Investigator, Not the Center Where it was Computed Petascale HPC Machines Not Ideal for Analysis/Viz Doesn’t Take Advantage of Local CI Resources on Campuses (e.g., Triton) or at other National Facilities (e.g., ANL Eureka) ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 7. Opening Up 10Gbps Data Path ORNL/NICS to ANL to SDSC Connectivity provided by ESnet Science Data Network End-to-End Coupling of User with DOE/NSF HPC Facilities
  • 8. StarGate Network & Hardware ALCF DOE Eureka 100 Dual Quad Core Xeon Servers 200 NVIDIA Quadro FX GPUs in 50 Quadro Plex S4 1U enclosures 3.2 TB RAM SDSC NICS Calit2/SDSC OptIPortal1 20 30” (2560 x 1600 pixel) LCD panels 10 NVIDIA Quadro FX 4600 graphics cards > 80 gigapixels 10 Gb/s network throughout NSF TeraGrid Kraken Cray XT5 8,256 Compute Nodes 99,072 Compute Cores 129 TB RAM simulation rendering visualization Science Data Network (SDN) > 10 Gb/s fiber optic network Dynamic VLANs configured using OSCARS ESnet ANL * Calit2 * LBNL * NICS * ORNL * SDSC Challenge: Kraken is not on ESnet
  • 9. StarGate Streaming Rendering ALCF SDSC flPy, a parallel (MPI) tiled image/movie viewer composites the individual movies, and synchronizes the movie playback across the OptIPortal rendering nodes. ESnet Simulation volume is rendered using vl3 , a parallel (MPI) volume renderer utilizing Eureka’s GPUs. The rendering changes views steadily to highlight 3D structure. A media bridge at the border provides secure access to the parallel rendering streams. gs1.intrepid.alcf.anl.gov ALCF Internal 1 The full image is broken into subsets (tiles). The tiles are continuously encoded as a separate movies. 2 3 4 Updated instructions are sent back to the renderer to change views, or load a different dataset. 5 ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 10. Test animation of 1/64 of the data volume (1024 3 region) www.mcs.anl.gov/~insley/ENZO/BAO/B4096/enzo-b4096-1024subregion-test.mov ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 11. Data Moved ORNL to ANL data transfer nodes 577 time steps ~148TB Peak bandwidth ~2.4Gb/s Disk to disk GridFTP, Multiple Simultaneous Transfers, Each with Multiple TCP Connects Average Aggregate Bandwidth <800mb/s, Using Multiple Transfers Additionally Pre-Transfer: Data was Stored in ORNL HPSS, Had to be Staged to Disk on Data Transfer Nodes One Moved to HPSS Partition, Cant Move Data Back Post-Transfer: Each Time Step was a Tar File, Had to Untar Moving Forward, will Need Direct High-Bandwidth Path from Kraken (NICS) to Eureka (ALCF) ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 12. ANL Eureka Graphics Cluster Data Analytics and Visualization Cluster at ALCF (2) Head Nodes, (100) Compute Nodes (2) Nvidia Quadro FX5600 Graphics Cards (2) XEON E5405 2.00 GHz Quad Core Processors 32 GB RAM: (8) 4 Rank, 4GB DIMMS (1) Myricom 10G CX4 NIC (2) 250GB Local Disks; (1) System, (1) Minimal Scratch 32 GFlops per Server ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 13. Visualization Pipeline vl3 – Hardware Accelerated Volume Rendering Library 4096 3 Volume on 65 Nodes of Eureka Enzo Reader can Load from Native HDF5 Format Uniform Grid and AMR, Resampled to Uniform grid Locally Run Interactively on Subset of Data On a Local Workstation, 512 3 Subvolume Batch for Generating Animations on Eureka Working Toward Remote Display and Control ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 14. vl3 Rendering Performance on Eureka Image Size: 4096x4096 Number of Samples: 4096 Note Data I/O Bottleneck ANL * Calit2 * LBNL * NICS * ORNL * SDSC Data Size Number of Processors/ Graphics Cards Load Time Render/Composite Time 2048 3 17 2min 27sec 9.22 sec 4096 3 129 5min 10sec 4.51 sec 6400 3 (AMR) 129 4min 17sec 13.42sec
  • 15. Next Experiments SC09 - Stream a 4Kx2K Movie From ANL Storage Device to OptIPortable on Show Floor Mike Norman is a 2009 INCITE investigator 6 M SU on Jaguar Supersonic MHD Turbulence Simulations for Star Formation Use Similar Data Path for This to Show Replicability Can DOE Make This New Mode Available to Other Users? ANL * Calit2 * LBNL * NICS * ORNL * SDSC

Editor's Notes

  • #13: NSF TeraGrid Review January 10, 2006 Charlie Catlett (cec@uchicago.edu) Eureka – the visualization cluster at ALCF Each node has 2 graphics cards 8 processors 32 GB RAM fast interconnect local disk Server FLOPS = 2.0 GHz * 8 cores * 2 flop per clock = 32 GFLOPS
  • #15: NSF TeraGrid Review January 10, 2006 Charlie Catlett (cec@uchicago.edu) One of its strengths is its speed, and ability to handle large data sets. Number of procs = power of 2 to do rendering + 1 for compositing 2 graphics cards per node, so half as many nodes as listed here Data i/o is clearly the bottleneck Doing an animation of a single time step, data is only loaded once, can be pretty quick