SlideShare a Scribd company logo
LEADER IN CLOUD CLUSTER COMPUTING 
November 2014 
0
Cycle Computing believes 
access to cloud cluster computing 
accelerates discovery & invention
HGST, A Western Digital Company 
Transforming design of drives that hold the world’s data 
• The Science 
– The Problem: 30 days to 
finish the run in-house, 
stopping other work 
– Engineering advanced 
drive heads by doing 
1 Million simulations of 
potential designs 
– 1 Million simulations = 
Sweep of 22 design 
parameters on three 
different media types. 
• The Business 
“At every step, we are 
innovating with purpose 
and pace to exceed the 
expectations of our 
customers” 
– Mike Cordano, President
“Gojira” Run – Facts and Figures 
World’s Largest Fortune 500 cloud run 
Metric Count 
Compute Hours of Work 619,748 hours 
Compute Years of Work 70.75 years 
Design Count ~1 Million drive head designs 
Run Time 8 Hours, not 30 days in-house 
Application Used MRM/MatLab, CycleCloud, Chef 
Max Scale (cores) 70,908 AWS cores, 3 regions 
Max Scale (instances) 5,689 Spot Instances at peak 
Computing power 729 TeraFLOPS rPeak, more than 
#63 on Top 500’s rPeak 
Infrastructure costs AWS Spot Instances: $5,594
The value of Timing 
Technical computing is the New Enterprise Workload 
Technology Timing Significance 
Told about this workload on 
Wednesday, ran by the weekend 
Our software and cloud enable fast turn 
around work, at scale 
0 to 50,000 cores in 23 minutes 
Can tackle problems at a scale that is 100x 
bigger than in-house, in minutes 
8 hours, instead of 30 days 90x throughput, faster business result 
729 TeraFLOPs cluster in 60 minutes AWS Spot enabled access for $5,593.94 
All IvyBridge processors Moore’s Law helps HGST
What’s different about this run? 
• New Enterprise Scale: World’s Largest Cloud run for an F500, 
R&D now asks the right question, there are no scale limits 
• New Industry: Leader in Manufacturing, reflects broad enterprise 
adoption of Cloud Cluster Computing 
• New Agility: CycleCloud acquired and vetted 50,000 cores in 23 
minutes, and controlled all regions from one instance of the software 
• New Processor: Had 50% more FLOPS per Ivy Bridge core than the 
year-ago total from the MegaRun
70,900 cores, 728.95 TeraFLOPS
0 to 50,000 cores in 20 min
The whole workload ran in 8 hours

More Related Content

PDF
Lets together do big data on google cloud platform v1.4
PDF
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...
PDF
Google App Engine 7 9-14
PDF
GCP Gaming 2016 Keynote Seoul, Korea
PDF
Google not all clouds are created equal - sap sapphire 2014 (1)
PPTX
Bizosys at fifth elephant
PDF
Journey to Containerized Application / Google Container Engine
PDF
Scientific Computing With Amazon Web Services
Lets together do big data on google cloud platform v1.4
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...
Google App Engine 7 9-14
GCP Gaming 2016 Keynote Seoul, Korea
Google not all clouds are created equal - sap sapphire 2014 (1)
Bizosys at fifth elephant
Journey to Containerized Application / Google Container Engine
Scientific Computing With Amazon Web Services

What's hot (18)

PPTX
Google for モバイル アプリ 16:00: モバイル kpi 分析の新標準 fluentd + google big query
PDF
GCP Gaming 2016 Seoul, Korea Build Game Server in 20min
PDF
TensorFlow on GCP
PDF
How to reduce hosting costs for Redis based applications on Java
PDF
GCP Gaming 2016 Seoul, Korea Gaming Analytics
PDF
HPC Cloud - SURF Research Boot Camp
PDF
CloudZone Supercharge Your Cloud Event 26/02/2014
PDF
node.js on Google Compute Engine
PPTX
Alan Gates, Hortonworks_Hadoop&SQL
PPTX
Webinar: Using Litmus Chaos Engineering and AI for auto incident detection
PDF
The nature of Clouds - G-talks - 22.11.2019
PDF
Fermilab aws on demand
PDF
Microservices at Mercari
PPTX
Save 60% of Kubernetes storage costs on AWS & others with OpenEBS
PDF
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
PDF
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
PPTX
Build 2017 - P4002 - Speedup Interactive Analytics on Petabytes of Data on Azure
PDF
BigQueryで作る分析環境
Google for モバイル アプリ 16:00: モバイル kpi 分析の新標準 fluentd + google big query
GCP Gaming 2016 Seoul, Korea Build Game Server in 20min
TensorFlow on GCP
How to reduce hosting costs for Redis based applications on Java
GCP Gaming 2016 Seoul, Korea Gaming Analytics
HPC Cloud - SURF Research Boot Camp
CloudZone Supercharge Your Cloud Event 26/02/2014
node.js on Google Compute Engine
Alan Gates, Hortonworks_Hadoop&SQL
Webinar: Using Litmus Chaos Engineering and AI for auto incident detection
The nature of Clouds - G-talks - 22.11.2019
Fermilab aws on demand
Microservices at Mercari
Save 60% of Kubernetes storage costs on AWS & others with OpenEBS
[Cloud OnAir] Talks by DevRel Vol.4 データ管理とデータ ベース 2020年8月27日 放送
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
Build 2017 - P4002 - Speedup Interactive Analytics on Petabytes of Data on Azure
BigQueryで作る分析環境
Ad

Similar to Cycle Cloud 70,000 Core AWS Cluster for HGST (20)

PDF
HPC Cluster Computing from 64 to 156,000 Cores 
PDF
Capturing Value from The Next 10 Billion Devices
PDF
Applying Cloud Techniques to Address Complexity in HPC System Integrations
PDF
Hp Ncoic Susanne Balle Sept17 Final
PDF
Utility HPC: Right Systems, Right Scale, Right Science
PPTX
What is Edge Computing and Why does it matter in IoT?
PDF
2014 Future of Cloud Computing Study
PPT
Transforming Your Business Through Cloud Computing
 
PDF
Five key emerging trends impacting Data Centers in 2016
PDF
ICT Insights Issue 17 (03/2016)
PDF
HPC Compass 2014/2015 IBM Special
PDF
Hpc kompass ibm_special_2014
PDF
Enterprise Architecture For Digital Business Transforming It Geng Lin
PDF
Hpc compass transtec_2012
PPTX
Arpan pal gridcomputing_iot_uworld2013
PPTX
Arpan pal gridcomputing_iot_uworld2013
PPTX
Arpan pal gridcomputing_iot_uworld2013
PPTX
2017 12 lab informatics summit
PDF
AWS Cloud Experience CA: Keynote
PDF
Edge Computing.pdf
HPC Cluster Computing from 64 to 156,000 Cores 
Capturing Value from The Next 10 Billion Devices
Applying Cloud Techniques to Address Complexity in HPC System Integrations
Hp Ncoic Susanne Balle Sept17 Final
Utility HPC: Right Systems, Right Scale, Right Science
What is Edge Computing and Why does it matter in IoT?
2014 Future of Cloud Computing Study
Transforming Your Business Through Cloud Computing
 
Five key emerging trends impacting Data Centers in 2016
ICT Insights Issue 17 (03/2016)
HPC Compass 2014/2015 IBM Special
Hpc kompass ibm_special_2014
Enterprise Architecture For Digital Business Transforming It Geng Lin
Hpc compass transtec_2012
Arpan pal gridcomputing_iot_uworld2013
Arpan pal gridcomputing_iot_uworld2013
Arpan pal gridcomputing_iot_uworld2013
2017 12 lab informatics summit
AWS Cloud Experience CA: Keynote
Edge Computing.pdf
Ad

More from inside-BigData.com (20)

PDF
Major Market Shifts in IT
PDF
Preparing to program Aurora at Exascale - Early experiences and future direct...
PPTX
Transforming Private 5G Networks
PDF
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
PDF
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
PDF
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
PDF
HPC Impact: EDA Telemetry Neural Networks
PDF
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
PDF
Machine Learning for Weather Forecasts
PPTX
HPC AI Advisory Council Update
PDF
Fugaku Supercomputer joins fight against COVID-19
PDF
Energy Efficient Computing using Dynamic Tuning
PDF
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
PDF
State of ARM-based HPC
PDF
Versal Premium ACAP for Network and Cloud Acceleration
PDF
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
PDF
Scaling TCO in a Post Moore's Era
PDF
CUDA-Python and RAPIDS for blazing fast scientific computing
PDF
Introducing HPC with a Raspberry Pi Cluster
PDF
Overview of HPC Interconnects
Major Market Shifts in IT
Preparing to program Aurora at Exascale - Early experiences and future direct...
Transforming Private 5G Networks
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
HPC Impact: EDA Telemetry Neural Networks
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Machine Learning for Weather Forecasts
HPC AI Advisory Council Update
Fugaku Supercomputer joins fight against COVID-19
Energy Efficient Computing using Dynamic Tuning
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
State of ARM-based HPC
Versal Premium ACAP for Network and Cloud Acceleration
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Scaling TCO in a Post Moore's Era
CUDA-Python and RAPIDS for blazing fast scientific computing
Introducing HPC with a Raspberry Pi Cluster
Overview of HPC Interconnects

Recently uploaded (20)

PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
project resource management chapter-09.pdf
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Getting Started with Data Integration: FME Form 101
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
August Patch Tuesday
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
Final SEM Unit 1 for mit wpu at pune .pptx
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
project resource management chapter-09.pdf
Module 1.ppt Iot fundamentals and Architecture
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Getting Started with Data Integration: FME Form 101
Hindi spoken digit analysis for native and non-native speakers
Enhancing emotion recognition model for a student engagement use case through...
A novel scalable deep ensemble learning framework for big data classification...
Getting started with AI Agents and Multi-Agent Systems
August Patch Tuesday
Univ-Connecticut-ChatGPT-Presentaion.pdf
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
1 - Historical Antecedents, Social Consideration.pdf
Web App vs Mobile App What Should You Build First.pdf
2021 HotChips TSMC Packaging Technologies for Chiplets and 3D_0819 publish_pu...
Assigned Numbers - 2025 - Bluetooth® Document
Group 1 Presentation -Planning and Decision Making .pptx
DP Operators-handbook-extract for the Mautical Institute

Cycle Cloud 70,000 Core AWS Cluster for HGST

  • 1. LEADER IN CLOUD CLUSTER COMPUTING November 2014 0
  • 2. Cycle Computing believes access to cloud cluster computing accelerates discovery & invention
  • 3. HGST, A Western Digital Company Transforming design of drives that hold the world’s data • The Science – The Problem: 30 days to finish the run in-house, stopping other work – Engineering advanced drive heads by doing 1 Million simulations of potential designs – 1 Million simulations = Sweep of 22 design parameters on three different media types. • The Business “At every step, we are innovating with purpose and pace to exceed the expectations of our customers” – Mike Cordano, President
  • 4. “Gojira” Run – Facts and Figures World’s Largest Fortune 500 cloud run Metric Count Compute Hours of Work 619,748 hours Compute Years of Work 70.75 years Design Count ~1 Million drive head designs Run Time 8 Hours, not 30 days in-house Application Used MRM/MatLab, CycleCloud, Chef Max Scale (cores) 70,908 AWS cores, 3 regions Max Scale (instances) 5,689 Spot Instances at peak Computing power 729 TeraFLOPS rPeak, more than #63 on Top 500’s rPeak Infrastructure costs AWS Spot Instances: $5,594
  • 5. The value of Timing Technical computing is the New Enterprise Workload Technology Timing Significance Told about this workload on Wednesday, ran by the weekend Our software and cloud enable fast turn around work, at scale 0 to 50,000 cores in 23 minutes Can tackle problems at a scale that is 100x bigger than in-house, in minutes 8 hours, instead of 30 days 90x throughput, faster business result 729 TeraFLOPs cluster in 60 minutes AWS Spot enabled access for $5,593.94 All IvyBridge processors Moore’s Law helps HGST
  • 6. What’s different about this run? • New Enterprise Scale: World’s Largest Cloud run for an F500, R&D now asks the right question, there are no scale limits • New Industry: Leader in Manufacturing, reflects broad enterprise adoption of Cloud Cluster Computing • New Agility: CycleCloud acquired and vetted 50,000 cores in 23 minutes, and controlled all regions from one instance of the software • New Processor: Had 50% more FLOPS per Ivy Bridge core than the year-ago total from the MegaRun
  • 8. 0 to 50,000 cores in 20 min
  • 9. The whole workload ran in 8 hours