SlideShare a Scribd company logo
University of Tsukuba | Center for Computational Sciences
https://www.ccs.tsukuba.ac.jp/
Specification of Cygnus
Item
Peak performance 2.4 PFLOPS DP
(GPU: 2.2 PFLOPS. CPU: 0.2 PFLOPS, FPGA: 0.6 PFLOPS SP)
⇨ enhanced by mixed precision and variable precision on FPGA
81 (32 Albireo (GPU+FPGA) nodes, 49 Deneb (GPU-onlu) nodes)
192 GiB DDR4-2666/node = 256GB/s, 32GiB x 4 for GPU/node = 3.6TB/s
Intel Xeon Gold (SKL) x2 sockets
NVIDIA V100 x4 (PCle)
Intel Stratix10 x2 (each with 100Gbps x4 links/FPGA and x8 links/node)
Luster, RAID6, 2.5 PB
Mellanox InfiniBand HDR100 x4 (two cables of HDR200 / node)
4 TB/s aggregated bandwidth
CPU: C, C++, Fortran, OpneMP, GPU: OpenACC, CUDA, FPGA: OpenCL,
Verilog HDL
NEC
#of nodes
Memory
CPU / node
GPU / node
FPGA / node
Global File System
Interconnection
Network
Programming
Language
System Vendor
Specification
• Compute nodes (NEC LX 102Bk-6) x 150
• Login nodes (NEC LX 124Rk-2) x 3
• 2 x Intel Xeon (Sapphire Rapids)
• 256 GiB DDR5 Memory, NVMe SSD, InfiniBand x 2, 100GbE
• Parallel File System (DDN ES200NV/ES7990X/SS9012)
• DDN EXAScaler (Lustre)
• MDS/MDT (4.2 billion inodes)
• Active/Standby MDS
• 1.92 TB NVMe SSD x 11 (8D + 2P + 1HS)
• InfiniBand HDR100 x 4
• OSS/OST (7.1 PF available)
• 4 x Active/Active OSSs
• 18 TB 7,200rpm NL-SAS x 534
((33drives x 8pools + 3HS) x 2)
• 8D + 2P Declustered RAID
• InfiniBand HDR100 x 8
CHFS/Cache – Caching File System
for Node-Local Persistent Memory / Storage
• Design with CHFS without degradation of metadata and bandwidth
performance
• Relaxed reasonably consistency with PFS
• User’ s assumption to FS does not change
• Demonstrates high bandwidth, metadata performance, latency, and
scalability that are nearly identical to CHFS without caching
Osamu Tatebe, Hiroki Ohtsuji, "Caching Support for CHFS Node-local Persistent Memory File System",
Proceedings of 3rd Workshop on Extreme-Scale Storage and Analysis (ESSA 2022), pp.1103-1110,
10.1109/IPDPSW55747.2022.00182, 2022
CCS HP
Please visit our website and YouTube!
YouTube Brochure
The CCS promotes "multidisciplinary computational science" on the basis of the fusion between
computational science and computer science. For the purpose, the CCS develops high-performance
computing systems by the "co-design". The scientific research areas cover particle physics,
astrophysics, nuclear physics, nano-science, life science, environmental science, and information
science.
The CCS was reorganized in April, 2004, from the preceding center, Center for Computational
Physics that was established in 1992. The CCS is the institute for the above-mentioned research fields
and also the joint-use facility for outside researchers. Since 2010, the CCS has been approved as a
national core-center, Advanced Interdisciplinary Computational Science Collaboration Initiative (AISCI),
by the Ministry of Education, Culture, Sports, Science and Technology (MEXT). The CCS aims at
playing a significant role for the development of the Multidisciplinary Computational Science.
Mission of CCS
Cygnus: Multi-Hybrid Accelerated Computing Platform
Pegasus: Big memory supercomputer
Combining goodness of different type of
accelerators: GPU + FPGA
・GPU is still an essential accelerator for simple and
large degree of parallelism to provide ~10 TFLOPS
peak performance
・FPGA is a new type of accelerator for
application-specific hardware with programmability
and speeded up based on pipelining of calculation
・FPGA is good for external communication between
them with advanced high speed interconnection up to
100Gbps x4
• Intel Xeon Sapphire Rapids, NVIDIA H100 Tensor Core
GPU with PCIe and 51TFlops of extreme performance,
and 2 TiB of persistent memory strongly drive Big Data
and AI
• The world's first system with NVIDIA H100 PCIe GPUs
connected via PCIe Gen5
• First system announced in Japan that will utilize NVIDIA
Quantum-2 InfiniBand networking
FPGA design plan
・Router
- For the dedicated network, this impl. is
mandatory.
- Forwarding packets to destinations
・User Logic
- OpenCL kernel runs here.
- Inter-FPGA comm. can be controlled from
OpenCL kernel.
・SL3
- SerialLite III : Intel FPGA IP
- Including transceiver modules for Inter-FPGA
data transfer.
- Users don’ t need to care
System name
Total performance
Total memory size
Number of nodes
Interconnects
Parallel file system
Pegasus
8.1 PFlops
319 TiByte (19 TiByte DDR5 + 300 TiByte
Persistent Memory)
150
Full bisection fat-tree network interconnected by
the NVIDIA Quantum-2 InfiniBand platform
7.1PB DDN EXAScaler (40 GB/s)

More Related Content

PDF
PCCC23:筑波大学計算科学研究センター テーマ1「スーパーコンピュータCygnus / Pegasus」
PDF
Cygnus - World First Multi-Hybrid Accelerated Cluster with GPU and FPGA Coupling
PPTX
Panel Presentation - Tom DeFanti with Larry Smarr and Frank Wuerthwein - Naut...
PPT
Current Trends in HPC
PDF
NSCC Training Introductory Class
PPTX
NSCC Training Introductory Class
PDF
NSCC Training - Introductory Class
PDF
Swiss National Supercomputing Center
PCCC23:筑波大学計算科学研究センター テーマ1「スーパーコンピュータCygnus / Pegasus」
Cygnus - World First Multi-Hybrid Accelerated Cluster with GPU and FPGA Coupling
Panel Presentation - Tom DeFanti with Larry Smarr and Frank Wuerthwein - Naut...
Current Trends in HPC
NSCC Training Introductory Class
NSCC Training Introductory Class
NSCC Training - Introductory Class
Swiss National Supercomputing Center

Similar to PCCC24(第24回PCクラスタシンポジウム):筑波大学計算科学研究センター テーマ2「スーパーコンピュータCygnus / Pegasus」 (20)

PPT
11540800.ppt
PDF
Ncar globally accessible user environment
PDF
LUG 2014
PPTX
Toward a National Research Platform
PPTX
PRP, CHASE-CI, TNRP and OSG
PDF
NASA Advanced Supercomputing (NAS) Division - Overview of The New Cabeus Clus...
PDF
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
PDF
HPC_June2011
PDF
Architecting a 35 PB distributed parallel file system for science
PPT
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
PDF
Tacc Infinite Memory Engine
PDF
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
PPTX
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
PDF
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
PDF
Swiss National Supercomputing Centre CSCS
PDF
01 From K to Fugaku
PDF
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
PDF
Workshop actualización SVG CESGA 2012
ODP
Systems Support for Many Task Computing
PDF
From the Archives: Future of Supercomputing at Altparty 2009
11540800.ppt
Ncar globally accessible user environment
LUG 2014
Toward a National Research Platform
PRP, CHASE-CI, TNRP and OSG
NASA Advanced Supercomputing (NAS) Division - Overview of The New Cabeus Clus...
Scale-out AI Training on Massive Core System from HPC to Fabric-based SOC
HPC_June2011
Architecting a 35 PB distributed parallel file system for science
High Performance Cyberinfrastructure Enabling Data-Driven Science in the Biom...
Tacc Infinite Memory Engine
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
Toward a National Research Platform to Enable Data-Intensive Open-Source Sci...
Demonstrating a Pre-Exascale, Cost-Effective Multi-Cloud Environment for Scie...
Swiss National Supercomputing Centre CSCS
01 From K to Fugaku
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exas...
Workshop actualización SVG CESGA 2012
Systems Support for Many Task Computing
From the Archives: Future of Supercomputing at Altparty 2009
Ad

More from PC Cluster Consortium (20)

PDF
PCCC24(第24回PCクラスタシンポジウム):株式会社アックス テーマ1「『俺のSoC』実現サポート」
PDF
PCCC24(第24回PCクラスタシンポジウム):東京大学情報基盤センター テーマ3「次の一手:Miyabiと将来に向けた取り組み」
PDF
PCCC24(第24回PCクラスタシンポジウム):富士通株式会社 テーマ1「次世代高性能・省電力プロセッサ『FUJITSU-MONAKA』とそのソフトウェ...
PDF
PCCC24(第24回PCクラスタシンポジウム):富士通株式会社 テーマ2「AI処理におけるGPUの演算効率を高めるミドルウェア技術『AI Computi...
PDF
PCCC24(第24回PCクラスタシンポジウム):日本オラクル株式会社 テーマ1「OCIのサステナビリティに対する取組」
PDF
PCCC24(第24回PCクラスタシンポジウム):Pacific Teck Japan テーマ3「そのコンテナ本当に大丈夫?マルウェアなどのセキュリティ対...
PDF
PCCC24(第24回PCクラスタシンポジウム):Pacific Teck Japan テーマ1「AI時代のクラスターマネジメントシステム 『Trinit...
PDF
PCCC24(第24回PCクラスタシンポジウム):エヌビディア合同会社 テーマ2「データセンター効率化のためのデータプロセッシングユニット NVIDIA ...
PDF
PCCC24(第24回PCクラスタシンポジウム):SCSK株式会社 テーマ2-1「マルチクラウド接続サービス『SCNX』」
PDF
PCCC24(第24回PCクラスタシンポジウム):SCSK株式会社 テーマ1「高負荷ハウジングサービス」
PDF
PCCC24(第24回PCクラスタシンポジウム):筑波大学計算科学研究センター テーマ3「学際計算科学による最新の研究成果」
PDF
PCCC24(第24回PCクラスタシンポジウム):筑波大学計算科学研究センター テーマ1「スーパーコンピュータMiyabi」
PDF
PCCC24(第24回PCクラスタシンポジウム):菱洋エレクトロ株式会社 テーマ1「RYOYO AI Techmate Programのご紹介」
PDF
PCCC23:SCSK株式会社 テーマ1「『Azure OpenAI Service』導入支援サービス」
PDF
PCCC23:日本AMD株式会社 テーマ2「AMD EPYC™ プロセッサーを用いたAIソリューション」
PDF
PCCC23:富士通株式会社 テーマ1「次世代高性能・省電力プロセッサ『FUJITSU-MONAKA』」
PDF
PCCC23:東京大学情報基盤センター 「Society5.0の実現を目指す『計算・データ・学習』の融合による革新的スーパーコンピューティング」
PDF
PCCC23:日本AMD株式会社 テーマ1「AMD Instinct™ アクセラレーターの概要」
PDF
PCCC23:富士通株式会社 テーマ3「Fujitsu Computing as a Service (CaaS)」
PDF
PCCC23:日本オラクル株式会社 テーマ1「OCIのHPC基盤技術と生成AI」
PCCC24(第24回PCクラスタシンポジウム):株式会社アックス テーマ1「『俺のSoC』実現サポート」
PCCC24(第24回PCクラスタシンポジウム):東京大学情報基盤センター テーマ3「次の一手:Miyabiと将来に向けた取り組み」
PCCC24(第24回PCクラスタシンポジウム):富士通株式会社 テーマ1「次世代高性能・省電力プロセッサ『FUJITSU-MONAKA』とそのソフトウェ...
PCCC24(第24回PCクラスタシンポジウム):富士通株式会社 テーマ2「AI処理におけるGPUの演算効率を高めるミドルウェア技術『AI Computi...
PCCC24(第24回PCクラスタシンポジウム):日本オラクル株式会社 テーマ1「OCIのサステナビリティに対する取組」
PCCC24(第24回PCクラスタシンポジウム):Pacific Teck Japan テーマ3「そのコンテナ本当に大丈夫?マルウェアなどのセキュリティ対...
PCCC24(第24回PCクラスタシンポジウム):Pacific Teck Japan テーマ1「AI時代のクラスターマネジメントシステム 『Trinit...
PCCC24(第24回PCクラスタシンポジウム):エヌビディア合同会社 テーマ2「データセンター効率化のためのデータプロセッシングユニット NVIDIA ...
PCCC24(第24回PCクラスタシンポジウム):SCSK株式会社 テーマ2-1「マルチクラウド接続サービス『SCNX』」
PCCC24(第24回PCクラスタシンポジウム):SCSK株式会社 テーマ1「高負荷ハウジングサービス」
PCCC24(第24回PCクラスタシンポジウム):筑波大学計算科学研究センター テーマ3「学際計算科学による最新の研究成果」
PCCC24(第24回PCクラスタシンポジウム):筑波大学計算科学研究センター テーマ1「スーパーコンピュータMiyabi」
PCCC24(第24回PCクラスタシンポジウム):菱洋エレクトロ株式会社 テーマ1「RYOYO AI Techmate Programのご紹介」
PCCC23:SCSK株式会社 テーマ1「『Azure OpenAI Service』導入支援サービス」
PCCC23:日本AMD株式会社 テーマ2「AMD EPYC™ プロセッサーを用いたAIソリューション」
PCCC23:富士通株式会社 テーマ1「次世代高性能・省電力プロセッサ『FUJITSU-MONAKA』」
PCCC23:東京大学情報基盤センター 「Society5.0の実現を目指す『計算・データ・学習』の融合による革新的スーパーコンピューティング」
PCCC23:日本AMD株式会社 テーマ1「AMD Instinct™ アクセラレーターの概要」
PCCC23:富士通株式会社 テーマ3「Fujitsu Computing as a Service (CaaS)」
PCCC23:日本オラクル株式会社 テーマ1「OCIのHPC基盤技術と生成AI」
Ad

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Approach and Philosophy of On baking technology
PDF
KodekX | Application Modernization Development
PDF
Empathic Computing: Creating Shared Understanding
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPT
Teaching material agriculture food technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Electronic commerce courselecture one. Pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Cloud computing and distributed systems.
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
MYSQL Presentation for SQL database connectivity
The Rise and Fall of 3GPP – Time for a Sabbatical?
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Approach and Philosophy of On baking technology
KodekX | Application Modernization Development
Empathic Computing: Creating Shared Understanding
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The AUB Centre for AI in Media Proposal.docx
Chapter 3 Spatial Domain Image Processing.pdf
Review of recent advances in non-invasive hemoglobin estimation
Teaching material agriculture food technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Electronic commerce courselecture one. Pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Mobile App Security Testing_ A Comprehensive Guide.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Building Integrated photovoltaic BIPV_UPV.pdf
Cloud computing and distributed systems.
NewMind AI Weekly Chronicles - August'25 Week I
MYSQL Presentation for SQL database connectivity

PCCC24(第24回PCクラスタシンポジウム):筑波大学計算科学研究センター テーマ2「スーパーコンピュータCygnus / Pegasus」

  • 1. University of Tsukuba | Center for Computational Sciences https://www.ccs.tsukuba.ac.jp/ Specification of Cygnus Item Peak performance 2.4 PFLOPS DP (GPU: 2.2 PFLOPS. CPU: 0.2 PFLOPS, FPGA: 0.6 PFLOPS SP) ⇨ enhanced by mixed precision and variable precision on FPGA 81 (32 Albireo (GPU+FPGA) nodes, 49 Deneb (GPU-onlu) nodes) 192 GiB DDR4-2666/node = 256GB/s, 32GiB x 4 for GPU/node = 3.6TB/s Intel Xeon Gold (SKL) x2 sockets NVIDIA V100 x4 (PCle) Intel Stratix10 x2 (each with 100Gbps x4 links/FPGA and x8 links/node) Luster, RAID6, 2.5 PB Mellanox InfiniBand HDR100 x4 (two cables of HDR200 / node) 4 TB/s aggregated bandwidth CPU: C, C++, Fortran, OpneMP, GPU: OpenACC, CUDA, FPGA: OpenCL, Verilog HDL NEC #of nodes Memory CPU / node GPU / node FPGA / node Global File System Interconnection Network Programming Language System Vendor Specification • Compute nodes (NEC LX 102Bk-6) x 150 • Login nodes (NEC LX 124Rk-2) x 3 • 2 x Intel Xeon (Sapphire Rapids) • 256 GiB DDR5 Memory, NVMe SSD, InfiniBand x 2, 100GbE • Parallel File System (DDN ES200NV/ES7990X/SS9012) • DDN EXAScaler (Lustre) • MDS/MDT (4.2 billion inodes) • Active/Standby MDS • 1.92 TB NVMe SSD x 11 (8D + 2P + 1HS) • InfiniBand HDR100 x 4 • OSS/OST (7.1 PF available) • 4 x Active/Active OSSs • 18 TB 7,200rpm NL-SAS x 534 ((33drives x 8pools + 3HS) x 2) • 8D + 2P Declustered RAID • InfiniBand HDR100 x 8 CHFS/Cache – Caching File System for Node-Local Persistent Memory / Storage • Design with CHFS without degradation of metadata and bandwidth performance • Relaxed reasonably consistency with PFS • User’ s assumption to FS does not change • Demonstrates high bandwidth, metadata performance, latency, and scalability that are nearly identical to CHFS without caching Osamu Tatebe, Hiroki Ohtsuji, "Caching Support for CHFS Node-local Persistent Memory File System", Proceedings of 3rd Workshop on Extreme-Scale Storage and Analysis (ESSA 2022), pp.1103-1110, 10.1109/IPDPSW55747.2022.00182, 2022 CCS HP Please visit our website and YouTube! YouTube Brochure The CCS promotes "multidisciplinary computational science" on the basis of the fusion between computational science and computer science. For the purpose, the CCS develops high-performance computing systems by the "co-design". The scientific research areas cover particle physics, astrophysics, nuclear physics, nano-science, life science, environmental science, and information science. The CCS was reorganized in April, 2004, from the preceding center, Center for Computational Physics that was established in 1992. The CCS is the institute for the above-mentioned research fields and also the joint-use facility for outside researchers. Since 2010, the CCS has been approved as a national core-center, Advanced Interdisciplinary Computational Science Collaboration Initiative (AISCI), by the Ministry of Education, Culture, Sports, Science and Technology (MEXT). The CCS aims at playing a significant role for the development of the Multidisciplinary Computational Science. Mission of CCS Cygnus: Multi-Hybrid Accelerated Computing Platform Pegasus: Big memory supercomputer Combining goodness of different type of accelerators: GPU + FPGA ・GPU is still an essential accelerator for simple and large degree of parallelism to provide ~10 TFLOPS peak performance ・FPGA is a new type of accelerator for application-specific hardware with programmability and speeded up based on pipelining of calculation ・FPGA is good for external communication between them with advanced high speed interconnection up to 100Gbps x4 • Intel Xeon Sapphire Rapids, NVIDIA H100 Tensor Core GPU with PCIe and 51TFlops of extreme performance, and 2 TiB of persistent memory strongly drive Big Data and AI • The world's first system with NVIDIA H100 PCIe GPUs connected via PCIe Gen5 • First system announced in Japan that will utilize NVIDIA Quantum-2 InfiniBand networking FPGA design plan ・Router - For the dedicated network, this impl. is mandatory. - Forwarding packets to destinations ・User Logic - OpenCL kernel runs here. - Inter-FPGA comm. can be controlled from OpenCL kernel. ・SL3 - SerialLite III : Intel FPGA IP - Including transceiver modules for Inter-FPGA data transfer. - Users don’ t need to care System name Total performance Total memory size Number of nodes Interconnects Parallel file system Pegasus 8.1 PFlops 319 TiByte (19 TiByte DDR5 + 300 TiByte Persistent Memory) 150 Full bisection fat-tree network interconnected by the NVIDIA Quantum-2 InfiniBand platform 7.1PB DDN EXAScaler (40 GB/s)