PCCC24（第24回PCクラスタシンポジウム）：筑波大学計算科学研究センターテーマ２「スーパーコンピュータCygnus / Pegasus」

University of Tsukuba | Center for Computational Sciences
https://www.ccs.tsukuba.ac.jp/
Specification of Cygnus
Item
Peak performance 2.4 PFLOPS DP
(GPU: 2.2 PFLOPS. CPU: 0.2 PFLOPS, FPGA: 0.6 PFLOPS SP)
⇨ enhanced by mixed precision and variable precision on FPGA
81 (32 Albireo (GPU+FPGA) nodes, 49 Deneb (GPU-onlu) nodes)
192 GiB DDR4-2666/node = 256GB/s, 32GiB x 4 for GPU/node = 3.6TB/s
Intel Xeon Gold (SKL) x2 sockets
NVIDIA V100 x4 (PCle)
Intel Stratix10 x2 (each with 100Gbps x4 links/FPGA and x8 links/node)
Luster, RAID6, 2.5 PB
Mellanox InfiniBand HDR100 x4 (two cables of HDR200 / node)
4 TB/s aggregated bandwidth
CPU: C, C++, Fortran, OpneMP, GPU: OpenACC, CUDA, FPGA: OpenCL,
Verilog HDL
NEC
#of nodes
Memory
CPU / node
GPU / node
FPGA / node
Global File System
Interconnection
Network
Programming
Language
System Vendor
Specification
• Compute nodes (NEC LX 102Bk-6) x 150
• Login nodes (NEC LX 124Rk-2) x 3
• 2 x Intel Xeon (Sapphire Rapids)
• 256 GiB DDR5 Memory, NVMe SSD, InfiniBand x 2, 100GbE
• Parallel File System (DDN ES200NV/ES7990X/SS9012)
• DDN EXAScaler (Lustre)
• MDS/MDT (4.2 billion inodes)
• Active/Standby MDS
• 1.92 TB NVMe SSD x 11 (8D + 2P + 1HS)
• InfiniBand HDR100 x 4
• OSS/OST (7.1 PF available)
• 4 x Active/Active OSSs
• 18 TB 7,200rpm NL-SAS x 534
((33drives x 8pools + 3HS) x 2)
• 8D + 2P Declustered RAID
• InfiniBand HDR100 x 8
CHFS/Cache – Caching File System
for Node-Local Persistent Memory / Storage
• Design with CHFS without degradation of metadata and bandwidth
performance
• Relaxed reasonably consistency with PFS
• User’ s assumption to FS does not change
• Demonstrates high bandwidth, metadata performance, latency, and
scalability that are nearly identical to CHFS without caching
Osamu Tatebe, Hiroki Ohtsuji, "Caching Support for CHFS Node-local Persistent Memory File System",
Proceedings of 3rd Workshop on Extreme-Scale Storage and Analysis (ESSA 2022), pp.1103-1110,
10.1109/IPDPSW55747.2022.00182, 2022
CCS HP
Please visit our website and YouTube!
YouTube Brochure
The CCS promotes "multidisciplinary computational science" on the basis of the fusion between
computational science and computer science. For the purpose, the CCS develops high-performance
computing systems by the "co-design". The scientific research areas cover particle physics,
astrophysics, nuclear physics, nano-science, life science, environmental science, and information
science.
The CCS was reorganized in April, 2004, from the preceding center, Center for Computational
Physics that was established in 1992. The CCS is the institute for the above-mentioned research fields
and also the joint-use facility for outside researchers. Since 2010, the CCS has been approved as a
national core-center, Advanced Interdisciplinary Computational Science Collaboration Initiative (AISCI),
by the Ministry of Education, Culture, Sports, Science and Technology (MEXT). The CCS aims at
playing a significant role for the development of the Multidisciplinary Computational Science.
Mission of CCS
Cygnus: Multi-Hybrid Accelerated Computing Platform
Pegasus: Big memory supercomputer
Combining goodness of different type of
accelerators: GPU + FPGA
・GPU is still an essential accelerator for simple and
large degree of parallelism to provide ~10 TFLOPS
peak performance
・FPGA is a new type of accelerator for
application-specific hardware with programmability
and speeded up based on pipelining of calculation
・FPGA is good for external communication between
them with advanced high speed interconnection up to
100Gbps x4
• Intel Xeon Sapphire Rapids, NVIDIA H100 Tensor Core
GPU with PCIe and 51TFlops of extreme performance,
and 2 TiB of persistent memory strongly drive Big Data
and AI
• The world's first system with NVIDIA H100 PCIe GPUs
connected via PCIe Gen5
• First system announced in Japan that will utilize NVIDIA
Quantum-2 InfiniBand networking
FPGA design plan
・Router
- For the dedicated network, this impl. is
mandatory.
- Forwarding packets to destinations
・User Logic
- OpenCL kernel runs here.
- Inter-FPGA comm. can be controlled from
OpenCL kernel.
・SL3
- SerialLite III : Intel FPGA IP
- Including transceiver modules for Inter-FPGA
data transfer.
- Users don’ t need to care
System name
Total performance
Total memory size
Number of nodes
Interconnects
Parallel file system
Pegasus
8.1 PFlops
319 TiByte (19 TiByte DDR5 + 300 TiByte
Persistent Memory)
150
Full bisection fat-tree network interconnected by
the NVIDIA Quantum-2 InfiniBand platform
7.1PB DDN EXAScaler (40 GB/s)

PCCC24（第24回PCクラスタシンポジウム）：筑波大学計算科学研究センターテーマ２「スーパーコンピュータCygnus / Pegasus」

More Related Content

Similar to PCCC24（第24回PCクラスタシンポジウム）：筑波大学計算科学研究センターテーマ２「スーパーコンピュータCygnus / Pegasus」 (20)

More from PC Cluster Consortium (20)

Recently uploaded (20)