SlideShare a Scribd company logo
2
Most read
7
Most read
13
Most read
SK hynix CXL Disaggregated Memory
Solution
SK hynix CXL Disaggregated Memory
Solution
Jungmin Choi, Memory System Architect, SK hynix
Agenda
 Motivation
 Growing Memory Bandwidth and Capacity Gap
 Challenges in Today’s Datacenter
 Solution
 Niagara: CXL Disaggregated Memory Prototype
 Use Cases of CXL Disaggregated Memory
 Other HW-assisted Features of Niagara
 Next Step
Growing Memory Bandwidth and
Capacity Gap
• Increase in core counts requires continued increase in memory bandwidth &
capacity
• The gap between such requirements and platform provisioning capability is
growing
• CXL creates new opportunities beyond physical limitations, and efficient memory
disaggregation is possible
[Memory Bandwidth Requirement] [Memory Capacity Requirement]
Challenges in Today’s Datacenter
• Challenge 1 : Memory stranding & data spill
• The memory utilization of each node in a compute cluster varies time to time
• Unused memory in each node can never be utilized by other nodes, which causes memory
stranding and data spill
VM 1
VM 2
VM 3
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Memory
Stranded
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Cor
e
Memory
Spilled
VM 1
VM 2
VM 3
[Memory Stranding] [Data Spill]
Memory underutilization & Waste of memory
costs
Storage swap & Performance
degradation
Two sides of a
coin
Challenges in Today’s Datacenter
• Challenge 2 : Data transfer overhead & data duplication
• In a distributed computing system, there is a network-based data transfer overhead between
remote nodes
• Duplication of shared data between nodes increases local memory pressure
Node
App
DRAM
Node
App
DRAM
Node
App
DRAM
Node
App
DRAM
Network
Serialization & Deserialization
Data Duplication
Solution: CXL Disaggregated Memory
System
• CXL disaggregated memory system can support memory pooling & sharing
• Memory pooling : Mitigate memory stranding and data spill by sharing memory resources
between nodes
• Memory sharing : Remove data transfer overhead and data duplication by sharing data
between nodes
Node #1
[Memory Pooling]
Allocate CXL memory based on memory usage for
each node
Share data objects based on zero-copy between
nodes
Node #2 Node #3 Node #4
CXL Disaggregated Memory
[Memory Sharing]
Node #1 Node #2 Node #3 Node #4
CXL Disaggregated Memory
Shared
Object
#1 Shared
Object
#2
Shared
Object
#N
Niagara: CXL Disaggregated Memory
Prototype
• Niagara is a 4U FPGA based multi-port CXL disaggregated memory prototype
• Up to 4 CXL host servers can be connected, Support up to 4 channels of DDR4 DIMM (1TB)
• Support memory pooling, sharing and other HW-assisted features
[Rack-Scale System with Niagara]
[Niagara Block Diagram]
[Niagara Specification]
CXL
Interface
CXL 2.0, Gen4x8
Up to 4-port
Memory
4CH DDR4 DIMM
Up to 1 TB
Performan
ce
Latency : 600ns
Bandwidth : 11 GB/s
CXL EP CXL EP CXL EP CXL EP
Memor
y
FPGA
CXL RP
CXL
HW-
assisted
Features
Pooled
Memory
Manage
r
CXL RP CXL RP CXL RP
Memory
Memor
y
Memor
y
Node #1 Node #2 Node #3 Node #4
Niagara
Server #1
Server #2
Server #3
Server #4
Use Cases of CXL Disaggregated Memory
• Use case 1 : Memory pooling
• Niagara supports Dynamic Capacity Service (DCS), a HW/SW integrated solution, for memory
pooling
• DCS can dynamically allocate/deallocate disaggregated memory resources for each node
without RESET
• Improve memory utilization and performance of a system equipped with CXL disaggregated
memory
Niagara Prototype
Node 1
Node
2
Node 3 Not Used
Node 4
Resizing
Allocate
d
Memory Mgmt. Subsystem
Local DRAM Zone
(Static Size)
Cluster Management Framework
Host VM Manager
OS/Hypervis
or
Virtual
Machine
CXL
Node #2
Node #3
Node #4
Node #1
VM
Local
DRAM
CXL
Memory
Memory Pooling
Daemon
VM
Local
DRAM
CXL
Memory
Memory Pooling
Daemon
CXL Memory
Zone
(Resizable)
VM Scheduler
Memory
*PMM
Mailbox
Secure
Eraser
**MPU #1
Memory
Section
Table
to MPU #1
to MPU #2
to MPU #3
to MPU #4
from Table
Mailbox
MPU #2
from Table
Mailbox
MPU #3
from Table
Mailbox
MPU #4
from Table
Memory
Memory
Memory
Dynamic Allocation Engine
CXL.io
CXL.mem
CXL.io
CXL.mem
CXL.io
CXL.mem
CXL.io
CXL.mem
CXL
IP
CXL
IP
CXL
IP
CXL
IP
* PMM : Pooled Memory
Manager
** MPU : Memory Protection
Unit
Use Cases of CXL Disaggregated
Memory
• Evaluation results of memory pooling
• Reduce execution time by mitigating data spill and improve memory utilization of a
system
[Execution Time of CloudSuite
Benchmark]
Spill to Niagara outperforms NVMe by up to 2.5x
(even though Niagara is an FPGA-based
prototype)
Utilization improved by 35% by applying DCS to
Kubernetes
[Memory Utilization on Kubernetes]
0 50 100 150 200 250 300 350
20
10
0
Normalized Execution Time vs Baseline [%]
Workload
Spilled
to
Medium
[%]
No Spill (Baseline) Spill to Niagara Spill to SSD
2.5x improvement
[ Collaboration with
MemVerge ]
Baseline Kubernetes
DCS-enabled
Kubernetes
Worker
#1
Worker
#2
Worker
#1
Worker
#2
# CPU Cores 112 176 112 176
Local Memory
Capacity
62 GB 62 GB 62 GB 62 GB
CXL Memory Capacity 32 GB 32 GB 64 GB (Pooled)
Total Memory
Capacity
94 GB 94 GB
62-126
GB
62-126
GB
Avg. Memory
Utilization
55.7% 75.1%
35% improvement
Use Cases of CXL Disaggregated
Memory
• Use case 2 : Memory sharing
• Niagara supports firmware that allows all hosts to access shared data objects in Niagara
memory
• No more object serialization and transfer over network for remote object access
• No more duplicate object copies on different nodes  Zero copy
CXL EP CXL EP CXL EP CXL EP
CXL
RP
Pooled
Memory
Manager
CXL
RP
CXL
RP
CXL
RP
Node
#1
Node
#2
Node
#3
Node
#4
Shared Object
Data Write Data Read Data Read Data Read
Use Cases of CXL Disaggregated
Memory
• Evaluation results of memory sharing
• Reduce execution time by eliminating network based data transfer overhead
[Execution Time of *Ray Shuffle
Benchmark]
Ray with Niagara outperforms native Ray by up to
5.9x
[Execution Time of Spark Benchmark]
5.9x improvement
*Ray is an open source based distributed computing framework for AI/ML
[ Collaboration with
MemVerge ]
Merge Hash Join outperforms Sort Merge join by
up to 1.8x
0 20 40 60 80 100 120 140
Query 1
Query 2
Query 3
End-to-End Execution Time [sec]
Mege Hash Join
1.8x improvement
Merge Hash Join
0 20 40 60 80 100
25
50
75
100
Latency [sec]
Partitions
Ray with Shared Memory
End-to-End Execution Time
[sec]
Other HW-assisted Features of Niagara
• Niagara supports HW-assisted features to CXL disaggregated memory efficiency
• Block Data Management : Copy or move data within CXL pooled memory
• Snapshot : Save and restore data in CXL Pooled Memory to/from storage device
• Memory Failure Prediction : Predict memory Uncorrectable Errors(UEs) by analyzing memory
failure patterns
[Block Data Management] [Snapshot] [Memory Failure Prediction]
CXL RP
Pooled
Memory
Manage
r
CXL RP
Node #1 Node #N
CXL EP CXL EP
CXL Disaggregated Memory
Data
Copy/Mov
e
CXL RP CXL RP
Node #1 Node #N
CXL EP CXL EP
CXL Disaggregated Memory
Data
Save/Restore
Storage
CXL RP CXL RP
Node #1 Node #N
CXL EP CXL EP
CXL Disaggregated Memory
UE prediction
Notify
predicted
UE
Pooled
Memory
Manage
r
Pooled
Memory
Manage
r
• Niagara 2.0 will be available by the end of 2023
• Niagara 2.0 is a 2U CXL disaggregated memory prototype which can connect up to 8 CXL
host servers
• Support DCD (Dynamic Capacity Device) feature defined in CXL specification 3.1
• We look forward to an open collaboration with industry partners to enable HW/SW
ecosystem
Next Step
[Niagara 2.0 Prototype] [Server Cluster with Niagara 2.0]
Server Server
Pooled
Memory
Manager
Cloud
Orchestrato
r
DCD Engine
CXL Disaggregated
Memory
Niagara
2.0
Niagara Live Demo
• Demonstrate the allocation/deallocation of
memory space in CXL disaggregated memory
based on dynamic changes in memory
requirements of VM servers
• Check out SK hynix booth (#A8) for a live demo
[Niagara Live Demo System]
Niagara
Server #1
Server #2
Server #3
Server #4
[Demo GUI]
Thank you!

More Related Content

PPTX
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...
PPTX
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
PPTX
Microchip: CXL Use Cases and Enabling Ecosystem
PPTX
All Presentations during CXL Forum at Flash Memory Summit 22
PPTX
SMART Modular: Memory Solutions with CXL
PPTX
CXL Fabric Management Standards
PPTX
Arm: Enabling CXL devices within the Data Center with Arm Solutions
PPTX
The State of CXL-related Activities within OCP
Micron: Memory Expansion with CXL Modules: Benefits, Use Cases and Enriching ...
CXL Memory Expansion, Pooling, Sharing, FAM Enablement, and Switching
Microchip: CXL Use Cases and Enabling Ecosystem
All Presentations during CXL Forum at Flash Memory Summit 22
SMART Modular: Memory Solutions with CXL
CXL Fabric Management Standards
Arm: Enabling CXL devices within the Data Center with Arm Solutions
The State of CXL-related Activities within OCP

What's hot (20)

PPTX
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
PPTX
CXL Consortium Update: Advancing Coherent Connectivity
PPTX
Q1 Memory Fabric Forum: Big Memory Computing for AI
PPTX
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)
PPTX
Broadcom PCIe & CXL Switches OCP Final.pptx
PPTX
MemVerge: The Software Stack for CXL Environments
PPTX
Enfabrica - Bridging the Network and Memory Worlds
PPTX
PDF
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXL
PDF
PCI Express* based Storage: Data Center NVM Express* Platform Topologies
PPTX
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
PPTX
MemVerge: Past Present and Future of CXL
PDF
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
PPTX
MemVerge: Memory Expansion Without Breaking the Budget
PPTX
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
PPTX
“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core
 
PPTX
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
PPTX
H3 Platform CXL Solution_Memory Fabric Forum.pptx
PDF
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SAN19-118
PPTX
TE Connectivity: Card Edge Interconnects - Understanding Device & Riser Card ...
Q1 Memory Fabric Forum: Building Fast and Secure Chips with CXL IP
CXL Consortium Update: Advancing Coherent Connectivity
Q1 Memory Fabric Forum: Big Memory Computing for AI
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)
Broadcom PCIe & CXL Switches OCP Final.pptx
MemVerge: The Software Stack for CXL Environments
Enfabrica - Bridging the Network and Memory Worlds
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXL
PCI Express* based Storage: Data Center NVM Express* Platform Topologies
Q1 Memory Fabric Forum: Compute Express Link (CXL) 3.1 Update
MemVerge: Past Present and Future of CXL
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SFO17-203
MemVerge: Memory Expansion Without Breaking the Budget
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
“Zen 3”: AMD 2nd Generation 7nm x86-64 Microprocessor Core
 
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
H3 Platform CXL Solution_Memory Fabric Forum.pptx
Reliability, Availability, and Serviceability (RAS) on ARM64 status - SAN19-118
TE Connectivity: Card Edge Interconnects - Understanding Device & Riser Card ...
Ad

Similar to SK hynix CXL Disaggregated Memory Solution (19)

PDF
WN Memory Tiering WP Mar2023.pdf
PPTX
CXL Consortium Update
PPTX
MemVerge - The Dawn of Big Memory
PPTX
Q1 Memory Fabric Forum: CXL-Related Activities within OCP
PDF
Memory-Fabric-Forum-at-OCP-Global-Summit-2024-–-Astera-and-Microsoft.pdf
PPTX
Marvell - Transforming Cloud Data Centers with CXL
PPTX
Intel: CXL Enabled Heterogeneous Active Memory Tiering
PDF
intel-memverge-seminar-cxl-presentation-feb24-final-240214215332-ca83fba5.pdf
PPTX
CXL chapter1 and chapter 2 presentation.pptx
PDF
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
PPTX
XConn: Scalable Memory Expansion and Sharing for AI Computing with CXL Switches
PPTX
AMD: 4th Generation EPYC CXL Demo
PPTX
Q1 Memory Fabric Forum: Memory expansion with CXL-Ready Systems and Devices
PDF
System Software Guide to CXL - Linux Kernel Meetup 2024.pdf
PPTX
Breaking the Memory Wall
PPTX
Intel: Industry Enablement of IO Technologies
PDF
Cassandra 2.0 (Introduction)
PPTX
Compute Express Link: Advancing Coherent Connectivity
PDF
Theta and the Future of Accelerator Programming
WN Memory Tiering WP Mar2023.pdf
CXL Consortium Update
MemVerge - The Dawn of Big Memory
Q1 Memory Fabric Forum: CXL-Related Activities within OCP
Memory-Fabric-Forum-at-OCP-Global-Summit-2024-–-Astera-and-Microsoft.pdf
Marvell - Transforming Cloud Data Centers with CXL
Intel: CXL Enabled Heterogeneous Active Memory Tiering
intel-memverge-seminar-cxl-presentation-feb24-final-240214215332-ca83fba5.pdf
CXL chapter1 and chapter 2 presentation.pptx
Q1 Memory Fabric Forum: Breaking Through the Memory Wall
XConn: Scalable Memory Expansion and Sharing for AI Computing with CXL Switches
AMD: 4th Generation EPYC CXL Demo
Q1 Memory Fabric Forum: Memory expansion with CXL-Ready Systems and Devices
System Software Guide to CXL - Linux Kernel Meetup 2024.pdf
Breaking the Memory Wall
Intel: Industry Enablement of IO Technologies
Cassandra 2.0 (Introduction)
Compute Express Link: Advancing Coherent Connectivity
Theta and the Future of Accelerator Programming
Ad

More from Memory Fabric Forum (17)

PDF
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.
PPTX
Q1 Memory Fabric Forum: About MindShare Training
PDF
Q1 Memory Fabric Forum: CXL Controller by Montage Technology
PDF
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin Labs
PDF
Q1 Memory Fabric Forum: SMART CXL Product Lineup
PDF
Q1 Memory Fabric Forum: CXL Form Factor Primer
PDF
Q1 Memory Fabric Forum: Memory Fabric in a Composable System
PDF
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory Modules
PPTX
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
PPTX
Q1 Memory Fabric Forum: XConn CXL Switches for AI
PDF
Q1 Memory Fabric Forum: VMware Memory Vision
PPTX
Micron - CXL Enabling New Pliability in the Modern Data Center.pptx
PPTX
Photowave Presentation Slides - 11.8.23.pptx
PPTX
TE Connectivity: Card Edge Interconnects
PPTX
Synopsys: Achieve First Pass Silicon Success with Synopsys CXL IP Solutions
PPTX
Samsung: CMM-H Tiered Memory Solution with Built-in DRAM
PPTX
MemVerge: Gismo (Global IO-free Shared Memory Objects)
Q1 Memory Fabric Forum: ZeroPoint. Remove the waste. Release the power.
Q1 Memory Fabric Forum: About MindShare Training
Q1 Memory Fabric Forum: CXL Controller by Montage Technology
Q1 Memory Fabric Forum: Teledyne LeCroy | Austin Labs
Q1 Memory Fabric Forum: SMART CXL Product Lineup
Q1 Memory Fabric Forum: CXL Form Factor Primer
Q1 Memory Fabric Forum: Memory Fabric in a Composable System
Q1 Memory Fabric Forum: Micron CXL-Compatible Memory Modules
Q1 Memory Fabric Forum: Advantages of Optical CXL​ for Disaggregated Compute ...
Q1 Memory Fabric Forum: XConn CXL Switches for AI
Q1 Memory Fabric Forum: VMware Memory Vision
Micron - CXL Enabling New Pliability in the Modern Data Center.pptx
Photowave Presentation Slides - 11.8.23.pptx
TE Connectivity: Card Edge Interconnects
Synopsys: Achieve First Pass Silicon Success with Synopsys CXL IP Solutions
Samsung: CMM-H Tiered Memory Solution with Built-in DRAM
MemVerge: Gismo (Global IO-free Shared Memory Objects)

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Cloud computing and distributed systems.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Machine learning based COVID-19 study performance prediction
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Cloud computing and distributed systems.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Spectroscopy.pptx food analysis technology
MYSQL Presentation for SQL database connectivity
Mobile App Security Testing_ A Comprehensive Guide.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
sap open course for s4hana steps from ECC to s4
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation_ Review paper, used for researhc scholars
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine learning based COVID-19 study performance prediction
Per capita expenditure prediction using model stacking based on satellite ima...
MIND Revenue Release Quarter 2 2025 Press Release
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...

SK hynix CXL Disaggregated Memory Solution

  • 1. SK hynix CXL Disaggregated Memory Solution
  • 2. SK hynix CXL Disaggregated Memory Solution Jungmin Choi, Memory System Architect, SK hynix
  • 3. Agenda  Motivation  Growing Memory Bandwidth and Capacity Gap  Challenges in Today’s Datacenter  Solution  Niagara: CXL Disaggregated Memory Prototype  Use Cases of CXL Disaggregated Memory  Other HW-assisted Features of Niagara  Next Step
  • 4. Growing Memory Bandwidth and Capacity Gap • Increase in core counts requires continued increase in memory bandwidth & capacity • The gap between such requirements and platform provisioning capability is growing • CXL creates new opportunities beyond physical limitations, and efficient memory disaggregation is possible [Memory Bandwidth Requirement] [Memory Capacity Requirement]
  • 5. Challenges in Today’s Datacenter • Challenge 1 : Memory stranding & data spill • The memory utilization of each node in a compute cluster varies time to time • Unused memory in each node can never be utilized by other nodes, which causes memory stranding and data spill VM 1 VM 2 VM 3 Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Memory Stranded Cor e Cor e Cor e Cor e Cor e Cor e Cor e Cor e Memory Spilled VM 1 VM 2 VM 3 [Memory Stranding] [Data Spill] Memory underutilization & Waste of memory costs Storage swap & Performance degradation Two sides of a coin
  • 6. Challenges in Today’s Datacenter • Challenge 2 : Data transfer overhead & data duplication • In a distributed computing system, there is a network-based data transfer overhead between remote nodes • Duplication of shared data between nodes increases local memory pressure Node App DRAM Node App DRAM Node App DRAM Node App DRAM Network Serialization & Deserialization Data Duplication
  • 7. Solution: CXL Disaggregated Memory System • CXL disaggregated memory system can support memory pooling & sharing • Memory pooling : Mitigate memory stranding and data spill by sharing memory resources between nodes • Memory sharing : Remove data transfer overhead and data duplication by sharing data between nodes Node #1 [Memory Pooling] Allocate CXL memory based on memory usage for each node Share data objects based on zero-copy between nodes Node #2 Node #3 Node #4 CXL Disaggregated Memory [Memory Sharing] Node #1 Node #2 Node #3 Node #4 CXL Disaggregated Memory Shared Object #1 Shared Object #2 Shared Object #N
  • 8. Niagara: CXL Disaggregated Memory Prototype • Niagara is a 4U FPGA based multi-port CXL disaggregated memory prototype • Up to 4 CXL host servers can be connected, Support up to 4 channels of DDR4 DIMM (1TB) • Support memory pooling, sharing and other HW-assisted features [Rack-Scale System with Niagara] [Niagara Block Diagram] [Niagara Specification] CXL Interface CXL 2.0, Gen4x8 Up to 4-port Memory 4CH DDR4 DIMM Up to 1 TB Performan ce Latency : 600ns Bandwidth : 11 GB/s CXL EP CXL EP CXL EP CXL EP Memor y FPGA CXL RP CXL HW- assisted Features Pooled Memory Manage r CXL RP CXL RP CXL RP Memory Memor y Memor y Node #1 Node #2 Node #3 Node #4 Niagara Server #1 Server #2 Server #3 Server #4
  • 9. Use Cases of CXL Disaggregated Memory • Use case 1 : Memory pooling • Niagara supports Dynamic Capacity Service (DCS), a HW/SW integrated solution, for memory pooling • DCS can dynamically allocate/deallocate disaggregated memory resources for each node without RESET • Improve memory utilization and performance of a system equipped with CXL disaggregated memory Niagara Prototype Node 1 Node 2 Node 3 Not Used Node 4 Resizing Allocate d Memory Mgmt. Subsystem Local DRAM Zone (Static Size) Cluster Management Framework Host VM Manager OS/Hypervis or Virtual Machine CXL Node #2 Node #3 Node #4 Node #1 VM Local DRAM CXL Memory Memory Pooling Daemon VM Local DRAM CXL Memory Memory Pooling Daemon CXL Memory Zone (Resizable) VM Scheduler Memory *PMM Mailbox Secure Eraser **MPU #1 Memory Section Table to MPU #1 to MPU #2 to MPU #3 to MPU #4 from Table Mailbox MPU #2 from Table Mailbox MPU #3 from Table Mailbox MPU #4 from Table Memory Memory Memory Dynamic Allocation Engine CXL.io CXL.mem CXL.io CXL.mem CXL.io CXL.mem CXL.io CXL.mem CXL IP CXL IP CXL IP CXL IP * PMM : Pooled Memory Manager ** MPU : Memory Protection Unit
  • 10. Use Cases of CXL Disaggregated Memory • Evaluation results of memory pooling • Reduce execution time by mitigating data spill and improve memory utilization of a system [Execution Time of CloudSuite Benchmark] Spill to Niagara outperforms NVMe by up to 2.5x (even though Niagara is an FPGA-based prototype) Utilization improved by 35% by applying DCS to Kubernetes [Memory Utilization on Kubernetes] 0 50 100 150 200 250 300 350 20 10 0 Normalized Execution Time vs Baseline [%] Workload Spilled to Medium [%] No Spill (Baseline) Spill to Niagara Spill to SSD 2.5x improvement [ Collaboration with MemVerge ] Baseline Kubernetes DCS-enabled Kubernetes Worker #1 Worker #2 Worker #1 Worker #2 # CPU Cores 112 176 112 176 Local Memory Capacity 62 GB 62 GB 62 GB 62 GB CXL Memory Capacity 32 GB 32 GB 64 GB (Pooled) Total Memory Capacity 94 GB 94 GB 62-126 GB 62-126 GB Avg. Memory Utilization 55.7% 75.1% 35% improvement
  • 11. Use Cases of CXL Disaggregated Memory • Use case 2 : Memory sharing • Niagara supports firmware that allows all hosts to access shared data objects in Niagara memory • No more object serialization and transfer over network for remote object access • No more duplicate object copies on different nodes  Zero copy CXL EP CXL EP CXL EP CXL EP CXL RP Pooled Memory Manager CXL RP CXL RP CXL RP Node #1 Node #2 Node #3 Node #4 Shared Object Data Write Data Read Data Read Data Read
  • 12. Use Cases of CXL Disaggregated Memory • Evaluation results of memory sharing • Reduce execution time by eliminating network based data transfer overhead [Execution Time of *Ray Shuffle Benchmark] Ray with Niagara outperforms native Ray by up to 5.9x [Execution Time of Spark Benchmark] 5.9x improvement *Ray is an open source based distributed computing framework for AI/ML [ Collaboration with MemVerge ] Merge Hash Join outperforms Sort Merge join by up to 1.8x 0 20 40 60 80 100 120 140 Query 1 Query 2 Query 3 End-to-End Execution Time [sec] Mege Hash Join 1.8x improvement Merge Hash Join 0 20 40 60 80 100 25 50 75 100 Latency [sec] Partitions Ray with Shared Memory End-to-End Execution Time [sec]
  • 13. Other HW-assisted Features of Niagara • Niagara supports HW-assisted features to CXL disaggregated memory efficiency • Block Data Management : Copy or move data within CXL pooled memory • Snapshot : Save and restore data in CXL Pooled Memory to/from storage device • Memory Failure Prediction : Predict memory Uncorrectable Errors(UEs) by analyzing memory failure patterns [Block Data Management] [Snapshot] [Memory Failure Prediction] CXL RP Pooled Memory Manage r CXL RP Node #1 Node #N CXL EP CXL EP CXL Disaggregated Memory Data Copy/Mov e CXL RP CXL RP Node #1 Node #N CXL EP CXL EP CXL Disaggregated Memory Data Save/Restore Storage CXL RP CXL RP Node #1 Node #N CXL EP CXL EP CXL Disaggregated Memory UE prediction Notify predicted UE Pooled Memory Manage r Pooled Memory Manage r
  • 14. • Niagara 2.0 will be available by the end of 2023 • Niagara 2.0 is a 2U CXL disaggregated memory prototype which can connect up to 8 CXL host servers • Support DCD (Dynamic Capacity Device) feature defined in CXL specification 3.1 • We look forward to an open collaboration with industry partners to enable HW/SW ecosystem Next Step [Niagara 2.0 Prototype] [Server Cluster with Niagara 2.0] Server Server Pooled Memory Manager Cloud Orchestrato r DCD Engine CXL Disaggregated Memory Niagara 2.0
  • 15. Niagara Live Demo • Demonstrate the allocation/deallocation of memory space in CXL disaggregated memory based on dynamic changes in memory requirements of VM servers • Check out SK hynix booth (#A8) for a live demo [Niagara Live Demo System] Niagara Server #1 Server #2 Server #3 Server #4 [Demo GUI]