SlideShare a Scribd company logo
The Next Frontier in AI Networking
The rapid arrival of real-time gaming, virtual reality and metaverse applications is changing the
way network, compute memory and interconnect I/O interact for the next decade. As the
future of metaverse applications evolve, the network needs to adapt for 10 times the growth in
traffic connecting 100s of processors with trillions of transactions and gigabits of throughput. AI
is becoming more meaningful as distributed applications push the envelope of predictable scale
and performance of the network. A common characteristic of these AI workloads is that they
are both data and compute-intensive. A typical AI workload involves a large sparse matrix
computation, distributed across 10s or 100s of processors (CPU, GPU, TPU, etc.) with intense
computations for a period of time. Once the data from all peers is received, it can be reduced or
merged with the local data and then another cycle of processing begins.
Historically, the only option to connect processor cores and memory have been proprietary
interconnects such as InfiniBand and other protocols that connect compute clusters with
offloads. However, the scale and richness of Ethernet radically changes the paradigm with
better distributed options now. Let's take a look at alternatives.
1. Ethernet NICs and Switches
Smart or high performance NICs often interconnect the sea of multiple cores in a network
design. This is an emerging trend where the Network interface Controller (NIC) not only
provides network connectivity but drives server offloads. The traditional design philosophy is to
leverage general purpose GPU or DPU cores and interconnect with the right price/performance
across memory and processors with accelerators such as RDMA (Remote Direct Memory
Access). DMA is an operation to access the memory directly from the NIC without involving the
CPU. Today’s NICs connect to Ethernet 10/100/200G switches, complementing the NICs, using a
programmable framework often based on P4, such as the Arista 7170 series, as well as the 7050
series for more expanded memory and feature coverage.
2. InfiniBand
InfiniBand based switches and HBA (Host Bus Adapters) combine general purpose DPUs and
GPUs to deliver consistent performance and can use RDMA offloads. Typical IB networks are
vendor specific closed systems in high performance compute (HPC) use cases. The access on the
responder throughput is limited by the InfiniBand (NIC and PCI). The low software dependency,
decreases latency for InfiniBand versus TCP/UDP performance. However, smarter improved
Ethernet switches and NICs also adopt non-TCP methods so the delta is narrowing between IB
and Ethernet. Historically, InfiniBand was implemented in large supercomputer clusters but the
high cost of scale-out and proprietary nature brings poor interoperability and limitations for AI
and compute intensive applications.
3. Ethernet-based Spine Fabric
The insatiable appetite for faster transfer latency and Ethernet as a preferred fabric between
these processors is growing. AI processing grows exponentially for self-driving cars, interactive
and autonomous gaming and virtual reality, mandating a scalable and reliable network.
Small packets with large flows make the Arista 7800 with EOS the ideal frontrunner
combination. Designed with cell based VOQ (Virtual Output Queuing), and deep buffer
architecture, the Arista 7800 is a flagship platform for high radix scale 100/200/400/800G
throughput across all ports for efficient packet spraying and congestion control techniques such
as ECN (Explicit Congestion Notification) and PFC (Priority Flow Control).
This new AI Spine delivers a balanced combination of low power, high performance/latency and
reliability. The combination of high- radix and throughput of 400/800/Terabit Ethernet speed
based on open standards is a winner! The future of AI applications requires more scale, state
and flow in switches while maintaining simple standards-based compute for rack-automated
networks. Compute intensive AI applications need open mainstream Ethernet fabric for
improved latency, scale and availability with predictable behaviors for distributed AI processing
and applications. Welcome to Arista’s data-driven network era powered by AI spines for next
generation cloud networking!
The Next Frontier in AI Networking.pdf

More Related Content

PDF
Sunoltech
PDF
Overview of HPC Interconnects
PDF
P4/FPGA, Packet Acceleration
PDF
The Network\'s IN the (virtualised) Server: Virtualized Io In Heterogeneous M...
PDF
To Infiniband and Beyond
PDF
Cloud Networking Trends
PDF
Network service in open stack cloud
PDF
Co-Design Architecture for Exascale
Sunoltech
Overview of HPC Interconnects
P4/FPGA, Packet Acceleration
The Network\'s IN the (virtualised) Server: Virtualized Io In Heterogeneous M...
To Infiniband and Beyond
Cloud Networking Trends
Network service in open stack cloud
Co-Design Architecture for Exascale

Similar to The Next Frontier in AI Networking.pdf (20)

PPTX
Keynote -金耀辉--network service in open stack cloud-osap2012_jinyh_v4
PDF
Platforms for Accelerating the Software Defined and Virtual Infrastructure
PPTX
Network Service in OpenStack Cloud, by Yaohui Jin
PDF
Performance Evaluation of Soft RoCE over 1 Gigabit Ethernet
PDF
Interconnect your future
PDF
Accelerating HPC with Ethernet
PDF
Brocade solution brief
PDF
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
PDF
Connectivity for the data centric era
PPTX
InfiniBand Growth Trends - TOP500 (July 2015)
PDF
e6c952d11fcd811dac5d0dd086e23790_Dell technology.pdf
PDF
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
PPTX
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
PDF
Mellanox hpc day 2011 kiev
PDF
A Comparison of Four Series of CISCO Network Processors
PDF
A Comparison of Four Series of CISCO Network Processors
PDF
A Comparison of Four Series of CISCO Network Processors
PDF
A Comparison of Four Series of CISCO Network Processors
PDF
InfiniBand In-Network Computing Technology and Roadmap
Keynote -金耀辉--network service in open stack cloud-osap2012_jinyh_v4
Platforms for Accelerating the Software Defined and Virtual Infrastructure
Network Service in OpenStack Cloud, by Yaohui Jin
Performance Evaluation of Soft RoCE over 1 Gigabit Ethernet
Interconnect your future
Accelerating HPC with Ethernet
Brocade solution brief
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
Connectivity for the data centric era
InfiniBand Growth Trends - TOP500 (July 2015)
e6c952d11fcd811dac5d0dd086e23790_Dell technology.pdf
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellanox hpc day 2011 kiev
A Comparison of Four Series of CISCO Network Processors
A Comparison of Four Series of CISCO Network Processors
A Comparison of Four Series of CISCO Network Processors
A Comparison of Four Series of CISCO Network Processors
InfiniBand In-Network Computing Technology and Roadmap
Ad

More from StellaEric1 (9)

PDF
Sunoltech
PDF
Sunoltech
PDF
Sunoltech
PDF
Sunoltech
PDF
Sunoltech
PDF
Sunoltech
PDF
Arista.pdf
PDF
The New Edge as a Service.pdf
PDF
CI.pdf
Sunoltech
Sunoltech
Sunoltech
Sunoltech
Sunoltech
Sunoltech
Arista.pdf
The New Edge as a Service.pdf
CI.pdf
Ad

Recently uploaded (20)

DOCX
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
PPTX
HR Introduction Slide (1).pptx on hr intro
DOCX
Business Management - unit 1 and 2
PPTX
Lecture (1)-Introduction.pptx business communication
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PDF
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
PDF
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
PPTX
2025 Product Deck V1.0.pptxCATALOGTCLCIA
PDF
IFRS Notes in your pocket for study all the time
PDF
Daniels 2024 Inclusive, Sustainable Development
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PDF
Cours de Système d'information about ERP.pdf
PPTX
Probability Distribution, binomial distribution, poisson distribution
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
NewBase 12 August 2025 Energy News issue - 1812 by Khaled Al Awadi_compresse...
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PPTX
3. HISTORICAL PERSPECTIVE UNIIT 3^..pptx
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
How to Get Funding for Your Trucking Business
unit 2 cost accounting- Tender and Quotation & Reconciliation Statement
HR Introduction Slide (1).pptx on hr intro
Business Management - unit 1 and 2
Lecture (1)-Introduction.pptx business communication
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
Power and position in leadershipDOC-20250808-WA0011..pdf
Katrina Stoneking: Shaking Up the Alcohol Beverage Industry
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
2025 Product Deck V1.0.pptxCATALOGTCLCIA
IFRS Notes in your pocket for study all the time
Daniels 2024 Inclusive, Sustainable Development
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
Cours de Système d'information about ERP.pdf
Probability Distribution, binomial distribution, poisson distribution
Unit 1 Cost Accounting - Cost sheet
NewBase 12 August 2025 Energy News issue - 1812 by Khaled Al Awadi_compresse...
340036916-American-Literature-Literary-Period-Overview.ppt
3. HISTORICAL PERSPECTIVE UNIIT 3^..pptx
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
How to Get Funding for Your Trucking Business

The Next Frontier in AI Networking.pdf

  • 1. The Next Frontier in AI Networking The rapid arrival of real-time gaming, virtual reality and metaverse applications is changing the way network, compute memory and interconnect I/O interact for the next decade. As the future of metaverse applications evolve, the network needs to adapt for 10 times the growth in traffic connecting 100s of processors with trillions of transactions and gigabits of throughput. AI is becoming more meaningful as distributed applications push the envelope of predictable scale and performance of the network. A common characteristic of these AI workloads is that they are both data and compute-intensive. A typical AI workload involves a large sparse matrix computation, distributed across 10s or 100s of processors (CPU, GPU, TPU, etc.) with intense computations for a period of time. Once the data from all peers is received, it can be reduced or merged with the local data and then another cycle of processing begins. Historically, the only option to connect processor cores and memory have been proprietary interconnects such as InfiniBand and other protocols that connect compute clusters with offloads. However, the scale and richness of Ethernet radically changes the paradigm with better distributed options now. Let's take a look at alternatives. 1. Ethernet NICs and Switches Smart or high performance NICs often interconnect the sea of multiple cores in a network design. This is an emerging trend where the Network interface Controller (NIC) not only provides network connectivity but drives server offloads. The traditional design philosophy is to leverage general purpose GPU or DPU cores and interconnect with the right price/performance across memory and processors with accelerators such as RDMA (Remote Direct Memory Access). DMA is an operation to access the memory directly from the NIC without involving the CPU. Today’s NICs connect to Ethernet 10/100/200G switches, complementing the NICs, using a programmable framework often based on P4, such as the Arista 7170 series, as well as the 7050 series for more expanded memory and feature coverage.
  • 2. 2. InfiniBand InfiniBand based switches and HBA (Host Bus Adapters) combine general purpose DPUs and GPUs to deliver consistent performance and can use RDMA offloads. Typical IB networks are vendor specific closed systems in high performance compute (HPC) use cases. The access on the responder throughput is limited by the InfiniBand (NIC and PCI). The low software dependency, decreases latency for InfiniBand versus TCP/UDP performance. However, smarter improved Ethernet switches and NICs also adopt non-TCP methods so the delta is narrowing between IB and Ethernet. Historically, InfiniBand was implemented in large supercomputer clusters but the high cost of scale-out and proprietary nature brings poor interoperability and limitations for AI and compute intensive applications. 3. Ethernet-based Spine Fabric The insatiable appetite for faster transfer latency and Ethernet as a preferred fabric between these processors is growing. AI processing grows exponentially for self-driving cars, interactive and autonomous gaming and virtual reality, mandating a scalable and reliable network. Small packets with large flows make the Arista 7800 with EOS the ideal frontrunner combination. Designed with cell based VOQ (Virtual Output Queuing), and deep buffer architecture, the Arista 7800 is a flagship platform for high radix scale 100/200/400/800G throughput across all ports for efficient packet spraying and congestion control techniques such as ECN (Explicit Congestion Notification) and PFC (Priority Flow Control). This new AI Spine delivers a balanced combination of low power, high performance/latency and reliability. The combination of high- radix and throughput of 400/800/Terabit Ethernet speed based on open standards is a winner! The future of AI applications requires more scale, state and flow in switches while maintaining simple standards-based compute for rack-automated networks. Compute intensive AI applications need open mainstream Ethernet fabric for improved latency, scale and availability with predictable behaviors for distributed AI processing and applications. Welcome to Arista’s data-driven network era powered by AI spines for next generation cloud networking!