SlideShare a Scribd company logo
Own Your Own AI
Infrastructure that is
Scalable, Affordable
and Secure!
Quantea
QAI Cluster:
©QUANTEA 2024 1
Nan Liu: CEO
Email: nliu@quantea.com
LinkedIn: www.linkedin.com/in/nanliuofficial
April 4th, 2025
©QUANTEA 2024 2
Two Key Take Away Points
Quantea provides a viable AI infrastructure
alternative to NVIDIA Cluster and Cloud
Solutions with 30% - 50% money-savings.
On-Prem AI infrastructure is cost effective
About Me
With over 25 years of experience in IT,
networking, cybersecurity, software and data
storage, I am the CEO of Quantea, a leading
company in network packet data analytics for
AI infrastructure. My mission is to
"Democratizing AI Infrastructure for All" that
empower any enterprise to own an AI
training/inferencing cluster which is
affordable, scalable, and tailored to your
needs - because AI infrastructure should be
within reach for every business ready to
harness its power.
©QUANTEA 2024 3
About Quantea
Quantea started as a pioneer in network
traffic data analytics, leveraging a unique,
patented technology to deliver high-speed,
high-resolution network data traffic insights.
This core technology, backed by a granted
U.S. patent, evolved to optimize AI workload
performance monitoring and support
advanced intrusion detection systems. Today,
Quantea has grown into a leading AI
infrastructure provider, offering scalable AI
clusters designed for robust performance
and security, tailored to meet the demands of
modern AI workloads.
©QUANTEA 2024 4
Total
Cost
GPU Consumption
Cloud
Investment
On-prem
Investment
Inflection
Point
On-Prem is more
cost effective
On-Prem Investment is Cost Effective
More Iteration + Data Size Drive Cost:
• Hourly charge for GPUs
• Storage cost rises due to frequency of
access, data transfer, and replication
Early Phase of AI Project:
• Dominated by Experiment
• Sporadic GPU Usage ©QUANTEA 2024 5
Estimated Inflection Point (Rule of Thumb)
CLOUD AI COSTS (E.G., AWS,
GCP, AZURE)
• Renting an A100 or H100 GPU can
cost $3–$7/hour per GPU
• Running 8x GPUs 24/7 = ~$17,000–
$40,000/month
• Add data egress, storage,
orchestration, and idle time = Total
often reaches $100K+ monthly for
mid-to-large workloads
ON-PREMISE AI CLUSTER COSTS
• Capital expense for a QAI Cluster with
8x H100 GPUs: ~$500K
• Add power, cooling, support: TCO
spread over 3 years is far less than
ongoing cloud spend
• Break-even often reached in 9–15
months
©QUANTEA 2024 6
$100,000–$150,000/month in cloud GPU spend(or about $1.2M–$1.8M/year)
At this point, organizations often begin seriously considering on-prem AI
clusters, especially for training large models, data privacy, or cost control
reasons.
Minimum Entry Point
Price Comparison for
©QUANTEA 2024 7
4 x H100
GPU
System
or Cloud
Service
Quantea
QAI Mini
Cluster
wth H100
GPUs
Nvidia DGX
H100 (Standalone)
No Such Offer only
has 8 GPUs Minimum
Netapp Storage -
AFF C-Series: $150K
Or Dell EMC Storage -
Power Scale: $200K
Amazon
Web
Service
with H100
Service
Google
Cloud
Platform
with H100
Service
Price
$350,000
One-time fee
$707,440.99
One-time fee
$38,387.80
per month
$34,373.90
per month
• 4 x Nvidia H100 GPUs, 122TB Storage
• Performance Monitoring
• Intrusion Detection System
©QUANTEA 2024 8
1. Compute Instances with GPUs:
o AWS offers the p5.48xlarge instance type, which includes 8 NVIDIA
H100 GPUs.
o The on-demand pricing for this instance is approximately $98.32 per
hour.
o Since you require only 4 GPUs, you would utilize half of this
instance's capacity.
o Monthly Cost Calculation:
 Hourly rate for 4 GPUs: $98.32 / 2 = $49.16
 Monthly cost: $49.16 * 24 hours/day * 30 days/month =
$35,452.80
2. Storage:
o Assuming the use of Amazon S3 Standard storage.
o Pricing is approximately $0.023 per GB per month.
o Monthly Cost Calculation:
 122 TB = 122,000 GB
 Monthly cost: 122,000 GB * $0.023/GB = $2,806.00
3. Intrusion Detection System (IDS):
o AWS offers Amazon GuardDuty for threat detection.
o Pricing is based on the volume of data analyzed, with a free tier of 30
days.
o After the free tier, pricing is approximately $1.00 per 1 million events
analyzed.
o Monthly Cost Calculation:
 Estimating 100 million events per month: 100 million events / 1
million = 100 units
 Monthly cost: 100 units * $1.00 = $100.00
4. Performance Monitoring:
o AWS CloudWatch provides monitoring services.
o Pricing includes charges for metrics, logs, and
dashboards.
o Monthly Cost Calculation:
 Assuming 50 custom metrics: 50 metrics *
$0.30/metric = $15.00
 Assuming 10 GB of log data: 10 GB * $0.50/GB =
$5.00
 Assuming 3 dashboards: 3 dashboards *
$3.00/dashboard = $9.00
 Total monthly cost: $15.00 + $5.00 + $9.00 =
$29.00
Total Estimated Monthly Cost on AWS:
 Compute: $35,452.80
 Storage: $2,806.00
 Intrusion Detection: $100.00
 Performance Monitoring: $29.00
 Total: $38,387.80
AWS Cloud Cost Breakdown for • 4 x Nvidia H100 GPUs, 122TB Storage
• Performance Monitoring
• Intrusion Detection System
©QUANTEA 2024 9
1. Compute Instances with GPUs:
o GCP offers the a3-highgpu-8g instance type, which includes 8
NVIDIA H100 GPUs.
o Pricing for this instance is approximately $88.49 per hour.
o Since you require only 4 GPUs, you would utilize half of this
instance's capacity.
o Monthly Cost Calculation:
 Hourly rate for 4 GPUs: $88.49 / 2 = $44.245
 Monthly cost: $44.245 * 24 hours/day * 30 days/month =
$31,856.40
2. Storage:
o Assuming the use of Google Cloud Storage Standard.
o Pricing is approximately $0.020 per GB per month.
o Monthly Cost Calculation:
 122 TB = 122,000 GB
 Monthly cost: 122,000 GB * $0.020/GB = $2,440.00
3. Intrusion Detection System (IDS):
o GCP offers Security Command Center Premium for threat
detection.
o Pricing is based on the number of assets monitored.
o Monthly Cost Calculation:
 Assuming 100 assets: 100 assets * $0.0010/asset/hour *
24 hours/day * 30 days/month = $72.00
4. Performance Monitoring:
o GCP's Cloud Monitoring provides monitoring
services.
o Pricing includes charges for metrics, dashboards,
and uptime checks.
o Monthly Cost Calculation:
 Assuming 50 custom metrics: 50 metrics *
$0.01/metric/month = $0.50
 Assuming 10 GB of log data: 10 GB * $0.50/GB
= $5.00
 Assuming 3 dashboards: No additional cost
 Total monthly cost: $0.50 + $5.00 = $5.50
Total Estimated Monthly Cost on GCP:
 Compute: $31,856.40
 Storage: $2,440.00
 Intrusion Detection: $72.00
 Performance Monitoring: $5.50
 Total: $34,373.90
Google Cloud Cost Breakdown for
• 4 x Nvidia H100 GPUs, 122 TB Storage
• Performance Monitoring
• Intrusion Detection System
Apple to Apple Price
Comparison for
©QUANTEA 2024 10
8 x H100
GPU
System
or Cloud
Service
Quantea
QAI
Cluster
wth H100
GPUs
Nvidia DGX
H100 (Standalone)
No Intrusion
Detection System
Netapp Storage -
AFF C-Series: $150K
Or Dell EMC Storage -
Power Scale: $200K
Amazon
Web
Service
with H100
Service
Google
Cloud
Platform
with H100
Service
Price
$500,000
One-time fee
$707,440.99
One-time fee
$84,023.10
per month
$47,579.28
per month
• 8 x Nvidia H100 GPUs, 500TB Storage
• Performance Monitoring
• Intrusion Detection System
©QUANTEA 2024 11
1. Compute Resources
 Instance Type: p5.48xlarge
o Specifications:
 192 vCPUs
 2,048 GiB RAM
 8 NVIDIA H100 GPUs (80 GiB each)
 8 x 3.84 TB NVMe SSD local storage
o Pricing:
 On-Demand: $98.32 per hour
 Monthly Cost (730 hours): $98.32 × 730 ≈ $71,773.60
2. Storage
 Amazon S3 Standard Storage:
o Capacity: 500 TB
o Pricing:
 First 50 TB: $0.023 per GB
 Next 450 TB: $0.022 per GB
 Cost Calculation:
 First 50 TB: 50,000 GB × $0.023 = $1,150
 Next 450 TB: 450,000 GB × $0.022 = $9,900
 Total Storage Cost: $1,150 + $9,900 = $11,050 per month
3. Performance Monitoring
 Amazon CloudWatch:
o Basic Monitoring: Free
o Detailed Monitoring:
 $0.015 per metric per hour
 Assuming 10 custom metrics: 10 × $0.015 × 730 hours = $109.50 per month
o Logs:
 $0.50 per GB ingested
 Assuming 100 GB per month: 100 × $0.50 = $50
o Total CloudWatch Cost: $109.50 + $50 = $159.50 per month
4. Intrusion Detection
 AWS GuardDuty:
o Pricing:
 $1.00 per million events analyzed
 $4.00 per GB of data processed
 Cost Estimation:
 Assuming 100 million events: 100 × $1.00 = $100
 Assuming 10 GB of data: 10 × $4.00 = $40
 Total GuardDuty Cost: $100 + $40 = $140 per month
5. Data Transfer
 Data Egress:
o First 1 GB per month: Free
o Up to 10 TB per month: $0.09 per GB
o Cost Estimation:
 Assuming 10 TB egress: 10,000 GB × $0.09 = $900 per month
Total Estimated Monthly Cost
 Compute: $71,773.60
 Storage: $11,050.00
 Performance Monitoring: $159.50
 Intrusion Detection: $140.00
 Data Transfer: $900.00
Total: Approximately $84,023.10 per month
Calculation of AWS Service for
• 8 x Nvidia H100 GPUs, 500TB Storage
• Performance Monitoring
• Intrusion Detection System
©QUANTEA 2024 12
1. Compute Resources
Google Cloud offers A3 machine types equipped with NVIDIA H100 GPUs.
 Machine Type: a3-highgpu-8g
 Configuration:
o 8 NVIDIA H100 80GB GPUs
o 208 vCPUs
o 1,872 GB of memory
 Pricing:
o On-Demand Price per Hour: Approximately $13.66
o Monthly Cost (730 hours): $13.66 × 730 ≈ $9,976.60
2. Storage
For 500 TB of storage, Google Cloud offers various options. Assuming the use of Standard
Persistent Disks:
 Pricing:
o Standard Persistent Disk: $0.04 per GB per month
o Monthly Cost: 500,000 GB × $0.04 = $20,000.00
3. Performance Monitoring
Google Cloud's operations suite (formerly Stackdriver) provides monitoring and logging
services. Costs are based on usage, including data volume and API calls. For a high-usage
scenario:
 Estimated Monthly Cost: Approximately $1,000.00
4. Intrusion Detection Systems (IDS)
Google Cloud offers Cloud IDS for network threat detection.
 Pricing:
o Per Endpoint per Hour: $1.50
o Per GB Inspected: $0.07
 Assumptions:
o 1 Endpoint running continuously: $1.50 × 730 hours = $1,095.00
o Data Inspected: 100 TB per month
 100,000 GB × $0.07 = $7,000.00
 Total Monthly IDS Cost: $1,095.00 + $7,000.00 = $8,095.00
5. Networking Costs
Assuming 100 TB of data egress per month:
 Pricing:
o First 1 TB: $0.12 per GB
o Next 9 TB: $0.11 per GB
o Next 40 TB: $0.08 per GB
o Next 100 TB: $0.07 per GB
 Monthly Cost:
o 1 TB × $0.12 = $122.88
o 9 TB × $0.11 = $1,024.00
o 40 TB × $0.08 = $3,276.80
o 50 TB × $0.07 = $3,584.00
o Total: $122.88 + $1,024.00 + $3,276.80 + $3,584.00 = $8,007.68
6. Additional Services
Additional costs may include services like Cloud Logging, Cloud Functions, and
others, depending on your specific requirements. For this estimate, we'll allocate
a nominal amount:
 Estimated Monthly Cost: $500.00
Total Estimated Monthly Cost
 Compute Resources: $9,976.60
 Storage: $20,000.00
 Performance Monitoring: $1,000.00
 Intrusion Detection Systems: $8,095.00
 Networking: $8,007.68
 Additional Services: $500.00
Total: Approximately $47,579.28 per month
Calculation of Google Service for
• 8 x Nvidia H100 GPUs, 500TB Storage
• Performance Monitoring
• Intrusion Detection System
©QUANTEA 2024 13
Cost Savings: $9.03M in 3 years!
$2.91M Quantea vs. $11.94M on AWS
CLOUD AI COSTS (AWS FOR 32 X
NVIDIA H100 AND 500TB STORAGE)
•Cloud AI Costs (AWS Example for 32 x NVIDIA
H100 GPUs)
 AWS p5.48xlarge Instances (8 x H100 per
instance, 4 instances total) → $283,622 per month
 Cloud Storage (500TB S3 Standard) → $11,500
per month
 Cloud Networking & Data Transfer Fees →
$35,000 per month
 AI Model Checkpointing & Monitoring → $2,000
per month
 Total Monthly Cloud Cost → $332,122 (~$3.98M
per year)
QUANTEA QAI CLUSTER (ONE-TIME
INVESTMENT)
• QAI Cluster with 32 x NVIDIA H100 GPUs
→ $2.55M (one-time cost)
 Annual Power & Maintenance Costs →
~$120,000 per year
 No Cloud Storage Fees → Data stored
locally with high-speed NVMe
 No Data Transfer Fees → Eliminate cloud
egress fees (~$420K annual savings)
 Total 3-Year TCO → $2.91M
• 32 x Nvidia H100 GPUs
• 500TB Data Storage
©QUANTEA 2024 14
Cost Savings: $1.2M One Time Fee!
$2.63M Quantea vs. $3.83M NVIDIA
NVIDIA DGX H100 CLUSTER COSTS
(ONE-TIME FEE, ON-PREM)
 NVIDIA DGX H100 System (32 GPUs)
→ $3.53M upfront
 Storage (1PB NetApp AFF C-Series)
 Annual Power & Cooling Costs →
~$120K per year
 Software Licensing (NVIDIAAI
Enterprise) → $60K per year
 Maintenance & Support → $120K per
year
 Total TCO → $3.83M
QUANTEA QAI CLUSTER (ONE TIME-
FEE, ON-PROM)
 QAI Cluster with 32 x NVIDIA H100
GPUs → $2.55M (one-time cost)
 1PB NVMe Storage (High-Speed)
 Annual Power & Maintenance Costs →
~$80K per year
 No NVIDIA Licensing Fees
 No Vendor-Locked Hardware Costs
 Total TCO → $2.63M (vs. $3.83M for
DGX) – Savings of $1.2M!
• 32 x Nvidia H100 GPUs
• 1PB Data Storage
Power Consumption
Comparison for 8 H100
GPU Configuration
1 x Standalone GPU Server (Not include the
power consumption KWs of data storage
appliance from NVIDIA Data Storage Partners).
Power consumption: 10.2 KW
©QUANTEA 2024 15
1. 4 X 1U Systems with 2 X H100 GPUs Each: 89600W
2. Console Server: 25W
3. Storage Server: 500W
4. QI Analyzer: 650W
5. Quantea Congition Switch: 500W
Total power consumption: 8.475KW
Quantea QAI
Cluster NVIDIA DGX H100
Power Smarter, Save $2272.95 Per Year :
Quantea QAI Cluster Delivers Green AI Solutions
Power Consumption Comparison
for 128 H100 GPU Cluster
 Compute Systems: 16 x DGX H100 Systems = 163.2 kW
 Storage System: NetApp AFF A90 = 1.95 kW
 Networking Equipment: Assuming an average of 7.5 kW
 Subtotal: 163.2 kW + 1.95 kW + 7.5 kW = 172.65 kW
 Cooling and Ancillary Systems:
Calculation: 172.65 kW × 25% (average) = 43.16 KW
Total Power Consumption:
Calculation: 172.65 kW + 43.16 kW = 215.81 kW
Total power consumption: 215.81 KW
©QUANTEA 2024 16
 64 X 1U Compute Systems with 2 X H100 GPUs Each: 128 kW
 Or 32 X 2U Compute Systems with 4 X H100 GPUs Each: 128 kW
 Console Server: 585 W
 2 X Storage Server: 1000 W
 QI Analyzer: 650 W
 2 X Quantea Cognition Switch: 2600 W
 Subtotal: 128kW + 585W + 1kW + 650W + 2.6kW = 132.235 kW
 Cooling and Ancillary Systems:
Calculation: 132.235 kW x 25% (average) = 33.059 kW
Total Power Consumption:
Calculation: 132.235 kW + 33.059 kW = 165.294
Total power consumption: 165.294 KW
Quantea QAI
Cluster
NVIDIA DGX H100 BasePod
Reduce Costs and Carbon Footprint – Save $6,637 Per Year
Gain Advantage of Industry
Standard Systems and
Components (Except GPUs)
Compatibility: Seamless integration with a variety of
hardware and software ecosystems.
Reliability: Proven components with extensive
testing and support networks.
Cost Efficiency: Lower costs due to competitive
pricing and multi-vendor sourcing.
Performance Optimization: Configurable Specially
for AI and data analytics workloads.
Simplified Maintenance: Widely available
components reduce downtime and repair complexity.
Power Savings: Strategic component selection,
integration expertise, and system customization to
minimize electrical energy use.
©QUANTEA 2024 17
Quantea QAI Cluster
©QUANTEA 2024 18
AI Network Comparison
Common AI Network Quantea QAI Network
Storage
Storage
Management
Compute
BMC Monitoring
Compute
Management
BMC Monitoring
4 Separate Networks 2 Separate Networks
1. Reduce the complexity
2. Use less network equipment
3. Use less energy
4. Improve AI job performance
5. Lower the total cost
1. Adding the complexity
2. Use more network equipment
3. Use more energy
4. Slow down AI job performance
5. High the total cost
Example of Why
Quantea QAI Cluster
is Cost Effective and
Save Energy at the
Sametime Improve
the Performance of
Training/Inferencing
for AI Workload
QAI Mini Cluster Easy Start with Just 4 GPUs
©QUANTEA 2024 19
Storage Server: 1U
16 NVMe Drives
122.88TB
2 x 1U Compute Server
with H100 GPUs each
Empty Slots for Adding
the nodes to scale in
future
Quantea Cognition:
400G enhanced switch
for AI Cluster
Quantea QI: Cluster
Manager with
Performance
monitoring and
intrusion detection
system (IDS)
Console Server:
Remote
management of each
node, storage server
and Quantea QI
Analyzer
Operating Systems:
Choose your
preferred OS Linux
(Ubuntu v 20.04
LTS/22.04 LTS or
CentOS)
Perfect for Controlling Your AI Destiny at Lowest Cost
©QUANTEA 2024 20
Operating Systems: Choose
your preferred OS Linux
(Ubuntu v 20.04 LTS/22.04
LTS or CentOS)
Data Storage for
implementing a distributed
and scalable file systems
Storage Server: 1U/2U/3U/4U
SAS/SATA/NVMe SSDs and
HDDs
NodeX: 1U Computing Server
with 2GPUs by Select from a
range of Nvidia A100, H100,
V100, or L40S GPUs
Console Server: Remote
management of each node,
storage server and Quantea QI
Quantea QI: Cluster Manager
with Performance monitoring
and intrusion detection system
(IDS)
Quantea Cognition: 400G
enhanced switch for AI Cluster
Build-in high performance
network visualization for AI
workload and network
analysis system
Latency and congestion
reporting, workload mapping
for GPU/storage nodes and
dedicated ports to offload
mirror traffic to Quantea QI
QAI Cluster with NVIDIA GPUs in Detail
1U Compute Nodes Configuration
©QUANTEA 2024 21
Data Storage for
implementing a distributed
and scalable file systems
Storage Server: 1U/2U/3U/4U
SAS/SATA/NVMe SSDs and
HDDs
NodeX: 2U Computing Server
with 4GPUs by Select from a
range of Nvidia A100, H100,
V100, or L40S GPUs
Console Server: Remote
management of each node,
storage server and Quantea QI
Quantea QI: Cluster Manager
with Performance monitoring
and intrusion detection system
(IDS)
Quantea Cognition: 400G
enhanced switch for AI Cluster
QAI Cluster with NVIDIA GPUs in Detail
Build-in high performance
network visualization for AI
workload and network
analysis system
Latency and congestion
reporting, workload mapping
for GPU/storage nodes and
dedicated ports to offload
mirror traffic to Quantea QI
Operating Systems: Choose
your preferred OS Linux
(Ubuntu v 20.04 LTS/22.04
LTS or CentOS)
2U Compute Nodes Configuration
©QUANTEA 2024 22
Data Storage for
implementing a distributed
and scalable file systems
Storage Server: 1U/2U/3U/4U
SAS/SATA/NVMe SSDs and
HDDs
NodeX: 2U Computing Server
with 4GPUs by Select from a
range of AMD MI300, MI300A,
MI300X, MI250, MI250X
Console Server: Remote
management of each node,
storage server and Quantea QI
Quantea QI: Cluster Manager
with Performance monitoring
and intrusion detection system
(IDS)
Quantea Cognition: 400G
enhanced switch for AI Cluster
QAI Cluster with AMD GPUs in Detail
Build-in high performance
network visualization for AI
workload and network
analysis system
Latency and congestion
reporting, workload mapping
for GPU/storage nodes and
dedicated ports to offload
mirror traffic to Quantea QI
Operating Systems: Choose
your preferred OS Linux
(Ubuntu v 20.04 LTS/22.04
LTS or CentOS)
2U Compute Nodes Configuration
23
©QUANTEA 2024
• Latency and
congestion
reporting
• Workload mapping
for GPU/Storage
nodes
• Dedicate ports to
offload mirror traffic
to Quantea QI
Analyzer
Quantea Cognition
and connection in
Detail
Quantea QI
23
23
QAI Cluster Manager
Quantea QI in
Detail
Fastest Network
Analysis Today at
400Gbps
• RoCEv2 monitoring
and analysis
• RDMA monitoring
and analysis
• Expandable storage
for longer retention
period
24
©QUANTEA 2024
QI-400G SERIES
Performance
Monitoring
Intrusion Detection System (IDS)
QAI Cluster Manager
All-in-One Convenience
©QUANTEA 2024 25
Fully Integrated,
pre-configured
Network
Configured
Console Server
Connected
Linux Installed
Kubernetes Master
Node Allocated
Data Storage Server
Provided
DHCP Name Space
Configured
Setup to Filter and
Direct Network
Traffic Data
Ready to Capture
Analyze Traffic Data
Monitoring
Between All Nodes
Intrusion Detection
System Enabled
Ready to Load Your Libraries and More to Train Your Models
Console Server
Connected
Plug-and-play by Nature
Quantea QAI
Cluster Hosting
Own your on-prem AI cluster
with ease! We handle everything
for you—completely hassle-free.
©QUANTEA 2024 26
• Starting at $3,800 per
month for a single 42U
Rack.
• With pricing adjustments
available for multiple
racks.
Quantea QAI Cluster is
Compatible
with NVIDIA Software Tools
©QUANTEA 2024 27
1. NVIDIA AI Enterprise: A cloud-native suite of AI and data analytics software, optimized for NVIDIA
GPUs, facilitating the development and deployment of AI solutions across various industries.
2. NVIDIA CUDA Toolkit: Provides a development environment for creating high-performance GPU-
accelerated applications, offering libraries, debugging, and optimization tools.
3. NVIDIA cuDNN (CUDA Deep Neural Network library): A GPU-accelerated library for deep neural
networks, delivering high-performance building blocks for deep learning frameworks.
4. NVIDIA TensorRT: A high-performance deep learning inference optimizer and runtime library,
enabling low-latency and high-throughput inference for deep learning models.
5. NVIDIA Triton Inference Server: An open-source inference serving software that simplifies the
deployment of AI models at scale in production environments.
6. NVIDIA NGC Catalog: A comprehensive hub of GPU-optimized software, including AI frameworks,
pretrained models, and HPC applications, streamlining the development workflow.
7. NVIDIA vGPU (Virtual GPU) Software: Enables multiple virtual machines to share a single GPU,
providing virtualization capabilities for graphics and compute workloads, enhancing resource
utilization.
8. NVIDIA DeepStream SDK: Facilitates the development of AI-powered video analytics applications,
offering a complete streaming analytics toolkit for real-time insights.
9. NVIDIA Clara: A healthcare application framework for AI-powered imaging, genomics, and the
development of smart sensors, accelerating computational workflows in the medical field.
10. NVIDIA Omniverse: A platform for real-time 3D design collaboration and simulation, enabling
creators, designers, and engineers to collaborate seamlessly in a shared virtual space.
Operate AI Agents locally on QAI including agentic AI and co-pilots
The QAI provides an enterprise ready
platform to utilize powerful AI tools that
are available in the market today.
A Powerful Platform for AI Development Tools
©Quantea 2025 24
Bringing AI to Every Business
 AI infrastructure has been dominated by tech
giants, making it inaccessible to small and mid-
sized businesses.
 Quantea's QAI Cluster enables all companies
to own their AI infrastructure, avoiding the
costly NVIDIA Cluster and cloud subscriptions,
also protecting proprietary data.
 Customize the cluster to meet specific industry
needs—whether you're in healthcare, finance,
or manufacturing.
©QUANTEA 2024 29
©QUANTEA 2024 30
Features
that Matter
 Affordability: Starting at approximately $350K for 4
GPUs cluster or $500K for 8 GPU and can be
adjusted to meet the customer’s budget, making it
accessible to a wide range of businesses.
 Customize: Choose the GPUs, operating systems,
and storage size that best fit your workload.
 Scalability: Scale from 4 NVIDIA GPUs at 2 x 1U
nodes Cluster, 8 NVIDIA GPUs at 4 x 1U node
cluster or 1 x 2U nodes NVIDIA GPU in a single
rack or AMD GPUs at 1-node or 2-node AMD GPU
cluster to over 100+ nodes in multiple racks cluster
as your business grows.
 Data Privacy: Maintain full control over your sensitive
data without the need for cloud services.
 Built-in Performance Monitoring: Optimize and
monitor workloads in real-time.
 Security: Integrated Intrusion Detection System
(IDS) for robust threat protection
Tailored to Your Needs:
Flexible Hardware and
Software Options
 GPUs: Select from a range of NVIDIA A100, H100,
V100, L40S and L40 GPUs or AMD MI300,
MI300A, MI300X, MI250 and MI250X depending on
your AI processing needs.
 Operating Systems: Choose your preferred OS—
Linux (Ubuntu v 20.04 LTS/22.04 LTS or CentOS).
 Storage Size: From 122TB to Multi-Petabytes of
storage, QAI offers flexibility to handle any dataset
size.
 Customize your infrastructure based on workload
type—from deep learning to NLP, computer vision
to LLM. ©QUANTEA 2024 31
©QUANTEA 2024 32
Start Small, Scale Big
 Start from 4 NVIDIA GPUs: 2 x 1U node or 1 x
2U node cluster or start from 8 NVIDIA GPUs:
2-node or 4-node cluster or start from AMD
GPUs 1-node or 2-node Cluster in a single
rack to keep initial investments low then expand
to 100+ nodes in multiple racks as your
business grows.
 QAI Clusters are designed for seamless
scalability, allowing you to add nodes, storage,
and compute power as needed.
 Ensure your infrastructure grows in tandem with
your AI demands, without compromising
performance or security.
©QUANTEA 2024 33
Custom Solutions
for Any Industry
 Healthcare: AI-driven diagnostics, patient
data protection, and medical imaging.
 Finance: Real-time fraud detection,
compliance, and high-speed algorithmic
trading.
 Telecom: Network optimization, customer
service automation, and predictive analytics.
 Retail: Personalized customer
recommendations, demand forecasting, and
inventory management.
 Manufacturing: Predictive maintenance,
automation, and supply chain management.
Ensuring the Security and
Integrity of AI Clusters
• Protecting Your Business from Data Leaks: Detect Unusual
Data Transfer and Protocol Anomaly.
• Safeguarding Against Insider Threats: Identify Anomalous
Behavior, Track Lateral Movement.
• Compliance Made Simple: Monitoring for Policy Violations
and Audit logs.
• Advanced Threat Detection with Deep Packet Inspection:
Malware and Exploits, Exploits Targeting AI Infrastructure.
• Detailed Forensics and Incident Response: Full packet
capture enable detailed analysis of incident and Flow-
based Logs Analysis.
©QUANTEA 2024 34
Monitoring and
Optimizing AI
Cluster Performance
• Understand Traffic Across Your AI Network: Analyze
the volume of Traffic packets per second, bytes per
second.
• Prevent Data Loss: Inspecting the sequence
numbers in captured TCP/UDP packets or RoCEv2
RDMA traffic.
• Optimize Bandwidth Usage: Track the data rates
(bytes per second) and packet rates across different
segments of the network.
• Detect Congestion Issues: Inspect ECN bits in the IP
header of captured packets. ECN-CE.
• Ensure Balanced Workloads: Track individual traffic
flows and their size (flow bytes).
• Keep Your System Flowing: Monitor queue lengths
and buffer utilization.
• Minimize Delays and Jitters: Capture and compare
latency data (round-trip times) and Monitor for jitter.
©QUANTEA 2024 35
Quantea Cluster
Enhancement
Consulting Service
©QUANTEA 2024 36
Lustre, BeeGFS, Ceph, HDFS or NFS
Cluster Consulting
Data Storage Consulting
• Kubernetes-Based Orchestration
• SLURM-Based Management
• Others
Orchestration and Management Setup
Scalable Cluster File System Setup
Stands Out from Others
©QUANTEA 2024 37
AI Infrastructure is Vastly Different than
Traditional IT Infrastructure (Network Example)
Network in Tradition IT Infrastructure
Control / User Access Network (N-S)
Loosely-Coupled Applications
TCP (Low Bandwidth Flows and Utilization)
High Jitter Tolerance
Oversubscribed Topologies
Heterogeneous Traffic, Statistical Multi-Pathing
Network AI Infrastructure
AI Fabric (E-W)
Tightly-Coupled Processes
RDMA (High Bandwidth Flows and Utilization)
Low Jitter Tolerance
Nonblocking Topologies
Bursty Network Capacity, Predictive Performance
©QUANTEA 2024 38
The Key measurement for the AI-optimized network is how long
an AI training job takes from start to finish
Quantea AI
Infrastructure
Consulting Service ©QUANTEA 2024 39
• Key Services:
- High-performance storage and server configuration
- Migration from cloud to on-prem solution
- Advanced networking setup for seamless data flow
- Security measures including firewalls and intrusion
detection
- Continuous monitoring and scalability solutions
• Why Choose Quantea?
- Proven expertise in AI infrastructure solutions
- Tailored strategies for your unique business needs
- Focus on maximizing performance and ROI
Quantea offers cutting-edge infrastructure consulting
services tailored for QAI Clusters. Our expertise
ensures optimal deployment, performance, and
scalability for advanced AI workloads.
Customized QAI Cluster architecture design and
implementation just for you!
Let Quantea empower your AI initiatives with robust and
reliable QAI Cluster solutions.
DESIGN & CREATE YOUR FAVORED
AI INFRASTRUCTURE
ProudRecipientoftheBusinessHallofFamein 5consecutiveYears
Recognized for
The Best of Business
The 2024 Best of Santa
Clara Award in the
Category of Software and
Networking Company
By the Santa Clara Business Recognition
©2024 Quantea 4
Recognized as a leader in delivering
cutting-edge AI infrastructure for
five consecutive years
Your Award-Winning AI Partner
©QUANTEA 2024 41
Two Key Take Away Points
in Owning Your Own AI
Destiny
Quantea provides a viable AI
infrastructure Alternative to
NVIDIA and Cloud Solutions with
30% - 50% money-savings.
On-Prem AI infrastructure is cost
effective
Avoid costly cloud
subscriptions and protect
your proprietary data with
on-premises infrastructure
Quantea QAI Cluster scales
seamlessly from 4 GPUs to
100+ nodes with many GPUs
as your business grows
42
READY TO OWN YOUR AI CLUSTER?
 Contact us at sales@quantea.com
 Visit www.quantea.com for more details
 Sales: (669)-238-0728 ext. 1111
Contact Quantea
today to explore how
our flexible AI
infrastructure can
empower your
business.
Customizable
solutions to meet
your specific GPU,
storage, and OS
requirements.
Schedule a
consultation to
discover the power of
QAI Cluster.
©QUANTEA 2024
Affordable, Scalable
AI Cluster with
Customizable Options
for Your Business
Quantea
QAI Cluster:
Democratizing AI
Infrastructure for All
©QUANTEA 2024 43
Nan Liu: CEO
Email: nliu@quantea.com
LinkedIn: www.linkedin.com/in/nanliuofficial
Question?

More Related Content

DOCX
Azure_pricing_calculation_estimation_calculator
PDF
Google cloud platform introduction
PDF
Evolving to serverless
PPTX
Radical Cloud Consolidation on The Ball
PPTX
Stor simple presentation customers
PDF
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
PDF
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
PDF
Cloud Native Cost Optimization
Azure_pricing_calculation_estimation_calculator
Google cloud platform introduction
Evolving to serverless
Radical Cloud Consolidation on The Ball
Stor simple presentation customers
Architecture, Products, and Total Cost of Ownership of the Leading Machine Le...
Should You Move Between AWS, Azure, or Google Clouds? Considerations, Pros an...
Cloud Native Cost Optimization

Similar to Own Your Own AI Infrastructure that is Scalable, Affordable, and Secure! (13)

PDF
cloud-training-pricing-billing.pdf
PDF
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...
PDF
Oracle Cloud Infrastructure Introduction
PPTX
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
PPTX
What Does It Cost for Microsoft Dynamics in the Cloud?
PPTX
Latest Updates to Azure Integration Services
PDF
AWS CZSK Webinar - Migrácia desktopov a aplikácií do AWS cloudu s Amazon Work...
PDF
AWSomeBuilder3-v12-clean.pdf
PDF
Cloud Price Comparison - AWS vs Azure vs Google
PPTX
UMF Cloud Pilot
PDF
12 Ways to Manage Cloud Costs and Optimize Cloud Spend
PPTX
Deploy Microsoft Azure Data Solutions
PPTX
CCI 2019 - Come ottimizzare i propri workload su Azure
cloud-training-pricing-billing.pdf
Strata Conference + Hadoop World NY 2013: Running On-premise Hadoop as a Busi...
Oracle Cloud Infrastructure Introduction
Scylla Summit 2019 Keynote - Dor Laor - Beyond Cassandra
What Does It Cost for Microsoft Dynamics in the Cloud?
Latest Updates to Azure Integration Services
AWS CZSK Webinar - Migrácia desktopov a aplikácií do AWS cloudu s Amazon Work...
AWSomeBuilder3-v12-clean.pdf
Cloud Price Comparison - AWS vs Azure vs Google
UMF Cloud Pilot
12 Ways to Manage Cloud Costs and Optimize Cloud Spend
Deploy Microsoft Azure Data Solutions
CCI 2019 - Come ottimizzare i propri workload su Azure
Ad

More from ideatoipo (20)

PPTX
How to Make Your Business Famous Through Social Media in 30 Days!
PDF
How to Avoid an Intellectual Property Disaster
PDF
How to Make Your Pre Seed Startup Grant Fundable
PDF
How to Make Your Pre-Seed Startup (Grant) Fundable!
PDF
How to Get Grants and Provisional Patents for your Startup
PDF
How to Pitch to Investors and Get Funded!
PDF
How to Get Grant Funding for Your Startup
PDF
How to Start a Global Brand in 30 Days or Less!
PDF
How to Get Seed and Pre-Seed Grants for Your Startup!
PDF
How to Get Seed and Pre-Seed Grants for Your Startup!
PDF
How to Get Non-Dilutive Funding For Your Business
PPTX
Democratizing Artificial Intelligence Infrastructure for All
PPTX
How to Get Seed and Pre-Seed Funding for Your Startup!
PPTX
How Startup and Enterprise Companies Can Own Their Own AI Infrastructure
PDF
How to Scale Your Digital Marketing Presence in 30 Days or Less!
PDF
How to Build an MVP in 30 Days or Less with Rich Foreman
PDF
Top IP Mistakes That Can Derail Your Startup
PDF
Top Legal Mistakes to Avoid If You Want to Get Funded!
PDF
How to Launch Your MVP in 30 Days or Less
PDF
How Non-Techies Can Build an Minimum Viable Product
How to Make Your Business Famous Through Social Media in 30 Days!
How to Avoid an Intellectual Property Disaster
How to Make Your Pre Seed Startup Grant Fundable
How to Make Your Pre-Seed Startup (Grant) Fundable!
How to Get Grants and Provisional Patents for your Startup
How to Pitch to Investors and Get Funded!
How to Get Grant Funding for Your Startup
How to Start a Global Brand in 30 Days or Less!
How to Get Seed and Pre-Seed Grants for Your Startup!
How to Get Seed and Pre-Seed Grants for Your Startup!
How to Get Non-Dilutive Funding For Your Business
Democratizing Artificial Intelligence Infrastructure for All
How to Get Seed and Pre-Seed Funding for Your Startup!
How Startup and Enterprise Companies Can Own Their Own AI Infrastructure
How to Scale Your Digital Marketing Presence in 30 Days or Less!
How to Build an MVP in 30 Days or Less with Rich Foreman
Top IP Mistakes That Can Derail Your Startup
Top Legal Mistakes to Avoid If You Want to Get Funded!
How to Launch Your MVP in 30 Days or Less
How Non-Techies Can Build an Minimum Viable Product
Ad

Recently uploaded (20)

PDF
Reconciliation AND MEMORANDUM RECONCILATION
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PDF
How to Get Business Funding for Small Business Fast
DOCX
unit 1 COST ACCOUNTING AND COST SHEET
PDF
COST SHEET- Tender and Quotation unit 2.pdf
PPTX
Probability Distribution, binomial distribution, poisson distribution
PPTX
Lecture (1)-Introduction.pptx business communication
PDF
Unit 1 Cost Accounting - Cost sheet
PPTX
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
PDF
MSPs in 10 Words - Created by US MSP Network
PDF
Types of control:Qualitative vs Quantitative
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PDF
Laughter Yoga Basic Learning Workshop Manual
DOCX
Euro SEO Services 1st 3 General Updates.docx
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
Training And Development of Employee .pdf
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
A Brief Introduction About Julia Allison
PPTX
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh
Reconciliation AND MEMORANDUM RECONCILATION
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
How to Get Business Funding for Small Business Fast
unit 1 COST ACCOUNTING AND COST SHEET
COST SHEET- Tender and Quotation unit 2.pdf
Probability Distribution, binomial distribution, poisson distribution
Lecture (1)-Introduction.pptx business communication
Unit 1 Cost Accounting - Cost sheet
CkgxkgxydkydyldylydlydyldlyddolydyoyyU2.pptx
MSPs in 10 Words - Created by US MSP Network
Types of control:Qualitative vs Quantitative
DOC-20250806-WA0002._20250806_112011_0000.pdf
Laughter Yoga Basic Learning Workshop Manual
Euro SEO Services 1st 3 General Updates.docx
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
Training And Development of Employee .pdf
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
A Brief Introduction About Julia Allison
job Avenue by vinith.pptxvnbvnvnvbnvbnbmnbmbh

Own Your Own AI Infrastructure that is Scalable, Affordable, and Secure!

  • 1. Own Your Own AI Infrastructure that is Scalable, Affordable and Secure! Quantea QAI Cluster: ©QUANTEA 2024 1 Nan Liu: CEO Email: nliu@quantea.com LinkedIn: www.linkedin.com/in/nanliuofficial April 4th, 2025
  • 2. ©QUANTEA 2024 2 Two Key Take Away Points Quantea provides a viable AI infrastructure alternative to NVIDIA Cluster and Cloud Solutions with 30% - 50% money-savings. On-Prem AI infrastructure is cost effective
  • 3. About Me With over 25 years of experience in IT, networking, cybersecurity, software and data storage, I am the CEO of Quantea, a leading company in network packet data analytics for AI infrastructure. My mission is to "Democratizing AI Infrastructure for All" that empower any enterprise to own an AI training/inferencing cluster which is affordable, scalable, and tailored to your needs - because AI infrastructure should be within reach for every business ready to harness its power. ©QUANTEA 2024 3
  • 4. About Quantea Quantea started as a pioneer in network traffic data analytics, leveraging a unique, patented technology to deliver high-speed, high-resolution network data traffic insights. This core technology, backed by a granted U.S. patent, evolved to optimize AI workload performance monitoring and support advanced intrusion detection systems. Today, Quantea has grown into a leading AI infrastructure provider, offering scalable AI clusters designed for robust performance and security, tailored to meet the demands of modern AI workloads. ©QUANTEA 2024 4
  • 5. Total Cost GPU Consumption Cloud Investment On-prem Investment Inflection Point On-Prem is more cost effective On-Prem Investment is Cost Effective More Iteration + Data Size Drive Cost: • Hourly charge for GPUs • Storage cost rises due to frequency of access, data transfer, and replication Early Phase of AI Project: • Dominated by Experiment • Sporadic GPU Usage ©QUANTEA 2024 5
  • 6. Estimated Inflection Point (Rule of Thumb) CLOUD AI COSTS (E.G., AWS, GCP, AZURE) • Renting an A100 or H100 GPU can cost $3–$7/hour per GPU • Running 8x GPUs 24/7 = ~$17,000– $40,000/month • Add data egress, storage, orchestration, and idle time = Total often reaches $100K+ monthly for mid-to-large workloads ON-PREMISE AI CLUSTER COSTS • Capital expense for a QAI Cluster with 8x H100 GPUs: ~$500K • Add power, cooling, support: TCO spread over 3 years is far less than ongoing cloud spend • Break-even often reached in 9–15 months ©QUANTEA 2024 6 $100,000–$150,000/month in cloud GPU spend(or about $1.2M–$1.8M/year) At this point, organizations often begin seriously considering on-prem AI clusters, especially for training large models, data privacy, or cost control reasons.
  • 7. Minimum Entry Point Price Comparison for ©QUANTEA 2024 7 4 x H100 GPU System or Cloud Service Quantea QAI Mini Cluster wth H100 GPUs Nvidia DGX H100 (Standalone) No Such Offer only has 8 GPUs Minimum Netapp Storage - AFF C-Series: $150K Or Dell EMC Storage - Power Scale: $200K Amazon Web Service with H100 Service Google Cloud Platform with H100 Service Price $350,000 One-time fee $707,440.99 One-time fee $38,387.80 per month $34,373.90 per month • 4 x Nvidia H100 GPUs, 122TB Storage • Performance Monitoring • Intrusion Detection System
  • 8. ©QUANTEA 2024 8 1. Compute Instances with GPUs: o AWS offers the p5.48xlarge instance type, which includes 8 NVIDIA H100 GPUs. o The on-demand pricing for this instance is approximately $98.32 per hour. o Since you require only 4 GPUs, you would utilize half of this instance's capacity. o Monthly Cost Calculation:  Hourly rate for 4 GPUs: $98.32 / 2 = $49.16  Monthly cost: $49.16 * 24 hours/day * 30 days/month = $35,452.80 2. Storage: o Assuming the use of Amazon S3 Standard storage. o Pricing is approximately $0.023 per GB per month. o Monthly Cost Calculation:  122 TB = 122,000 GB  Monthly cost: 122,000 GB * $0.023/GB = $2,806.00 3. Intrusion Detection System (IDS): o AWS offers Amazon GuardDuty for threat detection. o Pricing is based on the volume of data analyzed, with a free tier of 30 days. o After the free tier, pricing is approximately $1.00 per 1 million events analyzed. o Monthly Cost Calculation:  Estimating 100 million events per month: 100 million events / 1 million = 100 units  Monthly cost: 100 units * $1.00 = $100.00 4. Performance Monitoring: o AWS CloudWatch provides monitoring services. o Pricing includes charges for metrics, logs, and dashboards. o Monthly Cost Calculation:  Assuming 50 custom metrics: 50 metrics * $0.30/metric = $15.00  Assuming 10 GB of log data: 10 GB * $0.50/GB = $5.00  Assuming 3 dashboards: 3 dashboards * $3.00/dashboard = $9.00  Total monthly cost: $15.00 + $5.00 + $9.00 = $29.00 Total Estimated Monthly Cost on AWS:  Compute: $35,452.80  Storage: $2,806.00  Intrusion Detection: $100.00  Performance Monitoring: $29.00  Total: $38,387.80 AWS Cloud Cost Breakdown for • 4 x Nvidia H100 GPUs, 122TB Storage • Performance Monitoring • Intrusion Detection System
  • 9. ©QUANTEA 2024 9 1. Compute Instances with GPUs: o GCP offers the a3-highgpu-8g instance type, which includes 8 NVIDIA H100 GPUs. o Pricing for this instance is approximately $88.49 per hour. o Since you require only 4 GPUs, you would utilize half of this instance's capacity. o Monthly Cost Calculation:  Hourly rate for 4 GPUs: $88.49 / 2 = $44.245  Monthly cost: $44.245 * 24 hours/day * 30 days/month = $31,856.40 2. Storage: o Assuming the use of Google Cloud Storage Standard. o Pricing is approximately $0.020 per GB per month. o Monthly Cost Calculation:  122 TB = 122,000 GB  Monthly cost: 122,000 GB * $0.020/GB = $2,440.00 3. Intrusion Detection System (IDS): o GCP offers Security Command Center Premium for threat detection. o Pricing is based on the number of assets monitored. o Monthly Cost Calculation:  Assuming 100 assets: 100 assets * $0.0010/asset/hour * 24 hours/day * 30 days/month = $72.00 4. Performance Monitoring: o GCP's Cloud Monitoring provides monitoring services. o Pricing includes charges for metrics, dashboards, and uptime checks. o Monthly Cost Calculation:  Assuming 50 custom metrics: 50 metrics * $0.01/metric/month = $0.50  Assuming 10 GB of log data: 10 GB * $0.50/GB = $5.00  Assuming 3 dashboards: No additional cost  Total monthly cost: $0.50 + $5.00 = $5.50 Total Estimated Monthly Cost on GCP:  Compute: $31,856.40  Storage: $2,440.00  Intrusion Detection: $72.00  Performance Monitoring: $5.50  Total: $34,373.90 Google Cloud Cost Breakdown for • 4 x Nvidia H100 GPUs, 122 TB Storage • Performance Monitoring • Intrusion Detection System
  • 10. Apple to Apple Price Comparison for ©QUANTEA 2024 10 8 x H100 GPU System or Cloud Service Quantea QAI Cluster wth H100 GPUs Nvidia DGX H100 (Standalone) No Intrusion Detection System Netapp Storage - AFF C-Series: $150K Or Dell EMC Storage - Power Scale: $200K Amazon Web Service with H100 Service Google Cloud Platform with H100 Service Price $500,000 One-time fee $707,440.99 One-time fee $84,023.10 per month $47,579.28 per month • 8 x Nvidia H100 GPUs, 500TB Storage • Performance Monitoring • Intrusion Detection System
  • 11. ©QUANTEA 2024 11 1. Compute Resources  Instance Type: p5.48xlarge o Specifications:  192 vCPUs  2,048 GiB RAM  8 NVIDIA H100 GPUs (80 GiB each)  8 x 3.84 TB NVMe SSD local storage o Pricing:  On-Demand: $98.32 per hour  Monthly Cost (730 hours): $98.32 × 730 ≈ $71,773.60 2. Storage  Amazon S3 Standard Storage: o Capacity: 500 TB o Pricing:  First 50 TB: $0.023 per GB  Next 450 TB: $0.022 per GB  Cost Calculation:  First 50 TB: 50,000 GB × $0.023 = $1,150  Next 450 TB: 450,000 GB × $0.022 = $9,900  Total Storage Cost: $1,150 + $9,900 = $11,050 per month 3. Performance Monitoring  Amazon CloudWatch: o Basic Monitoring: Free o Detailed Monitoring:  $0.015 per metric per hour  Assuming 10 custom metrics: 10 × $0.015 × 730 hours = $109.50 per month o Logs:  $0.50 per GB ingested  Assuming 100 GB per month: 100 × $0.50 = $50 o Total CloudWatch Cost: $109.50 + $50 = $159.50 per month 4. Intrusion Detection  AWS GuardDuty: o Pricing:  $1.00 per million events analyzed  $4.00 per GB of data processed  Cost Estimation:  Assuming 100 million events: 100 × $1.00 = $100  Assuming 10 GB of data: 10 × $4.00 = $40  Total GuardDuty Cost: $100 + $40 = $140 per month 5. Data Transfer  Data Egress: o First 1 GB per month: Free o Up to 10 TB per month: $0.09 per GB o Cost Estimation:  Assuming 10 TB egress: 10,000 GB × $0.09 = $900 per month Total Estimated Monthly Cost  Compute: $71,773.60  Storage: $11,050.00  Performance Monitoring: $159.50  Intrusion Detection: $140.00  Data Transfer: $900.00 Total: Approximately $84,023.10 per month Calculation of AWS Service for • 8 x Nvidia H100 GPUs, 500TB Storage • Performance Monitoring • Intrusion Detection System
  • 12. ©QUANTEA 2024 12 1. Compute Resources Google Cloud offers A3 machine types equipped with NVIDIA H100 GPUs.  Machine Type: a3-highgpu-8g  Configuration: o 8 NVIDIA H100 80GB GPUs o 208 vCPUs o 1,872 GB of memory  Pricing: o On-Demand Price per Hour: Approximately $13.66 o Monthly Cost (730 hours): $13.66 × 730 ≈ $9,976.60 2. Storage For 500 TB of storage, Google Cloud offers various options. Assuming the use of Standard Persistent Disks:  Pricing: o Standard Persistent Disk: $0.04 per GB per month o Monthly Cost: 500,000 GB × $0.04 = $20,000.00 3. Performance Monitoring Google Cloud's operations suite (formerly Stackdriver) provides monitoring and logging services. Costs are based on usage, including data volume and API calls. For a high-usage scenario:  Estimated Monthly Cost: Approximately $1,000.00 4. Intrusion Detection Systems (IDS) Google Cloud offers Cloud IDS for network threat detection.  Pricing: o Per Endpoint per Hour: $1.50 o Per GB Inspected: $0.07  Assumptions: o 1 Endpoint running continuously: $1.50 × 730 hours = $1,095.00 o Data Inspected: 100 TB per month  100,000 GB × $0.07 = $7,000.00  Total Monthly IDS Cost: $1,095.00 + $7,000.00 = $8,095.00 5. Networking Costs Assuming 100 TB of data egress per month:  Pricing: o First 1 TB: $0.12 per GB o Next 9 TB: $0.11 per GB o Next 40 TB: $0.08 per GB o Next 100 TB: $0.07 per GB  Monthly Cost: o 1 TB × $0.12 = $122.88 o 9 TB × $0.11 = $1,024.00 o 40 TB × $0.08 = $3,276.80 o 50 TB × $0.07 = $3,584.00 o Total: $122.88 + $1,024.00 + $3,276.80 + $3,584.00 = $8,007.68 6. Additional Services Additional costs may include services like Cloud Logging, Cloud Functions, and others, depending on your specific requirements. For this estimate, we'll allocate a nominal amount:  Estimated Monthly Cost: $500.00 Total Estimated Monthly Cost  Compute Resources: $9,976.60  Storage: $20,000.00  Performance Monitoring: $1,000.00  Intrusion Detection Systems: $8,095.00  Networking: $8,007.68  Additional Services: $500.00 Total: Approximately $47,579.28 per month Calculation of Google Service for • 8 x Nvidia H100 GPUs, 500TB Storage • Performance Monitoring • Intrusion Detection System
  • 13. ©QUANTEA 2024 13 Cost Savings: $9.03M in 3 years! $2.91M Quantea vs. $11.94M on AWS CLOUD AI COSTS (AWS FOR 32 X NVIDIA H100 AND 500TB STORAGE) •Cloud AI Costs (AWS Example for 32 x NVIDIA H100 GPUs)  AWS p5.48xlarge Instances (8 x H100 per instance, 4 instances total) → $283,622 per month  Cloud Storage (500TB S3 Standard) → $11,500 per month  Cloud Networking & Data Transfer Fees → $35,000 per month  AI Model Checkpointing & Monitoring → $2,000 per month  Total Monthly Cloud Cost → $332,122 (~$3.98M per year) QUANTEA QAI CLUSTER (ONE-TIME INVESTMENT) • QAI Cluster with 32 x NVIDIA H100 GPUs → $2.55M (one-time cost)  Annual Power & Maintenance Costs → ~$120,000 per year  No Cloud Storage Fees → Data stored locally with high-speed NVMe  No Data Transfer Fees → Eliminate cloud egress fees (~$420K annual savings)  Total 3-Year TCO → $2.91M • 32 x Nvidia H100 GPUs • 500TB Data Storage
  • 14. ©QUANTEA 2024 14 Cost Savings: $1.2M One Time Fee! $2.63M Quantea vs. $3.83M NVIDIA NVIDIA DGX H100 CLUSTER COSTS (ONE-TIME FEE, ON-PREM)  NVIDIA DGX H100 System (32 GPUs) → $3.53M upfront  Storage (1PB NetApp AFF C-Series)  Annual Power & Cooling Costs → ~$120K per year  Software Licensing (NVIDIAAI Enterprise) → $60K per year  Maintenance & Support → $120K per year  Total TCO → $3.83M QUANTEA QAI CLUSTER (ONE TIME- FEE, ON-PROM)  QAI Cluster with 32 x NVIDIA H100 GPUs → $2.55M (one-time cost)  1PB NVMe Storage (High-Speed)  Annual Power & Maintenance Costs → ~$80K per year  No NVIDIA Licensing Fees  No Vendor-Locked Hardware Costs  Total TCO → $2.63M (vs. $3.83M for DGX) – Savings of $1.2M! • 32 x Nvidia H100 GPUs • 1PB Data Storage
  • 15. Power Consumption Comparison for 8 H100 GPU Configuration 1 x Standalone GPU Server (Not include the power consumption KWs of data storage appliance from NVIDIA Data Storage Partners). Power consumption: 10.2 KW ©QUANTEA 2024 15 1. 4 X 1U Systems with 2 X H100 GPUs Each: 89600W 2. Console Server: 25W 3. Storage Server: 500W 4. QI Analyzer: 650W 5. Quantea Congition Switch: 500W Total power consumption: 8.475KW Quantea QAI Cluster NVIDIA DGX H100 Power Smarter, Save $2272.95 Per Year : Quantea QAI Cluster Delivers Green AI Solutions
  • 16. Power Consumption Comparison for 128 H100 GPU Cluster  Compute Systems: 16 x DGX H100 Systems = 163.2 kW  Storage System: NetApp AFF A90 = 1.95 kW  Networking Equipment: Assuming an average of 7.5 kW  Subtotal: 163.2 kW + 1.95 kW + 7.5 kW = 172.65 kW  Cooling and Ancillary Systems: Calculation: 172.65 kW × 25% (average) = 43.16 KW Total Power Consumption: Calculation: 172.65 kW + 43.16 kW = 215.81 kW Total power consumption: 215.81 KW ©QUANTEA 2024 16  64 X 1U Compute Systems with 2 X H100 GPUs Each: 128 kW  Or 32 X 2U Compute Systems with 4 X H100 GPUs Each: 128 kW  Console Server: 585 W  2 X Storage Server: 1000 W  QI Analyzer: 650 W  2 X Quantea Cognition Switch: 2600 W  Subtotal: 128kW + 585W + 1kW + 650W + 2.6kW = 132.235 kW  Cooling and Ancillary Systems: Calculation: 132.235 kW x 25% (average) = 33.059 kW Total Power Consumption: Calculation: 132.235 kW + 33.059 kW = 165.294 Total power consumption: 165.294 KW Quantea QAI Cluster NVIDIA DGX H100 BasePod Reduce Costs and Carbon Footprint – Save $6,637 Per Year
  • 17. Gain Advantage of Industry Standard Systems and Components (Except GPUs) Compatibility: Seamless integration with a variety of hardware and software ecosystems. Reliability: Proven components with extensive testing and support networks. Cost Efficiency: Lower costs due to competitive pricing and multi-vendor sourcing. Performance Optimization: Configurable Specially for AI and data analytics workloads. Simplified Maintenance: Widely available components reduce downtime and repair complexity. Power Savings: Strategic component selection, integration expertise, and system customization to minimize electrical energy use. ©QUANTEA 2024 17 Quantea QAI Cluster
  • 18. ©QUANTEA 2024 18 AI Network Comparison Common AI Network Quantea QAI Network Storage Storage Management Compute BMC Monitoring Compute Management BMC Monitoring 4 Separate Networks 2 Separate Networks 1. Reduce the complexity 2. Use less network equipment 3. Use less energy 4. Improve AI job performance 5. Lower the total cost 1. Adding the complexity 2. Use more network equipment 3. Use more energy 4. Slow down AI job performance 5. High the total cost Example of Why Quantea QAI Cluster is Cost Effective and Save Energy at the Sametime Improve the Performance of Training/Inferencing for AI Workload
  • 19. QAI Mini Cluster Easy Start with Just 4 GPUs ©QUANTEA 2024 19 Storage Server: 1U 16 NVMe Drives 122.88TB 2 x 1U Compute Server with H100 GPUs each Empty Slots for Adding the nodes to scale in future Quantea Cognition: 400G enhanced switch for AI Cluster Quantea QI: Cluster Manager with Performance monitoring and intrusion detection system (IDS) Console Server: Remote management of each node, storage server and Quantea QI Analyzer Operating Systems: Choose your preferred OS Linux (Ubuntu v 20.04 LTS/22.04 LTS or CentOS) Perfect for Controlling Your AI Destiny at Lowest Cost
  • 20. ©QUANTEA 2024 20 Operating Systems: Choose your preferred OS Linux (Ubuntu v 20.04 LTS/22.04 LTS or CentOS) Data Storage for implementing a distributed and scalable file systems Storage Server: 1U/2U/3U/4U SAS/SATA/NVMe SSDs and HDDs NodeX: 1U Computing Server with 2GPUs by Select from a range of Nvidia A100, H100, V100, or L40S GPUs Console Server: Remote management of each node, storage server and Quantea QI Quantea QI: Cluster Manager with Performance monitoring and intrusion detection system (IDS) Quantea Cognition: 400G enhanced switch for AI Cluster Build-in high performance network visualization for AI workload and network analysis system Latency and congestion reporting, workload mapping for GPU/storage nodes and dedicated ports to offload mirror traffic to Quantea QI QAI Cluster with NVIDIA GPUs in Detail 1U Compute Nodes Configuration
  • 21. ©QUANTEA 2024 21 Data Storage for implementing a distributed and scalable file systems Storage Server: 1U/2U/3U/4U SAS/SATA/NVMe SSDs and HDDs NodeX: 2U Computing Server with 4GPUs by Select from a range of Nvidia A100, H100, V100, or L40S GPUs Console Server: Remote management of each node, storage server and Quantea QI Quantea QI: Cluster Manager with Performance monitoring and intrusion detection system (IDS) Quantea Cognition: 400G enhanced switch for AI Cluster QAI Cluster with NVIDIA GPUs in Detail Build-in high performance network visualization for AI workload and network analysis system Latency and congestion reporting, workload mapping for GPU/storage nodes and dedicated ports to offload mirror traffic to Quantea QI Operating Systems: Choose your preferred OS Linux (Ubuntu v 20.04 LTS/22.04 LTS or CentOS) 2U Compute Nodes Configuration
  • 22. ©QUANTEA 2024 22 Data Storage for implementing a distributed and scalable file systems Storage Server: 1U/2U/3U/4U SAS/SATA/NVMe SSDs and HDDs NodeX: 2U Computing Server with 4GPUs by Select from a range of AMD MI300, MI300A, MI300X, MI250, MI250X Console Server: Remote management of each node, storage server and Quantea QI Quantea QI: Cluster Manager with Performance monitoring and intrusion detection system (IDS) Quantea Cognition: 400G enhanced switch for AI Cluster QAI Cluster with AMD GPUs in Detail Build-in high performance network visualization for AI workload and network analysis system Latency and congestion reporting, workload mapping for GPU/storage nodes and dedicated ports to offload mirror traffic to Quantea QI Operating Systems: Choose your preferred OS Linux (Ubuntu v 20.04 LTS/22.04 LTS or CentOS) 2U Compute Nodes Configuration
  • 23. 23 ©QUANTEA 2024 • Latency and congestion reporting • Workload mapping for GPU/Storage nodes • Dedicate ports to offload mirror traffic to Quantea QI Analyzer Quantea Cognition and connection in Detail Quantea QI 23 23 QAI Cluster Manager
  • 24. Quantea QI in Detail Fastest Network Analysis Today at 400Gbps • RoCEv2 monitoring and analysis • RDMA monitoring and analysis • Expandable storage for longer retention period 24 ©QUANTEA 2024 QI-400G SERIES Performance Monitoring Intrusion Detection System (IDS) QAI Cluster Manager
  • 25. All-in-One Convenience ©QUANTEA 2024 25 Fully Integrated, pre-configured Network Configured Console Server Connected Linux Installed Kubernetes Master Node Allocated Data Storage Server Provided DHCP Name Space Configured Setup to Filter and Direct Network Traffic Data Ready to Capture Analyze Traffic Data Monitoring Between All Nodes Intrusion Detection System Enabled Ready to Load Your Libraries and More to Train Your Models Console Server Connected Plug-and-play by Nature
  • 26. Quantea QAI Cluster Hosting Own your on-prem AI cluster with ease! We handle everything for you—completely hassle-free. ©QUANTEA 2024 26 • Starting at $3,800 per month for a single 42U Rack. • With pricing adjustments available for multiple racks.
  • 27. Quantea QAI Cluster is Compatible with NVIDIA Software Tools ©QUANTEA 2024 27 1. NVIDIA AI Enterprise: A cloud-native suite of AI and data analytics software, optimized for NVIDIA GPUs, facilitating the development and deployment of AI solutions across various industries. 2. NVIDIA CUDA Toolkit: Provides a development environment for creating high-performance GPU- accelerated applications, offering libraries, debugging, and optimization tools. 3. NVIDIA cuDNN (CUDA Deep Neural Network library): A GPU-accelerated library for deep neural networks, delivering high-performance building blocks for deep learning frameworks. 4. NVIDIA TensorRT: A high-performance deep learning inference optimizer and runtime library, enabling low-latency and high-throughput inference for deep learning models. 5. NVIDIA Triton Inference Server: An open-source inference serving software that simplifies the deployment of AI models at scale in production environments. 6. NVIDIA NGC Catalog: A comprehensive hub of GPU-optimized software, including AI frameworks, pretrained models, and HPC applications, streamlining the development workflow. 7. NVIDIA vGPU (Virtual GPU) Software: Enables multiple virtual machines to share a single GPU, providing virtualization capabilities for graphics and compute workloads, enhancing resource utilization. 8. NVIDIA DeepStream SDK: Facilitates the development of AI-powered video analytics applications, offering a complete streaming analytics toolkit for real-time insights. 9. NVIDIA Clara: A healthcare application framework for AI-powered imaging, genomics, and the development of smart sensors, accelerating computational workflows in the medical field. 10. NVIDIA Omniverse: A platform for real-time 3D design collaboration and simulation, enabling creators, designers, and engineers to collaborate seamlessly in a shared virtual space.
  • 28. Operate AI Agents locally on QAI including agentic AI and co-pilots The QAI provides an enterprise ready platform to utilize powerful AI tools that are available in the market today. A Powerful Platform for AI Development Tools ©Quantea 2025 24
  • 29. Bringing AI to Every Business  AI infrastructure has been dominated by tech giants, making it inaccessible to small and mid- sized businesses.  Quantea's QAI Cluster enables all companies to own their AI infrastructure, avoiding the costly NVIDIA Cluster and cloud subscriptions, also protecting proprietary data.  Customize the cluster to meet specific industry needs—whether you're in healthcare, finance, or manufacturing. ©QUANTEA 2024 29
  • 30. ©QUANTEA 2024 30 Features that Matter  Affordability: Starting at approximately $350K for 4 GPUs cluster or $500K for 8 GPU and can be adjusted to meet the customer’s budget, making it accessible to a wide range of businesses.  Customize: Choose the GPUs, operating systems, and storage size that best fit your workload.  Scalability: Scale from 4 NVIDIA GPUs at 2 x 1U nodes Cluster, 8 NVIDIA GPUs at 4 x 1U node cluster or 1 x 2U nodes NVIDIA GPU in a single rack or AMD GPUs at 1-node or 2-node AMD GPU cluster to over 100+ nodes in multiple racks cluster as your business grows.  Data Privacy: Maintain full control over your sensitive data without the need for cloud services.  Built-in Performance Monitoring: Optimize and monitor workloads in real-time.  Security: Integrated Intrusion Detection System (IDS) for robust threat protection
  • 31. Tailored to Your Needs: Flexible Hardware and Software Options  GPUs: Select from a range of NVIDIA A100, H100, V100, L40S and L40 GPUs or AMD MI300, MI300A, MI300X, MI250 and MI250X depending on your AI processing needs.  Operating Systems: Choose your preferred OS— Linux (Ubuntu v 20.04 LTS/22.04 LTS or CentOS).  Storage Size: From 122TB to Multi-Petabytes of storage, QAI offers flexibility to handle any dataset size.  Customize your infrastructure based on workload type—from deep learning to NLP, computer vision to LLM. ©QUANTEA 2024 31
  • 32. ©QUANTEA 2024 32 Start Small, Scale Big  Start from 4 NVIDIA GPUs: 2 x 1U node or 1 x 2U node cluster or start from 8 NVIDIA GPUs: 2-node or 4-node cluster or start from AMD GPUs 1-node or 2-node Cluster in a single rack to keep initial investments low then expand to 100+ nodes in multiple racks as your business grows.  QAI Clusters are designed for seamless scalability, allowing you to add nodes, storage, and compute power as needed.  Ensure your infrastructure grows in tandem with your AI demands, without compromising performance or security.
  • 33. ©QUANTEA 2024 33 Custom Solutions for Any Industry  Healthcare: AI-driven diagnostics, patient data protection, and medical imaging.  Finance: Real-time fraud detection, compliance, and high-speed algorithmic trading.  Telecom: Network optimization, customer service automation, and predictive analytics.  Retail: Personalized customer recommendations, demand forecasting, and inventory management.  Manufacturing: Predictive maintenance, automation, and supply chain management.
  • 34. Ensuring the Security and Integrity of AI Clusters • Protecting Your Business from Data Leaks: Detect Unusual Data Transfer and Protocol Anomaly. • Safeguarding Against Insider Threats: Identify Anomalous Behavior, Track Lateral Movement. • Compliance Made Simple: Monitoring for Policy Violations and Audit logs. • Advanced Threat Detection with Deep Packet Inspection: Malware and Exploits, Exploits Targeting AI Infrastructure. • Detailed Forensics and Incident Response: Full packet capture enable detailed analysis of incident and Flow- based Logs Analysis. ©QUANTEA 2024 34
  • 35. Monitoring and Optimizing AI Cluster Performance • Understand Traffic Across Your AI Network: Analyze the volume of Traffic packets per second, bytes per second. • Prevent Data Loss: Inspecting the sequence numbers in captured TCP/UDP packets or RoCEv2 RDMA traffic. • Optimize Bandwidth Usage: Track the data rates (bytes per second) and packet rates across different segments of the network. • Detect Congestion Issues: Inspect ECN bits in the IP header of captured packets. ECN-CE. • Ensure Balanced Workloads: Track individual traffic flows and their size (flow bytes). • Keep Your System Flowing: Monitor queue lengths and buffer utilization. • Minimize Delays and Jitters: Capture and compare latency data (round-trip times) and Monitor for jitter. ©QUANTEA 2024 35
  • 36. Quantea Cluster Enhancement Consulting Service ©QUANTEA 2024 36 Lustre, BeeGFS, Ceph, HDFS or NFS Cluster Consulting Data Storage Consulting • Kubernetes-Based Orchestration • SLURM-Based Management • Others Orchestration and Management Setup Scalable Cluster File System Setup
  • 37. Stands Out from Others ©QUANTEA 2024 37
  • 38. AI Infrastructure is Vastly Different than Traditional IT Infrastructure (Network Example) Network in Tradition IT Infrastructure Control / User Access Network (N-S) Loosely-Coupled Applications TCP (Low Bandwidth Flows and Utilization) High Jitter Tolerance Oversubscribed Topologies Heterogeneous Traffic, Statistical Multi-Pathing Network AI Infrastructure AI Fabric (E-W) Tightly-Coupled Processes RDMA (High Bandwidth Flows and Utilization) Low Jitter Tolerance Nonblocking Topologies Bursty Network Capacity, Predictive Performance ©QUANTEA 2024 38 The Key measurement for the AI-optimized network is how long an AI training job takes from start to finish
  • 39. Quantea AI Infrastructure Consulting Service ©QUANTEA 2024 39 • Key Services: - High-performance storage and server configuration - Migration from cloud to on-prem solution - Advanced networking setup for seamless data flow - Security measures including firewalls and intrusion detection - Continuous monitoring and scalability solutions • Why Choose Quantea? - Proven expertise in AI infrastructure solutions - Tailored strategies for your unique business needs - Focus on maximizing performance and ROI Quantea offers cutting-edge infrastructure consulting services tailored for QAI Clusters. Our expertise ensures optimal deployment, performance, and scalability for advanced AI workloads. Customized QAI Cluster architecture design and implementation just for you! Let Quantea empower your AI initiatives with robust and reliable QAI Cluster solutions. DESIGN & CREATE YOUR FAVORED AI INFRASTRUCTURE
  • 40. ProudRecipientoftheBusinessHallofFamein 5consecutiveYears Recognized for The Best of Business The 2024 Best of Santa Clara Award in the Category of Software and Networking Company By the Santa Clara Business Recognition ©2024 Quantea 4 Recognized as a leader in delivering cutting-edge AI infrastructure for five consecutive years Your Award-Winning AI Partner
  • 41. ©QUANTEA 2024 41 Two Key Take Away Points in Owning Your Own AI Destiny Quantea provides a viable AI infrastructure Alternative to NVIDIA and Cloud Solutions with 30% - 50% money-savings. On-Prem AI infrastructure is cost effective Avoid costly cloud subscriptions and protect your proprietary data with on-premises infrastructure Quantea QAI Cluster scales seamlessly from 4 GPUs to 100+ nodes with many GPUs as your business grows
  • 42. 42 READY TO OWN YOUR AI CLUSTER?  Contact us at sales@quantea.com  Visit www.quantea.com for more details  Sales: (669)-238-0728 ext. 1111 Contact Quantea today to explore how our flexible AI infrastructure can empower your business. Customizable solutions to meet your specific GPU, storage, and OS requirements. Schedule a consultation to discover the power of QAI Cluster. ©QUANTEA 2024
  • 43. Affordable, Scalable AI Cluster with Customizable Options for Your Business Quantea QAI Cluster: Democratizing AI Infrastructure for All ©QUANTEA 2024 43 Nan Liu: CEO Email: nliu@quantea.com LinkedIn: www.linkedin.com/in/nanliuofficial Question?