SlideShare a Scribd company logo
Improve deep learning inference
performance with Microsoft Azure
Esv4 VMs with 2nd Gen Intel Xeon
Scalable processors
Newer Esv4 VMs handled more images per
second than Esv3 VMs with older processors
Using a subset of machine learning—deep learning—to classify
images or make predictions from consumer data can help
organizations put their mountains of data to good use. For these types
of deep learning models, Microsoft Azure offers memory-optimized
Esv4-series virtual machines (VMs). Azure Esv4-series VMs are based
on Intel®
Xeon®
Platinum 8272CL processors, which include a feature,
Intel Deep Learning Boost, that Intel designed to improve machine
learning workloads.
At Principled Technologies, we used two inference benchmarks from
the Model Zoo for Intel Architecture—ResNet50, which classifies
images, and Wide & Deep recommendation system, which makes
relationships between data—to compare the inference performance
of older Azure Esv3-series VMs to newer Esv4-series VMs at various
instance sizes. We found that for both deep learning benchmarks,
the upgraded Esv4-series VMs offered significantly better inference
performance, which shows that organizations seeking quick data
insights can benefit from selecting Microsoft Azure Esv4-series VMs
featuring 2nd Generation Intel Xeon Scalable processors.
Classify up
to 8.40x
more images
per second
Get
recommendations
from data up to
3.48x as fast
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021
A Principled Technologies report: Hands-on testing. Real-world results.
Figure 1: Key specifications for each VM we tested. Source: Principled Technologies.
How we tested
We purchased three sets of virtual machine instances from two memory-optimized Microsoft Azure VM series:
• Newer Esv4 series featuring Intel Xeon Platinum 8272CL processors (Cascade Lake)
• Older Esv3 series featuring Intel Xeon E5-2673 v4 processors (Broadwell)
We ran each VM in the East US region.
Figure 1 shows the specifications for the virtual machines that we chose. To show how businesses with different
deep learning demands can benefit from choosing Esv4-series VMs, we tested small (8 vCPU), medium (16
vCPU), and large (64 vCPU) VM sizes.
Small
(E8s_v4) 8 vCPUs
Medium
(E16s_v4) 16 vCPUs
Large
(E64s_v4) 64 vCPUs
About 2nd Generation Intel Xeon Scalable processors with Intel Deep Learning Boost
The 2nd Generation Intel Xeon Scalable processor platform—codenamed Cascade Lake—features a wide
range of processor types, including Bronze, Silver, Gold, and Platinum, to support varying workload needs.
To accelerate machine learning inference, 2nd Gen Intel Xeon Scalable processors offer Intel Deep Learning
Boost (DL Boost). Intel DL Boost builds on Intel Advanced Vector Extensions 512 (AVX-512) instructions with
Intel Vector Neural Network Instructions (VNNI), combining multiple processor instructions into one to improve
machine learning inference performance through resource optimization.1
To learn more about Intel DL Boost built into 2nd Generation Intel Xeon Scalable processors, visit https://www.
intel.com/content/dam/www/public/us/en/documents/product-overviews/dl-boost-product-overview.pdf.
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 2
What Esv4-series VMs can offer your organization
Compared to older Esv3 VMs, ESv4-series VMs offer:
• Up to 506GB RAM
• Guaranteed Cascade Lake processor with all-core Turbo clock speed of 3.4 GHz and Intel Vector Neural
Network support (AVX-512 VNNI)
• The ability to support premium storage and premium storage caching
Image classification results – ResNet50
From Model Zoo for Intel Architecture, we chose the popular ResNet50 deep learning benchmark for testing.
ResNet50 is a convolutional neural network that runs 50 layers deep and recognizes and classifies images. Using
deep learning to classify images is useful for real-world applications such as self-driving cars or aiding in medical
diagnoses. The benchmark reported throughput in images per second that the solutions handled using this
model, with higher scores indicating better performance at this type of deep learning.
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 3
Small instances
If your deep learning needs are on the smaller side, selecting an Azure VM with 8 vCPUs could meet your
image classification needs. We found that a new Azure Esv4-series VM with 8 vCPUs featuring 2nd Gen Intel
Xeon Scalable processors (with INT8 precision) classified 8.40 times the number of images per second using the
ResNet50 benchmark as the small-sized VM with previous-generation processors (with FP32 precision).
0 1 2 3 4 5 8
6
E8s_v4
E8s_v3
8 vCPU ResNet50 normalized images/sec throughput
Images/sec
Higher is better
9
1
7
8.4
Figure 2: Relative number of images per second that the small-size VMs (8 vCPUs) classified using the ResNet50
benchmark. Higher numbers are better. Source: Principled Technologies.
Medium instances
Larger models or datasets may benefit from an increase to 16 vCPUs per virtual machine. We found that a new
Azure Esv4-series VM with 16 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision)
classified 6.67 times the number of images per second using the ResNet50 benchmark as the medium-sized VM
with previous-generation processors (with FP32 precision).
Higher is better
Images/sec
E16s_v4
E16s_v3
16 vCPU ResNet50 normalized images/sec throughput
0 1 2 3 4 5 8
6 9
1
7
6.67
Figure 3: Relative number of images per second that the medium-size VMs (16 vCPUs) classified using the ResNet50
benchmark. Higher numbers are better. Source: Principled Technologies.
6.67x
the images
per second
8.40x
the images
per second
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 4
Large instances
If your organization needs to run deep learning workloads to classify even larger datasets, VMs with 64 vCPUs
can better tackle your needs. We found that a new Azure Esv4-series VM with 64 vCPUs featuring 2nd Gen Intel
Xeon Scalable processors (with INT8 precision) classified 5.96 times the number of images per second using the
ResNet50 benchmark as the large-sized VM with previous-generation processors (with FP32 precision).
Higher is better
Images/sec
E64s_v4
E64s_v3
64 vCPU ResNet50 normalized images/sec throughput
5.96
0 1 2 3 4 5 8
6 9
1
7
Figure 4: Relative number of images per second that the large-size VMs (64 vCPUs) classified using the ResNet50
benchmark. Higher numbers are better. Source: Principled Technologies.
Get more value from
your cloud VMs
Budget considerations require weighing the
cost of any performance improvements. Put
simply: is the boost in performance worth
the additional cost? We found that for deep
learning performance on Microsoft Azure
Esv4-series VMs, the answer is yes. Based
on our test results, newer Esv4-series VMs
can offer up to 8.40 times the deep learning
performance at a lower (0.94x) overall cost.
This means that upgraded Esv4-series VMs
with 2nd Gen Intel Xeon Scalable processors
can offer better overall value compared to
older Esv3-series VMs.
5.96x
the images
per second
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 5
Making recommendations based on data – Wide & Deep
learning recommender
We used Model Zoo for Intel Architecture for Wide & Deep learning recommender testing. Wide & Deep
uses wide linear models and deep neural networks to infer meaningful relationships between data and deliver
recommendations based on that data. The benchmark reports the number of samples per second that the
instance handled, with more samples indicating better performance.
Small instances
Smaller deep learning problems with smaller datasets may require VMs configured with 8 vCPUs. We found that
a new Azure Esv4-series VM with 8 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision)
handled 3.48 times the number of samples per second using the Wide & Deep benchmark as the small-sized VM
with previous-generation processors (with FP32 precision).
0 1 3
8 vCPU normalized throughput
Samples/sec
Higher is better
4
E8s_v4
E8s_v3
3.48
1
2
Figure 5: Relative number of samples per second that the small-size VMs (8 vCPUs) handled using the Wide & Deep
benchmark. Higher numbers are better. Source: Principled Technologies.
3.48x
the samples
per second
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 6
Medium instances
For those seeking to make recommendations based on mid-sized datasets, 16-vCPU VMs may be more
appropriate. We found that a new Azure Esv4-series VM with 16 vCPUs featuring 2nd Gen Intel Xeon Scalable
processors (with INT8 precision) handled 3.23 times the number of samples per second using the Wide & Deep
benchmark as the medium-sized VM with previous-generation processors (with FP32 precision).
Samples/sec
E16s_v4
E16s_v3
16 vCPU normalized throughput
Higher is better
0 1 3 4
3.23
1
2
Figure 6: Relative number of samples per second that the medium-size VMs (16 vCPUs) handled using the Wide & Deep
benchmark. Higher numbers are better. Source: Principled Technologies.
Large instances
VMs aren’t one-size-fits-all, so large models and datasets may require more powerful virtual machines with
64 vCPUs. We found that a new Azure Esv4-series VM with 64 vCPUs featuring 2nd Gen Intel Xeon Scalable
processors (with INT8 precision) handled 2.99 times the number of samples per second using the Wide & Deep
benchmark as the large-sized VM with previous-generation processors (with FP32 precision).
Samples/sec
E64s_v4
E64s_v3
64 vCPU normalized throughput
Higher is better
0 1 3 4
2.99
1
2
Figure 7: Relative number of samples per second that the large-size VMs (64 vCPUs) handled using the Wide & Deep
benchmark. Higher numbers are better. Source: Principled Technologies.
2.99x
the samples
per second
3.23x
the samples
per second
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 7
Get insights faster with Azure Esv4 VMs featuring 2nd Gen Intel
Xeon Scalable processors
While deep learning models and their applications can vary widely, getting insights from data faster is always
the goal, to drive innovation or boost consumer sales. In our tests, we found that newer Microsoft Azure Esv4-
series VMs featuring 2nd Gen Intel Xeon Scalable processors—which offer Intel Deep Learning Boost—improved
deep learning inference performance for image classification and recommendations over older Esv3 VMs. And at
just 0.94x the cost, the Esv4 series offers significantly better value per VM, which could mean your organization
requires fewer VMs to support.
By choosing Microsoft Azure Esv4-series VMs with 2nd Gen Intel Xeon Scalable processors, your organization
can get deep learning insights from data faster than with older Esv3-series VMs.
1	 Intel, “Intel Deep Learning Boost,” accessed July 29, 2021, https://www.intel.com/content/dam/www/public/us/en/
documents/product-overviews/dl-boost-product-overview.pdf.
Principled Technologies is a registered trademark of Principled Technologies, Inc.
All other product names are the trademarks of their respective owners.
For additional information, review the science behind this report.
Principled
Technologies®
Facts matter.®
Principled
Technologies®
Facts matter.®
This project was commissioned by Intel.
Read the science behind this report at http://facts.pt/lHyrW1n
Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 8

More Related Content

PDF
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
PDF
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
PDF
Component upgrades from Intel and Dell can increase VM density and boost perf...
PDF
Speed up deep learning tasks with Amazon Web Services instances featuring 2nd...
PDF
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
PDF
Prepare images for machine learning faster with servers powered by AMD EPYC 7...
PDF
Reap better SQL Server OLTP performance with next-generation Dell EMC PowerEd...
PDF
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
Component upgrades from Intel and Dell can increase VM density and boost perf...
Speed up deep learning tasks with Amazon Web Services instances featuring 2nd...
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
Prepare images for machine learning faster with servers powered by AMD EPYC 7...
Reap better SQL Server OLTP performance with next-generation Dell EMC PowerEd...
Create useful data center health visualizations with Dell iDRAC Telemetry Ref...

What's hot (20)

PDF
Get improved performance and new features from Dell EMC PowerEdge servers wit...
PDF
Upgrade to Dell EMC PowerEdge R940 servers with VMware vSphere 7.0 and gain g...
PDF
A single-socket Dell EMC PowerEdge R7515 solution delivered better value on a...
PDF
Symantec NetBackup 7.6 benchmark comparison: Data protection in a large-scale...
PDF
SQL Server 2016 database performance on the Dell EMC PowerEdge FC630 QLogic 1...
PDF
Ensure greater uptime and boost VMware vSAN cluster performance with the Del...
PDF
Boost your work with hardware from Intel
PDF
Use VMware vSAN HCI Mesh to manage your vSAN storage resources and share them...
PDF
Citrix XenApp hosted shared desktop performance on Cisco UCS: Cisco VM-FEX vs...
PDF
Power edge mx7000_sds_performance_1018
PDF
Business-critical applications on VMware vSphere 6, VMware Virtual SAN, and V...
PDF
3 key wins: Dell EMC PowerEdge MX with OpenManage Enterprise over Cisco UCS a...
PDF
Migrate VMs faster with a new Dell EMC PowerEdge MX solution
PDF
Increase density and performance with upgrades from Intel and Dell
PDF
Keep data available without affecting user response time
PDF
Compared to a similarly sized solution from a scale-out vendor, the Dell EMC ...
PDF
Pod density comparison: VMware vSphere with Tanzu vs. a bare-metal approach ...
PDF
Give DevOps teams self-service resource pools within your private infrastruct...
PDF
Using VMTurbo to boost performance
PDF
Run compute-intensive Apache Hadoop big data workloads faster with Dell EMC P...
Get improved performance and new features from Dell EMC PowerEdge servers wit...
Upgrade to Dell EMC PowerEdge R940 servers with VMware vSphere 7.0 and gain g...
A single-socket Dell EMC PowerEdge R7515 solution delivered better value on a...
Symantec NetBackup 7.6 benchmark comparison: Data protection in a large-scale...
SQL Server 2016 database performance on the Dell EMC PowerEdge FC630 QLogic 1...
Ensure greater uptime and boost VMware vSAN cluster performance with the Del...
Boost your work with hardware from Intel
Use VMware vSAN HCI Mesh to manage your vSAN storage resources and share them...
Citrix XenApp hosted shared desktop performance on Cisco UCS: Cisco VM-FEX vs...
Power edge mx7000_sds_performance_1018
Business-critical applications on VMware vSphere 6, VMware Virtual SAN, and V...
3 key wins: Dell EMC PowerEdge MX with OpenManage Enterprise over Cisco UCS a...
Migrate VMs faster with a new Dell EMC PowerEdge MX solution
Increase density and performance with upgrades from Intel and Dell
Keep data available without affecting user response time
Compared to a similarly sized solution from a scale-out vendor, the Dell EMC ...
Pod density comparison: VMware vSphere with Tanzu vs. a bare-metal approach ...
Give DevOps teams self-service resource pools within your private infrastruct...
Using VMTurbo to boost performance
Run compute-intensive Apache Hadoop big data workloads faster with Dell EMC P...
Ad

Similar to Improve deep learning inference  performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors (20)

PDF
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
PDF
Complete artificial intelligence workloads faster using Microsoft Azure virtu...
PDF
Complete more Apache Cassandra database work with Microsoft Azure Lsv3-series...
PDF
AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors improv...
PDF
Complete decision support system workloads faster using new Microsoft Azure E...
PDF
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...
PDF
MySQL and Spark machine learning performance on Azure VMsbased on 3rd Gen AMD...
PDF
AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offere...
PDF
Complete online analytics processing work faster with Google Cloud Platform N...
PDF
Accelerate your growth potential
PDF
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...
PDF
Combine containerization and GPU acceleration on VMware: Dell PowerEdge R750 ...
PDF
Get a clearer picture of potential cloud performance by looking beyond SPECra...
PDF
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
PDF
Get competitive logistic regression performance with servers with AMD EPYC 75...
PDF
AMD EPYC 7763 processor-based servers can offer a better value for MySQL work...
PDF
Get insight from document-based distributed MongoDB databases sooner and have...
PDF
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v2 and Intel SSD t...
PDF
Accelerating Real Time Applications on Heterogeneous Platforms
PDF
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
Process data analytics queries faster with new Microsoft Azure Lsv3-series VM...
Complete artificial intelligence workloads faster using Microsoft Azure virtu...
Complete more Apache Cassandra database work with Microsoft Azure Lsv3-series...
AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors improv...
Complete decision support system workloads faster using new Microsoft Azure E...
Google Cloud N2 VM instances featuring 3rd Gen Intel Xeon Scalable processors...
MySQL and Spark machine learning performance on Azure VMsbased on 3rd Gen AMD...
AWS EC2 M6i instances featuring 3rd Gen Intel Xeon Scalable processors offere...
Complete online analytics processing work faster with Google Cloud Platform N...
Accelerate your growth potential
Get higher performance for your MySQL databases with Dell APEX Private Cloud ...
Combine containerization and GPU acceleration on VMware: Dell PowerEdge R750 ...
Get a clearer picture of potential cloud performance by looking beyond SPECra...
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Get competitive logistic regression performance with servers with AMD EPYC 75...
AMD EPYC 7763 processor-based servers can offer a better value for MySQL work...
Get insight from document-based distributed MongoDB databases sooner and have...
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v2 and Intel SSD t...
Accelerating Real Time Applications on Heterogeneous Platforms
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
Ad

More from Principled Technologies (20)

PDF
Modernizing your data center with Dell and AMD
PDF
Dell Pro 14 Plus: Be better prepared for what’s coming
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
PDF
Make GenAI investments go further with the Dell AI Factory - Infographic
PDF
Make GenAI investments go further with the Dell AI Factory
PDF
Unlock faster insights with Azure Databricks
PDF
Speed up your transactions and save with new Dell PowerEdge R7725 servers pow...
PDF
The case for on-premises AI
PDF
Dell PowerEdge server cooling: Choose the cooling options that match the need...
PDF
Speed up your transactions and save with new Dell PowerEdge R7725 servers pow...
PDF
Propel your business into the future by refreshing with new one-socket Dell P...
PDF
Propel your business into the future by refreshing with new one-socket Dell P...
PDF
Unlock flexibility, security, and scalability by migrating MySQL databases to...
PDF
Migrate your PostgreSQL databases to Microsoft Azure for plug‑and‑play simpli...
PDF
On-premises AI approaches: The advantages of a turnkey solution, HPE Private ...
PDF
A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
PDF
Gain the flexibility that diverse modern workloads demand with Dell PowerStore
PDF
Save up to $2.8M per new server over five years by consolidating with new Sup...
PDF
Securing Red Hat workloads on Azure - Summary Presentation
PDF
Securing Red Hat workloads on Azure - Infographic
Modernizing your data center with Dell and AMD
Dell Pro 14 Plus: Be better prepared for what’s coming
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Make GenAI investments go further with the Dell AI Factory - Infographic
Make GenAI investments go further with the Dell AI Factory
Unlock faster insights with Azure Databricks
Speed up your transactions and save with new Dell PowerEdge R7725 servers pow...
The case for on-premises AI
Dell PowerEdge server cooling: Choose the cooling options that match the need...
Speed up your transactions and save with new Dell PowerEdge R7725 servers pow...
Propel your business into the future by refreshing with new one-socket Dell P...
Propel your business into the future by refreshing with new one-socket Dell P...
Unlock flexibility, security, and scalability by migrating MySQL databases to...
Migrate your PostgreSQL databases to Microsoft Azure for plug‑and‑play simpli...
On-premises AI approaches: The advantages of a turnkey solution, HPE Private ...
A Dell PowerStore shared storage solution is more cost-effective than an HCI ...
Gain the flexibility that diverse modern workloads demand with Dell PowerStore
Save up to $2.8M per new server over five years by consolidating with new Sup...
Securing Red Hat workloads on Azure - Summary Presentation
Securing Red Hat workloads on Azure - Infographic

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
KodekX | Application Modernization Development
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PPT
Teaching material agriculture food technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
A Presentation on Artificial Intelligence
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Weekly Chronicles - August'25 Week I
Electronic commerce courselecture one. Pdf
KodekX | Application Modernization Development
cuic standard and advanced reporting.pdf
NewMind AI Monthly Chronicles - July 2025
Teaching material agriculture food technology
Diabetes mellitus diagnosis method based random forest with bat algorithm
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
A Presentation on Artificial Intelligence
Reach Out and Touch Someone: Haptics and Empathic Computing
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
The AUB Centre for AI in Media Proposal.docx
“AI and Expert System Decision Support & Business Intelligence Systems”
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Weekly Chronicles - August'25 Week I

Improve deep learning inference  performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors

  • 1. Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors Newer Esv4 VMs handled more images per second than Esv3 VMs with older processors Using a subset of machine learning—deep learning—to classify images or make predictions from consumer data can help organizations put their mountains of data to good use. For these types of deep learning models, Microsoft Azure offers memory-optimized Esv4-series virtual machines (VMs). Azure Esv4-series VMs are based on Intel® Xeon® Platinum 8272CL processors, which include a feature, Intel Deep Learning Boost, that Intel designed to improve machine learning workloads. At Principled Technologies, we used two inference benchmarks from the Model Zoo for Intel Architecture—ResNet50, which classifies images, and Wide & Deep recommendation system, which makes relationships between data—to compare the inference performance of older Azure Esv3-series VMs to newer Esv4-series VMs at various instance sizes. We found that for both deep learning benchmarks, the upgraded Esv4-series VMs offered significantly better inference performance, which shows that organizations seeking quick data insights can benefit from selecting Microsoft Azure Esv4-series VMs featuring 2nd Generation Intel Xeon Scalable processors. Classify up to 8.40x more images per second Get recommendations from data up to 3.48x as fast Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 A Principled Technologies report: Hands-on testing. Real-world results.
  • 2. Figure 1: Key specifications for each VM we tested. Source: Principled Technologies. How we tested We purchased three sets of virtual machine instances from two memory-optimized Microsoft Azure VM series: • Newer Esv4 series featuring Intel Xeon Platinum 8272CL processors (Cascade Lake) • Older Esv3 series featuring Intel Xeon E5-2673 v4 processors (Broadwell) We ran each VM in the East US region. Figure 1 shows the specifications for the virtual machines that we chose. To show how businesses with different deep learning demands can benefit from choosing Esv4-series VMs, we tested small (8 vCPU), medium (16 vCPU), and large (64 vCPU) VM sizes. Small (E8s_v4) 8 vCPUs Medium (E16s_v4) 16 vCPUs Large (E64s_v4) 64 vCPUs About 2nd Generation Intel Xeon Scalable processors with Intel Deep Learning Boost The 2nd Generation Intel Xeon Scalable processor platform—codenamed Cascade Lake—features a wide range of processor types, including Bronze, Silver, Gold, and Platinum, to support varying workload needs. To accelerate machine learning inference, 2nd Gen Intel Xeon Scalable processors offer Intel Deep Learning Boost (DL Boost). Intel DL Boost builds on Intel Advanced Vector Extensions 512 (AVX-512) instructions with Intel Vector Neural Network Instructions (VNNI), combining multiple processor instructions into one to improve machine learning inference performance through resource optimization.1 To learn more about Intel DL Boost built into 2nd Generation Intel Xeon Scalable processors, visit https://www. intel.com/content/dam/www/public/us/en/documents/product-overviews/dl-boost-product-overview.pdf. Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 2
  • 3. What Esv4-series VMs can offer your organization Compared to older Esv3 VMs, ESv4-series VMs offer: • Up to 506GB RAM • Guaranteed Cascade Lake processor with all-core Turbo clock speed of 3.4 GHz and Intel Vector Neural Network support (AVX-512 VNNI) • The ability to support premium storage and premium storage caching Image classification results – ResNet50 From Model Zoo for Intel Architecture, we chose the popular ResNet50 deep learning benchmark for testing. ResNet50 is a convolutional neural network that runs 50 layers deep and recognizes and classifies images. Using deep learning to classify images is useful for real-world applications such as self-driving cars or aiding in medical diagnoses. The benchmark reported throughput in images per second that the solutions handled using this model, with higher scores indicating better performance at this type of deep learning. Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 3
  • 4. Small instances If your deep learning needs are on the smaller side, selecting an Azure VM with 8 vCPUs could meet your image classification needs. We found that a new Azure Esv4-series VM with 8 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision) classified 8.40 times the number of images per second using the ResNet50 benchmark as the small-sized VM with previous-generation processors (with FP32 precision). 0 1 2 3 4 5 8 6 E8s_v4 E8s_v3 8 vCPU ResNet50 normalized images/sec throughput Images/sec Higher is better 9 1 7 8.4 Figure 2: Relative number of images per second that the small-size VMs (8 vCPUs) classified using the ResNet50 benchmark. Higher numbers are better. Source: Principled Technologies. Medium instances Larger models or datasets may benefit from an increase to 16 vCPUs per virtual machine. We found that a new Azure Esv4-series VM with 16 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision) classified 6.67 times the number of images per second using the ResNet50 benchmark as the medium-sized VM with previous-generation processors (with FP32 precision). Higher is better Images/sec E16s_v4 E16s_v3 16 vCPU ResNet50 normalized images/sec throughput 0 1 2 3 4 5 8 6 9 1 7 6.67 Figure 3: Relative number of images per second that the medium-size VMs (16 vCPUs) classified using the ResNet50 benchmark. Higher numbers are better. Source: Principled Technologies. 6.67x the images per second 8.40x the images per second Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 4
  • 5. Large instances If your organization needs to run deep learning workloads to classify even larger datasets, VMs with 64 vCPUs can better tackle your needs. We found that a new Azure Esv4-series VM with 64 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision) classified 5.96 times the number of images per second using the ResNet50 benchmark as the large-sized VM with previous-generation processors (with FP32 precision). Higher is better Images/sec E64s_v4 E64s_v3 64 vCPU ResNet50 normalized images/sec throughput 5.96 0 1 2 3 4 5 8 6 9 1 7 Figure 4: Relative number of images per second that the large-size VMs (64 vCPUs) classified using the ResNet50 benchmark. Higher numbers are better. Source: Principled Technologies. Get more value from your cloud VMs Budget considerations require weighing the cost of any performance improvements. Put simply: is the boost in performance worth the additional cost? We found that for deep learning performance on Microsoft Azure Esv4-series VMs, the answer is yes. Based on our test results, newer Esv4-series VMs can offer up to 8.40 times the deep learning performance at a lower (0.94x) overall cost. This means that upgraded Esv4-series VMs with 2nd Gen Intel Xeon Scalable processors can offer better overall value compared to older Esv3-series VMs. 5.96x the images per second Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 5
  • 6. Making recommendations based on data – Wide & Deep learning recommender We used Model Zoo for Intel Architecture for Wide & Deep learning recommender testing. Wide & Deep uses wide linear models and deep neural networks to infer meaningful relationships between data and deliver recommendations based on that data. The benchmark reports the number of samples per second that the instance handled, with more samples indicating better performance. Small instances Smaller deep learning problems with smaller datasets may require VMs configured with 8 vCPUs. We found that a new Azure Esv4-series VM with 8 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision) handled 3.48 times the number of samples per second using the Wide & Deep benchmark as the small-sized VM with previous-generation processors (with FP32 precision). 0 1 3 8 vCPU normalized throughput Samples/sec Higher is better 4 E8s_v4 E8s_v3 3.48 1 2 Figure 5: Relative number of samples per second that the small-size VMs (8 vCPUs) handled using the Wide & Deep benchmark. Higher numbers are better. Source: Principled Technologies. 3.48x the samples per second Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 6
  • 7. Medium instances For those seeking to make recommendations based on mid-sized datasets, 16-vCPU VMs may be more appropriate. We found that a new Azure Esv4-series VM with 16 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision) handled 3.23 times the number of samples per second using the Wide & Deep benchmark as the medium-sized VM with previous-generation processors (with FP32 precision). Samples/sec E16s_v4 E16s_v3 16 vCPU normalized throughput Higher is better 0 1 3 4 3.23 1 2 Figure 6: Relative number of samples per second that the medium-size VMs (16 vCPUs) handled using the Wide & Deep benchmark. Higher numbers are better. Source: Principled Technologies. Large instances VMs aren’t one-size-fits-all, so large models and datasets may require more powerful virtual machines with 64 vCPUs. We found that a new Azure Esv4-series VM with 64 vCPUs featuring 2nd Gen Intel Xeon Scalable processors (with INT8 precision) handled 2.99 times the number of samples per second using the Wide & Deep benchmark as the large-sized VM with previous-generation processors (with FP32 precision). Samples/sec E64s_v4 E64s_v3 64 vCPU normalized throughput Higher is better 0 1 3 4 2.99 1 2 Figure 7: Relative number of samples per second that the large-size VMs (64 vCPUs) handled using the Wide & Deep benchmark. Higher numbers are better. Source: Principled Technologies. 2.99x the samples per second 3.23x the samples per second Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 7
  • 8. Get insights faster with Azure Esv4 VMs featuring 2nd Gen Intel Xeon Scalable processors While deep learning models and their applications can vary widely, getting insights from data faster is always the goal, to drive innovation or boost consumer sales. In our tests, we found that newer Microsoft Azure Esv4- series VMs featuring 2nd Gen Intel Xeon Scalable processors—which offer Intel Deep Learning Boost—improved deep learning inference performance for image classification and recommendations over older Esv3 VMs. And at just 0.94x the cost, the Esv4 series offers significantly better value per VM, which could mean your organization requires fewer VMs to support. By choosing Microsoft Azure Esv4-series VMs with 2nd Gen Intel Xeon Scalable processors, your organization can get deep learning insights from data faster than with older Esv3-series VMs. 1 Intel, “Intel Deep Learning Boost,” accessed July 29, 2021, https://www.intel.com/content/dam/www/public/us/en/ documents/product-overviews/dl-boost-product-overview.pdf. Principled Technologies is a registered trademark of Principled Technologies, Inc. All other product names are the trademarks of their respective owners. For additional information, review the science behind this report. Principled Technologies® Facts matter.® Principled Technologies® Facts matter.® This project was commissioned by Intel. Read the science behind this report at http://facts.pt/lHyrW1n Improve deep learning inference performance with Microsoft Azure Esv4 VMs with 2nd Gen Intel Xeon Scalable processors September 2021 | 8