SlideShare a Scribd company logo
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING1
Spectre/Meltdown Impact on
High Performance Workloads
Red Hat Performance Engineering
NVIDIA GTC, March 2018
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING2
● Context
● Performance Data and Capacity Planning
● KBase Overview
● Going Forward
AGENDA
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING3
VULNERABILITY OVERVIEW
Meltdown - Rogue data cache load
● Allows reading all memory
Spectre Variant 1 - Bounds check bypass
● Throw-away data left insecure in CPU cache
Spectre Variant 2 - Branch Target Injection
● CPU branch predictor trained by exploit
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
DETERMINING VULNERABILITY
• OS vendors provide tools to determine vulnerability
• Recent Linux and RHEL kernels include the following
new sysfs entries
4
$ ls -1 /sys/devices/system/cpu/vulnerabilities/*
/sys/devices/system/cpu/vulnerabilities/meltdown
/sys/devices/system/cpu/vulnerabilities/spectre_v
1
/sys/devices/system/cpu/vulnerabilities/spectre_v
2
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
Userspace vs. Kernelspace
Userspace ( e.g. /bin/bash)
Operating System (e.g. Linux kernel)
System Call Interface
5
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING6
PERFORMANCE DATA
AND
CAPACITY PLANNING
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING7
TensorFlow on NVIDIA GPUs, CUDA 9.1
%diff vs Pre-CVE Kernel Baseline
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING8
KERNEL BYPASS NETWORKING - Broadwell
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING9
CAPACITY PLANNING
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING10
● RHEL 6/7 have transparent hugepages enabled by default
○ Reduces amount of TLB entries and thus total flush impact
● RHEL 7.4.z has PCID support from 4.14
○ Reduces impact of TLB flushes by tagging/tracking
○ (2010+ CPUs have it...check /proc/cpuinfo)
● RHEL has runtime knobs to disable patches (no reboot req’d)
HOW TO MANAGE PERFORMANCE IMPACT
echo 0 > /sys/kernel/debug/x86/pti_enabled
echo 0 >
/sys/kernel/debug/x86/ibrs_enabled
echo 0 >
/sys/kernel/debug/x86/retp_enabled
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING11
IMPACT VARIES BY WORKLOAD
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING12
● Community landing page https://meltdownattack.com/
● Meltdown and Spectre in 3 Minutes
● FOSDEM Closing Keynote
● Speculative Execution Exploit Performance Impacts
● Controlling the Performance Impact of Microcode and
Security Patches
● Intel.com: Advancing Security at the Silicon Level
LINKS
JEREMY EDER - RED HAT PERFORMANCE ENGINEERING
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews

More Related Content

PDF
Red Hat Summit 2018 5 New High Performance Features in OpenShift
PDF
NVIDIA GTC 2018: Enabling GPU-as-a-Service Providers with Red Hat OpenShift
PDF
Triangle Kubernetes Meetup - Performance Sensitive Apps in OpenShift
PDF
Best practices for optimizing Red Hat platforms for large scale datacenter de...
PDF
NVIDIA GTC 2019: Red Hat and the NVIDIA DGX: Tried, Tested, Trusted
PDF
Rhel8 Beta - Halifax RHUG
PDF
LinuxCon NA 2016: When Containers and Virtualization Do - and Don’t - Work T...
PDF
Part 4 Maximizing the utilization of GPU resources on-premise and in the cloud
Red Hat Summit 2018 5 New High Performance Features in OpenShift
NVIDIA GTC 2018: Enabling GPU-as-a-Service Providers with Red Hat OpenShift
Triangle Kubernetes Meetup - Performance Sensitive Apps in OpenShift
Best practices for optimizing Red Hat platforms for large scale datacenter de...
NVIDIA GTC 2019: Red Hat and the NVIDIA DGX: Tried, Tested, Trusted
Rhel8 Beta - Halifax RHUG
LinuxCon NA 2016: When Containers and Virtualization Do - and Don’t - Work T...
Part 4 Maximizing the utilization of GPU resources on-premise and in the cloud

What's hot (20)

PDF
OSCON 2017: To contain or not to contain
PDF
A Container Stack for Openstack - OpenStack Silicon Valley
PDF
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
PDF
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
PDF
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
PDF
KubeCon NA, Seattle, 2016: Performance and Scalability Tuning Kubernetes for...
PDF
Deploying Containers at Scale on OpenStack
PDF
Red Hat Summit 2017: Wicked Fast PaaS: Performance Tuning of OpenShift and D...
PDF
XPDS16: Live scalability for vGPU using gScale - Xiao Zheng, Intel
PDF
XPDS16: Live Migration of vGPU - Xiao Zheng, Intel Asia-Pacific Research & De...
PDF
Ceph Tech Talk: Ceph at DigitalOcean
PDF
Patroni: Kubernetes-native PostgreSQL companion
PDF
Let's turn your PostgreSQL into columnar store with cstore_fdw
PDF
AMD EPYC™ Microprocessor Architecture
 
PPTX
AMD EPYC 7002 World Records
 
PPTX
AMD EPYC World Records
 
PDF
LAS16-207: Bus scaling QoS
PDF
How to Burn Multi-GPUs using CUDA stress test memo
PDF
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
PDF
RedGateWebinar - Where did my CPU go?
OSCON 2017: To contain or not to contain
A Container Stack for Openstack - OpenStack Silicon Valley
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
Protecting the Galaxy - Multi-Region Disaster Recovery with OpenStack and Ceph
Part 2 Maximizing the utilization of GPU resources on-premise and in the cloud
KubeCon NA, Seattle, 2016: Performance and Scalability Tuning Kubernetes for...
Deploying Containers at Scale on OpenStack
Red Hat Summit 2017: Wicked Fast PaaS: Performance Tuning of OpenShift and D...
XPDS16: Live scalability for vGPU using gScale - Xiao Zheng, Intel
XPDS16: Live Migration of vGPU - Xiao Zheng, Intel Asia-Pacific Research & De...
Ceph Tech Talk: Ceph at DigitalOcean
Patroni: Kubernetes-native PostgreSQL companion
Let's turn your PostgreSQL into columnar store with cstore_fdw
AMD EPYC™ Microprocessor Architecture
 
AMD EPYC 7002 World Records
 
AMD EPYC World Records
 
LAS16-207: Bus scaling QoS
How to Burn Multi-GPUs using CUDA stress test memo
Cephfs - Red Hat Openstack and Ceph meetup, Pune 28th november 2015
RedGateWebinar - Where did my CPU go?
Ad

Similar to NVIDIA GTC 2018: Spectre/Meltdown Impact on High Performance Workloads (20)

PDF
RHEL_Overview_Customer_Presentation_(golddeck)_customer_version
PDF
Red Hat for IBM System z IBM Enterprise2014 Las Vegas
PDF
OpenStack Benelux Conference 2014 | Plenair | RedHat
PDF
tburke_rhel6_summit.pdf
PPTX
Unix tc
PDF
Module 13 - Troubleshooting
PDF
Red Hat Enterprise Linux 8 Technical overview v1(1).pdf
PDF
Openstack Benelux Conference 2014 Red Hat Keynote
PDF
RHEL roadmap
PDF
Red Hat for IBM System z Update v5
PDF
2010-01-28 NSA Open Source User Group Meeting, Current & Future Linux on Syst...
PDF
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
PDF
2010-11-08 NSA Technical Symposium
PDF
2011-11-03 Intelligence Community Cloud Users Group
PDF
Kernel Recipes 2017 - The state of kernel self-protection - Kees Cook
PDF
2009-09-24 Get the Hype on System z Webinar with IBM, Current & Future Linux ...
PPTX
Linux remote
PDF
2008-09-09 IBM Interaction Conference, Red Hat Update for System z
PDF
2009-04-14 IBM ISV Partnership Call
PDF
RHEL roadmap
RHEL_Overview_Customer_Presentation_(golddeck)_customer_version
Red Hat for IBM System z IBM Enterprise2014 Las Vegas
OpenStack Benelux Conference 2014 | Plenair | RedHat
tburke_rhel6_summit.pdf
Unix tc
Module 13 - Troubleshooting
Red Hat Enterprise Linux 8 Technical overview v1(1).pdf
Openstack Benelux Conference 2014 Red Hat Keynote
RHEL roadmap
Red Hat for IBM System z Update v5
2010-01-28 NSA Open Source User Group Meeting, Current & Future Linux on Syst...
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
2010-11-08 NSA Technical Symposium
2011-11-03 Intelligence Community Cloud Users Group
Kernel Recipes 2017 - The state of kernel self-protection - Kees Cook
2009-09-24 Get the Hype on System z Webinar with IBM, Current & Future Linux ...
Linux remote
2008-09-09 IBM Interaction Conference, Red Hat Update for System z
2009-04-14 IBM ISV Partnership Call
RHEL roadmap
Ad

Recently uploaded (20)

PDF
Modernizing your data center with Dell and AMD
PDF
Approach and Philosophy of On baking technology
PPT
Teaching material agriculture food technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Cloud computing and distributed systems.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Modernizing your data center with Dell and AMD
Approach and Philosophy of On baking technology
Teaching material agriculture food technology
Dropbox Q2 2025 Financial Results & Investor Presentation
The Rise and Fall of 3GPP – Time for a Sabbatical?
The AUB Centre for AI in Media Proposal.docx
Review of recent advances in non-invasive hemoglobin estimation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
Cloud computing and distributed systems.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Unlocking AI with Model Context Protocol (MCP)
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Machine learning based COVID-19 study performance prediction
“AI and Expert System Decision Support & Business Intelligence Systems”
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

NVIDIA GTC 2018: Spectre/Meltdown Impact on High Performance Workloads

  • 1. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING1 Spectre/Meltdown Impact on High Performance Workloads Red Hat Performance Engineering NVIDIA GTC, March 2018
  • 2. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING2 ● Context ● Performance Data and Capacity Planning ● KBase Overview ● Going Forward AGENDA
  • 3. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING3 VULNERABILITY OVERVIEW Meltdown - Rogue data cache load ● Allows reading all memory Spectre Variant 1 - Bounds check bypass ● Throw-away data left insecure in CPU cache Spectre Variant 2 - Branch Target Injection ● CPU branch predictor trained by exploit
  • 4. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING DETERMINING VULNERABILITY • OS vendors provide tools to determine vulnerability • Recent Linux and RHEL kernels include the following new sysfs entries 4 $ ls -1 /sys/devices/system/cpu/vulnerabilities/* /sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/spectre_v 1 /sys/devices/system/cpu/vulnerabilities/spectre_v 2
  • 5. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING Userspace vs. Kernelspace Userspace ( e.g. /bin/bash) Operating System (e.g. Linux kernel) System Call Interface 5
  • 6. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING6 PERFORMANCE DATA AND CAPACITY PLANNING
  • 7. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING7 TensorFlow on NVIDIA GPUs, CUDA 9.1 %diff vs Pre-CVE Kernel Baseline
  • 8. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING8 KERNEL BYPASS NETWORKING - Broadwell
  • 9. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING9 CAPACITY PLANNING
  • 10. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING10 ● RHEL 6/7 have transparent hugepages enabled by default ○ Reduces amount of TLB entries and thus total flush impact ● RHEL 7.4.z has PCID support from 4.14 ○ Reduces impact of TLB flushes by tagging/tracking ○ (2010+ CPUs have it...check /proc/cpuinfo) ● RHEL has runtime knobs to disable patches (no reboot req’d) HOW TO MANAGE PERFORMANCE IMPACT echo 0 > /sys/kernel/debug/x86/pti_enabled echo 0 > /sys/kernel/debug/x86/ibrs_enabled echo 0 > /sys/kernel/debug/x86/retp_enabled
  • 11. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING11 IMPACT VARIES BY WORKLOAD
  • 12. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING12 ● Community landing page https://meltdownattack.com/ ● Meltdown and Spectre in 3 Minutes ● FOSDEM Closing Keynote ● Speculative Execution Exploit Performance Impacts ● Controlling the Performance Impact of Microcode and Security Patches ● Intel.com: Advancing Security at the Silicon Level LINKS
  • 13. JEREMY EDER - RED HAT PERFORMANCE ENGINEERING THANK YOU plus.google.com/+RedHat linkedin.com/company/red-hat youtube.com/user/RedHatVideos facebook.com/redhatinc twitter.com/RedHatNews