SlideShare a Scribd company logo
Accelerate Ceph performance via
SPDK related techniques
Ziye Yang
Intel Corporation
Oct 2015
SG
Storage Group
Legal Disclaimer
2
Notice: This document contains information on products in the design phase of development. The information here is subject to change without notice. Do not finalize a design
with this information.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at Intel.com, or from the
OEM or retailer.
No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses.
You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant
Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein.
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata
are available on request.
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel
representative to obtain the latest Intel product specifications and roadmaps.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as
well as any warranty arising from course of performance, course of dealing, or usage in trade.
<Use only if applicable> Warning: Altering PC clock or memory frequency and/or voltage may (i) reduce system stability and use life of the system, memory and processor; (ii)
cause the processor and other system components to fail; (iii) cause reductions in system performance; (iv) cause additional heat or other damage; and (v) affect system data
integrity. Intel assumes no responsibility that the memory, included if used with altered clock frequencies and/or voltages, will be fit for any particular purpose. Check with memory
manufacturer for warranty and additional details.
<Use only if applicable> Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual
performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark
results, visit http://www.intel.com/performance.
<Use only if applicable> Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances and configurations, may
affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.
<Use only if applicable> Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational
purposes. Any differences in your system hardware, software or configuration may affect your actual performance.
<Use only if applicable> Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and
confirm whether referenced data are accurate.
<Use only if applicable> Intel® AMT should be used by a knowledgeable IT administrator and requires enabled systems, software, activation, and connection to a corporate
network. Intel AMT functionality on mobile systems may be limited in some situations. Your results will depend on your specific implementation. Learn more by visiting Intel®
Active Management Technology.
<Use only if applicable> Intel is a sponsor and member of the Benchmark XPRT Development Community, and was the major developer of the XPRT family of benchmarks.
Principled Technologies is the publisher of the XPRT family of benchmarks. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases.
Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting
www.intel.com/design/literature.htm.
Intel and the Intel logo <Add terms trademarked by Intel and used in this document> are trademarks of Intel Corporation in the U. S. and/or other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 20xx, Intel Corporation. All Rights Reserved.
SG
Storage Group
Agenda
Recent common requirements for Ceph
Middle Cache tiering solution
Building block techniques
I/O optimization technique
 DPDK for storage
Data Processing Acceleration techniques
 ISA-L
Conclusion
3
Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other
names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
SG
Storage Group
Agenda
Recent common requirements for Ceph
Middle Cache tiering solution
Building blocks techniques
I/O optimization technique
 DPDK for storage
Data Processing Acceleration techniques
 ISA-L
Conclusion
4
Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other
names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
SG
Storage Group
Recent Common requirements for Ceph
Legacy protocol support
For example, iSCSI interface support for
transparently migrating applications from
Enterprise storage to cloud storage –
Ceph
High performance requirements for Ceph
Low latency for front end applications
5
Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries.
*Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
SG
Storage Group
Agenda
Recent common requirements for Ceph
Middle Cache tiering solution
Building block techniques
I/O optimization technique
 DPDK for storage
Data Processing Acceleration techniques
 ISA-L
Conclusion
6
Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other
names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
SG
Storage Group
Provide Mid-Tier Cache between
applications and Ceph
• Protocol support:
• Aim to target iSCSI/NFS/NVMF
protocols.
• High performance
• Provide local cache in each
Mid-Tier node
• Provide write log for data
consistency.
• HA:
• Replicate to 1+ additional
nodes
• Heartbeat for failed node
detection
7
iSCSI Initiator
Mid-Tier Cache
Node
Ceph Cluster
iSCSI/NVME
Target
Ceph RBD
Cache
Write Log
Mid-Tier Cache
NodeiSCSI/NVMF
Target
Ceph RBD
Cache
Write Log
iSCSI InitiatoriSCSI InitiatoriSCSI/NVMF
Initiator
SG
Storage Group
Agenda
Recent common requirements for Ceph
Middle Cache tiering solution
Building blocks techniques
I/O optimization technique
 DPDK for storage
Data Processing Acceleration techniques
 ISA-L
Conclusion
8
Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other
names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
SG
Storage Group
DPDK for Storage Overview
Uses Intel® DPDK and UNS technology
- Optimized user space lockless polling technology in the NIC driver
- Presents lock-light libraries and network TCP/IP services
Provides an enhanced software stack that optimizes iSCSI front end targets
- Optimizes packets in user space using lockless polling mechanisms
- Reference software available for customer application integration to NVMe or other backend
Supports Linux* operating systems
Enables a higher system level performance for iSCSI targets
Currently available as reference software
iSCSI
Target
Customer
Storage
App
Intel® DPDK
NIC Driver
TCP
IP
(UNS)
NVMe Driver
Intel® DPDK LIBRARIES
NIC
User-space
Mem Driver DDR
CBDMACBDMA Driver
CLOUD
9
WRITEREAD
Customer SW
Existing SW
Linux* Kernel
Enhanced SW
NVMe
*Other names and brands may be claimed as the property of others.
SG
Storage Group
High-Level Block Diagram
Third Party SW
New Intel SW
Linux* Kernel
Existing Intel SW
SG
Storage Group
Intel DPDK for Storage Benefits
+ Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific
computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you
in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Source: Intel Internal Measurements as of 22 August 2014. See back up slides for configuration details.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and
SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-
dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors.
Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804.
For more information go to http://www.intel.com/performance
12
Up to 7x Better Performance+
Up to 10x fewer cores utilized
with the NVMe Driver
Reduces Total
Cost of Ownership
Free Source Code
User Space
Implementation
 vs. Linux*-IO Target (LIO)
 Or 1/7 CPU overhead at same performance
 Vs. Linux* NVMe Driver
 Reduce BOM costs between $80-500 by removing the need for a TOE
 Utilize free CPU cycles for other workloads
 Customizable source code available as reference
 Evaluation source available upon request
 Portable/Upgradable and permissive licensing
 Requires Software License Agreement for full product use
*Other names and brands may be claimed as the property of others.
SG
Storage Group
Intel DPDK for
Storage
Full Packaging
and Contents
Library Package Includes:
 Intel DPDK | UNS | Optimized Storage Stacks
as reference software
 User space support code (written in C):
- POSIX compliant
- Demo/Usage, Unit test (functional correctness),
Basic performance
 API manuals –
may include links or copy key papers
 Release.txt
(release notes, version, and library serial IDs)
 Linux* Support
Source Agreement
 Source available under Restricted Use
License
Agreement Confidential (RULAC)
 Source code available under source license
agreements (SLA)
13
SG
Storage Group
SG
Storage Group
Intel® DPDK for Storage:
Case Study1: Performance
Comparison with LIO
SG
Storage Group
Intel® Xeon® Processor E5-2620v2-iSCSI Read/Write:
4 KB Data (performance per/core)
0
100
200
300
400
500
600
IO/s(inthousands)
LIO 6
Core
DPDK for
Storage
2 Core
DPDK
for
Storage
1 Core
0
50
100
150
200
250
300
350
IO/s(inthousands)
LIODPDK for
Storage
15
+ Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer
systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your
contemplated purchases, including the performance of that product when combined with other products. Source: Intel Internal Measurements as of 22 August 2014. See back up slide # 10-13 for configuration details.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3
instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent
optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors.
Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804.
For more information go to http://www.intel.com/performance
NVM
Express
Backend
PERFORMANCE PERFORMANCE/CORE
Up to 650% increase in max performance per core+
4 KB-Random-
100% Read
4 KB-Random-
70% Read 30%
Write
4 KB-Random-
100% Write
SG
Storage Group
SG
Storage Group
Intel® DPDK for Storage
Case Study2: User space NVME
driver(SPDK) Benefit
SG
Storage Group
4 KB Random Read Performance: 4 x NVMe Drives
Single-Core Intel® Xeon® Processor
DPDK for Storage NVMe driver delivers up to 6x performance improvement
vs. Kernel NVMe driver with a single-core Intel® Xeon® processor
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1 Partition 2 Partitions 4 Partitions 8 Partitions 16 Partitions
IOps(thousands)
WKB NVMe Driver Kernel NVMe Driver
For test configuration details see slide # 16 and 18
Disclaimer: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary.
You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. For more information go to http://www.intel.com/performance.
17
Kernel
NVMe Driver
DPDK for
Storage
NVMe Driver
SG
Storage Group
4 KB Random Read Performance: 1-4 NVMe Drives
Single-Core Intel® Xeon® Processor
DPDK for Storage NVMe driver scales linearly in performance
from 1 to 4 NVMe drives with a single-core Intel® Xeon® processor
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1 NVMe 2 NVMe 4 NVMe
IOps(thousands)
For test configuration details see slide # 16 and 18
Disclaimer: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary.
You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. For more information go to http://www.intel.com/performance.
18
Kernel
NVMe Driver
DPDK for
Storage
NVMe Driver
SG
Storage Group
Agenda
Recent common requirements for Ceph
Middle Cache tiering solution
Building blocks techniques
I/O optimization technique
 DPDK for storage
Data Processing Acceleration techniques
 ISA-L
Conclusion
20
Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other
names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
SG
Storage Group
Benefits of using Intel ISA-L
Intel ISA-L
enables Storage
OEMs to obtain
more performance
from Intel CPUs and
reduce investment
in developing their
own optimizations
FASTER TTM/
LESS RESOURCES
than developing optimizations
from scratch
7X BANDWIDTHUp to
for Hash functions compared
to OpenSSL algorithms
4X BANDWIDTHUp to
improvement on compression
compared to zlib
that use new architectural
enhancements that are TTM
DEVELOP
OPTIMIZATIONS
Allows
Intel to
of additional coresMAXIMUM UTILIZATIONAllows
21
SG
Storage Group
Intel® ISA-L
Packaging
and Contents
Supports 64-bit,
Intel®Xeon®and
Atom Processor
E5-2600/2400
and Atom C2000
product family
forward
Source Code Library
Single Core
Low-Level
Functions
(OS independent
functions)
Gen Over
Gen Function
Updates
to take advantage
of new processor
features
22
SG
Storage Group
Intel® ISA-L Functions
PERFORMANCE OPTIMIZING
DATA
PROTECTION
XOR (RAID 5), P+Q (RAID 6), Reed Solomon Erasure Code
COMPRESSION
“DEFLATE”
IGZIP: Fast CompressionMulti-Buffer: SHA-1, SHA-256, SHA-512, MD5
CRYPTOGRAPHIC
HASHING
Dog
06d80e7
b0C50bs
49a509t
b49f249
24e8c8o
05x84q4
CRC-T10, CRC-IEEE (802.3),
CRC32-iSCSI
DATA
INTEGRITY
ReceiverSender
CRC DataDivisor
00..0 Data
Remainder
Remainder
Divisor
CRC Data
Zero,
accept
Non-zero,
reject
CRC n bits
n+1 bits
XTS-AES 128,
XTS-AES 256
ENCRYPTION
plaintext
ReceiverSender
plaintext
Decryption
Algorithm
Encryption
Algorithm
Ciphertext
Public encryption key
Private encryption keydB
eB
23
SG
Storage Group
A B C D E
Store
Data
Hashing Usage: Data Deduplication Optimizations (Fix Size)
0010 1010 0101 1100 1100 1101 1010 0010
Data Chunking
DEDUPLICATION ENGINE
0010 1010 0101 0010 1100 1010 1101 1100
A B C A D B E D
Indexing
DATA PROCESSING
Intel ISA-L
Multi-buffer Hashing
Algorithms
SHA-1, SHA-256, SHA-512, MD5
Intel ISA-L
Hashing Function
Stitching Algorithm
Multi-hash-sha1+murmur3_128
1010110010
00101010101
101110101
0101010101
INCOMING
DATA
STREAM
Intel ISA-L
3rdParty
Key:
Performance Over
OpenSSL Algorithms7XUp
To
24
SG
Storage Group
Hashing Usage: Data Deduplication Optimizations (Dynamic Size)
0010
10101
101
0101
INCOMING
DATA
STREAM
Intel ISA-L
3rdParty
Key:
A B C D
Store
Data
001 01010 001 01010 1101 001 1100 1100
Data Chunking
DEDUPLICATION ENGINE
001 01010 001 1100 01010 1101 1100
A B A A C B D C
Indexing
DATA PROCESSING
Intel ISA-L
Multi-buffer Hashing
Algorithms
SHA-1, SHA-256, SHA-512, MD5
Intel ISA-L
Hashing Function
Stitching Algorithm
Multi-hash-sha1+murmur3_128
Performance Over
OpenSSL Algorithms7XUp
To
DATA
PROCESSING
Intel ISA-L
Rolling hash
fingerprinting
001
SG
Storage Group
Intel ISA-L provides a solution to deploy Erasure Code (EC) with better
performance, so that data replication can be done faster with half the space of
other methods.
• Support any Matrixes: Vandermonde Reed-Solomon EC, Cauchy Reed-Solomon EC
• Support the different EC strategies: Local Reloadable Code EC, Regeneration Code EC, Hitchhiker Code EC
P1 P2 P3
D1 D2 D3 D4 D6D5 D7 D8 D9
DATA
PROCESSING
Intel ISA-L
EC(9+3) Encode
Performance Over
Traditional Lookup
Table Code
~10X
Storage Capacity
Needed
Storage Capacity Needed
Erasure Code
3X
Replication
D1 D2 D3 D4 D6D5 D7 D8 D9
Reconstruct D1 and D4
Performance Over
Traditional Lookup
Table Code
~10X
P1 D2 D3 P2 D6D5 D7 D8 D9
DATA
PROCESSING
Intel ISA-L
EC(9+3) Decode
D4D1
Intel ISA-LKey:
Source: "Erasure Code and Intel® Intelligent Storage Acceleration Library”
http://www.intel.com/content/www/us/en/storage/erasure-code-isa-l-solution-video.html
SG
Storage Group
27
Solving Real-World Problems: Qihoo 360
Source: Case Study "Intel and Qihoo 360 Internet Portal Datacenter - Big Data Storage Optimization Case Study”
https://software.intel.com/en-us/articles/intel-and-qihoo-360-internet-portal-datacenter-big-data-storage-optimization-case-study
DEPLOYED
INTEL-ISAL-
BASED
HDFS Raid for INTEL-
XEON-BASED Cold
Storage
EC Encode SPEEDS
45X FASTER
THAN JAVA VRS
EC Decode SPEEDS
36X FASTER
THAN JAVA VRS
REDUCED
C O S T S
BY 25%~30%
SG
Storage Group
28
Solving Real-World Problems: Alibaba
DEPLOYED
INTEL-ISAL-BASED
Sheepdog Erasure Code
for INTEL-ATOM-
C2000-BASED Cold
Storage
5X
CPU utilization reduction
Source: Case Study "Lambert: Achieve High Durability, Low Cost & Flexibility at Same Time, Open source storage engine for exabyte data in Alibaba”
http://events.linuxfoundation.org/sites/events/files/slides/LFVault2015_Alibaba.pdf
Data Recovery SPEEDS
4X FASTER
THAN Sheepdog ZFEC
SG
Storage Group
In this presentation, we introduce the storage
optimization techniques provided by Intel for
accelerating the Ceph performance:
• I/O optimization technique: DPDK for storage (SPDK)
• Data Processing Acceleration: ISA-L
These kinds of building block techniques can
help customers to accelerate the Ceph
performance on IA platform
Conclusion
SG
Storage Group
Q & A ?
Accelerate Ceph performance via SPDK related techniques

More Related Content

PDF
PDF
Crimson: Ceph for the Age of NVMe and Persistent Memory
PDF
Ceph on arm64 upload
PDF
The Linux Block Layer - Built for Fast Storage
PDF
Introduction to Greenplum
PDF
Spark streaming: Best Practices
PDF
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
PPTX
Linux Network Stack
Crimson: Ceph for the Age of NVMe and Persistent Memory
Ceph on arm64 upload
The Linux Block Layer - Built for Fast Storage
Introduction to Greenplum
Spark streaming: Best Practices
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Linux Network Stack

What's hot (20)

PPTX
Gpdb best practices v a01 20150313
PDF
Physical Memory Management.pdf
PDF
2021.02 new in Ceph Pacific Dashboard
PDF
USENIX ATC 2017: Visualizing Performance with Flame Graphs
PDF
IT Automation with Ansible
PPTX
Kubernetes introduction
PDF
A crash course in CRUSH
PDF
BlueStore, A New Storage Backend for Ceph, One Year In
PPTX
What you need to know about ceph
PDF
DevJam 2019 - Introduction to Kubernetes
PPTX
RocksDB compaction
PDF
Replicação Lógica no PostgreSQL 10
PDF
Best Practices with PostgreSQL on Solaris
PDF
Cassandra overview
PPTX
Apache Flink and what it is used for
PDF
Ansible Introduction
PDF
Apache spark 2.3 and beyond
PDF
2019.06.27 Intro to Ceph
PDF
Alphorm.com Formation Ansible : Le Guide Complet du Débutant
KEY
Introduction to Cassandra: Replication and Consistency
Gpdb best practices v a01 20150313
Physical Memory Management.pdf
2021.02 new in Ceph Pacific Dashboard
USENIX ATC 2017: Visualizing Performance with Flame Graphs
IT Automation with Ansible
Kubernetes introduction
A crash course in CRUSH
BlueStore, A New Storage Backend for Ceph, One Year In
What you need to know about ceph
DevJam 2019 - Introduction to Kubernetes
RocksDB compaction
Replicação Lógica no PostgreSQL 10
Best Practices with PostgreSQL on Solaris
Cassandra overview
Apache Flink and what it is used for
Ansible Introduction
Apache spark 2.3 and beyond
2019.06.27 Intro to Ceph
Alphorm.com Formation Ansible : Le Guide Complet du Débutant
Introduction to Cassandra: Replication and Consistency
Ad

Similar to Accelerate Ceph performance via SPDK related techniques (20)

PDF
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
PPTX
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
PDF
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
PPTX
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
PDF
1 intro to_dpdk_and_hw
PDF
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
PDF
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
PDF
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
PDF
LF_DPDK_Accelerate storage service via SPDK
PDF
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
PDF
3 additional dpdk_theory(1)
PDF
4 dpdk roadmap(1)
PPTX
Ceph Day Taipei - Accelerate Ceph via SPDK
PDF
Ceph Day Beijing - Storage Modernization with Intel and Ceph
PDF
Ceph Day Beijing - Storage Modernization with Intel & Ceph
PDF
Crooke CWF Keynote FINAL final platinum
PDF
Accelerating Virtual Machine Access with the Storage Performance Development ...
PDF
What are latest new features that DPDK brings into 2018?
PDF
Hw09 Optimizing Hadoop Deployments
PDF
Hw09 Optimizing Hadoop Deployments
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
Ceph Day Tokyo - Delivering cost effective, high performance Ceph cluster
Ceph Day Seoul - Delivering Cost Effective, High Performance Ceph cluster
Ceph Day KL - Delivering cost-effective, high performance Ceph cluster
1 intro to_dpdk_and_hw
Ceph Day Taipei - Delivering cost-effective, high performance, Ceph cluster
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
Ceph on Intel: Intel Storage Components, Benchmarks, and Contributions
LF_DPDK_Accelerate storage service via SPDK
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
3 additional dpdk_theory(1)
4 dpdk roadmap(1)
Ceph Day Taipei - Accelerate Ceph via SPDK
Ceph Day Beijing - Storage Modernization with Intel and Ceph
Ceph Day Beijing - Storage Modernization with Intel & Ceph
Crooke CWF Keynote FINAL final platinum
Accelerating Virtual Machine Access with the Storage Performance Development ...
What are latest new features that DPDK brings into 2018?
Hw09 Optimizing Hadoop Deployments
Hw09 Optimizing Hadoop Deployments
Ad

Recently uploaded (20)

PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Modernizing your data center with Dell and AMD
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Big Data Technologies - Introduction.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Modernizing your data center with Dell and AMD
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Spectral efficient network and resource selection model in 5G networks
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
NewMind AI Weekly Chronicles - August'25 Week I
Unlocking AI with Model Context Protocol (MCP)
Building Integrated photovoltaic BIPV_UPV.pdf
The AUB Centre for AI in Media Proposal.docx
Reach Out and Touch Someone: Haptics and Empathic Computing
Encapsulation_ Review paper, used for researhc scholars
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
“AI and Expert System Decision Support & Business Intelligence Systems”
Big Data Technologies - Introduction.pptx
cuic standard and advanced reporting.pdf
Understanding_Digital_Forensics_Presentation.pptx

Accelerate Ceph performance via SPDK related techniques

  • 1. Accelerate Ceph performance via SPDK related techniques Ziye Yang Intel Corporation Oct 2015
  • 2. SG Storage Group Legal Disclaimer 2 Notice: This document contains information on products in the design phase of development. The information here is subject to change without notice. Do not finalize a design with this information. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at Intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. <Use only if applicable> Warning: Altering PC clock or memory frequency and/or voltage may (i) reduce system stability and use life of the system, memory and processor; (ii) cause the processor and other system components to fail; (iii) cause reductions in system performance; (iv) cause additional heat or other damage; and (v) affect system data integrity. Intel assumes no responsibility that the memory, included if used with altered clock frequencies and/or voltages, will be fit for any particular purpose. Check with memory manufacturer for warranty and additional details. <Use only if applicable> Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. <Use only if applicable> Cost reduction scenarios described are intended as examples of how a given Intel- based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. <Use only if applicable> Results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance. <Use only if applicable> Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. <Use only if applicable> Intel® AMT should be used by a knowledgeable IT administrator and requires enabled systems, software, activation, and connection to a corporate network. Intel AMT functionality on mobile systems may be limited in some situations. Your results will depend on your specific implementation. Learn more by visiting Intel® Active Management Technology. <Use only if applicable> Intel is a sponsor and member of the Benchmark XPRT Development Community, and was the major developer of the XPRT family of benchmarks. Principled Technologies is the publisher of the XPRT family of benchmarks. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel and the Intel logo <Add terms trademarked by Intel and used in this document> are trademarks of Intel Corporation in the U. S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation. All Rights Reserved.
  • 3. SG Storage Group Agenda Recent common requirements for Ceph Middle Cache tiering solution Building block techniques I/O optimization technique  DPDK for storage Data Processing Acceleration techniques  ISA-L Conclusion 3 Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
  • 4. SG Storage Group Agenda Recent common requirements for Ceph Middle Cache tiering solution Building blocks techniques I/O optimization technique  DPDK for storage Data Processing Acceleration techniques  ISA-L Conclusion 4 Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
  • 5. SG Storage Group Recent Common requirements for Ceph Legacy protocol support For example, iSCSI interface support for transparently migrating applications from Enterprise storage to cloud storage – Ceph High performance requirements for Ceph Low latency for front end applications 5 Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
  • 6. SG Storage Group Agenda Recent common requirements for Ceph Middle Cache tiering solution Building block techniques I/O optimization technique  DPDK for storage Data Processing Acceleration techniques  ISA-L Conclusion 6 Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
  • 7. SG Storage Group Provide Mid-Tier Cache between applications and Ceph • Protocol support: • Aim to target iSCSI/NFS/NVMF protocols. • High performance • Provide local cache in each Mid-Tier node • Provide write log for data consistency. • HA: • Replicate to 1+ additional nodes • Heartbeat for failed node detection 7 iSCSI Initiator Mid-Tier Cache Node Ceph Cluster iSCSI/NVME Target Ceph RBD Cache Write Log Mid-Tier Cache NodeiSCSI/NVMF Target Ceph RBD Cache Write Log iSCSI InitiatoriSCSI InitiatoriSCSI/NVMF Initiator
  • 8. SG Storage Group Agenda Recent common requirements for Ceph Middle Cache tiering solution Building blocks techniques I/O optimization technique  DPDK for storage Data Processing Acceleration techniques  ISA-L Conclusion 8 Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
  • 9. SG Storage Group DPDK for Storage Overview Uses Intel® DPDK and UNS technology - Optimized user space lockless polling technology in the NIC driver - Presents lock-light libraries and network TCP/IP services Provides an enhanced software stack that optimizes iSCSI front end targets - Optimizes packets in user space using lockless polling mechanisms - Reference software available for customer application integration to NVMe or other backend Supports Linux* operating systems Enables a higher system level performance for iSCSI targets Currently available as reference software iSCSI Target Customer Storage App Intel® DPDK NIC Driver TCP IP (UNS) NVMe Driver Intel® DPDK LIBRARIES NIC User-space Mem Driver DDR CBDMACBDMA Driver CLOUD 9 WRITEREAD Customer SW Existing SW Linux* Kernel Enhanced SW NVMe *Other names and brands may be claimed as the property of others.
  • 10. SG Storage Group High-Level Block Diagram Third Party SW New Intel SW Linux* Kernel Existing Intel SW
  • 11. SG Storage Group Intel DPDK for Storage Benefits + Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Intel Internal Measurements as of 22 August 2014. See back up slides for configuration details. Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor- dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804. For more information go to http://www.intel.com/performance 12 Up to 7x Better Performance+ Up to 10x fewer cores utilized with the NVMe Driver Reduces Total Cost of Ownership Free Source Code User Space Implementation  vs. Linux*-IO Target (LIO)  Or 1/7 CPU overhead at same performance  Vs. Linux* NVMe Driver  Reduce BOM costs between $80-500 by removing the need for a TOE  Utilize free CPU cycles for other workloads  Customizable source code available as reference  Evaluation source available upon request  Portable/Upgradable and permissive licensing  Requires Software License Agreement for full product use *Other names and brands may be claimed as the property of others.
  • 12. SG Storage Group Intel DPDK for Storage Full Packaging and Contents Library Package Includes:  Intel DPDK | UNS | Optimized Storage Stacks as reference software  User space support code (written in C): - POSIX compliant - Demo/Usage, Unit test (functional correctness), Basic performance  API manuals – may include links or copy key papers  Release.txt (release notes, version, and library serial IDs)  Linux* Support Source Agreement  Source available under Restricted Use License Agreement Confidential (RULAC)  Source code available under source license agreements (SLA) 13
  • 13. SG Storage Group SG Storage Group Intel® DPDK for Storage: Case Study1: Performance Comparison with LIO
  • 14. SG Storage Group Intel® Xeon® Processor E5-2620v2-iSCSI Read/Write: 4 KB Data (performance per/core) 0 100 200 300 400 500 600 IO/s(inthousands) LIO 6 Core DPDK for Storage 2 Core DPDK for Storage 1 Core 0 50 100 150 200 250 300 350 IO/s(inthousands) LIODPDK for Storage 15 + Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Intel Internal Measurements as of 22 August 2014. See back up slide # 10-13 for configuration details. Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804. For more information go to http://www.intel.com/performance NVM Express Backend PERFORMANCE PERFORMANCE/CORE Up to 650% increase in max performance per core+ 4 KB-Random- 100% Read 4 KB-Random- 70% Read 30% Write 4 KB-Random- 100% Write
  • 15. SG Storage Group SG Storage Group Intel® DPDK for Storage Case Study2: User space NVME driver(SPDK) Benefit
  • 16. SG Storage Group 4 KB Random Read Performance: 4 x NVMe Drives Single-Core Intel® Xeon® Processor DPDK for Storage NVMe driver delivers up to 6x performance improvement vs. Kernel NVMe driver with a single-core Intel® Xeon® processor 0 200 400 600 800 1000 1200 1400 1600 1800 2000 1 Partition 2 Partitions 4 Partitions 8 Partitions 16 Partitions IOps(thousands) WKB NVMe Driver Kernel NVMe Driver For test configuration details see slide # 16 and 18 Disclaimer: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance. 17 Kernel NVMe Driver DPDK for Storage NVMe Driver
  • 17. SG Storage Group 4 KB Random Read Performance: 1-4 NVMe Drives Single-Core Intel® Xeon® Processor DPDK for Storage NVMe driver scales linearly in performance from 1 to 4 NVMe drives with a single-core Intel® Xeon® processor 0 200 400 600 800 1000 1200 1400 1600 1800 2000 1 NVMe 2 NVMe 4 NVMe IOps(thousands) For test configuration details see slide # 16 and 18 Disclaimer: Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance. 18 Kernel NVMe Driver DPDK for Storage NVMe Driver
  • 18. SG Storage Group Agenda Recent common requirements for Ceph Middle Cache tiering solution Building blocks techniques I/O optimization technique  DPDK for storage Data Processing Acceleration techniques  ISA-L Conclusion 20 Intel and the Intel logo are trademarks of Intel Corporation in the U. S. and/or other countries. *Other names and brands may be claimed as the property of others. Copyright © 20xx, Intel Corporation.
  • 19. SG Storage Group Benefits of using Intel ISA-L Intel ISA-L enables Storage OEMs to obtain more performance from Intel CPUs and reduce investment in developing their own optimizations FASTER TTM/ LESS RESOURCES than developing optimizations from scratch 7X BANDWIDTHUp to for Hash functions compared to OpenSSL algorithms 4X BANDWIDTHUp to improvement on compression compared to zlib that use new architectural enhancements that are TTM DEVELOP OPTIMIZATIONS Allows Intel to of additional coresMAXIMUM UTILIZATIONAllows 21
  • 20. SG Storage Group Intel® ISA-L Packaging and Contents Supports 64-bit, Intel®Xeon®and Atom Processor E5-2600/2400 and Atom C2000 product family forward Source Code Library Single Core Low-Level Functions (OS independent functions) Gen Over Gen Function Updates to take advantage of new processor features 22
  • 21. SG Storage Group Intel® ISA-L Functions PERFORMANCE OPTIMIZING DATA PROTECTION XOR (RAID 5), P+Q (RAID 6), Reed Solomon Erasure Code COMPRESSION “DEFLATE” IGZIP: Fast CompressionMulti-Buffer: SHA-1, SHA-256, SHA-512, MD5 CRYPTOGRAPHIC HASHING Dog 06d80e7 b0C50bs 49a509t b49f249 24e8c8o 05x84q4 CRC-T10, CRC-IEEE (802.3), CRC32-iSCSI DATA INTEGRITY ReceiverSender CRC DataDivisor 00..0 Data Remainder Remainder Divisor CRC Data Zero, accept Non-zero, reject CRC n bits n+1 bits XTS-AES 128, XTS-AES 256 ENCRYPTION plaintext ReceiverSender plaintext Decryption Algorithm Encryption Algorithm Ciphertext Public encryption key Private encryption keydB eB 23
  • 22. SG Storage Group A B C D E Store Data Hashing Usage: Data Deduplication Optimizations (Fix Size) 0010 1010 0101 1100 1100 1101 1010 0010 Data Chunking DEDUPLICATION ENGINE 0010 1010 0101 0010 1100 1010 1101 1100 A B C A D B E D Indexing DATA PROCESSING Intel ISA-L Multi-buffer Hashing Algorithms SHA-1, SHA-256, SHA-512, MD5 Intel ISA-L Hashing Function Stitching Algorithm Multi-hash-sha1+murmur3_128 1010110010 00101010101 101110101 0101010101 INCOMING DATA STREAM Intel ISA-L 3rdParty Key: Performance Over OpenSSL Algorithms7XUp To 24
  • 23. SG Storage Group Hashing Usage: Data Deduplication Optimizations (Dynamic Size) 0010 10101 101 0101 INCOMING DATA STREAM Intel ISA-L 3rdParty Key: A B C D Store Data 001 01010 001 01010 1101 001 1100 1100 Data Chunking DEDUPLICATION ENGINE 001 01010 001 1100 01010 1101 1100 A B A A C B D C Indexing DATA PROCESSING Intel ISA-L Multi-buffer Hashing Algorithms SHA-1, SHA-256, SHA-512, MD5 Intel ISA-L Hashing Function Stitching Algorithm Multi-hash-sha1+murmur3_128 Performance Over OpenSSL Algorithms7XUp To DATA PROCESSING Intel ISA-L Rolling hash fingerprinting 001
  • 24. SG Storage Group Intel ISA-L provides a solution to deploy Erasure Code (EC) with better performance, so that data replication can be done faster with half the space of other methods. • Support any Matrixes: Vandermonde Reed-Solomon EC, Cauchy Reed-Solomon EC • Support the different EC strategies: Local Reloadable Code EC, Regeneration Code EC, Hitchhiker Code EC P1 P2 P3 D1 D2 D3 D4 D6D5 D7 D8 D9 DATA PROCESSING Intel ISA-L EC(9+3) Encode Performance Over Traditional Lookup Table Code ~10X Storage Capacity Needed Storage Capacity Needed Erasure Code 3X Replication D1 D2 D3 D4 D6D5 D7 D8 D9 Reconstruct D1 and D4 Performance Over Traditional Lookup Table Code ~10X P1 D2 D3 P2 D6D5 D7 D8 D9 DATA PROCESSING Intel ISA-L EC(9+3) Decode D4D1 Intel ISA-LKey: Source: "Erasure Code and Intel® Intelligent Storage Acceleration Library” http://www.intel.com/content/www/us/en/storage/erasure-code-isa-l-solution-video.html
  • 25. SG Storage Group 27 Solving Real-World Problems: Qihoo 360 Source: Case Study "Intel and Qihoo 360 Internet Portal Datacenter - Big Data Storage Optimization Case Study” https://software.intel.com/en-us/articles/intel-and-qihoo-360-internet-portal-datacenter-big-data-storage-optimization-case-study DEPLOYED INTEL-ISAL- BASED HDFS Raid for INTEL- XEON-BASED Cold Storage EC Encode SPEEDS 45X FASTER THAN JAVA VRS EC Decode SPEEDS 36X FASTER THAN JAVA VRS REDUCED C O S T S BY 25%~30%
  • 26. SG Storage Group 28 Solving Real-World Problems: Alibaba DEPLOYED INTEL-ISAL-BASED Sheepdog Erasure Code for INTEL-ATOM- C2000-BASED Cold Storage 5X CPU utilization reduction Source: Case Study "Lambert: Achieve High Durability, Low Cost & Flexibility at Same Time, Open source storage engine for exabyte data in Alibaba” http://events.linuxfoundation.org/sites/events/files/slides/LFVault2015_Alibaba.pdf Data Recovery SPEEDS 4X FASTER THAN Sheepdog ZFEC
  • 27. SG Storage Group In this presentation, we introduce the storage optimization techniques provided by Intel for accelerating the Ceph performance: • I/O optimization technique: DPDK for storage (SPDK) • Data Processing Acceleration: ISA-L These kinds of building block techniques can help customers to accelerate the Ceph performance on IA platform Conclusion