SlideShare a Scribd company logo
George Markomanolis
KAUST Supercomputing Laboratory
2017-11-15
The Virtual Institute of I/O and the IO-500 BOF
Denver, Colorado, USA
Experience using the IO-500
Why to use/contribute to IO500
benchmark?
• It is a community effort, we need your feedback, if something does not work,
report it and we will try to find a solution
• Even a submission of results is contribution, people can find how your storage
performs
• Present the results to the users of your system and educate them how the hard
cases perform, in order to avoiding similar approaches.
• It is fun:
• Exploring various filesystems
• Decide for your next procurement
• Keep track of performance on the storage with the new upcoming technologies, is quite
interesting
Challenges
• Debugging on two nodes, could be totally different experience than one
node 
• Python and MPI caused us a lot of issues, some burnt core hours
• As result, some unfinished executions, did not erase the created data
• A library for parallel find, demands to find the mpicc command, Cray
systems do not have mpicc, fixing by modifying manually the configure file.
• In some cases, configure could fail and we just had to load latest autotools
module, or fix something.
• Some commands on not classic filesystems, maybe they do not report back
what you expect
How to run IO-500
• git clone https://github.com/VI4IO/io-500-dev
• cd io-500-dev
• ./utilities/prepare.sh
• ./io500.sh (submit this script if you use a scheduler)
• email results to submit@io500.org
Modify IO-500
• Modify io500.sh accordingly, for example:
io500_mpirun="mpirun"
io500_mpiargs="-np 2"
io500_ior_easy_params="-t 2048k -b 2g -F"
io500_mdtest_easy_files_per_proc=25000
Modify IO-500 II
• Modify io500.sh accordingly, select which experiments to be executed:
io500_run_ior_easy="True"
io500_run_md_easy="True "
…
io500_run_md_hard_delete="True"
• For valid submission, you need to execute all the tests while the write
phases should take at least 5 minutes
Modify IO-500 III
• Modify io500.sh accordingly, uncomment these lines and declare the
path to your pfind wrapper:
#io500_find_mpi="True"
#io500_find_cmd="$PWD/bin/pfind"
Example of a not valid test case
[RESULT] BW phase 1 ior_easy_write 96.133 GB/s : time 187.24 seconds
[RESULT] BW phase 2 ior_hard_write 11.230 GB/s : time 46.79 seconds
[RESULT] BW phase 3 ior_easy_read 109.249 GB/s : time 164.76 seconds
[RESULT] BW phase 4 ior_hard_read 7.871 GB/s : time 66.74 seconds
[RESULT] IOPS phase 1 mdtest_easy_write 49.231 kiops : time 19.61 seconds
[RESULT] IOPS phase 2 mdtest_hard_write 15.444 kiops : time 17.05 seconds
[RESULT] IOPS phase 3 find 8.120 kiops : time 98.45 seconds
[RESULT] IOPS phase 5 mdtest_easy_stat 5.313 kiops : time 127.18 seconds
[RESULT] IOPS phase 6 mdtest_hard_stat 6.772 kiops : time 30.43 seconds
[RESULT] IOPS phase 7 mdtest_easy_delete 14.873 kiops : time 49.98 seconds
[RESULT] IOPS phase 8 mdtest_hard_read 45.599 kiops : time 10.16 seconds
[RESULT] IOPS phase 9 mdtest_hard_delete 30.776 kiops : time 11.84 seconds
[SCORE] Bandwidth 31.04 GB/s : IOPS 16.1537 kiops : TOTAL 501.4108
Experience with IO500 benchmark
• With not proper tuning, the benchmark will finish either too fast or too slow
• Start tuning with small values and increase them till you find the ones that produce
the required outcome
• Be sure that you have enough space for the output data
• Check form the IOR output if it recognizes correctly the number of processes and
how many are used per node
• If the benchmark is too slow without reason, check if other users execute intensive
I/O applications
• Be sure that you do not harm the system, try to execute the benchmark when the
system is not too busy or during maintenance
• For the IOR Hard, you could stripe the corresponding folder
KAUST – Lustre – IO-500
• 1000 compute nodes, 16000 processes, 144 OSTs
• ior_easy_params="-t 2m -b 5440m”
• ior_hard_writes_per_proc=792
• mdtest_hard_files_per_proc=380
• mdtest_easy_files_per_proc=452
KAUST – Cray DataWarp – IO-500
• 300 compute nodes, 2400 processes, 268 DataWarp nodes
• ior_easy_params="-t 2m -b 192616m”
• ior_hard_writes_per_proc=77872
• mdtest_hard_files_per_proc=1630
• mdtest_easy_files_per_proc=10800
Profiling IO500 with Darshan I
IOR easy
Profiling IO500 with Darshan II
IOR easy
Profiling IO500 with Darshan III
IOR easy
Profiling IO500 with Darshan IV
IOR easy
Profiling IO500 with Darshan
IOR Hard
Profiling IO500 with Darshan II
IOR Hard
Profiling IO500 with Darshan
MD Hard
Profiling IO500 with Darshan II
MD Hard
Presenting data in radar chart
Score
IO
MD
Tot IOPs 0
0.5
1
Radar chart
Ranked systems
#1 #2
The best storage I/O system
should be represented in a full
diamond graph
Conclusions
• Tuning the parameters, could take some time depending on the system and
the experience
• The good news is that as community we can solve many issues
• Till now the IOR easy is considered the normal approach for procurement,
however, this does not correspond to the real application
• We need a better way to understand the procurement of storage and IO500
seems to be in the right direction
• We plan some future additions, such as mix workload
• More submissions we have, the better to understand the various
filesystems

More Related Content

PPTX
raid technology
PDF
Linux File System
PPTX
A Brief History of Big Data
PDF
Protection of big data privacy
PPT
Introduction to MongoDB
PPTX
Debunking the Myths of HDFS Erasure Coding Performance
PDF
Re-imagine Data Monitoring with whylogs and Spark
PPTX
File system Os
raid technology
Linux File System
A Brief History of Big Data
Protection of big data privacy
Introduction to MongoDB
Debunking the Myths of HDFS Erasure Coding Performance
Re-imagine Data Monitoring with whylogs and Spark
File system Os

What's hot (20)

PPTX
Hadoop And Their Ecosystem ppt
PPTX
Network attached stroage
PPT
Chapter 10 - File System Interface
PPTX
13. case study
PPTX
Raid Technology
PDF
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
DOCX
PPTX
Splunk introduction
PPTX
File Protection in Operating System
PPTX
File system.
PPT
Chapter 13 - I/O Systems
PPTX
File System Interface
PDF
File System Implementation - Part1
PPT
Disk structure
PPTX
PDF
The CAP Theorem
DOC
File System FAT And NTFS
PPTX
3 Data Mining Tasks
PPTX
DBMS - RAID
PPT
Oracle Database Vault
Hadoop And Their Ecosystem ppt
Network attached stroage
Chapter 10 - File System Interface
13. case study
Raid Technology
Taming the ever-evolving Compliance Beast : Lessons learnt at LinkedIn [Strat...
Splunk introduction
File Protection in Operating System
File system.
Chapter 13 - I/O Systems
File System Interface
File System Implementation - Part1
Disk structure
The CAP Theorem
File System FAT And NTFS
3 Data Mining Tasks
DBMS - RAID
Oracle Database Vault
Ad

Similar to Experience using the IO-500 (20)

PDF
Understanding and Measuring I/O Performance
PDF
Introducing IO-500 benchmark
PPTX
UKOUG, Lies, Damn Lies and I/O Statistics
PDF
Disk IO Benchmarking in shared multi-tenant environments
PDF
Filesystem Performance from a Database Perspective
PPTX
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
PDF
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
PDF
SSD & HDD Performance Testing with TKperf
PDF
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
PPTX
SQLIO - measuring storage performance
PDF
PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...
PPTX
IO Dubi Lebel
PDF
My Sql Performance On Ec2
PDF
My Sql Performance In A Cloud
PDF
P99CONF — What We Need to Unlearn About Persistent Storage
PPTX
Performance tuning for software raid6 driver in linux
PDF
Presentation aix performance tuning
PPTX
Writing Applications for Scylla
PDF
A New IO Scheduler Algorithm for Mixed Workloads
Understanding and Measuring I/O Performance
Introducing IO-500 benchmark
UKOUG, Lies, Damn Lies and I/O Statistics
Disk IO Benchmarking in shared multi-tenant environments
Filesystem Performance from a Database Perspective
Oracle Open World 2014: Lies, Damned Lies, and I/O Statistics [ CON3671]
Scylla Summit 2022: Operating at Monstrous Scales: Benchmarking Petabyte Work...
SSD & HDD Performance Testing with TKperf
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
SQLIO - measuring storage performance
PostgreSQL Portland Performance Practice Project - Database Test 2 Filesystem...
IO Dubi Lebel
My Sql Performance On Ec2
My Sql Performance In A Cloud
P99CONF — What We Need to Unlearn About Persistent Storage
Performance tuning for software raid6 driver in linux
Presentation aix performance tuning
Writing Applications for Scylla
A New IO Scheduler Algorithm for Mixed Workloads
Ad

More from George Markomanolis (17)

PDF
Evaluating GPU programming Models for the LUMI Supercomputer
PDF
Utilizing AMD GPUs: Tuning, programming models, and roadmap
PDF
Exploring the Programming Models for the LUMI Supercomputer
PDF
Getting started with AMD GPUs
PDF
Analyzing ECP Proxy Apps with the Profiling Tool Score-P
PDF
Introduction to Extrae/Paraver, part I
PDF
Performance Analysis with Scalasca, part II
PDF
Performance Analysis with Scalasca on Summit Supercomputer part I
PDF
Performance Analysis with TAU on Summit Supercomputer, part II
PDF
How to use TAU for Performance Analysis on Summit Supercomputer
PDF
Harshad - Handle Darshan Data
PDF
Lustre Best Practices
PDF
Burst Buffer: From Alpha to Omega
PDF
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
PDF
markomanolis_phd_defense
PDF
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
PDF
Introduction to Performance Analysis tools on Shaheen II
Evaluating GPU programming Models for the LUMI Supercomputer
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Exploring the Programming Models for the LUMI Supercomputer
Getting started with AMD GPUs
Analyzing ECP Proxy Apps with the Profiling Tool Score-P
Introduction to Extrae/Paraver, part I
Performance Analysis with Scalasca, part II
Performance Analysis with Scalasca on Summit Supercomputer part I
Performance Analysis with TAU on Summit Supercomputer, part II
How to use TAU for Performance Analysis on Summit Supercomputer
Harshad - Handle Darshan Data
Lustre Best Practices
Burst Buffer: From Alpha to Omega
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
markomanolis_phd_defense
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Introduction to Performance Analysis tools on Shaheen II

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Introduction to Data Science and Data Analysis
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Computer network topology notes for revision
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Quality review (1)_presentation of this 21
[EN] Industrial Machine Downtime Prediction
Qualitative Qantitative and Mixed Methods.pptx
climate analysis of Dhaka ,Banglades.pptx
.pdf is not working space design for the following data for the following dat...
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Galatica Smart Energy Infrastructure Startup Pitch Deck
Lecture1 pattern recognition............
Introduction to Knowledge Engineering Part 1
1_Introduction to advance data techniques.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to Data Science and Data Analysis
Clinical guidelines as a resource for EBP(1).pdf
STUDY DESIGN details- Lt Col Maksud (21).pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Computer network topology notes for revision
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Supervised vs unsupervised machine learning algorithms
Data_Analytics_and_PowerBI_Presentation.pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx

Experience using the IO-500

  • 1. George Markomanolis KAUST Supercomputing Laboratory 2017-11-15 The Virtual Institute of I/O and the IO-500 BOF Denver, Colorado, USA Experience using the IO-500
  • 2. Why to use/contribute to IO500 benchmark? • It is a community effort, we need your feedback, if something does not work, report it and we will try to find a solution • Even a submission of results is contribution, people can find how your storage performs • Present the results to the users of your system and educate them how the hard cases perform, in order to avoiding similar approaches. • It is fun: • Exploring various filesystems • Decide for your next procurement • Keep track of performance on the storage with the new upcoming technologies, is quite interesting
  • 3. Challenges • Debugging on two nodes, could be totally different experience than one node  • Python and MPI caused us a lot of issues, some burnt core hours • As result, some unfinished executions, did not erase the created data • A library for parallel find, demands to find the mpicc command, Cray systems do not have mpicc, fixing by modifying manually the configure file. • In some cases, configure could fail and we just had to load latest autotools module, or fix something. • Some commands on not classic filesystems, maybe they do not report back what you expect
  • 4. How to run IO-500 • git clone https://github.com/VI4IO/io-500-dev • cd io-500-dev • ./utilities/prepare.sh • ./io500.sh (submit this script if you use a scheduler) • email results to submit@io500.org
  • 5. Modify IO-500 • Modify io500.sh accordingly, for example: io500_mpirun="mpirun" io500_mpiargs="-np 2" io500_ior_easy_params="-t 2048k -b 2g -F" io500_mdtest_easy_files_per_proc=25000
  • 6. Modify IO-500 II • Modify io500.sh accordingly, select which experiments to be executed: io500_run_ior_easy="True" io500_run_md_easy="True " … io500_run_md_hard_delete="True" • For valid submission, you need to execute all the tests while the write phases should take at least 5 minutes
  • 7. Modify IO-500 III • Modify io500.sh accordingly, uncomment these lines and declare the path to your pfind wrapper: #io500_find_mpi="True" #io500_find_cmd="$PWD/bin/pfind"
  • 8. Example of a not valid test case [RESULT] BW phase 1 ior_easy_write 96.133 GB/s : time 187.24 seconds [RESULT] BW phase 2 ior_hard_write 11.230 GB/s : time 46.79 seconds [RESULT] BW phase 3 ior_easy_read 109.249 GB/s : time 164.76 seconds [RESULT] BW phase 4 ior_hard_read 7.871 GB/s : time 66.74 seconds [RESULT] IOPS phase 1 mdtest_easy_write 49.231 kiops : time 19.61 seconds [RESULT] IOPS phase 2 mdtest_hard_write 15.444 kiops : time 17.05 seconds [RESULT] IOPS phase 3 find 8.120 kiops : time 98.45 seconds [RESULT] IOPS phase 5 mdtest_easy_stat 5.313 kiops : time 127.18 seconds [RESULT] IOPS phase 6 mdtest_hard_stat 6.772 kiops : time 30.43 seconds [RESULT] IOPS phase 7 mdtest_easy_delete 14.873 kiops : time 49.98 seconds [RESULT] IOPS phase 8 mdtest_hard_read 45.599 kiops : time 10.16 seconds [RESULT] IOPS phase 9 mdtest_hard_delete 30.776 kiops : time 11.84 seconds [SCORE] Bandwidth 31.04 GB/s : IOPS 16.1537 kiops : TOTAL 501.4108
  • 9. Experience with IO500 benchmark • With not proper tuning, the benchmark will finish either too fast or too slow • Start tuning with small values and increase them till you find the ones that produce the required outcome • Be sure that you have enough space for the output data • Check form the IOR output if it recognizes correctly the number of processes and how many are used per node • If the benchmark is too slow without reason, check if other users execute intensive I/O applications • Be sure that you do not harm the system, try to execute the benchmark when the system is not too busy or during maintenance • For the IOR Hard, you could stripe the corresponding folder
  • 10. KAUST – Lustre – IO-500 • 1000 compute nodes, 16000 processes, 144 OSTs • ior_easy_params="-t 2m -b 5440m” • ior_hard_writes_per_proc=792 • mdtest_hard_files_per_proc=380 • mdtest_easy_files_per_proc=452
  • 11. KAUST – Cray DataWarp – IO-500 • 300 compute nodes, 2400 processes, 268 DataWarp nodes • ior_easy_params="-t 2m -b 192616m” • ior_hard_writes_per_proc=77872 • mdtest_hard_files_per_proc=1630 • mdtest_easy_files_per_proc=10800
  • 12. Profiling IO500 with Darshan I IOR easy
  • 13. Profiling IO500 with Darshan II IOR easy
  • 14. Profiling IO500 with Darshan III IOR easy
  • 15. Profiling IO500 with Darshan IV IOR easy
  • 16. Profiling IO500 with Darshan IOR Hard
  • 17. Profiling IO500 with Darshan II IOR Hard
  • 18. Profiling IO500 with Darshan MD Hard
  • 19. Profiling IO500 with Darshan II MD Hard
  • 20. Presenting data in radar chart Score IO MD Tot IOPs 0 0.5 1 Radar chart Ranked systems #1 #2 The best storage I/O system should be represented in a full diamond graph
  • 21. Conclusions • Tuning the parameters, could take some time depending on the system and the experience • The good news is that as community we can solve many issues • Till now the IOR easy is considered the normal approach for procurement, however, this does not correspond to the real application • We need a better way to understand the procurement of storage and IO500 seems to be in the right direction • We plan some future additions, such as mix workload • More submissions we have, the better to understand the various filesystems