SlideShare a Scribd company logo
ORNL is managed by UT-Battelle, LLC for the US Department of Energy
Introduction to Extrae/Paraver
George S. Markomanolis
7 August 2019
22 Open slide master to edit
Extrae/Paraver
• Developed by Barcelona Supercomputing Center
• Extrae for instrumentation
• Paraver for visualization and performance analysis
• Installed version on Summit: v3.7.1
• Module: extrae
• Web site: https://tools.bsc.es/extrae
https://tools.bsc.es/paraver
33 Open slide master to edit
Capability Matrix - Extrae
Capability Profiling Tracing Notes/Limitations
MPI, MPI-IO Yes Yes
OpenMP CPU Yes Yes Only GNU
OpenMP GPU Yes Yes Only with GNU compiler, no OpenACC
OpenACC No No
CUDA Yes Yes Not advanced
POSIX I/O ?? ??
POSIX threads Yes Yes
Memory – app-level Yes Yes Need to use dynamic allocation
Memory – func-level Yes Yes Need to use dynamic allocation
Hotspot Detection Yes Yes
Variance Detection Yes Yes
Hardware Counters Yes Yes
44 Open slide master to edit
Compilation
• Extrae is a bit more complicate to start using it compared to many
other tools
• We can have dynamic or static compilation
– For static, it is required to recompile
– For dynamic is required to compile with -g, it works even without -g but less
information will be instrumented:
55 Open slide master to edit
How does Extrae work?
• Symbol substitution through LD_PRELOAD
– We need to use specific libraries based on programming language/model
• Dynamic instrumentation (based on DynInst)
• Static link
66 Open slide master to edit
Trace Generation Workflow
77 Open slide master to edit
Library Selection
• Choose a library depending on the application type
• The suffix “f” is for Fortran codes
88 Open slide master to edit
99 Open slide master to edit
Extrae XML configuration - MPI
1010 Open slide master to edit
Execution and Merging
• jsrun -n 64 -r 8 -a 1 -c 1 ./trace.sh ./miniWeather_mpi
• trace.sh:
#!/bin/bash
export EXTRAE_HOME=/sw/summit/extrae/3.7.1/rhel7.5_gnu6.4.0
export EXTRAE_CONFIG_FILE=/full_path/extrae.xml
export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so:$LD_PRELOAD
$*
• jsrun -n 64 -r 8 -a 1 -c 1 mpimpi2prv -f TRACE.mpits -e miniWeather_mpi
1111 Open slide master to edit
After the execution with merging
• A folder set-X where X is number 0,1, etc. with the traces, one folder
for every 256 MPI processes
• Files based on the merging output, *.prv, *.pcf, *.row, the first one is
the merged trace and the rest information about the trace and the
events.
• Now you need to visualize the trace for performance analysis.
• We use the tool Paraver, it is available for Linux, Mac, Windows and
already pre-compiled (https://tools.bsc.es/downloads), quite difficult
to be built on Power processor. Available on Rhea or your computer.
1212 Open slide master to edit
Paraver on Rhea
% ssh –Y username@rhea.ccs.ornl.gov
% module load paraver
% wxparaver
Location for configuration files: /sw/rhea/paraver/cfgs/
ls /sw/rhea/paraver/cfgs/
burst_mode clustering counters_PAPI CUDA folding General Java mpi
OmpSs OpenCL OpenMP pthread sampling+folding scripts
software_countersspectral uninstall.sh
1313 Open slide master to edit
Paraver – Load trace
Load trace
1414 Open slide master to edit
Paraver - Filter trace
Reduce trace size
Click Yes
1515 Open slide master to edit
Paraver - Filter trace
Click Browse and load the filter.xml file
1616 Open slide master to edit
Paraver – Visualize trace
Click Browse and load the filter.xml file
1717 Open slide master to edit
Paraver – Investigating trace
Remove the communication links
1818 Open slide master to edit
Paraver - Zoom
Zoom, left click with mouse and
select area moving the cursor
horizontally towards right and
decide which part we want to
study
1919 Open slide master to edit
Paraver – Computation configuration file
Click on Open Control Window• We load h_comp_time.cfg, File -> Load configuration
2020 Open slide master to edit
Paraver – Computation configuration file
Zoom
• We load h_comp_time.cfg, File -> Load configuration
We can see some
iterations
2121 Open slide master to edit
Paraver – Extract part of the original trace
• Select Filter Trace
• Select for Input the original trace
• Select cut for the execution chain
• Trace options: Use original time to be able
to compare between traces and remove last
state
• Click Select region and mark the area to cut from the original trace
• Click Apply, the trace will be created and loaded
2222 Open slide master to edit
Paraver – MPI Profile
• Select Hints -> MPI -> MPI Profile
Scroll down
Click Hints -> MPI profile -> Histogram
Zoom
The average under the column
Outside MPI represents the parallel
efficiency, the value Avg/Max is the
load balance and the Max is the
communication efficiency
2323 Open slide master to edit
Paraver - Analyzing the trace - MPI Profile
Select Window properties ->
Communication for Type and
Maximum bytes sent for Statistic.
2424 Open slide master to edit
Paraver - Computation
• Load the 2dh_usedulduration.cfg for a histogram of the duration for the computation regions
• A lot of areas are not constituted by vertical lines which shows load imbalance.
• We explore in the next slide what is inside the red circle
• We select Open Filtered Window and we zoom in the area of red circle
2525 Open slide master to edit
Paraver - Computation
• We zoom in the first area, we compare with the MPI calls and the 2dp_line_call.cfg
• Only the processes 2-8 execute this part and seems that is not instrumented, thus, it
could be from an external library
2626 Open slide master to edit
Paraver Useful Instructions
• Load the 2dh_useful_instructions.cfg
2727 Open slide master to edit
Paraver – Extract part from the original trace
After we click “Select Region…”
then select the area from the
already opened filtered trace
2828 Open slide master to edit
Paraver – Profile per calling line
We load the 2dp_line_call.cfg, we select open control window and synchronize the new
window
2929 Open slide master to edit
Paraver – Late receivers
We load the late_receivers.cfg and the 2dp_line_call.cfg
3030 Open slide master to edit
Paraver – Late senders
We load the receiver_from_late_sender.cfg and the 2dp_line_call.cfg
3131 Open slide master to edit
MiniWeather MPI+OpenMP
• jsrun -n 64 -r 8 -a 1 -c 2 ./trace_openmp.sh ./miniWeather_mpi_openmp
• Trace_opnemp.sh:
#!/bin/bash
export EXTRAE_HOME=/sw/summit/extrae/3.7.1/rhel7.5_gnu6.4.0
export EXTRAE_CONFIG_FILE=/gpfs/alpine/…/c/extrae_openmp.xml
export
LD_PRELOAD=${EXTRAE_HOME}/lib/libompitrace.so:$LD_PRELOAD
## Run the desired program
$*
• jsrun -n 64 -r 8 -a 1 -c 2 mpimpi2prv -f TRACE.mpits -e
./miniWeather_mpi_openmp
3232 Open slide master to edit
Parallel Loops
• Create a new chop file as described before
• Load the parallel_loops.cfg and zoom
3333 Open slide master to edit
Parallel Loops
• Create a new chop file as described before
• Load the parallel_loops.cfg and zoom
3434 Open slide master to edit
Load Balance
• Load OpenMP/analysis/load_balance.cfg
• Load the parallel_loops.cfg and zoom
3535 Open slide master to edit
Load Balance
• Load OpenMP/analysis/load_balance.cfg
3636 Open slide master to edit
Load Balance
• Load OpenMP/views/parallel_functions_useful.cfg and zoom
3737 Open slide master to edit
Paraver - Flush
Ctrl + Zoom
3838 Open slide master to edit
Paraver – Chop bigger area
Activate communication lines
Right Click -> View -> Communication
lines
Because of I/O rank 0, it delays
to post the MPI_Irecv and
MPI_ISend
Ctrl + Zoom
on top
processes

More Related Content

PDF
Analyzing ECP Proxy Apps with the Profiling Tool Score-P
PDF
Video Transcoding at the ABC with Microservices at GOTO Chicago
PDF
Inside the ABC's new Media Transcoding system, Metro
PDF
Video Transcoding at Scale for ABC iview (NDC Sydney)
PDF
eBPF/XDP
PDF
Getting started with AMD GPUs
PDF
eBPF Tooling and Debugging Infrastructure
PDF
eBPF Debugging Infrastructure - Current Techniques
Analyzing ECP Proxy Apps with the Profiling Tool Score-P
Video Transcoding at the ABC with Microservices at GOTO Chicago
Inside the ABC's new Media Transcoding system, Metro
Video Transcoding at Scale for ABC iview (NDC Sydney)
eBPF/XDP
Getting started with AMD GPUs
eBPF Tooling and Debugging Infrastructure
eBPF Debugging Infrastructure - Current Techniques

What's hot (20)

PPTX
Introduction to Parallelization ans performance optimization
PDF
Introduction to Parallelization ans performance optimization
PDF
Device-specific Clang Tooling for Embedded Systems
PPTX
Onnc intro
PDF
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
PPT
Presentation1
PDF
GTC16 - S6510 - Targeting GPUs with OpenMP 4.5
PDF
Performance evaluation with Arm HPC tools for SVE
PDF
GTC16 - S6410 - Comparing OpenACC 2.5 and OpenMP 4.5
PDF
Profiling your Applications using the Linux Perf Tools
PPTX
Lua: the world's most infuriating language
PDF
How to use TAU for Performance Analysis on Summit Supercomputer
ODP
Screen Player
PDF
Performance Analysis with TAU on Summit Supercomputer, part II
PPTX
Demystify eBPF JIT Compiler
PDF
Arm tools and roadmap for SVE compiler support
PDF
Programming Languages & Tools for Higher Performance & Productivity
PPTX
How data rules the world: Telemetry in Battlefield Heroes
PPTX
Introduction to Parallelization ans performance optimization
Introduction to Parallelization ans performance optimization
Device-specific Clang Tooling for Embedded Systems
Onnc intro
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...
Presentation1
GTC16 - S6510 - Targeting GPUs with OpenMP 4.5
Performance evaluation with Arm HPC tools for SVE
GTC16 - S6410 - Comparing OpenACC 2.5 and OpenMP 4.5
Profiling your Applications using the Linux Perf Tools
Lua: the world's most infuriating language
How to use TAU for Performance Analysis on Summit Supercomputer
Screen Player
Performance Analysis with TAU on Summit Supercomputer, part II
Demystify eBPF JIT Compiler
Arm tools and roadmap for SVE compiler support
Programming Languages & Tools for Higher Performance & Productivity
How data rules the world: Telemetry in Battlefield Heroes
Ad

Similar to Introduction to Extrae/Paraver, part I (20)

PDF
Performance Analysis with Scalasca on Summit Supercomputer part I
PPTX
SC'18 BoF Presentation
PPTX
Prometheus - Open Source Forum Japan
PDF
magellan_mongodb_workload_analysis
PDF
Microservices and Prometheus (Microservices NYC 2016)
PDF
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
PPTX
Prometheus (Prometheus London, 2016)
PDF
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
PPTX
PMIx Updated Overview
PPTX
Open MPI SC'15 State of the Union BOF
PDF
Pm ix tutorial-june2019-pub (1)
PDF
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
PDF
Containerizing HPC and AI applications using E4S and Performance Monitor tool
PPTX
Decoupling Provenance Capture and Analysis from Execution
PPTX
EuroMPI 2017 PMIx presentation
PDF
Prometheus (Microsoft, 2016)
PDF
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
PDF
Monitorama 2015 Netflix Instance Analysis
PDF
Application Profiling at the HPCAC High Performance Center
PPTX
Combining Phase Identification and Statistic Modeling for Automated Parallel ...
Performance Analysis with Scalasca on Summit Supercomputer part I
SC'18 BoF Presentation
Prometheus - Open Source Forum Japan
magellan_mongodb_workload_analysis
Microservices and Prometheus (Microservices NYC 2016)
Prometheus: A Next Generation Monitoring System (FOSDEM 2016)
Prometheus (Prometheus London, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
PMIx Updated Overview
Open MPI SC'15 State of the Union BOF
Pm ix tutorial-june2019-pub (1)
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...
Containerizing HPC and AI applications using E4S and Performance Monitor tool
Decoupling Provenance Capture and Analysis from Execution
EuroMPI 2017 PMIx presentation
Prometheus (Microsoft, 2016)
Designing Software Libraries and Middleware for Exascale Systems: Opportuniti...
Monitorama 2015 Netflix Instance Analysis
Application Profiling at the HPCAC High Performance Center
Combining Phase Identification and Statistic Modeling for Automated Parallel ...
Ad

More from George Markomanolis (13)

PDF
Evaluating GPU programming Models for the LUMI Supercomputer
PDF
Utilizing AMD GPUs: Tuning, programming models, and roadmap
PDF
Exploring the Programming Models for the LUMI Supercomputer
PDF
Performance Analysis with Scalasca, part II
PDF
Introducing IO-500 benchmark
PDF
Experience using the IO-500
PDF
Harshad - Handle Darshan Data
PDF
Lustre Best Practices
PDF
Burst Buffer: From Alpha to Omega
PDF
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
PDF
markomanolis_phd_defense
PDF
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
PDF
Introduction to Performance Analysis tools on Shaheen II
Evaluating GPU programming Models for the LUMI Supercomputer
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Exploring the Programming Models for the LUMI Supercomputer
Performance Analysis with Scalasca, part II
Introducing IO-500 benchmark
Experience using the IO-500
Harshad - Handle Darshan Data
Lustre Best Practices
Burst Buffer: From Alpha to Omega
Optimizing an Earth Science Atmospheric Application with the OmpSs Programmin...
markomanolis_phd_defense
Porting an MPI application to hybrid MPI+OpenMP with Reveal tool on Shaheen II
Introduction to Performance Analysis tools on Shaheen II

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPT
Teaching material agriculture food technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation theory and applications.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Cloud computing and distributed systems.
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Machine Learning_overview_presentation.pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Teaching material agriculture food technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation theory and applications.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Cloud computing and distributed systems.
MIND Revenue Release Quarter 2 2025 Press Release
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
gpt5_lecture_notes_comprehensive_20250812015547.pdf
MYSQL Presentation for SQL database connectivity
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The AUB Centre for AI in Media Proposal.docx
20250228 LYD VKU AI Blended-Learning.pptx
Machine Learning_overview_presentation.pptx
A comparative analysis of optical character recognition models for extracting...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Introduction to Extrae/Paraver, part I

  • 1. ORNL is managed by UT-Battelle, LLC for the US Department of Energy Introduction to Extrae/Paraver George S. Markomanolis 7 August 2019
  • 2. 22 Open slide master to edit Extrae/Paraver • Developed by Barcelona Supercomputing Center • Extrae for instrumentation • Paraver for visualization and performance analysis • Installed version on Summit: v3.7.1 • Module: extrae • Web site: https://tools.bsc.es/extrae https://tools.bsc.es/paraver
  • 3. 33 Open slide master to edit Capability Matrix - Extrae Capability Profiling Tracing Notes/Limitations MPI, MPI-IO Yes Yes OpenMP CPU Yes Yes Only GNU OpenMP GPU Yes Yes Only with GNU compiler, no OpenACC OpenACC No No CUDA Yes Yes Not advanced POSIX I/O ?? ?? POSIX threads Yes Yes Memory – app-level Yes Yes Need to use dynamic allocation Memory – func-level Yes Yes Need to use dynamic allocation Hotspot Detection Yes Yes Variance Detection Yes Yes Hardware Counters Yes Yes
  • 4. 44 Open slide master to edit Compilation • Extrae is a bit more complicate to start using it compared to many other tools • We can have dynamic or static compilation – For static, it is required to recompile – For dynamic is required to compile with -g, it works even without -g but less information will be instrumented:
  • 5. 55 Open slide master to edit How does Extrae work? • Symbol substitution through LD_PRELOAD – We need to use specific libraries based on programming language/model • Dynamic instrumentation (based on DynInst) • Static link
  • 6. 66 Open slide master to edit Trace Generation Workflow
  • 7. 77 Open slide master to edit Library Selection • Choose a library depending on the application type • The suffix “f” is for Fortran codes
  • 8. 88 Open slide master to edit
  • 9. 99 Open slide master to edit Extrae XML configuration - MPI
  • 10. 1010 Open slide master to edit Execution and Merging • jsrun -n 64 -r 8 -a 1 -c 1 ./trace.sh ./miniWeather_mpi • trace.sh: #!/bin/bash export EXTRAE_HOME=/sw/summit/extrae/3.7.1/rhel7.5_gnu6.4.0 export EXTRAE_CONFIG_FILE=/full_path/extrae.xml export LD_PRELOAD=${EXTRAE_HOME}/lib/libmpitrace.so:$LD_PRELOAD $* • jsrun -n 64 -r 8 -a 1 -c 1 mpimpi2prv -f TRACE.mpits -e miniWeather_mpi
  • 11. 1111 Open slide master to edit After the execution with merging • A folder set-X where X is number 0,1, etc. with the traces, one folder for every 256 MPI processes • Files based on the merging output, *.prv, *.pcf, *.row, the first one is the merged trace and the rest information about the trace and the events. • Now you need to visualize the trace for performance analysis. • We use the tool Paraver, it is available for Linux, Mac, Windows and already pre-compiled (https://tools.bsc.es/downloads), quite difficult to be built on Power processor. Available on Rhea or your computer.
  • 12. 1212 Open slide master to edit Paraver on Rhea % ssh –Y username@rhea.ccs.ornl.gov % module load paraver % wxparaver Location for configuration files: /sw/rhea/paraver/cfgs/ ls /sw/rhea/paraver/cfgs/ burst_mode clustering counters_PAPI CUDA folding General Java mpi OmpSs OpenCL OpenMP pthread sampling+folding scripts software_countersspectral uninstall.sh
  • 13. 1313 Open slide master to edit Paraver – Load trace Load trace
  • 14. 1414 Open slide master to edit Paraver - Filter trace Reduce trace size Click Yes
  • 15. 1515 Open slide master to edit Paraver - Filter trace Click Browse and load the filter.xml file
  • 16. 1616 Open slide master to edit Paraver – Visualize trace Click Browse and load the filter.xml file
  • 17. 1717 Open slide master to edit Paraver – Investigating trace Remove the communication links
  • 18. 1818 Open slide master to edit Paraver - Zoom Zoom, left click with mouse and select area moving the cursor horizontally towards right and decide which part we want to study
  • 19. 1919 Open slide master to edit Paraver – Computation configuration file Click on Open Control Window• We load h_comp_time.cfg, File -> Load configuration
  • 20. 2020 Open slide master to edit Paraver – Computation configuration file Zoom • We load h_comp_time.cfg, File -> Load configuration We can see some iterations
  • 21. 2121 Open slide master to edit Paraver – Extract part of the original trace • Select Filter Trace • Select for Input the original trace • Select cut for the execution chain • Trace options: Use original time to be able to compare between traces and remove last state • Click Select region and mark the area to cut from the original trace • Click Apply, the trace will be created and loaded
  • 22. 2222 Open slide master to edit Paraver – MPI Profile • Select Hints -> MPI -> MPI Profile Scroll down Click Hints -> MPI profile -> Histogram Zoom The average under the column Outside MPI represents the parallel efficiency, the value Avg/Max is the load balance and the Max is the communication efficiency
  • 23. 2323 Open slide master to edit Paraver - Analyzing the trace - MPI Profile Select Window properties -> Communication for Type and Maximum bytes sent for Statistic.
  • 24. 2424 Open slide master to edit Paraver - Computation • Load the 2dh_usedulduration.cfg for a histogram of the duration for the computation regions • A lot of areas are not constituted by vertical lines which shows load imbalance. • We explore in the next slide what is inside the red circle • We select Open Filtered Window and we zoom in the area of red circle
  • 25. 2525 Open slide master to edit Paraver - Computation • We zoom in the first area, we compare with the MPI calls and the 2dp_line_call.cfg • Only the processes 2-8 execute this part and seems that is not instrumented, thus, it could be from an external library
  • 26. 2626 Open slide master to edit Paraver Useful Instructions • Load the 2dh_useful_instructions.cfg
  • 27. 2727 Open slide master to edit Paraver – Extract part from the original trace After we click “Select Region…” then select the area from the already opened filtered trace
  • 28. 2828 Open slide master to edit Paraver – Profile per calling line We load the 2dp_line_call.cfg, we select open control window and synchronize the new window
  • 29. 2929 Open slide master to edit Paraver – Late receivers We load the late_receivers.cfg and the 2dp_line_call.cfg
  • 30. 3030 Open slide master to edit Paraver – Late senders We load the receiver_from_late_sender.cfg and the 2dp_line_call.cfg
  • 31. 3131 Open slide master to edit MiniWeather MPI+OpenMP • jsrun -n 64 -r 8 -a 1 -c 2 ./trace_openmp.sh ./miniWeather_mpi_openmp • Trace_opnemp.sh: #!/bin/bash export EXTRAE_HOME=/sw/summit/extrae/3.7.1/rhel7.5_gnu6.4.0 export EXTRAE_CONFIG_FILE=/gpfs/alpine/…/c/extrae_openmp.xml export LD_PRELOAD=${EXTRAE_HOME}/lib/libompitrace.so:$LD_PRELOAD ## Run the desired program $* • jsrun -n 64 -r 8 -a 1 -c 2 mpimpi2prv -f TRACE.mpits -e ./miniWeather_mpi_openmp
  • 32. 3232 Open slide master to edit Parallel Loops • Create a new chop file as described before • Load the parallel_loops.cfg and zoom
  • 33. 3333 Open slide master to edit Parallel Loops • Create a new chop file as described before • Load the parallel_loops.cfg and zoom
  • 34. 3434 Open slide master to edit Load Balance • Load OpenMP/analysis/load_balance.cfg • Load the parallel_loops.cfg and zoom
  • 35. 3535 Open slide master to edit Load Balance • Load OpenMP/analysis/load_balance.cfg
  • 36. 3636 Open slide master to edit Load Balance • Load OpenMP/views/parallel_functions_useful.cfg and zoom
  • 37. 3737 Open slide master to edit Paraver - Flush Ctrl + Zoom
  • 38. 3838 Open slide master to edit Paraver – Chop bigger area Activate communication lines Right Click -> View -> Communication lines Because of I/O rank 0, it delays to post the MPI_Irecv and MPI_ISend Ctrl + Zoom on top processes