SlideShare a Scribd company logo
7
Most read
8
Most read
9
Most read
ANALYTICAL MODELS OF
PARALLEL PROGRAMS
Prof. Shashikant V. Athawale
Assistant Professor | Computer Engineering
Department | AISSMS College of Engineering,
Kennedy Road, Pune , MH, India - 411001
Contents
❖ Analytical Models: Sources of overhead in Parallel
Programs.
❖ Performance Metrics for Parallel Systems
❖ The effect of Granularity on Performance
❖ Scalability of Parallel Systems
❖ Minimum execution time and minimum cost
❖ Optimal execution time.
❖ Dense Matrix Algorithms: Matrix-Vector Multiplication
❖ Matrix-Matrix Multiplication.
Basics of Analytical Modelling
❖ A sequential algorithm is evaluated by its runtime in
function of its input size.
➢ O(f(n)), Ω(f(n)), Θ(f(n)).
❖ The asymptotic runtime is independent of the platform.
Analysis “at a constant factor”.
❖ A parallel algorithm is evaluated by its runtime in
function of
➢ the input size
➢ the number of processors,
➢
Sources of overhead in parallel programs
❖ Overheads: wasted computation, communication, idling,
contention.
➢ Inter-process interaction.
➢ Load imbalance.
➢ Dependencies.
Performance metrics for parallel systems
❖ Execution time = time elapsed between
➢ beginning and end of execution on a sequential
computer.
➢ beginning of first processor and end of the last
processor on a parallel computer.
Performance metrics for parallel systems
❖ Total parallel overhead.
➢ Total time collectively spent by all processing
elements = pTp
➢ Time spent doing useful work (serial time) = Ts
➢ Overhead function: To = pTp-Ts.
Effect of Granularity on Performance
❖ Scaling down: To use fewer processing elements than
the maximum possible.
❖ Naïve way to scale down:
➢ Assign the work of n/p processing element to every
processing element.
■ Computation increases by n/p.
■ Communication growth ≤ n/p.
❖ If a parallel system with n processing elements is cost
optimal, then it is still cost optimal with p.
Scalability of Parallel Systems
❖ Scalability: ability to use efficiently increasing
processing power.
❖ Can maintain its efficiency constant when increasing the
number of processors and the size of the problem.
❖ In many cases T0=f(TS,p) and grows sub-linearly with
TS. It can be possible to increase p and TS and keep E
constant.
❖ Scalability measures the ability to increase speedup in
function of p.
Minimum execution time and minimum cost
❖ We can determine the minimum parallel runtime T min P
for a given W by differentiating the expression for TP
w.r.t. p and equating it to zero.
❖ If p0 is the value of p as determined by this equation, Tp
(p0) is the minimum parallel time.
Dense Matrix Algorithms: Matrix-Vector
Multiplication
❖ We aim to multiply a dense n × n matrix A with an n × 1
vector x to yield the n × 1 result vector y.
❖ The serial algorithm requires n2
multiplications and
additions. W = n2
Matrix Vector Multiplication
❖ There are Two Types :
➢ Rowwise 1-D Partitioning
■ The n × n matrix is partitioned among n
processors, with each processor storing complete
row of the matrix.
■ The n × 1 vector x is distributed such that each
process owns one of its elements.
Matrix Vector Multiplication
❖ 2-D partitioning
■ The n × n matrix is partitioned among n2
processors such that each processor owns a single
element.
■ The n × 1 vector x is distributed only in the last
column of n processors.
Matrix-Matrix Multiplication
❖ The problem of multiplying two n × n dense, square
matrices A and B to yield the product matrix C = A × B.
❖ The serial complexity is O(n3
).

More Related Content

PDF
Lecture 4 principles of parallel algorithm design updated
PDF
Distributed Operating System_1
PPTX
Daa unit 1
PPT
Parallel algorithms
PPTX
Parallel programming model
PPT
Parallel Processing Concepts
PPTX
Dichotomy of parallel computing platforms
PDF
Course outline of parallel and distributed computing
Lecture 4 principles of parallel algorithm design updated
Distributed Operating System_1
Daa unit 1
Parallel algorithms
Parallel programming model
Parallel Processing Concepts
Dichotomy of parallel computing platforms
Course outline of parallel and distributed computing

What's hot (20)

PPTX
Allocation of Frames & Thrashing
PPTX
formal verification
PPT
Formal Specifications in Formal Methods
PPTX
Data-Intensive Technologies for Cloud Computing
PDF
Introduction to High-Performance Computing
PPT
1.prallelism
PPT
OPERATING SYSTEM SERVICES, OPERATING SYSTEM STRUCTURES
PPTX
Kernel. Operating System
PPTX
Media Access Control
PDF
Lecture 1 introduction to parallel and distributed computing
PDF
Parallel Algorithms
PPTX
Paging and segmentation
PPT
Classical Planning
PPTX
SCHEDULING ALGORITHMS
DOCX
Group Communication in distributed Systems.docx
PPTX
Finite Automata: Deterministic And Non-deterministic Finite Automaton (DFA)
PPTX
Deadlock ppt
PPTX
priority interrupt computer organization
PPTX
Structure of processes ppt
PDF
Lecture 3 parallel programming platforms
Allocation of Frames & Thrashing
formal verification
Formal Specifications in Formal Methods
Data-Intensive Technologies for Cloud Computing
Introduction to High-Performance Computing
1.prallelism
OPERATING SYSTEM SERVICES, OPERATING SYSTEM STRUCTURES
Kernel. Operating System
Media Access Control
Lecture 1 introduction to parallel and distributed computing
Parallel Algorithms
Paging and segmentation
Classical Planning
SCHEDULING ALGORITHMS
Group Communication in distributed Systems.docx
Finite Automata: Deterministic And Non-deterministic Finite Automaton (DFA)
Deadlock ppt
priority interrupt computer organization
Structure of processes ppt
Lecture 3 parallel programming platforms
Ad

Similar to Analytical Models of Parallel Programs (20)

PPT
Chap5 slides
PPTX
Performance measures
PDF
PPoPP15
PDF
Design Analysis and Algorithm Module1.pdf
PPT
PPTX
Data Structures - Lecture 1 [introduction]
PPT
multi threaded and distributed algorithms
PPTX
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
PDF
Lecture-2-01-02-2022.pdf
PPTX
TensorFlow.pptx
PDF
CS-438 COMPUTER SYSTEM MODELING WK5LEC9-10.pdf
DOCX
in computer data structures and algorithms
PDF
Distributed Convex Optimization Thesis - Behroz Sikander
PPT
Introduction to Data Structures Sorting and searching
PPTX
Analysis of Algorithms_Under Graduate Class Slide
PPTX
Analysis of algorithn class 2
PPTX
Aca11 bk2 ch9
PDF
Genetic Algorithm for Process Scheduling
PPT
Chap4 slides
PPTX
Design and Analysis of Algorithms.pptx
Chap5 slides
Performance measures
PPoPP15
Design Analysis and Algorithm Module1.pdf
Data Structures - Lecture 1 [introduction]
multi threaded and distributed algorithms
FALLSEM2022-23_BCSE202L_TH_VL2022230103292_Reference_Material_I_25-07-2022_Fu...
Lecture-2-01-02-2022.pdf
TensorFlow.pptx
CS-438 COMPUTER SYSTEM MODELING WK5LEC9-10.pdf
in computer data structures and algorithms
Distributed Convex Optimization Thesis - Behroz Sikander
Introduction to Data Structures Sorting and searching
Analysis of Algorithms_Under Graduate Class Slide
Analysis of algorithn class 2
Aca11 bk2 ch9
Genetic Algorithm for Process Scheduling
Chap4 slides
Design and Analysis of Algorithms.pptx
Ad

More from Dr Shashikant Athawale (20)

PPT
Amortized analysis
PPT
Complexity theory
PPT
Divide and Conquer
PPT
Model and Design
PPT
Fundamental of Algorithms
PPT
CUDA Architecture
PPT
Parallel Algorithms- Sorting and Graph
PPT
Basic Communication
PPT
Parallel Processing Concepts
PPT
Dynamic programming
PPT
Parallel algorithms
PPT
Greedy method
PPT
Divide and conquer
PPT
Branch and bound
PPT
Asymptotic notation
PPT
String matching algorithms
PPTX
Advanced Wireless Technologies
PPTX
Vehicular network
PPTX
Delay telerant network
Amortized analysis
Complexity theory
Divide and Conquer
Model and Design
Fundamental of Algorithms
CUDA Architecture
Parallel Algorithms- Sorting and Graph
Basic Communication
Parallel Processing Concepts
Dynamic programming
Parallel algorithms
Greedy method
Divide and conquer
Branch and bound
Asymptotic notation
String matching algorithms
Advanced Wireless Technologies
Vehicular network
Delay telerant network

Recently uploaded (20)

PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
Construction Project Organization Group 2.pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
web development for engineering and engineering
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
Geodesy 1.pptx...............................................
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
OOP with Java - Java Introduction (Basics)
PPT
Mechanical Engineering MATERIALS Selection
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Sustainable Sites - Green Building Construction
PDF
composite construction of structures.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Welding lecture in detail for understanding
PPTX
bas. eng. economics group 4 presentation 1.pptx
Operating System & Kernel Study Guide-1 - converted.pdf
Construction Project Organization Group 2.pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
web development for engineering and engineering
Arduino robotics embedded978-1-4302-3184-4.pdf
Geodesy 1.pptx...............................................
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
OOP with Java - Java Introduction (Basics)
Mechanical Engineering MATERIALS Selection
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Sustainable Sites - Green Building Construction
composite construction of structures.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Welding lecture in detail for understanding
bas. eng. economics group 4 presentation 1.pptx

Analytical Models of Parallel Programs

  • 1. ANALYTICAL MODELS OF PARALLEL PROGRAMS Prof. Shashikant V. Athawale Assistant Professor | Computer Engineering Department | AISSMS College of Engineering, Kennedy Road, Pune , MH, India - 411001
  • 2. Contents ❖ Analytical Models: Sources of overhead in Parallel Programs. ❖ Performance Metrics for Parallel Systems ❖ The effect of Granularity on Performance ❖ Scalability of Parallel Systems ❖ Minimum execution time and minimum cost ❖ Optimal execution time. ❖ Dense Matrix Algorithms: Matrix-Vector Multiplication ❖ Matrix-Matrix Multiplication.
  • 3. Basics of Analytical Modelling ❖ A sequential algorithm is evaluated by its runtime in function of its input size. ➢ O(f(n)), Ω(f(n)), Θ(f(n)). ❖ The asymptotic runtime is independent of the platform. Analysis “at a constant factor”. ❖ A parallel algorithm is evaluated by its runtime in function of ➢ the input size ➢ the number of processors, ➢
  • 4. Sources of overhead in parallel programs ❖ Overheads: wasted computation, communication, idling, contention. ➢ Inter-process interaction. ➢ Load imbalance. ➢ Dependencies.
  • 5. Performance metrics for parallel systems ❖ Execution time = time elapsed between ➢ beginning and end of execution on a sequential computer. ➢ beginning of first processor and end of the last processor on a parallel computer.
  • 6. Performance metrics for parallel systems ❖ Total parallel overhead. ➢ Total time collectively spent by all processing elements = pTp ➢ Time spent doing useful work (serial time) = Ts ➢ Overhead function: To = pTp-Ts.
  • 7. Effect of Granularity on Performance ❖ Scaling down: To use fewer processing elements than the maximum possible. ❖ Naïve way to scale down: ➢ Assign the work of n/p processing element to every processing element. ■ Computation increases by n/p. ■ Communication growth ≤ n/p. ❖ If a parallel system with n processing elements is cost optimal, then it is still cost optimal with p.
  • 8. Scalability of Parallel Systems ❖ Scalability: ability to use efficiently increasing processing power. ❖ Can maintain its efficiency constant when increasing the number of processors and the size of the problem. ❖ In many cases T0=f(TS,p) and grows sub-linearly with TS. It can be possible to increase p and TS and keep E constant. ❖ Scalability measures the ability to increase speedup in function of p.
  • 9. Minimum execution time and minimum cost ❖ We can determine the minimum parallel runtime T min P for a given W by differentiating the expression for TP w.r.t. p and equating it to zero. ❖ If p0 is the value of p as determined by this equation, Tp (p0) is the minimum parallel time.
  • 10. Dense Matrix Algorithms: Matrix-Vector Multiplication ❖ We aim to multiply a dense n × n matrix A with an n × 1 vector x to yield the n × 1 result vector y. ❖ The serial algorithm requires n2 multiplications and additions. W = n2
  • 11. Matrix Vector Multiplication ❖ There are Two Types : ➢ Rowwise 1-D Partitioning ■ The n × n matrix is partitioned among n processors, with each processor storing complete row of the matrix. ■ The n × 1 vector x is distributed such that each process owns one of its elements.
  • 12. Matrix Vector Multiplication ❖ 2-D partitioning ■ The n × n matrix is partitioned among n2 processors such that each processor owns a single element. ■ The n × 1 vector x is distributed only in the last column of n processors.
  • 13. Matrix-Matrix Multiplication ❖ The problem of multiplying two n × n dense, square matrices A and B to yield the product matrix C = A × B. ❖ The serial complexity is O(n3 ).