SlideShare a Scribd company logo
Parallel Programming
Potential Benefits, Limits and Costs
of Parallel Programming
Amdahl's Law
‱ Amdahl's Law states that potential program
speedup is defined by the fraction of code (P)
that can be parallelized:
Potential Benefits, Limits and Costs
of Parallel Programming
Amdahl's Law
Speedup when introducing more processors
Potential Benefits, Limits and Costs
of Parallel Programming
Amdahl's Law
Amdahl's law
Amdahl's Law
‱ If none of the code can be parallelized, P = 0 and the
speedup = 1 (no speedup).
‱ If all of the code is parallelized, P = 1 and the
speedup is
infinite (in theory).
‱ If 50% of the code can be parallelized, maximum
speedup = 2, meaning the code will run twice as fast.
‱ Introducing the number of processors performing the
parallel fraction of work, the relationship can be
modeled by:
‱ Where P = parallel fraction, N = number of processors
and S = serial fraction.
Amdahl's Law
‱ It soon becomes obvious that there are limits to
the scalability of parallelism.
‱ For example:
 "Famous" quote: You can spend a
lifetime getting 95% of your code to
be parallel, and never achieve better
than 20x speedup no matter how
many processors you use!
 However, certain problems
demonstrate increased performance
by increasing the problem size.
Complexity
 In general, parallel applications are more complex than
corresponding serial applications.
 Not only do you have multiple instruction streams
executing at the same time, but you also have data flowing
between them.
 The costs of complexity are measured in programmer time
in virtually every aspect of the software development cycle:
 Design
 Coding
 Debugging
 Tuning
 Maintenance
 Adhering to "good" software development practices
is essential when developing parallel applications.
Portability
 Thanks to standardization in several APIs, such as MPI,
OpenMP and POSIX threads, portability issues with
parallel programs are not as serious as in years past.
 All the usual portability issues associated with serial
programs apply to parallel programs.
 Even though standards exist for several APIs,
implementations will differ in a number of details,
sometimes to the point of requiring code modifications in
order to effect portability.
 Operating systems can play a key role in code portability
issues.
 Hardware architectures are characteristically highly
variable and can affect portability.
Resource Requirements
 The primary intent of parallel programming is to
decrease execution wall clock time, however in order to
accomplish this, more CPU time is required.
 For example, a parallel code that runs in 1 hour on 8
processors actually uses 8 hours of CPU time.
 The amount of memory required can be greater for
parallel codes than serial codes, due to the need to
replicate data and for overheads associated with
parallel support libraries and subsystems.
 For short running parallel programs, there can be a
decrease in performance compared to a similar serial
implementation.
 The overhead costs associated with setting up the
parallel environment, task creation, communications
and task termination can comprise a significant portion
of the total execution time for short runs.
Scalability
 High performance computing (HPC) clusters
are able to solve big problems using a large
number of processors.
 This is also known as parallel computing,
where many processors work
simultaneously to produce exceptional
computational power and to significantly
reduce the total computational time.
 In such scenarios, scalability or scaling is
widely used to indicate the ability of
hardware and software to deliver greater
computational power when the amount of
resources is increased.
Scalability
 For HPC clusters, it is important that they
are scalable, in other words that the
capacity of the whole system can be
proportionally increased by adding more
hardware.
 For software, scalability is sometimes
referred to as parallelization efficiency —
the ratio between the actual speedup and
the ideal speedup obtained when using a
certain number of processors.
Software scalability
▶ The speedup in parallel computing can be
straightforwardly defined as
speedup = t1 / tN,
where t1 is the computational time for running the
software using one processor,
and tN is the computational time running the same
software with N processors.
▶ Ideally, we would like software to have a linear
speedup that is equal to the number of processors
(speedup = N), as that would mean that every
processor would be contributing 100% of its
computational power.
▶ Unfortunately, this is a very challenging goal for
real applications to attain.
Software scalability
 In 1967, Amdahl pointed out that the speedup is
limited by the fraction of the serial part of the
software that is not amenable to parallelization
 Amdahl’s law can be formulated as follows
speedup = 1 / (s + p / N)
 Amdahl’s law states that, for a fixed problem, the
upper limit of speedup is determined by the serial
fraction of the code.
 This is called strong scaling and can be explained
by the following example.
Strong scalability
 Consider a program that takes 20 hours to run using a
single processor core.
 If a particular part of the program, which takes one hour
to execute, cannot be parallelized (s = 1/20 = 0.05), and if
the code that takes up the remaining 19 hours of
execution time can be parallelized (p = 1 − s = 0.95), then
 Regardless of how many processors are devoted to a
parallelized execution of this program, the minimum
execution time cannot be less than that critical one hour.
 Hence, the theoretical speedup is limited to at most 20
times (when N = ∞, speedup = 1/s = 20). As such, the
parallelization efficiency decreases as the amount of
resources increases.
 For this reason, parallel computing with many
processors
is useful only for highly parallelized programs.
Strong scalability
Amdahl’s law gives the upper limit of speedup for a problem of fixed
size. This seems to be a bottleneck for parallel computing; if one would
like to gain a 500 times speedup on 1000 processors, Amdahl’s law
requires that the proportion of serial part cannot exceed 0.1%.
Scalability
 The ability of a parallel program's performance to scale is
a result of a number of interrelated factors. Simply
adding more processors is rarely the answer.
 The algorithm may have inherent limits to scalability. At
some point, adding more resources causes performance to
decrease. This is a common situation with many parallel
applications.
 Hardware factors play a significant role in scalability.
Examples:
 Memory-cpu bus bandwidth on an SMP machine
 Communications network bandwidth
 Amount of memory available on any given machine or
set of
machines
 Processor clock speed
 Parallel support libraries and subsystems software

More Related Content

PDF
3. Potential Benefits, Limits and Costs of Parallel Programming.pdf
PDF
Zoom Meeting Crack License 100% Working [2025]
PDF
Movavi Screen Recorder Studio 2025 crack Free Download
PDF
Windows 8.1 Pro Activator Crack Version [April-2025]
PDF
Iobit Uninstaller Pro Crack Free Download
PDF
Auslogics Video Grabber 1.0.0.7 Crack Free Download
PDF
Wondershare UniConverter Download (Latest 2025)
PDF
Auslogics Video Grabber 1.0.0.12 Free Download
3. Potential Benefits, Limits and Costs of Parallel Programming.pdf
Zoom Meeting Crack License 100% Working [2025]
Movavi Screen Recorder Studio 2025 crack Free Download
Windows 8.1 Pro Activator Crack Version [April-2025]
Iobit Uninstaller Pro Crack Free Download
Auslogics Video Grabber 1.0.0.7 Crack Free Download
Wondershare UniConverter Download (Latest 2025)
Auslogics Video Grabber 1.0.0.12 Free Download

Similar to Unit 1.2 Parallel Programming in HPC.pptx (20)

PDF
Apowersoft Screen Recorder Pro Free CRACK
PDF
Apowersoft Screen Recorder Pro Free CRACK Download
PDF
Auslogics Video Grabber 1.0.0.7 Crack Free
PDF
Auslogics Video Grabber 1.0.0.7 Crack Free
PDF
Wondershare Recoverit 13.5.12.11 Free Download
PDF
Auslogics Video Grabber 1.0.0.12 Free Download
PDF
Flexible PDF 2025 Crack License 100% Working
PDF
Auslogics Video Grabber 1.0.0.7 Crack Free
PDF
Aiseesoft Video Converter Ultimate 10.9.6
PDF
The XBMC Free CRACK Download for Pc 2025
PPTX
The XBMC Free CRACK Download for Pc .
PPTX
Apowersoft Screen Recorder Pro Free CRACK Download
PPTX
MiniGolf Showdown TENOKE Free Download
PPTX
Arctic Motel Simulator TENOKE Free Download
PPTX
Red Giant Shooter Suite 13 64 Bit Free CRACK Download
PPTX
Download Celtx Plus 2025 crack free for Mac
PPTX
Movavi Screen Recorder Studio 2025 crack Free Download
PPTX
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
PPTX
Parallel Algorithms Advantages and Disadvantages
PPTX
1.1 Introduction.pptx about the design thinking of the engineering students
Apowersoft Screen Recorder Pro Free CRACK
Apowersoft Screen Recorder Pro Free CRACK Download
Auslogics Video Grabber 1.0.0.7 Crack Free
Auslogics Video Grabber 1.0.0.7 Crack Free
Wondershare Recoverit 13.5.12.11 Free Download
Auslogics Video Grabber 1.0.0.12 Free Download
Flexible PDF 2025 Crack License 100% Working
Auslogics Video Grabber 1.0.0.7 Crack Free
Aiseesoft Video Converter Ultimate 10.9.6
The XBMC Free CRACK Download for Pc 2025
The XBMC Free CRACK Download for Pc .
Apowersoft Screen Recorder Pro Free CRACK Download
MiniGolf Showdown TENOKE Free Download
Arctic Motel Simulator TENOKE Free Download
Red Giant Shooter Suite 13 64 Bit Free CRACK Download
Download Celtx Plus 2025 crack free for Mac
Movavi Screen Recorder Studio 2025 crack Free Download
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
Parallel Algorithms Advantages and Disadvantages
1.1 Introduction.pptx about the design thinking of the engineering students
Ad

More from sayalee7 (8)

PPTX
break continue and pass statement in python.pptx
PPTX
Unit 4 Introduction to internet of Things.pptx
PDF
Unit 4 Internet of Things communication models.pdf
PPTX
Unit 2.2 Parallel programming architecture .pptx
PPTX
parallel programming Models in system(1).pptx
PPTX
3-tierdatawarehouse in data analytics.pptx
PPTX
Unit 1.A.Introduction to Knowledge Discovery Data Mining (1).pptx
PPTX
Unit 1.2 Basic Statistical Descriptions of Data (2).pptx
break continue and pass statement in python.pptx
Unit 4 Introduction to internet of Things.pptx
Unit 4 Internet of Things communication models.pdf
Unit 2.2 Parallel programming architecture .pptx
parallel programming Models in system(1).pptx
3-tierdatawarehouse in data analytics.pptx
Unit 1.A.Introduction to Knowledge Discovery Data Mining (1).pptx
Unit 1.2 Basic Statistical Descriptions of Data (2).pptx
Ad

Recently uploaded (20)

PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
PPT on Performance Review to get promotions
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
web development for engineering and engineering
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
additive manufacturing of ss316l using mig welding
PPT
Mechanical Engineering MATERIALS Selection
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Welding lecture in detail for understanding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
composite construction of structures.pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
bas. eng. economics group 4 presentation 1.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPT on Performance Review to get promotions
CH1 Production IntroductoryConcepts.pptx
web development for engineering and engineering
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
additive manufacturing of ss316l using mig welding
Mechanical Engineering MATERIALS Selection
OOP with Java - Java Introduction (Basics)
Welding lecture in detail for understanding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
composite construction of structures.pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
bas. eng. economics group 4 presentation 1.pptx

Unit 1.2 Parallel Programming in HPC.pptx

  • 2. Potential Benefits, Limits and Costs of Parallel Programming Amdahl's Law ‱ Amdahl's Law states that potential program speedup is defined by the fraction of code (P) that can be parallelized:
  • 3. Potential Benefits, Limits and Costs of Parallel Programming Amdahl's Law Speedup when introducing more processors
  • 4. Potential Benefits, Limits and Costs of Parallel Programming Amdahl's Law Amdahl's law
  • 5. Amdahl's Law ‱ If none of the code can be parallelized, P = 0 and the speedup = 1 (no speedup). ‱ If all of the code is parallelized, P = 1 and the speedup is infinite (in theory). ‱ If 50% of the code can be parallelized, maximum speedup = 2, meaning the code will run twice as fast. ‱ Introducing the number of processors performing the parallel fraction of work, the relationship can be modeled by: ‱ Where P = parallel fraction, N = number of processors and S = serial fraction.
  • 6. Amdahl's Law ‱ It soon becomes obvious that there are limits to the scalability of parallelism. ‱ For example:
  • 7.  "Famous" quote: You can spend a lifetime getting 95% of your code to be parallel, and never achieve better than 20x speedup no matter how many processors you use!  However, certain problems demonstrate increased performance by increasing the problem size.
  • 8. Complexity  In general, parallel applications are more complex than corresponding serial applications.  Not only do you have multiple instruction streams executing at the same time, but you also have data flowing between them.  The costs of complexity are measured in programmer time in virtually every aspect of the software development cycle:  Design  Coding  Debugging  Tuning  Maintenance  Adhering to "good" software development practices is essential when developing parallel applications.
  • 9. Portability  Thanks to standardization in several APIs, such as MPI, OpenMP and POSIX threads, portability issues with parallel programs are not as serious as in years past.  All the usual portability issues associated with serial programs apply to parallel programs.  Even though standards exist for several APIs, implementations will differ in a number of details, sometimes to the point of requiring code modifications in order to effect portability.  Operating systems can play a key role in code portability issues.  Hardware architectures are characteristically highly variable and can affect portability.
  • 10. Resource Requirements  The primary intent of parallel programming is to decrease execution wall clock time, however in order to accomplish this, more CPU time is required.  For example, a parallel code that runs in 1 hour on 8 processors actually uses 8 hours of CPU time.  The amount of memory required can be greater for parallel codes than serial codes, due to the need to replicate data and for overheads associated with parallel support libraries and subsystems.  For short running parallel programs, there can be a decrease in performance compared to a similar serial implementation.  The overhead costs associated with setting up the parallel environment, task creation, communications and task termination can comprise a significant portion of the total execution time for short runs.
  • 11. Scalability  High performance computing (HPC) clusters are able to solve big problems using a large number of processors.  This is also known as parallel computing, where many processors work simultaneously to produce exceptional computational power and to significantly reduce the total computational time.  In such scenarios, scalability or scaling is widely used to indicate the ability of hardware and software to deliver greater computational power when the amount of resources is increased.
  • 12. Scalability  For HPC clusters, it is important that they are scalable, in other words that the capacity of the whole system can be proportionally increased by adding more hardware.  For software, scalability is sometimes referred to as parallelization efficiency — the ratio between the actual speedup and the ideal speedup obtained when using a certain number of processors.
  • 13. Software scalability ▶ The speedup in parallel computing can be straightforwardly defined as speedup = t1 / tN, where t1 is the computational time for running the software using one processor, and tN is the computational time running the same software with N processors. ▶ Ideally, we would like software to have a linear speedup that is equal to the number of processors (speedup = N), as that would mean that every processor would be contributing 100% of its computational power. ▶ Unfortunately, this is a very challenging goal for real applications to attain.
  • 14. Software scalability  In 1967, Amdahl pointed out that the speedup is limited by the fraction of the serial part of the software that is not amenable to parallelization  Amdahl’s law can be formulated as follows speedup = 1 / (s + p / N)  Amdahl’s law states that, for a fixed problem, the upper limit of speedup is determined by the serial fraction of the code.  This is called strong scaling and can be explained by the following example.
  • 15. Strong scalability  Consider a program that takes 20 hours to run using a single processor core.  If a particular part of the program, which takes one hour to execute, cannot be parallelized (s = 1/20 = 0.05), and if the code that takes up the remaining 19 hours of execution time can be parallelized (p = 1 − s = 0.95), then  Regardless of how many processors are devoted to a parallelized execution of this program, the minimum execution time cannot be less than that critical one hour.  Hence, the theoretical speedup is limited to at most 20 times (when N = ∞, speedup = 1/s = 20). As such, the parallelization efficiency decreases as the amount of resources increases.  For this reason, parallel computing with many processors is useful only for highly parallelized programs.
  • 16. Strong scalability Amdahl’s law gives the upper limit of speedup for a problem of fixed size. This seems to be a bottleneck for parallel computing; if one would like to gain a 500 times speedup on 1000 processors, Amdahl’s law requires that the proportion of serial part cannot exceed 0.1%.
  • 17. Scalability  The ability of a parallel program's performance to scale is a result of a number of interrelated factors. Simply adding more processors is rarely the answer.  The algorithm may have inherent limits to scalability. At some point, adding more resources causes performance to decrease. This is a common situation with many parallel applications.  Hardware factors play a significant role in scalability. Examples:  Memory-cpu bus bandwidth on an SMP machine  Communications network bandwidth  Amount of memory available on any given machine or set of machines  Processor clock speed  Parallel support libraries and subsystems software