SlideShare a Scribd company logo
CS-416 Parallel and Distributed SystemsJawwadShamsiLecture #2 19th January 2010
How Much of ParallelismDecomposition: The process of partitioning a computer program into independent pieces that can be run simultaneously (in parallel).Data ParallelismTask Parallelism
Scalar InstructionSingle Instruction at a timeLoad R1 @ 1000Vector InstructionVector operationsC(1:100)=A(1:100)+B(1:100)Super computerVector Instructions with pipelined floating point arithematic
PipeliningPipelining overlaps various stages of instruction execution to achieve performance. At a high level of abstraction, an instruction can be executed while the next one is being decoded and the next one is being fetched. This is akin to an assembly line for manufacture of cars.
Pipeline-limitationsPipelining, however, has several limitations. The speed of a pipeline is eventually limited by the slowest stage. For this reason, conventional processors rely on very deep pipelines (20 stage pipelines in state-of-the-art Pentium processors). However, in typical program traces, every 5-6th instruction is a conditional jump! This requires very accurate branch prediction. The penalty of a misprediction grows with the depth of the pipeline, since a larger number of instructions will have to be flushed.
Pipelining and Superscalar Execution One simple way of alleviating these bottlenecks is to use multiple pipelines. The question then becomes one of selecting these instructions.
Super Scalar ExecutionIssue multiple instructions during the same CPU cycleHow?By simultaneously dispatching multiple instructions to redundant unitsnot a separate CPU core but an execution resource within a single CPU e.g. ALU
Difference Between Vector Instruction and Pipelining
Parallel Processing Computer speed / output is increased through parallel processingIncrease hardwareIncrease costTech advancementsIncrease in cost is minimal
Types of Parallelism - explained
Data ParallelismSame code segment runs concurrently on each processorEach processor is assigned its own part of the data to work on

More Related Content

PPTX
Dichotomy of parallel computing platforms
PPTX
Scope of parallelism
PPTX
Superscalar Processor
PDF
Exploiting latency bounds for energy efficient load balancing
PPT
Instruction Level Parallelism and Superscalar Processors
PPTX
Superscalar processor
PPTX
FIne Grain Multithreading
Dichotomy of parallel computing platforms
Scope of parallelism
Superscalar Processor
Exploiting latency bounds for energy efficient load balancing
Instruction Level Parallelism and Superscalar Processors
Superscalar processor
FIne Grain Multithreading

What's hot (20)

PDF
Superscalar and VLIW architectures
PPT
13 superscalar
PPT
Lec1 final
PDF
Superscalar processors
PPTX
Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor ...
PPTX
Zvika Rozenshein,General Manager, EngineeringIQ
PPTX
Hardware and software parallelism
PPTX
Cutting the pipe
PPTX
Developing imperfect software
PPTX
TCAM Design using Flash Transistors
PDF
[Altibase] 8 replication part1 (overview)
PPTX
Term Project Presentation (4)
PPTX
VLIW(Very Long Instruction Word)
PPT
Client Centric Consistency Model
PPT
Vliw
PPT
Consistency protocols
PPT
Vliw and superscaler
PPTX
Optimization of Electrical Machines in the Cloud with SyMSpace by LCM
PPTX
Load balancing In cloud - In a semi distributed system
PDF
Client-centric Consistency Models
Superscalar and VLIW architectures
13 superscalar
Lec1 final
Superscalar processors
Multithreading: Exploiting Thread-Level Parallelism to Improve Uniprocessor ...
Zvika Rozenshein,General Manager, EngineeringIQ
Hardware and software parallelism
Cutting the pipe
Developing imperfect software
TCAM Design using Flash Transistors
[Altibase] 8 replication part1 (overview)
Term Project Presentation (4)
VLIW(Very Long Instruction Word)
Client Centric Consistency Model
Vliw
Consistency protocols
Vliw and superscaler
Optimization of Electrical Machines in the Cloud with SyMSpace by LCM
Load balancing In cloud - In a semi distributed system
Client-centric Consistency Models
Ad

Similar to Lecture2 (20)

PPT
computer architecture module3 notes module
PPT
CALecture3Module1.ppt
PPTX
Pipelining and vector processing
PPTX
Superscalar & superpipeline processor
PPTX
pipeline in computer architecture design
PPTX
Pipeline and Vector Processing Computer Org. Architecture.pptx
PPTX
Unit 4 COA.pptx
PPSX
Pipelining_Computer Organization_TU(BIM)
PDF
COA_Unit-3_slides_Pipeline Processing .pdf
PPTX
6. ILP.pptx
PPTX
Unit - 5 Pipelining.pptx
PDF
Parallelism
PPTX
Computer Architecture
DOC
Pipeline Mechanism
PPTX
Parallel Processing.pptx
PPT
14 superscalar
PDF
The AVR Pipelining explanation detailed.pdf
PPT
Unit 3
PPT
Computer Organozation
PPTX
pipelining-190913185902.pptx
computer architecture module3 notes module
CALecture3Module1.ppt
Pipelining and vector processing
Superscalar & superpipeline processor
pipeline in computer architecture design
Pipeline and Vector Processing Computer Org. Architecture.pptx
Unit 4 COA.pptx
Pipelining_Computer Organization_TU(BIM)
COA_Unit-3_slides_Pipeline Processing .pdf
6. ILP.pptx
Unit - 5 Pipelining.pptx
Parallelism
Computer Architecture
Pipeline Mechanism
Parallel Processing.pptx
14 superscalar
The AVR Pipelining explanation detailed.pdf
Unit 3
Computer Organozation
pipelining-190913185902.pptx
Ad

More from Asad Abbas (6)

PPT
Advanced full text searching techniques using Lucene
PPTX
Lecture4
PPTX
Lecture3
PPTX
Lecture1
PPTX
Lecture5
PPTX
Lecture6
Advanced full text searching techniques using Lucene
Lecture4
Lecture3
Lecture1
Lecture5
Lecture6

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Encapsulation theory and applications.pdf
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Machine Learning_overview_presentation.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
1. Introduction to Computer Programming.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
A Presentation on Artificial Intelligence
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Getting Started with Data Integration: FME Form 101
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Encapsulation theory and applications.pdf
Tartificialntelligence_presentation.pptx
Machine Learning_overview_presentation.pptx
Approach and Philosophy of On baking technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Accuracy of neural networks in brain wave diagnosis of schizophrenia
1. Introduction to Computer Programming.pptx
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
A Presentation on Artificial Intelligence
Digital-Transformation-Roadmap-for-Companies.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Getting Started with Data Integration: FME Form 101
Agricultural_Statistics_at_a_Glance_2022_0.pdf
cloud_computing_Infrastucture_as_cloud_p
Reach Out and Touch Someone: Haptics and Empathic Computing
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Per capita expenditure prediction using model stacking based on satellite ima...

Lecture2

  • 1. CS-416 Parallel and Distributed SystemsJawwadShamsiLecture #2 19th January 2010
  • 2. How Much of ParallelismDecomposition: The process of partitioning a computer program into independent pieces that can be run simultaneously (in parallel).Data ParallelismTask Parallelism
  • 3. Scalar InstructionSingle Instruction at a timeLoad R1 @ 1000Vector InstructionVector operationsC(1:100)=A(1:100)+B(1:100)Super computerVector Instructions with pipelined floating point arithematic
  • 4. PipeliningPipelining overlaps various stages of instruction execution to achieve performance. At a high level of abstraction, an instruction can be executed while the next one is being decoded and the next one is being fetched. This is akin to an assembly line for manufacture of cars.
  • 5. Pipeline-limitationsPipelining, however, has several limitations. The speed of a pipeline is eventually limited by the slowest stage. For this reason, conventional processors rely on very deep pipelines (20 stage pipelines in state-of-the-art Pentium processors). However, in typical program traces, every 5-6th instruction is a conditional jump! This requires very accurate branch prediction. The penalty of a misprediction grows with the depth of the pipeline, since a larger number of instructions will have to be flushed.
  • 6. Pipelining and Superscalar Execution One simple way of alleviating these bottlenecks is to use multiple pipelines. The question then becomes one of selecting these instructions.
  • 7. Super Scalar ExecutionIssue multiple instructions during the same CPU cycleHow?By simultaneously dispatching multiple instructions to redundant unitsnot a separate CPU core but an execution resource within a single CPU e.g. ALU
  • 8. Difference Between Vector Instruction and Pipelining
  • 9. Parallel Processing Computer speed / output is increased through parallel processingIncrease hardwareIncrease costTech advancementsIncrease in cost is minimal
  • 10. Types of Parallelism - explained
  • 11. Data ParallelismSame code segment runs concurrently on each processorEach processor is assigned its own part of the data to work on