SlideShare a Scribd company logo
6
Most read
9
Most read
15
Most read
Superscalar and VLIW
Architectures
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Outline
• Types of architectures
• Superscalar
• Differences between CISC, RISC and VLIW
• VLIW
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Parallel processing
Processing instructions in parallel requires
three major tasks:
1. checking dependencies between
instructions to determine which
instructions can be grouped together for
parallel execution;
2. assigning instructions to the functional
units on the hardware;
3. determining when instructions are initiated
placed together into a single word.
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Major categories
From Mark Smotherman, “Understanding EPIC Architectures and Implementations”
VLIW – Very Long Instruction Word
EPIC – Explicitly Parallel Instruction Computing
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Major categories
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Superscalar Processors
• Superscalar processors are designed to
exploit more instruction-level parallelism in
user programs.
• Only independent instructions can be
executed in parallel without causing a wait
state.
• The amount of instruction-level parallelism
varies widely depending on the type of code
being executed.
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Pipelining in Superscalar Processors
• In order to fully utilise a superscalar
processor of degree m, m instructions must
be executable in parallel. This situation may
not be true in all clock cycles. In that case,
some of the pipelines may be stalling in a
wait state.
• In a superscalar processor, the simple
operation latency should require only one
cycle, as in the base scalar processor.
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Superscalar Execution
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Superscalar Implementation
• Simultaneously fetch multiple instructions
• Logic to determine true dependencies
involving register values
• Mechanisms to communicate these values
• Mechanisms to initiate multiple instructions
in parallel
• Resources for parallel execution of multiple
instructions
• Mechanisms for committing process state in
correct order Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Some Architectures
• PowerPC 604
– six independent execution units:
• Branch execution unit
• Load/Store unit
• 3 Integer units
• Floating-point unit
– in-order issue
– register renaming
• Power PC 620
– provides in addition to the 604 out-of-order issue
• Pentium
– three independent execution units:
• 2 Integer units
• Floating point unit
– in-order issue
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
The VLIW Architecture
• A typical VLIW (very long instruction
word) machine has instruction words
hundreds of bits in length.
• Multiple functional units are used
concurrently in a VLIW processor.
• All functional units share the use of a
common large register file.
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Comparison: CISC, RISC, VLIW
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Dr. Amit Kumar, Dept of CSE, JUET,
Guna
Advantages of VLIW
Compiler prepares fixed packets of multiple
operations that give the full "plan of execution"
– dependencies are determined by compiler and
used to schedule according to function unit
latencies
– function units are assigned by compiler and
correspond to the position within the
instruction packet ("slotting")
– compiler produces fully-scheduled, hazard-
free code => hardware doesn't have to
"rediscover" dependencies or scheduleDr. Amit Kumar, Dept of CSE, JUET,
Guna
Disadvantages of VLIW
Compatibility across implementations is a major
problem
– VLIW code won't run properly with
different number of function units or
different latencies
– unscheduled events (e.g., cache miss) stall
entire processor
Code density is another problem
– low slot utilization (mostly nops)
– reduce nops by compression ("flexible
VLIW", "variable-length VLIW")Dr. Amit Kumar, Dept of CSE, JUET,
Guna

More Related Content

PDF
Soc architecture and design
PPSX
CISC & RISC ARCHITECTURES
PPTX
Introduction to arm processor
PDF
ARM CORTEX M3 PPT
PPTX
Arm architecture chapter2_steve_furber
PDF
Introduction to arm architecture
PPTX
CISC & RISC Architecture
PPTX
Embedded System
Soc architecture and design
CISC & RISC ARCHITECTURES
Introduction to arm processor
ARM CORTEX M3 PPT
Arm architecture chapter2_steve_furber
Introduction to arm architecture
CISC & RISC Architecture
Embedded System

What's hot (20)

PPTX
EC8791 consumer electronics-platform level performance analysis
PPTX
Design challenges in embedded systems
PDF
ARM 32-bit Microcontroller Cortex-M3 introduction
PDF
SOC Processors Used in SOC
PDF
Vx works RTOS
DOCX
Hardware-Software Codesign
PPTX
Embedded System Programming on ARM Cortex M3 and M4 Course
PDF
Lecture 1 introduction to parallel and distributed computing
PPTX
Ec8791 arm 9 processor
PDF
Instruction Level Parallelism (ILP) Limitations
PPTX
Pipelining powerpoint presentation
PDF
Memory mapping
PPTX
WORKFLOW OF THE PROCESS IN SPM
PDF
System On Chip
PPTX
Vhdl programming
PPTX
Semiconductor memories
PDF
ARM Processor Tutorial
PPT
E.s unit 6
PPT
Assembly language programming_fundamentals 8086
PPTX
Data Designs (Software Engg.)
EC8791 consumer electronics-platform level performance analysis
Design challenges in embedded systems
ARM 32-bit Microcontroller Cortex-M3 introduction
SOC Processors Used in SOC
Vx works RTOS
Hardware-Software Codesign
Embedded System Programming on ARM Cortex M3 and M4 Course
Lecture 1 introduction to parallel and distributed computing
Ec8791 arm 9 processor
Instruction Level Parallelism (ILP) Limitations
Pipelining powerpoint presentation
Memory mapping
WORKFLOW OF THE PROCESS IN SPM
System On Chip
Vhdl programming
Semiconductor memories
ARM Processor Tutorial
E.s unit 6
Assembly language programming_fundamentals 8086
Data Designs (Software Engg.)
Ad

Similar to Superscalar and VLIW architectures (20)

PPT
Vliw
PPT
Vliw and superscaler
PDF
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
PPTX
VLIW(Very Long Instruction Word)
PDF
Vliw or epic
PPT
Overview of Very long instruction word processors
PPTX
Difficulties in Pipelining
PPT
Lec1 final
PPTX
Parallel Computing
PPT
Processor Design Flow for architecture design
PPT
Overview of Very long instruction word Computing
PDF
Shown below is a VLIW system in which each long instruction word gen.pdf
PPT
Chapter 3
PPT
Computer Architecture Instruction-Level paraallel processors
PPTX
Instruction Level Parallelism | Static Multiple Issue & Dynamic Multiple Issu...
PPTX
Scope of parallelism
PPT
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
PDF
Programmable digital signal processors architecture programming and applicati...
PPT
Cp uarch
PPT
Unit 5-lecture 5
Vliw
Vliw and superscaler
Fpga based 128 bit customised vliw processor for executing dual scalarvector ...
VLIW(Very Long Instruction Word)
Vliw or epic
Overview of Very long instruction word processors
Difficulties in Pipelining
Lec1 final
Parallel Computing
Processor Design Flow for architecture design
Overview of Very long instruction word Computing
Shown below is a VLIW system in which each long instruction word gen.pdf
Chapter 3
Computer Architecture Instruction-Level paraallel processors
Instruction Level Parallelism | Static Multiple Issue & Dynamic Multiple Issu...
Scope of parallelism
Lec4 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- ISA
Programmable digital signal processors architecture programming and applicati...
Cp uarch
Unit 5-lecture 5
Ad

More from Amit Kumar Rathi (20)

PDF
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
PDF
Fundamentals of Genetic Algorithms (Soft Computing)
PDF
Fuzzy Systems by using fuzzy set (Soft Computing)
PDF
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
PDF
Associative Memory using NN (Soft Computing)
PDF
Back Propagation Network (Soft Computing)
PDF
Fundamentals of Neural Network (Soft Computing)
PDF
Introduction to Soft Computing (intro to the building blocks of SC)
PDF
Topological sorting
PDF
String matching, naive,
PDF
Shortest path algorithms
PDF
Sccd and topological sorting
PDF
Red black trees
PDF
Recurrence and master theorem
PDF
Rabin karp string matcher
PDF
Minimum spanning tree
PDF
Merge sort analysis
PDF
Loop invarient
PDF
Linear sort
PDF
Heap and heapsort
Hybrid Systems using Fuzzy, NN and GA (Soft Computing)
Fundamentals of Genetic Algorithms (Soft Computing)
Fuzzy Systems by using fuzzy set (Soft Computing)
Fuzzy Set Theory and Classical Set Theory (Soft Computing)
Associative Memory using NN (Soft Computing)
Back Propagation Network (Soft Computing)
Fundamentals of Neural Network (Soft Computing)
Introduction to Soft Computing (intro to the building blocks of SC)
Topological sorting
String matching, naive,
Shortest path algorithms
Sccd and topological sorting
Red black trees
Recurrence and master theorem
Rabin karp string matcher
Minimum spanning tree
Merge sort analysis
Loop invarient
Linear sort
Heap and heapsort

Recently uploaded (20)

PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
DOCX
573137875-Attendance-Management-System-original
PDF
Digital Logic Computer Design lecture notes
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
web development for engineering and engineering
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Internet of Things (IOT) - A guide to understanding
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Foundation to blockchain - A guide to Blockchain Tech
573137875-Attendance-Management-System-original
Digital Logic Computer Design lecture notes
UNIT-1 - COAL BASED THERMAL POWER PLANTS
CYBER-CRIMES AND SECURITY A guide to understanding
Embodied AI: Ushering in the Next Era of Intelligent Systems
web development for engineering and engineering
Lecture Notes Electrical Wiring System Components
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Mechanical Engineering MATERIALS Selection
Lesson 3_Tessellation.pptx finite Mathematics
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
CH1 Production IntroductoryConcepts.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026

Superscalar and VLIW architectures

  • 1. Superscalar and VLIW Architectures Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 2. Outline • Types of architectures • Superscalar • Differences between CISC, RISC and VLIW • VLIW Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 3. Parallel processing Processing instructions in parallel requires three major tasks: 1. checking dependencies between instructions to determine which instructions can be grouped together for parallel execution; 2. assigning instructions to the functional units on the hardware; 3. determining when instructions are initiated placed together into a single word. Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 4. Major categories From Mark Smotherman, “Understanding EPIC Architectures and Implementations” VLIW – Very Long Instruction Word EPIC – Explicitly Parallel Instruction Computing Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 5. Major categories Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 6. Superscalar Processors • Superscalar processors are designed to exploit more instruction-level parallelism in user programs. • Only independent instructions can be executed in parallel without causing a wait state. • The amount of instruction-level parallelism varies widely depending on the type of code being executed. Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 7. Pipelining in Superscalar Processors • In order to fully utilise a superscalar processor of degree m, m instructions must be executable in parallel. This situation may not be true in all clock cycles. In that case, some of the pipelines may be stalling in a wait state. • In a superscalar processor, the simple operation latency should require only one cycle, as in the base scalar processor. Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 8. Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 9. Superscalar Execution Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 10. Superscalar Implementation • Simultaneously fetch multiple instructions • Logic to determine true dependencies involving register values • Mechanisms to communicate these values • Mechanisms to initiate multiple instructions in parallel • Resources for parallel execution of multiple instructions • Mechanisms for committing process state in correct order Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 11. Some Architectures • PowerPC 604 – six independent execution units: • Branch execution unit • Load/Store unit • 3 Integer units • Floating-point unit – in-order issue – register renaming • Power PC 620 – provides in addition to the 604 out-of-order issue • Pentium – three independent execution units: • 2 Integer units • Floating point unit – in-order issue Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 12. The VLIW Architecture • A typical VLIW (very long instruction word) machine has instruction words hundreds of bits in length. • Multiple functional units are used concurrently in a VLIW processor. • All functional units share the use of a common large register file. Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 13. Comparison: CISC, RISC, VLIW Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 14. Dr. Amit Kumar, Dept of CSE, JUET, Guna
  • 15. Advantages of VLIW Compiler prepares fixed packets of multiple operations that give the full "plan of execution" – dependencies are determined by compiler and used to schedule according to function unit latencies – function units are assigned by compiler and correspond to the position within the instruction packet ("slotting") – compiler produces fully-scheduled, hazard- free code => hardware doesn't have to "rediscover" dependencies or scheduleDr. Amit Kumar, Dept of CSE, JUET, Guna
  • 16. Disadvantages of VLIW Compatibility across implementations is a major problem – VLIW code won't run properly with different number of function units or different latencies – unscheduled events (e.g., cache miss) stall entire processor Code density is another problem – low slot utilization (mostly nops) – reduce nops by compression ("flexible VLIW", "variable-length VLIW")Dr. Amit Kumar, Dept of CSE, JUET, Guna