SlideShare a Scribd company logo
Basic non-pipelined
CPU Architecture
Contents
•CPU Architecture Types
•Detailed data path of a typical register
based CPU
•Fetch-Decode-Execute Cycle
•Implementation of Control Unit: Hardwired
Approach and Micro programmed Approach
•Calculations of CPI and MIPS parameters
Recall
• A simple CPU consists of a set of registers, Arithmetic Logic Unit
(ALU), and Control Unit (CU).
• Operand: Information involved in any operation performed by the
CPU needs to be addressed. In computer terminology, such
information is called the operand.
Recall
• Registers: A processor register (CPU register) is one of a small set of
data holding places that are part of the computer processor.
A register may hold an instruction, a storage address, or any kind of
data (such as a bit sequence or individual characters). Some instructions
specify registers as part of the instruction.
• Accumulator : A one-address instruction takes the form ADD R1. In this
case the instruction implicitly refers to a register, called the Accumulator
Racc, such that the contents of the accumulator is added to the contents
of the register R1 and the results are stored back into the accumulator
Racc.
Recall
Recall
A Simple Machine
A Simple Machine
• Our simple machine is an accumulator-based processor, which
has five 16-bit registers: Program Counter (PC), Instruction
Register (IR), Address Register (AR), Accumulator (AC), and
Data Register (DR). The PC contains the address of the next
instruction to be executed. The IR contains the operation code
portion of the instruction being executed. The AR contains the
address portion (if any) of the instruction being executed. The
AC serves as the implicit source and destination of data. The
DR is used to hold data. The memory unit is made up of 4096
words of storage. The word size is 16 bits.
CPU Architecture Types
• Accumulator (before 1960, e.g. 68HC11):
1-address add A acc  acc + mem[A]
• Stack (1960s to 1970s):
0-address add tos  tos + next
• Register-Memory (1970s to present, e.g. 80x86):
2-address add R1, A R1 R1 + mem[A]
load R1, A R1  mem[A]
• Register-Register (Load/Store) (1960s to present, e.g. MIPS):
3-address add R1, R2, R3 R1  R2 + R3
load R1, R2 R1  mem[R2]
store R1, R2 mem[R1]  R2
Code Sequence C = A + B
for Four Instruction Sets
Stack Accumulator Register
(register-memory)
Register (load-
store)
Push A
Push B
Add
Pop C
Load A
Add B
Store C
Load R1, A
Add R1, B
Store C, R1
Load R1,A
Load R2, B
Add R3, R1, R2
Store C, R3
memory memory
acc = acc + mem[C] R1 = R1 + mem[C] R3 = R1 + R2
Stack Architectures
•Instruction set:
add, sub, mult, div, . . .
push A, pop A
•Example: A*B - (A+C*B)
push A
push B
mul
push A
push C
push B
mul
add
sub
A B
A
A*B
A*B
A*B
A*B
A
A
C
A*B
A A*B
A C B B*C A+B*C result
Accumulator Architectures
• Instruction set:
add A, sub A, mult A, div A, . . .
load A, store A
• Example: A*B - (A+C*B)
load B
mul C
add A
store D
load A
mul B
sub D
B B*C A+B*C AA+B*C A*B result
acc = acc +,-,*,/ mem[A]
Register-Memory Architectures
• Instruction set:
add R1, A sub R1, A mul R1, B
load R1, A store R1, A
• Example: A*B - (A+C*B)
load R1, A
mul R1, B /* A*B */
store R1, D
load R2, C
mul R2, B /* C*B */
add R2, A /* A + CB */
sub R2, D /* AB - (A + C*B) */
R1 = R1 +,-,*,/ mem[B]
Register (Load-Store) Architectures
• Instruction set:
add R1, R2, R3 sub R1, R2, R3 mul R1, R2, R3
load R1, &A store R1, &A move R1, R2
• Example: A*B - (A+C*B)
load R1, &A
load R2, &B
load R3, &C
mul R7, R3, R2 /* C*B */
add R8, R7, R1 /* A + C*B */
mul R9, R1, R2 /* A*B */
sub R10, R9, R8 /* A*B - (A+C*B) */
R3 = R1 +,-,*,/ R2
Detailed data path of a typical register
based CPU
•DATAPATH:
•The CPU can be divided into a data section (data
path: contains the registers and the ALU) and a
control section (control unit: issues control signals
to the data path).
•Internal data movement among registers and
between the ALU and registers may be carried out
using different organizations including one-bus,
two-bus, or three-bus organizations.
One-Bus Data path
• Using one bus, the CPU registers and the ALU use a single bus to
move outgoing and incoming data.
• Since a bus can handle only a single data movement within one clock
cycle, two-operand operations will need two cycles to fetch the
operands for the ALU.
• Additional registers may also be needed to buffer data for the ALU.
This bus organization is the simplest and least expensive, but it limits
the amount of data transfer that can be done in the same clock cycle,
which will slow down the overall performance.
• Figure shows a one-bus data path consisting of a set of general-
purpose registers, a memory address register (MAR), a memory data
register (MDR), an instruction register (IR), a program counter (PC),
and an ALU
One-Bus Data path
Two-Bus Data path
• Using two buses is a faster solution than the one-bus organization. In
this case, general-purpose registers are connected to both buses. Data
can be transferred from two different registers to the input point of the
ALU at the same time.
• Therefore, a two operand operation can fetch both operands in the same
clock cycle. An additional buffer register may be needed to hold the
output of the ALU when the two buses are busy carrying the two
operands. Figure a shows a two-bus organization.
• In some cases, one of the buses may be dedicated for moving data into
registers (in-bus), while the other is dedicated for transferring data out of
the registers (out-bus).
• In this case, the additional buffer register may be used, as one of the
ALU inputs, to hold one of the operands.
• The ALU output can be connected directly to the in-bus, which will
transfer the result into one of the registers. Figure b shows a two-bus
organization with in-bus and out-bus.
An Example of Two-Bus Data path.
Example of Two-Bus Data path with
in-bus and out-bus
Three-Bus Data path
• In a three-bus organization, two buses may be used as source buses
while the third is used as destination.
• The source buses move data out of registers (out-bus), and the
destination bus may move data into a register (in-bus).
• Each of the two out-buses is connected to an ALU input point. The
output of the ALU is connected directly to the in-bus.
• As can be expected, the more buses we have, the more data we can
move within a single clock cycle.
• However, increasing the number of buses will also increase the
complexity of the hardware. Figure shows an example of a three-bus
data path.
Three Bus Data path
Fetch-Decode-Execute
Cycle
Both the data and the program that acts upon that data are loaded
into main memory (RAM) by the operating system. The CPU is
now ready to do some work.
Steps of the Fetch-Decode-Execute
Cycle
• Get the next instruction
• Figure out what to do
• Gathering the data needed to do it
• Do it
• Save the result, and
• Repeat (billions of times/second)!
Fetch Cycle
• The Program Counter (PC) contains the address of the next
instruction to be fetched
• The address contained in the PC is copied to the Memory
Address Register (MAR).
• The instruction is copied from the memory location contained in
the MAR and placed in the Memory Buffer Register (MBR).
• The entire instruction is copied from the MBR and placed in the
Current Instruction Register (CIR)
• The PC is incremented so that it points to the next
instruction to be fetched
Execute Cycle
• The address part of the instruction is placed in the MAR
• The instruction is decoded and executed
• The processor checks for interrupts (signals from devices or
other sources seeking the attention of the processor) and
either branches to the relevant interrupt service routine or
starts the cycle again.
Example
1.The PC contains the address of location 100
2.CU fetches instruction in location 100
3. Make a copy of the instruction into the IR
4. Increment the PC by 1
5.Activate the right circuits to execute the
instruction
101
1. The PC contains the address of location 101
2. CU fetches instruction in location 101
3. A copy of the instruction is saved in the IR
4. Increment the PC
5. Activate the right circuits to execute the
instruction
Control Unit
• CU is the engine that runs the entire computer with the help of the
control signals.
• It perform the correct sequencing of the correct signals.
• It controls everything with a few control signals that points within
processor and a few control signals to the system bus.
• All the micro-operation are controlled by CU by performing
two basic tasks:
• Sequencing: It causes the processor to step through the series
of micro-operation in proper sequence, based on program being
executed.
• Execution: It causes each micro-operation to be performed.
Basic non pipelined cpu architecture
Control Signal Sources
• Clock
• It helps to synchronize the operation. It causes one micro-
• operation to be performed for each clock pulse
• Instruction Register
• Op-code for current instruction
• Determines which micro-instructions are performed
• Flags
• State of CPU
• Results of previous operations
• From Control Bus
• Interrupts / Bus Requests
• Acknowledgements
Control Signal Outputs
• Within Processor
• Cause data movement
• Activate specific functions
• Via Main Bus
• To memory
• To I/O modules
Types
• There are two design approach for CU:
• Hardwired approach
• Micro-programming approach
Hardwired Approach
• The control signals are generated by the help of thehardware.
• It can be designed as the clock sequential circuit.
• It is implemented with logic gates, flip-flops, decoders,
multiplexers and other logic buildings blocks.
Micro programmed Approach
• All controls that can be activated simultaneously are grouped together
to form the control words.
• These words are stored in the control memory.
• The control words are fetched from the control memory and
are routed to various functional units to enable appropriate
processing hardware.
Attributes Hardwired Control Microprogramming
Control
Speed Fast Slow
Cost of
Implementation
More Cheaper
Flexibility Difficult to modify Flexible
Ability to handle
complex instruction
Difficult Easier
Decoding Complex Easy
Application RISC CISC
Instruction Set Size Small Large
Control Memory Absent Present
Comparison
Micro programmed Control Unit
Control Unit Function
• Sequence login unit issues read command
• Word specified in control address register is read into
control buffer register
• Control buffer register contents generates control signals
and next address information
• Sequence login loads new address into control buffer
register based on next address information from control
buffer register and ALU flags
Calculations of CPI and MIPS parameters
We denote the number of CPU clock cycles for executing a job to be the
cycle count (CC), the cycle time by CT, and the clock frequency by
f=1/CT. The time taken by the CPU to execute a job can be expressed as
CPU time = CC x CT = CC / f
It may be easier to count the number of instructions executed in a given
program as compared to counting the number of CPU clock cycles
needed for executing that program. Therefore, the average number of
clock cycles per instruction (CPI) has been used as an alternate
performance measure. The following equation shows how to compute
the CPI.
CPI = CPU clock cycles for the program/Instruction count
CPU time = Instruction count x CPI x Clock cycle time
= (Instruction count x CPI) / Clock rate
Calculations of CPI and MIPS parameters
(Contd.)
overall CPI can be computed as,
Where Ii is the number of times an instruction of type i is executed in
the program and CPIi is the average number of clock cycles needed to
execute such instruction.
Example 1
• Consider computing the overall CPI for a machine A for which the
following performance measures were recorded when executing a
set of benchmark programs. Assume that the clock rate of the CPU is
200 MHz.
• Assuming the execution of 100 instructions, the overall CPI can be
computed as
Instruction
category
Percentage of
occurrence
No. of cycles
per instruction
ALU 38 1
Load & store 15 3
Branch 42 4
Others 5 5
Answer
CPI = (38*1+15*3+42*4+5*5)/100
=2.76
MIPS(million instructions-per-second)
The rate of instruction execution per unit time,
What is the MIPS rating for the machine considered in the previous
example
answer
MIPS = (200 X 10^6) / (2.76 X 10^6)
= 70.24
Exercise 1
Suppose that the same set of benchmark programs considered above
were executed on another machine, call it machine B, for which the
following measures were recorded.
What is the MIPS rating for the machine B and assuming a clock rate of
200 MHz?
Instruction
category
Percentage of
occurrence
No. of cycles
per instruction
ALU 35 1
Load & store 30 2
Branch 15 3
Others 20 5
Answer
CPI = (35*1 + 30*2 + 15*3 + 20*5 )/ 100
= 2.4
MIPS = (200 X 10^6) / (2.4 * 10^6)
= 83.67
Exercise 2
• Write the Code sequence using four types of CPU architecture for the
following,
Reference
• FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE
• Mostafa Abd-El-Barr, King Fahd University of Petroleum & Minerals (KFUPM)
• Hesham El-Rewini, Southern Methodist University

More Related Content

PPT
mano.ppt
PPTX
Arithmetic micro operations
PPTX
Register transfer language
PPT
Data transfer and manipulation
PPT
Cpu organisation
PPTX
Computer Organisation & Architecture (chapter 1)
PPTX
CPU Architecture - Basic
PPTX
DMA and DMA controller
mano.ppt
Arithmetic micro operations
Register transfer language
Data transfer and manipulation
Cpu organisation
Computer Organisation & Architecture (chapter 1)
CPU Architecture - Basic
DMA and DMA controller

What's hot (20)

PPTX
Direct Memory Access
PPTX
Control unit
PPTX
Processor organization & register organization
PPTX
Logical and shift micro operations
PPTX
Register organization, stack
PPTX
Instruction set and instruction execution cycle
PPTX
Instruction Cycle in Computer Organization.pptx
PPTX
Computer architecture
PPTX
Datapath Design of Computer Architecture
PPTX
Multiprocessor
PPT
Instruction cycle
PPTX
Instruction codes
PPTX
Memory in computer
PPS
Interrupts
PPTX
Multiprocessor
PPTX
Addressing Modes.pptx
PPTX
Basic Computer Organization and Design
PPTX
Computer system bus
PPTX
Register Reference Instructions | Computer Science
Direct Memory Access
Control unit
Processor organization & register organization
Logical and shift micro operations
Register organization, stack
Instruction set and instruction execution cycle
Instruction Cycle in Computer Organization.pptx
Computer architecture
Datapath Design of Computer Architecture
Multiprocessor
Instruction cycle
Instruction codes
Memory in computer
Interrupts
Multiprocessor
Addressing Modes.pptx
Basic Computer Organization and Design
Computer system bus
Register Reference Instructions | Computer Science
Ad

Similar to Basic non pipelined cpu architecture (20)

PPTX
SUDHARSAN.V.pptx
PPTX
1.1.2 Processor and primary storage components.pptx
PDF
AS & A Level Computer Science Chapter 4 Presentation
PPTX
F453 computer science fde cycle
PPTX
Computer Organization : CPU, Memory and I/O organization
PPTX
Unit iii
PPTX
Microprocessor
PPT
CSe_Cumilla Bangladesh_Country CSE CSE213_5.ppt
PPTX
ucs.pptxUCS UNIT 4 INPUT AND OUTPUT DEVICE
PPTX
Introduction of CPU.pptx
PPT
Mpi unit i_8086_architectures
PPTX
introduction to central processing unit powerpoint
PPTX
BCSE205L-Module 3 _Datapath Computer Architecture Org.pptx
PDF
COMPUTER ORGANIZATION NOTES Unit 7
PDF
310471266 chapter-7-notes-computer-organization
PPTX
Computer Organization & Architecture (COA) Unit 2
PPT
Digital-Unit-III.ppt
PPTX
Computer Organisation and Architecture (COA)
PPTX
CO UNIT 4 pptx in computer organizations
PPT
Control unit presentation about CH16.ppt
SUDHARSAN.V.pptx
1.1.2 Processor and primary storage components.pptx
AS & A Level Computer Science Chapter 4 Presentation
F453 computer science fde cycle
Computer Organization : CPU, Memory and I/O organization
Unit iii
Microprocessor
CSe_Cumilla Bangladesh_Country CSE CSE213_5.ppt
ucs.pptxUCS UNIT 4 INPUT AND OUTPUT DEVICE
Introduction of CPU.pptx
Mpi unit i_8086_architectures
introduction to central processing unit powerpoint
BCSE205L-Module 3 _Datapath Computer Architecture Org.pptx
COMPUTER ORGANIZATION NOTES Unit 7
310471266 chapter-7-notes-computer-organization
Computer Organization & Architecture (COA) Unit 2
Digital-Unit-III.ppt
Computer Organisation and Architecture (COA)
CO UNIT 4 pptx in computer organizations
Control unit presentation about CH16.ppt
Ad

Recently uploaded (20)

PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
A Presentation on Artificial Intelligence
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Cloud computing and distributed systems.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
“AI and Expert System Decision Support & Business Intelligence Systems”
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Electronic commerce courselecture one. Pdf
cuic standard and advanced reporting.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx
A Presentation on Artificial Intelligence
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation_ Review paper, used for researhc scholars
NewMind AI Monthly Chronicles - July 2025
Unlocking AI with Model Context Protocol (MCP)
Cloud computing and distributed systems.
Reach Out and Touch Someone: Haptics and Empathic Computing
Network Security Unit 5.pdf for BCA BBA.
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Building Integrated photovoltaic BIPV_UPV.pdf

Basic non pipelined cpu architecture

  • 2. Contents •CPU Architecture Types •Detailed data path of a typical register based CPU •Fetch-Decode-Execute Cycle •Implementation of Control Unit: Hardwired Approach and Micro programmed Approach •Calculations of CPI and MIPS parameters
  • 3. Recall • A simple CPU consists of a set of registers, Arithmetic Logic Unit (ALU), and Control Unit (CU). • Operand: Information involved in any operation performed by the CPU needs to be addressed. In computer terminology, such information is called the operand.
  • 5. • Registers: A processor register (CPU register) is one of a small set of data holding places that are part of the computer processor. A register may hold an instruction, a storage address, or any kind of data (such as a bit sequence or individual characters). Some instructions specify registers as part of the instruction. • Accumulator : A one-address instruction takes the form ADD R1. In this case the instruction implicitly refers to a register, called the Accumulator Racc, such that the contents of the accumulator is added to the contents of the register R1 and the results are stored back into the accumulator Racc. Recall
  • 8. A Simple Machine • Our simple machine is an accumulator-based processor, which has five 16-bit registers: Program Counter (PC), Instruction Register (IR), Address Register (AR), Accumulator (AC), and Data Register (DR). The PC contains the address of the next instruction to be executed. The IR contains the operation code portion of the instruction being executed. The AR contains the address portion (if any) of the instruction being executed. The AC serves as the implicit source and destination of data. The DR is used to hold data. The memory unit is made up of 4096 words of storage. The word size is 16 bits.
  • 9. CPU Architecture Types • Accumulator (before 1960, e.g. 68HC11): 1-address add A acc  acc + mem[A] • Stack (1960s to 1970s): 0-address add tos  tos + next • Register-Memory (1970s to present, e.g. 80x86): 2-address add R1, A R1 R1 + mem[A] load R1, A R1  mem[A] • Register-Register (Load/Store) (1960s to present, e.g. MIPS): 3-address add R1, R2, R3 R1  R2 + R3 load R1, R2 R1  mem[R2] store R1, R2 mem[R1]  R2
  • 10. Code Sequence C = A + B for Four Instruction Sets Stack Accumulator Register (register-memory) Register (load- store) Push A Push B Add Pop C Load A Add B Store C Load R1, A Add R1, B Store C, R1 Load R1,A Load R2, B Add R3, R1, R2 Store C, R3 memory memory acc = acc + mem[C] R1 = R1 + mem[C] R3 = R1 + R2
  • 11. Stack Architectures •Instruction set: add, sub, mult, div, . . . push A, pop A •Example: A*B - (A+C*B) push A push B mul push A push C push B mul add sub A B A A*B A*B A*B A*B A A C A*B A A*B A C B B*C A+B*C result
  • 12. Accumulator Architectures • Instruction set: add A, sub A, mult A, div A, . . . load A, store A • Example: A*B - (A+C*B) load B mul C add A store D load A mul B sub D B B*C A+B*C AA+B*C A*B result acc = acc +,-,*,/ mem[A]
  • 13. Register-Memory Architectures • Instruction set: add R1, A sub R1, A mul R1, B load R1, A store R1, A • Example: A*B - (A+C*B) load R1, A mul R1, B /* A*B */ store R1, D load R2, C mul R2, B /* C*B */ add R2, A /* A + CB */ sub R2, D /* AB - (A + C*B) */ R1 = R1 +,-,*,/ mem[B]
  • 14. Register (Load-Store) Architectures • Instruction set: add R1, R2, R3 sub R1, R2, R3 mul R1, R2, R3 load R1, &A store R1, &A move R1, R2 • Example: A*B - (A+C*B) load R1, &A load R2, &B load R3, &C mul R7, R3, R2 /* C*B */ add R8, R7, R1 /* A + C*B */ mul R9, R1, R2 /* A*B */ sub R10, R9, R8 /* A*B - (A+C*B) */ R3 = R1 +,-,*,/ R2
  • 15. Detailed data path of a typical register based CPU •DATAPATH: •The CPU can be divided into a data section (data path: contains the registers and the ALU) and a control section (control unit: issues control signals to the data path). •Internal data movement among registers and between the ALU and registers may be carried out using different organizations including one-bus, two-bus, or three-bus organizations.
  • 16. One-Bus Data path • Using one bus, the CPU registers and the ALU use a single bus to move outgoing and incoming data. • Since a bus can handle only a single data movement within one clock cycle, two-operand operations will need two cycles to fetch the operands for the ALU. • Additional registers may also be needed to buffer data for the ALU. This bus organization is the simplest and least expensive, but it limits the amount of data transfer that can be done in the same clock cycle, which will slow down the overall performance. • Figure shows a one-bus data path consisting of a set of general- purpose registers, a memory address register (MAR), a memory data register (MDR), an instruction register (IR), a program counter (PC), and an ALU
  • 18. Two-Bus Data path • Using two buses is a faster solution than the one-bus organization. In this case, general-purpose registers are connected to both buses. Data can be transferred from two different registers to the input point of the ALU at the same time. • Therefore, a two operand operation can fetch both operands in the same clock cycle. An additional buffer register may be needed to hold the output of the ALU when the two buses are busy carrying the two operands. Figure a shows a two-bus organization. • In some cases, one of the buses may be dedicated for moving data into registers (in-bus), while the other is dedicated for transferring data out of the registers (out-bus). • In this case, the additional buffer register may be used, as one of the ALU inputs, to hold one of the operands. • The ALU output can be connected directly to the in-bus, which will transfer the result into one of the registers. Figure b shows a two-bus organization with in-bus and out-bus.
  • 19. An Example of Two-Bus Data path.
  • 20. Example of Two-Bus Data path with in-bus and out-bus
  • 21. Three-Bus Data path • In a three-bus organization, two buses may be used as source buses while the third is used as destination. • The source buses move data out of registers (out-bus), and the destination bus may move data into a register (in-bus). • Each of the two out-buses is connected to an ALU input point. The output of the ALU is connected directly to the in-bus. • As can be expected, the more buses we have, the more data we can move within a single clock cycle. • However, increasing the number of buses will also increase the complexity of the hardware. Figure shows an example of a three-bus data path.
  • 23. Fetch-Decode-Execute Cycle Both the data and the program that acts upon that data are loaded into main memory (RAM) by the operating system. The CPU is now ready to do some work.
  • 24. Steps of the Fetch-Decode-Execute Cycle • Get the next instruction • Figure out what to do • Gathering the data needed to do it • Do it • Save the result, and • Repeat (billions of times/second)!
  • 25. Fetch Cycle • The Program Counter (PC) contains the address of the next instruction to be fetched • The address contained in the PC is copied to the Memory Address Register (MAR). • The instruction is copied from the memory location contained in the MAR and placed in the Memory Buffer Register (MBR). • The entire instruction is copied from the MBR and placed in the Current Instruction Register (CIR) • The PC is incremented so that it points to the next instruction to be fetched
  • 26. Execute Cycle • The address part of the instruction is placed in the MAR • The instruction is decoded and executed • The processor checks for interrupts (signals from devices or other sources seeking the attention of the processor) and either branches to the relevant interrupt service routine or starts the cycle again.
  • 28. 1.The PC contains the address of location 100 2.CU fetches instruction in location 100 3. Make a copy of the instruction into the IR 4. Increment the PC by 1 5.Activate the right circuits to execute the instruction
  • 29. 101
  • 30. 1. The PC contains the address of location 101 2. CU fetches instruction in location 101 3. A copy of the instruction is saved in the IR 4. Increment the PC 5. Activate the right circuits to execute the instruction
  • 31. Control Unit • CU is the engine that runs the entire computer with the help of the control signals. • It perform the correct sequencing of the correct signals. • It controls everything with a few control signals that points within processor and a few control signals to the system bus. • All the micro-operation are controlled by CU by performing two basic tasks: • Sequencing: It causes the processor to step through the series of micro-operation in proper sequence, based on program being executed. • Execution: It causes each micro-operation to be performed.
  • 33. Control Signal Sources • Clock • It helps to synchronize the operation. It causes one micro- • operation to be performed for each clock pulse • Instruction Register • Op-code for current instruction • Determines which micro-instructions are performed • Flags • State of CPU • Results of previous operations • From Control Bus • Interrupts / Bus Requests • Acknowledgements
  • 34. Control Signal Outputs • Within Processor • Cause data movement • Activate specific functions • Via Main Bus • To memory • To I/O modules
  • 35. Types • There are two design approach for CU: • Hardwired approach • Micro-programming approach
  • 36. Hardwired Approach • The control signals are generated by the help of thehardware. • It can be designed as the clock sequential circuit. • It is implemented with logic gates, flip-flops, decoders, multiplexers and other logic buildings blocks.
  • 37. Micro programmed Approach • All controls that can be activated simultaneously are grouped together to form the control words. • These words are stored in the control memory. • The control words are fetched from the control memory and are routed to various functional units to enable appropriate processing hardware.
  • 38. Attributes Hardwired Control Microprogramming Control Speed Fast Slow Cost of Implementation More Cheaper Flexibility Difficult to modify Flexible Ability to handle complex instruction Difficult Easier Decoding Complex Easy Application RISC CISC Instruction Set Size Small Large Control Memory Absent Present Comparison
  • 40. Control Unit Function • Sequence login unit issues read command • Word specified in control address register is read into control buffer register • Control buffer register contents generates control signals and next address information • Sequence login loads new address into control buffer register based on next address information from control buffer register and ALU flags
  • 41. Calculations of CPI and MIPS parameters We denote the number of CPU clock cycles for executing a job to be the cycle count (CC), the cycle time by CT, and the clock frequency by f=1/CT. The time taken by the CPU to execute a job can be expressed as CPU time = CC x CT = CC / f It may be easier to count the number of instructions executed in a given program as compared to counting the number of CPU clock cycles needed for executing that program. Therefore, the average number of clock cycles per instruction (CPI) has been used as an alternate performance measure. The following equation shows how to compute the CPI. CPI = CPU clock cycles for the program/Instruction count CPU time = Instruction count x CPI x Clock cycle time = (Instruction count x CPI) / Clock rate
  • 42. Calculations of CPI and MIPS parameters (Contd.) overall CPI can be computed as, Where Ii is the number of times an instruction of type i is executed in the program and CPIi is the average number of clock cycles needed to execute such instruction.
  • 43. Example 1 • Consider computing the overall CPI for a machine A for which the following performance measures were recorded when executing a set of benchmark programs. Assume that the clock rate of the CPU is 200 MHz. • Assuming the execution of 100 instructions, the overall CPI can be computed as Instruction category Percentage of occurrence No. of cycles per instruction ALU 38 1 Load & store 15 3 Branch 42 4 Others 5 5
  • 45. MIPS(million instructions-per-second) The rate of instruction execution per unit time, What is the MIPS rating for the machine considered in the previous example
  • 46. answer MIPS = (200 X 10^6) / (2.76 X 10^6) = 70.24
  • 47. Exercise 1 Suppose that the same set of benchmark programs considered above were executed on another machine, call it machine B, for which the following measures were recorded. What is the MIPS rating for the machine B and assuming a clock rate of 200 MHz? Instruction category Percentage of occurrence No. of cycles per instruction ALU 35 1 Load & store 30 2 Branch 15 3 Others 20 5
  • 48. Answer CPI = (35*1 + 30*2 + 15*3 + 20*5 )/ 100 = 2.4 MIPS = (200 X 10^6) / (2.4 * 10^6) = 83.67
  • 49. Exercise 2 • Write the Code sequence using four types of CPU architecture for the following,
  • 50. Reference • FUNDAMENTALS OF COMPUTER ORGANIZATION AND ARCHITECTURE • Mostafa Abd-El-Barr, King Fahd University of Petroleum & Minerals (KFUPM) • Hesham El-Rewini, Southern Methodist University