SlideShare a Scribd company logo
MODULE 2
PART-I
BASIC PROCESSING UNIT
1
TOPICS COVERED
 FUNDAMENTAL CONCEPTS
 INSTRUCTION CYCLE
 EXECUTION OF A COMPLETE INSTRUCTION
 MULTIPLE BUS ORGANIZATION
 SEQUENCING OF CONTROL SIGNALS
2
3
INTRODUCTION
 This unit is about the processing unit, which executes machine
instructions and coordinates the activities of other units.
 This is also called a processor or instruction set processor (ISP).
 We understand its internal structure and how it performs the tasks of
fetching, decoding, and executing instructions of a program.
 The processing unit is called central processing unit (CPU).
 We explore the organization of the hardware that enables a CPU to
perform its main function.
 We learn how the execution of a complete instruction takes place and
we also learn Branch Instructions
4
SOME FUNDAMENTAL CONCEPTS
 A program, a set of instructions, to be executed by a computer is
loaded in sequential locations in the main memory.
 To execute this program, the CPU fetches one instruction at a time and
performs the functions specified.
 Until a branch or a jump instruction is executed, instructions are
fetched from successive memory locations.
 The address of the next instruction to be executed is kept by the CPU
in a dedicated register called program counter (PC).
 The contents of the PC are updated to point to the next instruction in
the sequence.
5
 Assume that each instruction occupies one memory word.
 Therefore, one instruction execution requires the CPU to perform the
following 3 steps:
1. Fetch the contents of the memory location by the PC into
instruction register (IR).
Symbolically, this can be written as:
IR ← [[PC]]
2. Increment the contents of the PC by 1, i.e., (assuming word
addressable) PC ← [PC]+1
3. Carry out the actions specified by the instruction in the IR.
 Steps 1 and 2 are called the fetch phase and step 3 is called the
execution phase.
6
Single-bus Organization of the Datapath inside
a processor
7
 Figure shows an organization in which the arithmetic and logic unit
(ALU) and all the registers are interconnected via a single common
bus.
 This bus is internal to the processor
 The address and data lines of the external memory bus are connected
to the internal processor bus via the memory data register, MDR, and
the memory address register, MAR, respectively.
 Register MDR has two inputs and two outputs.
 Data may be loaded into MDR either from the memory bus or from
the internal processor bus.
8
 The data stored in MDR may be placed on either bus. The input of
MAR is connected to the internal bus, and its output is connected to
the external bus.
 The control lines of the memory bus are connected to the instruction
decoder and control logic block.
 This unit is responsible for issuing the signals that control the
operation of all the units inside the processor and for interacting with
the memory bus.
 The use and number of the processor registers R0 through R(n-1) vary
considerably from one processor to another
 Three registers, Y, S, and TEMP are used by the processor for
temporary storage during execution of some instructions
9
 The multiplexer MUX selects either the output of register Y or a
constant value 4 to be provided as input A of the ALU.
 The constant 4 is used to increment the contents of the program
counter.
 We will refer to the two possible values of the MUX control input
Select as Select4 and Select Y for selecting the constant 4 or register
Y, respectively.
10
 As instruction execution progresses, data are transferred from one
register to another, often passing through the ALU to perform some
arithmetic or logic operation.
 The instruction decoder and control logic unit is responsible for
implementing the actions specified by the instruction loaded in the IR
register.
 The decoder generates the control signals needed to select the registers
involved and direct the transfer of data.
 The registers, the ALU, and the interconnecting bus are collectively
referred to as the datapath
11
 With few exceptions, an instruction can be executed by performing
one or more of the following operations in some specified sequence:
1. Fetch the contents of a given memory location and load them into a
processor register.
2. Store a word of data from a processor register into a given memory
location.
3. Transfer a word of data from one processor register to another or
to the ALU.
4. Perform an arithmetic or logic operation and store the result in a
processor register
1. Fetching a Word from memory
 To fetch a word from memory, the CPU has to specify the address of
the memory location where this information is stored and request a
read operation.
 The CPU transfers the address of the required word of information to
the MAR, which is connected to address lines of the memory bus.
 The CPU uses the control lines of the memory bus to indicate a Read
operation is needed.
 Then the CPU waits for Read operation completion, which is indicated
by Memory-Function Completed (MFC) signal set. When the MFC is
set, the information on the data lines is loaded into MDR
 The connections for register MDR has four control signals: MDRin
and MDRout control the connection to the internal bus, and MDRin E
and MDRout E control the connection to the external bus.
 Example :how to fetch a word from memory location, whose
address is specified in R1, and place the word fetched in R2
1 MAR ← [R1]
2 Request memory READ and put the data to the address register
3 Wait for the Memory Fetch Cycle (MFC) signal
4 Load MDR from the memory
5 R2 ←[MDR]
 The memory read operation requires three steps:
 Which can be described by the signals being activated as follows:
1. R1out, MARin, Read
2. MDRinE, WMFC
3. MDRout, R2in
Where WMFC is the control signal that causes the processor’s control
circuitry to wait for the arrival of the MFC signal.
14
2. Storing a Word in Memory
 After the address is loaded into MAR and data into MDR, The CPU
uses the control lines of the memory bus to indicate a Write operation
is needed.
 The example below shows how the machine store a word in R2 into a
memory location, whose address is specified in R1
1 MAR ← [R1]
2 MDR ←[R2]
3 Request memory write
4 Wait for MFC signal
15
 Executing the instruction Move R2,(R1) requires the following steps:
1. R1out, MARin,
2. R2out, MDRin, Write
3. MDRoutE, WMFC
 As in the case of the Read operation, the Write control signal causes
the memory bus interface hardware to issue a Write command on the
memory bus.
 The processor remains in step 3 until the memory operation is
completed and an MFC response is received
16
3. Register Transfers
 Instruction execution involves a sequence of steps in which data are
transferred from one register to another.
 For each register, two control signals are used to place the contents of
that register on the bus or to load the data on the bus into the register.
 This is represented symbolically in Figure the input and output of
register Ri are connected to the bus via switches controlled by the
signals Riinn and Riout respectively.
 When Riinn is set to 1, the data on the bus are loaded into Ri.
 Similarly, when Riout is set to 1, the contents of register Ri are placed
on the bus.
 While Riout is equal to 0, the bus can be used for transferring data
17
Input and output gating for the registers
18
 Suppose that ,to transfer the contents of register R1 to register R4.
 This can be accomplished as follows:
 Enable the output of register R1 by setting R1out to 1.
• This places the contents of R1 on the processor bus.
 Enable the input of register R4 by setting R4in to 1.
• This loads data from the processor bus into register R4.
19
 All operations and data transfers within the processor take place
within time periods defined by the processor clock.
 The control signals that govern a particular transfer are asserted at the
start of the clock cycle.
 In our example, R1out and R4in are set to 1.
 The registers consist of edge-triggered flip-flops.
 Hence, at the next active edge of the clock, the fip-flops that
constitute R4 will load the data present at their inputs.
 At the same time, the controls signals R1out and R4in will return to 0.
20
4. Performing an Arithmetic or Logic Operation
 The ALU is a combinational circuit that has no internal storage.
 It performs arithmetic and logic operations on the two operands
applied to its‘ A and B inputs.
 To add two numbers, the two operands have to be made available at
the inputs of the ALU simultaneously.
 In Figures one of the operands is the output of the multiplexer MUX
and the other operand is obtained directly from the bus.
 The result produced by the ALU is stored temporarily in register Z.
21
 A sequence of operations to add the contents of register R1 to those of
register R2 and store the result in register R3 is:
1. R1out, Yin
2. R2out, Select Y, Add, Zin
3. Zout, R3in
 The signals whose names are given in any step are activated for the
duration of the clock cycle corresponding to that step.
 All other signals are inactive.
 Hence, in step 1, the output of register R1 and the input of register Y
are enabled, causing the contents of R1 to be transferred over the bus
to Y.
22
 In step 2, the multiplexer‘s Select signal is set to Select Y, causing the
multiplexer to gate the contents of register Y to input A of the ALU.
 At the same time, the contents of register R2 are gated onto the bus
and, hence, to input B.
 The function performed by the ALU depends on the signals applied to
its control lines.
 In this case, the Add line is set to 1, causing the output of the ALU to
be the sum of the two numbers at inputs A and B.
 This sum is loaded into register Z, because its input control signal is
activated.
 In step 3, the contents of register Z are transferred to the destination
register, R3.
 This last transfer cannot be earned out during step 2, because only one
register output can be connected to the bus during any clock cycle.
23
EXECUTION OF A COMPLETE INSTRUCTION
 Let us now consider the sequence of elementary operations required to
execute one instruction.
 Consider the instruction,
Add (R3), R1 which adds the contents of a memory location
pointed to by R3 to register R1.
Executing this instruction requires the following actions:
1. Fetch the instruction
2. Fetch the first operand (the contents of the memory
location pointed to by R3)
3. Perform the addition
4. Load the result into R1
24
 Instruction execution is as follows:
 In step1, the instruction fetch operation is initiated by loading the
contents of the PC into the MAR and sending a Read request to the
memory.
 The Select signal is set to Select1, which causes the multiplexer MUX
to select the constant 1.
 This value is added to the operand at input B, which is the contents of
the PC and the result is stored in register.
 The updated value is moved from register back into the PC during step
2, while waiting for the memory to respond, the word fetched from the
memory is loaded into the IR 25
Control Sequence for Execution of the
instruction ADD (R3), R1
 Step Action
1. PCout , MARin , Read, Select1, Add, Zin
2. Zout , PCin , Y in , WMFC
3. MDRout , IRin
4. R3out , MARin , Read
5. R1out , Yin , WMFC
6. MDRout , SelectY, Add Zin
7. Zout , R1in , End
26
 Steps 1 through 3 constitute the instruction fetch phase, which is the
same for all instructions.
 The instruction decoding circuit interprets the contents of the IR at the
beginning of step 4.
 This enables the control circuitry to activate the control signals for
steps 4 through 7, which constitute the execution phase.
 The contents of register R3 are transferred to the MAR in step 4 and a
memory Read operation is initiated.
 Then, the contents of R1 are transferred to register Y in step 5, to
prepare for the addition operation.
27
 When the Read operation is completed, the memory operand is
available in register MDR and the addition operation is performed in
step 6.
 The contents of MDR are gated to the bus and thus also to the B input
of the ALU and register Y is selected as the second input to the ALU
by choosing Select Y.
 The sum is stored in register Z and then transferred toR1 in step 7.
 The End signal causes a new instruction fetch cycle to begin by
returning to step 1.
28
Branch Instructions
 A branch instruction replaces the contents of the PC with the branch
target address.
 This address is usually obtained by adding an offset X which is given
in the branch instruction to the updated value of the PC.
 Control sequence that implements an unconditional branch instruction
is as follows:
29
 Step Action
 1. PCout , MARin , Read, Select1, Add , Zin
 2. Zout , PCin , Yin , WMFC
 3. MDRout , IRin
 4. Offset-field-of-I Rout , Add , Zin
 5. Zout , PCin , End
30
 Processing starts, as usual with the fetch phase.
 This phase ends when the instruction is loaded into the IR in step 3.
 The offset value is extracted from the IR by the instruction decoding
circuit which will also perform sign extension if required.
 Since the value of the updated PC is already available in register Y,
the offset X is gated onto the bus in step 4 and an addition operation is
performed.
 The result, which is the branch target address, is loaded into the PC in
step 5
31
 The offset X used in branch instruction is usually the difference
between the branch target address and the address immediately
following the branch instruction.
 For example, if the branch instruction is at location 2000 and if the
branch target address is 2050, the value of X must be 49.
 The PC is incremented during the fetch phase, before knowing the
type of instruction being executed.
 Thus, when the branch address is computed in step 4, the PC value
used is the updated value, which points to the instruction following the
branch instruction in the memory.
32
 Consider now a conditional branch.
 In this case, we need to check the status of the condition codes before
loading a new value into the PC.
 For example, for a Branch-on negative (Branch<0) instruction, step 4
in Figure 5.5 is replaced with:
Offset-field-of-IRout, Add, Zin , If N=0 then End
 Thus, if N=0, the processor returns to step 1 immediately after step 4.
If N=1, step 5 is performed to load a new value into the PC, thus
performing the branch operation
33
MULTIPLE BUS ORGANIZATION
 Three-bus structure used to connect the registers and the ALU of a
processor.
 All general-purpose registers are combined into a single block called
the register file.
 In VLSI technology, the most efficient way to implement a number of
registers is in the form of an array of memory cells similar to those
used in the implementation of random-access memories (RAMs).
34
 The register file have three ports.
 There are two outputs, allowing the contents of two different registers
to be accessed simultaneously and have their contents placed on buses
A and B.
 The third port allows the data on bus C to be loaded into a third
register during the same clock cycle
35
36
 Buses A and B are used to transfer the source operands to the A and B
inputs of the ALU where an arithmetic or logic operation may be
performed.
 The result is transferred to the destination over bus C.
 If needed, the ALU may simply pass one of its two input operands
unmodified to bus C.
 We will call the ALU control signals for such an operation R=A or
R=B.
37
 A second feature is the introduction of the Incrementer unit, which is
used to increment the PC by 4.
 Using the Incrementer, eliminates the need to add 4 to the PC.
 The source for the constant 4 at the ALU input multiplexer is still
useful.
 It can be used to increment other addresses such as the memory
addresses in Load Multiple and Store Multiple instructions
38
 The structure in Figure requires significantly fewer control steps to
execute instructions .
 Consider the three-operand instruction of the form
OP Rsrc1, Rsrc2, Rdst
in which an operation is performed on the contents of
two source registers, and the result is placed into a destination register.
 Buses A and B are used to transfer the source operands, and bus C
provides the path to the destination.
 The path from the source buses to the destination bus goes through the
ALU, where the required operation is performed.
39
 Thus, assuming that the operation to be performed can be completed in
one pass through the ALU, the structure of Figure allows the
execution phase of an instruction to be performed in one cycle.
 Note that if it is merely necessary to copy the contents of one register
into another, then the transfer is also done through the ALU, but no
arithmetic or logic operation is performed
40
 The temporary storage registers Y and Z are not required .
 Register Y is not needed because both inputs to the ALU are provided
simultaneously via buses A and B.
 Register Z is not needed because the output from the ALU is
transferred to the destination register via the third bus, C.
 In this structure it is essential to ensure that the same register can serve
as both the source and the destination in a given instruction.
41
 Example:
Add R4, R5, R6
 The control sequence for executing this instruction is:
 In step 1,the contents of the PC are passed through the ALU using the
R=B control signal and loaded into the MAR to start a memory read
operation.
 At the same time, the PC is incremented by 4.
 Note that the value loaded into MAR is the original contents of the
PC
42
 The incremented value is loaded into the PC at the end of the clock
cycle and will not affect the contents of MAR.
 In step 2, the processor waits for MFC and loads the data received
into MDR and then transfers them to IR in step 3.
 Finally, the execution phase of the instruction requires only one
control step to complete step 4.
43
Control Sequence for the instruction Add R4, R5, R6
 Step Action
1. PC out , R=B, MAR in , Read, IncPC
2. WMFC
3. MDR out , R=B, IR in
4. R4outA , R5outBt, SelectA, Add, R6in , End
44
 By providing more paths for data transfer, a significant reduction in
the number of clock cycles needed to execute an instruction is
achieved.
 The three-bus structure allows execution of register-to-register
operation in a single clock cycle.
 This is particularly well suited to the requirements of RISC processors,
in which most arithmetic and logic instructions have register operands.
45

More Related Content

PPT
8051 block diagram
PDF
Microcontroller pic 16f877 addressing modes instructions and programming
PPTX
PIC-18 Microcontroller
PPT
Architecture of 8086 Microprocessor
PPTX
PULSE WIDTH MODULATION &DEMODULATION
PPTX
Equalization
PDF
8085 microprocessor ramesh gaonkar
PDF
Pulse modulation, Pulse Amplitude (PAM), Pulse Width (PWM/PLM/PDM), Pulse Pos...
8051 block diagram
Microcontroller pic 16f877 addressing modes instructions and programming
PIC-18 Microcontroller
Architecture of 8086 Microprocessor
PULSE WIDTH MODULATION &DEMODULATION
Equalization
8085 microprocessor ramesh gaonkar
Pulse modulation, Pulse Amplitude (PAM), Pulse Width (PWM/PLM/PDM), Pulse Pos...

What's hot (20)

PPTX
8051 timer counter
PPTX
Ec8491 CT - Unit 1 - Single Sideband Suppressed Carrier (SSB-SC)
PPT
Microinstruction sequencing new
PPT
Data transferschemes
PDF
8086 memory segmentation
PPTX
Interfacing external memory in 8051
DOCX
Pin configuration of 8085
PPTX
PROGRAMMABLE KEYBOARD AND DISPLAY INTERFACE(8279).pptx
PDF
Serial Communication Interfaces
DOC
PIC MICROCONTROLLERS -CLASS NOTES
PPTX
Adaptive delta modulation
PPTX
Design of Alu in computer architecture.pptx
PDF
Unit 2 mpmc
PPT
8237 / 8257 DMA
PPTX
UNIT 2 8086 System Bus Structure.pptx
PPTX
Comparsion of M-Ary psk,fsk,qapsk.pptx
PPT
Arm organization and implementation
PPTX
Lec 1 digital electroinics - memory array, write read operations
PPTX
Msp 430 addressing modes module 2
PPTX
80386 Architecture
8051 timer counter
Ec8491 CT - Unit 1 - Single Sideband Suppressed Carrier (SSB-SC)
Microinstruction sequencing new
Data transferschemes
8086 memory segmentation
Interfacing external memory in 8051
Pin configuration of 8085
PROGRAMMABLE KEYBOARD AND DISPLAY INTERFACE(8279).pptx
Serial Communication Interfaces
PIC MICROCONTROLLERS -CLASS NOTES
Adaptive delta modulation
Design of Alu in computer architecture.pptx
Unit 2 mpmc
8237 / 8257 DMA
UNIT 2 8086 System Bus Structure.pptx
Comparsion of M-Ary psk,fsk,qapsk.pptx
Arm organization and implementation
Lec 1 digital electroinics - memory array, write read operations
Msp 430 addressing modes module 2
80386 Architecture
Ad

Similar to Coa module2 (20)

DOCX
4th sem,(cs is),computer org unit-7
PPTX
COA-UNIT-III-FINAL (1).pptx
PDF
310471266 chapter-7-notes-computer-organization
PDF
COMPUTER ORGANIZATION NOTES Unit 7
PPT
CO By Rakesh Roshan
PPT
basic-processing-unit computer organ.ppt
PPT
Computer Organization for third semester Vtu SyllabusModule 4.ppt
PDF
BCS302-DDCO-basic processing unit-Module 5- VTU 2022 scheme-DDCO-pdf
PDF
Computer Organization
PPT
unit_6-basic-processing-unit.pptt engineering
PPT
Digital-Unit-III.ppt
PPT
Computer Organisation and Architecture
PDF
Central processing unit i
PPT
Computer Organization Unit 4 Processor &Control Unit
PPTX
Precessor organization
PPT
basic structure of computers
PPT
Unit2 control unit
PPT
chapter3 - Basic Processing base Unit.ppt
PPTX
8085microprocessor-functional block diagram, Arithmetic Logic Unit (ALU), Tim...
PPT
Unit 1 basic structure of computers
4th sem,(cs is),computer org unit-7
COA-UNIT-III-FINAL (1).pptx
310471266 chapter-7-notes-computer-organization
COMPUTER ORGANIZATION NOTES Unit 7
CO By Rakesh Roshan
basic-processing-unit computer organ.ppt
Computer Organization for third semester Vtu SyllabusModule 4.ppt
BCS302-DDCO-basic processing unit-Module 5- VTU 2022 scheme-DDCO-pdf
Computer Organization
unit_6-basic-processing-unit.pptt engineering
Digital-Unit-III.ppt
Computer Organisation and Architecture
Central processing unit i
Computer Organization Unit 4 Processor &Control Unit
Precessor organization
basic structure of computers
Unit2 control unit
chapter3 - Basic Processing base Unit.ppt
8085microprocessor-functional block diagram, Arithmetic Logic Unit (ALU), Tim...
Unit 1 basic structure of computers
Ad

More from cs19club (17)

PDF
Podd notes
PDF
Podd note 1
PDF
Podd mod6
DOC
Module 3
PDF
Ch05
PDF
Ch04
DOCX
Oodp extra2
DOCX
Oodp mod4
PPT
Module vi
PPTX
Module5 part2
PPT
Module 5 part1
PPT
Module4
PPTX
Io pro
PDF
Io pro
PPTX
Floating point arithmetic operations (1)
PPT
Addition and subtraction with signed magnitude data (mano
PPT
Coa module1
Podd notes
Podd note 1
Podd mod6
Module 3
Ch05
Ch04
Oodp extra2
Oodp mod4
Module vi
Module5 part2
Module 5 part1
Module4
Io pro
Io pro
Floating point arithmetic operations (1)
Addition and subtraction with signed magnitude data (mano
Coa module1

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PPTX
GDM (1) (1).pptx small presentation for students
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Classroom Observation Tools for Teachers
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Trump Administration's workforce development strategy
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
GDM (1) (1).pptx small presentation for students
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Orientation - ARALprogram of Deped to the Parents.pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Classroom Observation Tools for Teachers
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Abdominal Access Techniques with Prof. Dr. R K Mishra
Trump Administration's workforce development strategy
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Weekly quiz Compilation Jan -July 25.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf

Coa module2

  • 2. TOPICS COVERED  FUNDAMENTAL CONCEPTS  INSTRUCTION CYCLE  EXECUTION OF A COMPLETE INSTRUCTION  MULTIPLE BUS ORGANIZATION  SEQUENCING OF CONTROL SIGNALS 2
  • 3. 3 INTRODUCTION  This unit is about the processing unit, which executes machine instructions and coordinates the activities of other units.  This is also called a processor or instruction set processor (ISP).  We understand its internal structure and how it performs the tasks of fetching, decoding, and executing instructions of a program.  The processing unit is called central processing unit (CPU).  We explore the organization of the hardware that enables a CPU to perform its main function.  We learn how the execution of a complete instruction takes place and we also learn Branch Instructions
  • 4. 4 SOME FUNDAMENTAL CONCEPTS  A program, a set of instructions, to be executed by a computer is loaded in sequential locations in the main memory.  To execute this program, the CPU fetches one instruction at a time and performs the functions specified.  Until a branch or a jump instruction is executed, instructions are fetched from successive memory locations.  The address of the next instruction to be executed is kept by the CPU in a dedicated register called program counter (PC).  The contents of the PC are updated to point to the next instruction in the sequence.
  • 5. 5  Assume that each instruction occupies one memory word.  Therefore, one instruction execution requires the CPU to perform the following 3 steps: 1. Fetch the contents of the memory location by the PC into instruction register (IR). Symbolically, this can be written as: IR ← [[PC]] 2. Increment the contents of the PC by 1, i.e., (assuming word addressable) PC ← [PC]+1 3. Carry out the actions specified by the instruction in the IR.  Steps 1 and 2 are called the fetch phase and step 3 is called the execution phase.
  • 6. 6 Single-bus Organization of the Datapath inside a processor
  • 7. 7  Figure shows an organization in which the arithmetic and logic unit (ALU) and all the registers are interconnected via a single common bus.  This bus is internal to the processor  The address and data lines of the external memory bus are connected to the internal processor bus via the memory data register, MDR, and the memory address register, MAR, respectively.  Register MDR has two inputs and two outputs.  Data may be loaded into MDR either from the memory bus or from the internal processor bus.
  • 8. 8  The data stored in MDR may be placed on either bus. The input of MAR is connected to the internal bus, and its output is connected to the external bus.  The control lines of the memory bus are connected to the instruction decoder and control logic block.  This unit is responsible for issuing the signals that control the operation of all the units inside the processor and for interacting with the memory bus.  The use and number of the processor registers R0 through R(n-1) vary considerably from one processor to another  Three registers, Y, S, and TEMP are used by the processor for temporary storage during execution of some instructions
  • 9. 9  The multiplexer MUX selects either the output of register Y or a constant value 4 to be provided as input A of the ALU.  The constant 4 is used to increment the contents of the program counter.  We will refer to the two possible values of the MUX control input Select as Select4 and Select Y for selecting the constant 4 or register Y, respectively.
  • 10. 10  As instruction execution progresses, data are transferred from one register to another, often passing through the ALU to perform some arithmetic or logic operation.  The instruction decoder and control logic unit is responsible for implementing the actions specified by the instruction loaded in the IR register.  The decoder generates the control signals needed to select the registers involved and direct the transfer of data.  The registers, the ALU, and the interconnecting bus are collectively referred to as the datapath
  • 11. 11  With few exceptions, an instruction can be executed by performing one or more of the following operations in some specified sequence: 1. Fetch the contents of a given memory location and load them into a processor register. 2. Store a word of data from a processor register into a given memory location. 3. Transfer a word of data from one processor register to another or to the ALU. 4. Perform an arithmetic or logic operation and store the result in a processor register
  • 12. 1. Fetching a Word from memory  To fetch a word from memory, the CPU has to specify the address of the memory location where this information is stored and request a read operation.  The CPU transfers the address of the required word of information to the MAR, which is connected to address lines of the memory bus.  The CPU uses the control lines of the memory bus to indicate a Read operation is needed.  Then the CPU waits for Read operation completion, which is indicated by Memory-Function Completed (MFC) signal set. When the MFC is set, the information on the data lines is loaded into MDR
  • 13.  The connections for register MDR has four control signals: MDRin and MDRout control the connection to the internal bus, and MDRin E and MDRout E control the connection to the external bus.  Example :how to fetch a word from memory location, whose address is specified in R1, and place the word fetched in R2 1 MAR ← [R1] 2 Request memory READ and put the data to the address register 3 Wait for the Memory Fetch Cycle (MFC) signal 4 Load MDR from the memory 5 R2 ←[MDR]
  • 14.  The memory read operation requires three steps:  Which can be described by the signals being activated as follows: 1. R1out, MARin, Read 2. MDRinE, WMFC 3. MDRout, R2in Where WMFC is the control signal that causes the processor’s control circuitry to wait for the arrival of the MFC signal. 14
  • 15. 2. Storing a Word in Memory  After the address is loaded into MAR and data into MDR, The CPU uses the control lines of the memory bus to indicate a Write operation is needed.  The example below shows how the machine store a word in R2 into a memory location, whose address is specified in R1 1 MAR ← [R1] 2 MDR ←[R2] 3 Request memory write 4 Wait for MFC signal 15
  • 16.  Executing the instruction Move R2,(R1) requires the following steps: 1. R1out, MARin, 2. R2out, MDRin, Write 3. MDRoutE, WMFC  As in the case of the Read operation, the Write control signal causes the memory bus interface hardware to issue a Write command on the memory bus.  The processor remains in step 3 until the memory operation is completed and an MFC response is received 16
  • 17. 3. Register Transfers  Instruction execution involves a sequence of steps in which data are transferred from one register to another.  For each register, two control signals are used to place the contents of that register on the bus or to load the data on the bus into the register.  This is represented symbolically in Figure the input and output of register Ri are connected to the bus via switches controlled by the signals Riinn and Riout respectively.  When Riinn is set to 1, the data on the bus are loaded into Ri.  Similarly, when Riout is set to 1, the contents of register Ri are placed on the bus.  While Riout is equal to 0, the bus can be used for transferring data 17
  • 18. Input and output gating for the registers 18
  • 19.  Suppose that ,to transfer the contents of register R1 to register R4.  This can be accomplished as follows:  Enable the output of register R1 by setting R1out to 1. • This places the contents of R1 on the processor bus.  Enable the input of register R4 by setting R4in to 1. • This loads data from the processor bus into register R4. 19
  • 20.  All operations and data transfers within the processor take place within time periods defined by the processor clock.  The control signals that govern a particular transfer are asserted at the start of the clock cycle.  In our example, R1out and R4in are set to 1.  The registers consist of edge-triggered flip-flops.  Hence, at the next active edge of the clock, the fip-flops that constitute R4 will load the data present at their inputs.  At the same time, the controls signals R1out and R4in will return to 0. 20
  • 21. 4. Performing an Arithmetic or Logic Operation  The ALU is a combinational circuit that has no internal storage.  It performs arithmetic and logic operations on the two operands applied to its‘ A and B inputs.  To add two numbers, the two operands have to be made available at the inputs of the ALU simultaneously.  In Figures one of the operands is the output of the multiplexer MUX and the other operand is obtained directly from the bus.  The result produced by the ALU is stored temporarily in register Z. 21
  • 22.  A sequence of operations to add the contents of register R1 to those of register R2 and store the result in register R3 is: 1. R1out, Yin 2. R2out, Select Y, Add, Zin 3. Zout, R3in  The signals whose names are given in any step are activated for the duration of the clock cycle corresponding to that step.  All other signals are inactive.  Hence, in step 1, the output of register R1 and the input of register Y are enabled, causing the contents of R1 to be transferred over the bus to Y. 22
  • 23.  In step 2, the multiplexer‘s Select signal is set to Select Y, causing the multiplexer to gate the contents of register Y to input A of the ALU.  At the same time, the contents of register R2 are gated onto the bus and, hence, to input B.  The function performed by the ALU depends on the signals applied to its control lines.  In this case, the Add line is set to 1, causing the output of the ALU to be the sum of the two numbers at inputs A and B.  This sum is loaded into register Z, because its input control signal is activated.  In step 3, the contents of register Z are transferred to the destination register, R3.  This last transfer cannot be earned out during step 2, because only one register output can be connected to the bus during any clock cycle. 23
  • 24. EXECUTION OF A COMPLETE INSTRUCTION  Let us now consider the sequence of elementary operations required to execute one instruction.  Consider the instruction, Add (R3), R1 which adds the contents of a memory location pointed to by R3 to register R1. Executing this instruction requires the following actions: 1. Fetch the instruction 2. Fetch the first operand (the contents of the memory location pointed to by R3) 3. Perform the addition 4. Load the result into R1 24
  • 25.  Instruction execution is as follows:  In step1, the instruction fetch operation is initiated by loading the contents of the PC into the MAR and sending a Read request to the memory.  The Select signal is set to Select1, which causes the multiplexer MUX to select the constant 1.  This value is added to the operand at input B, which is the contents of the PC and the result is stored in register.  The updated value is moved from register back into the PC during step 2, while waiting for the memory to respond, the word fetched from the memory is loaded into the IR 25
  • 26. Control Sequence for Execution of the instruction ADD (R3), R1  Step Action 1. PCout , MARin , Read, Select1, Add, Zin 2. Zout , PCin , Y in , WMFC 3. MDRout , IRin 4. R3out , MARin , Read 5. R1out , Yin , WMFC 6. MDRout , SelectY, Add Zin 7. Zout , R1in , End 26
  • 27.  Steps 1 through 3 constitute the instruction fetch phase, which is the same for all instructions.  The instruction decoding circuit interprets the contents of the IR at the beginning of step 4.  This enables the control circuitry to activate the control signals for steps 4 through 7, which constitute the execution phase.  The contents of register R3 are transferred to the MAR in step 4 and a memory Read operation is initiated.  Then, the contents of R1 are transferred to register Y in step 5, to prepare for the addition operation. 27
  • 28.  When the Read operation is completed, the memory operand is available in register MDR and the addition operation is performed in step 6.  The contents of MDR are gated to the bus and thus also to the B input of the ALU and register Y is selected as the second input to the ALU by choosing Select Y.  The sum is stored in register Z and then transferred toR1 in step 7.  The End signal causes a new instruction fetch cycle to begin by returning to step 1. 28
  • 29. Branch Instructions  A branch instruction replaces the contents of the PC with the branch target address.  This address is usually obtained by adding an offset X which is given in the branch instruction to the updated value of the PC.  Control sequence that implements an unconditional branch instruction is as follows: 29
  • 30.  Step Action  1. PCout , MARin , Read, Select1, Add , Zin  2. Zout , PCin , Yin , WMFC  3. MDRout , IRin  4. Offset-field-of-I Rout , Add , Zin  5. Zout , PCin , End 30
  • 31.  Processing starts, as usual with the fetch phase.  This phase ends when the instruction is loaded into the IR in step 3.  The offset value is extracted from the IR by the instruction decoding circuit which will also perform sign extension if required.  Since the value of the updated PC is already available in register Y, the offset X is gated onto the bus in step 4 and an addition operation is performed.  The result, which is the branch target address, is loaded into the PC in step 5 31
  • 32.  The offset X used in branch instruction is usually the difference between the branch target address and the address immediately following the branch instruction.  For example, if the branch instruction is at location 2000 and if the branch target address is 2050, the value of X must be 49.  The PC is incremented during the fetch phase, before knowing the type of instruction being executed.  Thus, when the branch address is computed in step 4, the PC value used is the updated value, which points to the instruction following the branch instruction in the memory. 32
  • 33.  Consider now a conditional branch.  In this case, we need to check the status of the condition codes before loading a new value into the PC.  For example, for a Branch-on negative (Branch<0) instruction, step 4 in Figure 5.5 is replaced with: Offset-field-of-IRout, Add, Zin , If N=0 then End  Thus, if N=0, the processor returns to step 1 immediately after step 4. If N=1, step 5 is performed to load a new value into the PC, thus performing the branch operation 33
  • 34. MULTIPLE BUS ORGANIZATION  Three-bus structure used to connect the registers and the ALU of a processor.  All general-purpose registers are combined into a single block called the register file.  In VLSI technology, the most efficient way to implement a number of registers is in the form of an array of memory cells similar to those used in the implementation of random-access memories (RAMs). 34
  • 35.  The register file have three ports.  There are two outputs, allowing the contents of two different registers to be accessed simultaneously and have their contents placed on buses A and B.  The third port allows the data on bus C to be loaded into a third register during the same clock cycle 35
  • 36. 36
  • 37.  Buses A and B are used to transfer the source operands to the A and B inputs of the ALU where an arithmetic or logic operation may be performed.  The result is transferred to the destination over bus C.  If needed, the ALU may simply pass one of its two input operands unmodified to bus C.  We will call the ALU control signals for such an operation R=A or R=B. 37
  • 38.  A second feature is the introduction of the Incrementer unit, which is used to increment the PC by 4.  Using the Incrementer, eliminates the need to add 4 to the PC.  The source for the constant 4 at the ALU input multiplexer is still useful.  It can be used to increment other addresses such as the memory addresses in Load Multiple and Store Multiple instructions 38
  • 39.  The structure in Figure requires significantly fewer control steps to execute instructions .  Consider the three-operand instruction of the form OP Rsrc1, Rsrc2, Rdst in which an operation is performed on the contents of two source registers, and the result is placed into a destination register.  Buses A and B are used to transfer the source operands, and bus C provides the path to the destination.  The path from the source buses to the destination bus goes through the ALU, where the required operation is performed. 39
  • 40.  Thus, assuming that the operation to be performed can be completed in one pass through the ALU, the structure of Figure allows the execution phase of an instruction to be performed in one cycle.  Note that if it is merely necessary to copy the contents of one register into another, then the transfer is also done through the ALU, but no arithmetic or logic operation is performed 40
  • 41.  The temporary storage registers Y and Z are not required .  Register Y is not needed because both inputs to the ALU are provided simultaneously via buses A and B.  Register Z is not needed because the output from the ALU is transferred to the destination register via the third bus, C.  In this structure it is essential to ensure that the same register can serve as both the source and the destination in a given instruction. 41
  • 42.  Example: Add R4, R5, R6  The control sequence for executing this instruction is:  In step 1,the contents of the PC are passed through the ALU using the R=B control signal and loaded into the MAR to start a memory read operation.  At the same time, the PC is incremented by 4.  Note that the value loaded into MAR is the original contents of the PC 42
  • 43.  The incremented value is loaded into the PC at the end of the clock cycle and will not affect the contents of MAR.  In step 2, the processor waits for MFC and loads the data received into MDR and then transfers them to IR in step 3.  Finally, the execution phase of the instruction requires only one control step to complete step 4. 43
  • 44. Control Sequence for the instruction Add R4, R5, R6  Step Action 1. PC out , R=B, MAR in , Read, IncPC 2. WMFC 3. MDR out , R=B, IR in 4. R4outA , R5outBt, SelectA, Add, R6in , End 44
  • 45.  By providing more paths for data transfer, a significant reduction in the number of clock cycles needed to execute an instruction is achieved.  The three-bus structure allows execution of register-to-register operation in a single clock cycle.  This is particularly well suited to the requirements of RISC processors, in which most arithmetic and logic instructions have register operands. 45

Editor's Notes

  • #7: Since the instructions and data need to be feteched from the memory in order to perform a task, the time it takes to access and fetch this information will be one factor influencing how fast a given task will complete. In order to increase the speed of performing a task, one way is to reduce the amount of time it takes to fetch the data and the instructions. This time is called as “access time”. Suppose if we want to fetch the data at memory location with the address 10. In case of sequential access, we have to access locations 1-9, and then access location 10. Clearly, in case of sequential access the access times increase as memory locations with higher access times are accessed. We need some kind of memory which provides fixed and short access time irrespective of the memory location being accessed. That is, it provides random access. Why is the access time faster for the Cache than it is for primary storage? I haven’t yet discussed how the various units communicate with each other. In a few minutes I will discuss that, and it will become clear.