3. A difficulty that may caused a degradation of performance in an
instruction pipeline is due to possible collision of data or
address.
Solutions:
– Operand forwarding
– Hardware Interlocks
– Delayed Load
Data Dependency
3
4. Instead of transferring an ALU result into a destination
register, hardware checks it is needed as a source in the next
instruction or not, if needed it passes the result directly to the
ALU input, bypassing the register file.
Operand forwarding
4
5. • An interlock is a circuit that detects the instructions whose
source operands are still not available and it delays by enough
clock cycles to resolved the conflict.
Hardware Interlocks:
5
6. • Delay the loading of conflicting data by inserting no operation
instructions.
Delayed Load:
6
7. Data Dependency
• Use Delay Load to solve:
Example:
load R1 R1M[Addr1]
load R2 R2M[Addr2]
ADD R3R1+R2
Store M[addr3]R3
7
10. • A branch instruction can be conditional or unconditional.
• The branch instruction breaks the normal sequence of the
instruction stream, causing difficulties in the operation of the
instruction pipeline.
Handling Branching Instructions
10
12. Prefetch target instruction
• Prefetch the target instruction in addition to the instruction
following the branch.
• If the branch condition is successful, the pipeline continues
from the branch target instruction.
12
13. Loop Buffer
• Very fast memory
• Maintained by fetch stage of pipeline
• Check buffer before fetching from memory
• When a program loop is detected it is stored in the loop buffer in
its entirety including all branches.
• The loop buffer is similar (in principle) to a cache dedicated to
instructions. The differences are that the loop buffer only retains
instructions in sequence, and is much smaller in size (and lower
in cost).
13
14. Branch target buffer (BTB)
• BTB is an associative memory.
• Each entry in the BTB consists of the address of a previously
executed branch instruction and the target instruction for the
branch.
14
15. Branch Prediction
• A pipeline with branch prediction uses some additional logic
to guess the outcome of a conditional branch instruction
before it is executed.
15
16. Branch Prediction
• Various techniques can be used to predict whether a branch will be
taken or not:
– Prediction never taken
– Prediction always taken
– Prediction by opcode
– Branch history table
• The first three approaches are static: they do not depend on the
execution history up to the time of the conditional branch instruction.
The last approach is dynamic: they depend on the execution history.
16
17. Delayed Branch
• In this procedure, the compiler detects the branch instruction
and rearrange the machine language code sequence by
inserting useful instructions that keep the pipeline operating
without interrupts.
17
18. Example
• Five instructions need to be carried out:
Load from memory to R1
Increment R2
Add R3 to R4
Subtract R5 from R6
Branch to address X
18
23. Pipeline Hazards
• Hazards: situations that prevent the next
instruction from executing in the designated
clock cycle.
• 3 classes of hazards:
structural hazard – resource conflicts
data hazard – data dependency
control hazard – pc changes
(e.g., branches)
24. Structural Hazard
• A third type of hazard known as Structural hazard. This is the situation when
two instructions require the use of a given hardware resource at the same time.
• Arises from resource conflicts when the hardware can’t support all possible
combinations of overlapping instructions.
• Root Cause: resource conflicts
e.g., a processor with 1 reg write portbut having two instructions to write in
register
• Solution
stall one of the instructions until required unit is available
24
25. Structural Hazard
• Example
1 mem port
mem conflict
data access
vs
instr fetch
Load
Instr i+3
Instr i+2
Instr i+1
MEM
IF
28. Data Hazards
• A data hazard occur if either the source or the destination
operands of an instruction are not available at the time
expected in the pipeline. As a result some operations has to be
delayed and the pipeline stalls.
• A data hazard is a situation in which the pipeline is stalled
because the data to be operated on are delayed for some
reason.
28
29. Data Hazards Classification
• Depending on the order of read and write access in the instructions, data
hazards could be classified as three types.
• Consider two instructions i and j, with i occurring before j. Possible data
hazards:
– RAW (Read After Write)
• j tries to read a source before i writes to it , so j incorrectly gets the old
value;
• most common type of hazard, that is what we tried to explain so far.
– WAW (Write After Write)
• j tries to write an operand before is written by i. The write ends up
being performed in wrong order, having i overwrite the operand
written by j, the destination containing the operand written by i rather
than the one written by j
• Present in pipelines that write in more than one pipe stage
– WAR (Write After Read)
• j tries to write a destination before it is read by i, so the instruction i
incorrectly gets the new value
• This doesn’t happen in our example, since all reads are early and
writes late
29
30. Data Hazard Contd.
• Consider two instructions I and J, instruction J is assumed to
logically follow instruction I. We use the notation D(I) and
R(I) for the domain and range of instruction I.
• The domain contains the input set and the range corresponds
to the output set.
30
34. Control Hazards (Effect of Branching)
• Control hazard occurs whenever the pipeline makes an incorrect branch
prediction decisions, resulting in instructions entering the pipeline that
needs to be discarded.
• These result from the pipelining of branches and other instructions that
change the Program Counter.
• The action of fetching a non sequential or remote instruction after a branch
instruction is called branch taken.
• The instruction to be executed after a branch taken is called a branch
target.
• When a branch taken occurs all the instructions following the branch in the
pipeline becomes useless and will be drained from the pipeline.
34
35. Control Hazard Example
12: BEQ R1, R3, 36
16: AND R2, R3, R5
20: OR R6, R1, R7
24: ADD R8, R1, R9
36: XOR R10, R1, R11
Here 12,16,20,24,36 are the address of the instructions.
BEQ (Branch if Equal) instructions executes and if R1 and R3 are equal, so jump to instruction whose
memory address is 36.
So, once branch is taken, the next three in-order instructions which entered into the pipeline needs to
discarded. Hence, clock cycles are wasted and pipeline is stalled.
35
Editor's Notes
#23:Specifically, pipeline hazards are situations that prevent the next instruction from executing in the designated clock cycle.
There are 3 classes of hazards. Structural hazard due to resource conflicts, data hazard due to data dependency, and control hazard due to pc changes.
#25:Here’s an example of structural hazard due to memory conflict.
Assume the processor has only memory port.
A structural hazard will arise in clock cycle 4 when the load instruction reads data from memory and instruction i plus 3 fetches instruction from memory.
#26:The solution to this structural hazard is stall instruction i+3 for one clock cycle.