SlideShare a Scribd company logo
Ahmed Abdelazeem
Ahmed Abdelazeem
Ahmed Abdelazeem
Ahmed Abdelazeem
STA Basic Concepts
{ Concepts } + { Technique }
Ahmed Abdelazeem
Ahmed Abdelazeem
Ahmed Abdelazeem
✓ Design Rule v.s. Optimization Goal
✓ Prelayout & Postlayout STA
✓ Min and Max Timing Path
✓ Setup/Hold Time & Metastability
✓ Setup/Hold Timing Checks
✓ Clock Reconvergence Pessimism Removal (CRPR)
✓ Removal/Recovery Timing Checks
✓ Multi-Cycle Path
✓ Appendix “Time borrowing”
04 Timing Verification
Ahmed Abdelazeem
Ahmed Abdelazeem
Design Rule v.s. Optimization Goal
Design Rules
for
Reliable Design
Optimization
Constraints
for
PPA Target
Ahmed Abdelazeem
Ahmed Abdelazeem
Design Rule Check
Max Fanout
Max_fanout doesn’t mean the number of gates it can drive. It means the total
fanout_load shouldn’t exceed a certain limit.
Max Transition
1) Make sure delay calculation fall into library characterization range so it can be
accurate.
2) Reduce input transition to reduce short circuit power
Max Capacitance
Similarly to maximum transition constraint, but the cost is based on the total
capacitance that a particular standard cell can drive any interconnection in the
ASIC design
Min Pulse Width
The pulse width need to satisfy certain threshold for the sequential elements to
function properly.
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 19: Minimum Pulse Width
Min Pulse Width
Clock pulse width need to satisfy a certain threshold either defined in .lib or
set_min_pulse_width
For a low pulse, the tool uses fall_constraint; for a high pulse, it uses rise_constraint
What contributes to min pulse width calculation?
Clock pulse width (any signal pulse) will be including the effects from following aspects:
1) Non-equal rise and fall delay on gates along the path – taking away credit
2) Dynamic CRP – taking away credit
3) Static CRP – giving credit
4) Clock uncertainty – taking away credit
Pulse absorption
If the same clock signal passes through a series of the same type of cell, the pulse width of the
clock signal keeps decreasing. At some point, if the buffer delay is more than the clock pulse
width, the clock pulse is absorbed.
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 19: Minimum Pulse Width (cont’d)
How is min pulse width being calculated?
> For min_pulse_width_high:
open edge clock latency = (max_rise clock arrival)
close edge clock latency = (min_fall clock arrival )
actual pulse width (high) = open edge latency - close edge latency + conservative static CRP - worst case clock
uncertainty
> For min_pulse_width low:
open edge clock latency = (max_fall clock arrival)
close edge clock latency = (min_rise clock arrival )
actual pulse width (low) = open edge latency - close edge latency + conservative static CRP - worst case clock
uncertainty
Ahmed Abdelazeem
Ahmed Abdelazeem
Optimization Goal (Power, Performance, Area)
Timing Optimization Goal (Delay
Optimization)
- Worst Negative Slack (WNS)
- Total Negative Slack (TNS)
Power Goal
- Minimize both dynamic power and leakage power
- VT class usage: trade-off between speed and leakage
power. Low VT cell is faster but consume more leakage
power. High VT cell is slower but save more leakage power
Area Goal
- Minimize total area while keep design routable and
manufactural
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 20: report_constraint
report_constraint –all_violators -nosplit
Shows a summary of the worst violation per endpoint of each violated design rule constraint in the current design
Ahmed Abdelazeem
Ahmed Abdelazeem
Prelayout / Postlayout STA
Post-layout STA
Pre-layout STA
Ahmed Abdelazeem
Ahmed Abdelazeem
A Basic Flip-Flop structure
Ahmed Abdelazeem
Ahmed Abdelazeem
Setup Time
Ahmed Abdelazeem
Ahmed Abdelazeem
Setup Timing Check
Data must be stable before the active edge of the clock
D Q
CLK
CLK
D
Unstable
data
Setup time
Ahmed Abdelazeem
Ahmed Abdelazeem
Launch and Capture Flip-flops
Setup is checked from first active edge (clock) of launch flip-flop to closest active edge of capture
flip-flop
D Q
CLK
Logic D Q
CLK
clock 1
clock 2
data 2
launch flip-flop capture flip-flop
FF FF
Ahmed Abdelazeem
Ahmed Abdelazeem
Data and Clock Signals for Setup Timing Check
Setup condition
Tlaunch + TLFF + Tc < Tcapture + Tcycle - Tsetup
Tcycle
Tlaunch
CLK
Launch edge
LFF/CK
CFF/D
CFF/CK
Tcapture
Setup
Capture edge
D Q
CLK
D Q
CLK
Logic
CLK
data 2
launch flip-flop(LFF) capture flip-flop(CFF)
FF FF
Tlaunch
Tcapture
Tc
Ahmed Abdelazeem
Ahmed Abdelazeem
Setup Timing (Max Delay) Analysis
Slack is the difference between
data arrival and data required
times
1ns 5ns
FF1/clk
FF2/clk
FF2/D
1.1ns 5.1ns
Setup
Data
Required
Time
Data
Arrival
Time
F1
FF2
Clk
D
Data Arrival
Data Required
F1
FF1
Q
CLK
CLK U3
U2
0ns 4ns
Ahmed Abdelazeem
Ahmed Abdelazeem
Timing Report for Setup (Max Delay Analysis)
Startpoint: FF1 (rising edge-triggered flip-flop clocked by Clk)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by Clk)
Path Group: Clk
Path Type: max
Point Incr Path
-----------------------------------------------------------
clock Clk (rise edge) 0.00 0.00
clock network delay (propagated) 1.10 & 1.10
FF1/CLK (fdef1a15) 0.00 1.10 r
FF1/Q (fdef1a15) 0.50 & 1.60 r
U2/Y (buf1a27) 0.11 & 1.71 r
U3/Y (buf1a27) 0.11 & 1.82 r
FF2/D (fdef1a15) 0.05 & 1.87 r
data arrival time 1.87
clock Clk (rise edge) 4.00 4.00
clock network delay (propagated) 1.00 & 5.00
FF2/CLK (fdef1a15) 5.00 r
library setup time -0.21 4.79
data required time 4.79
------------------------------------------------------------
data required time 4.79
data arrival time -1.87
------------------------------------------------------------
slack (MET) 2.92
pt_shell> report_timing
0ns 4ns
Data
arrival
Data
required
Slack
Header
Ahmed Abdelazeem
Ahmed Abdelazeem
Negative Setup Time
Ahmed Abdelazeem
Ahmed Abdelazeem
Hold Time
Ahmed Abdelazeem
Ahmed Abdelazeem
Hold Time Check
Verifies that the data is held stable for a specified amount of time after the active edge of the clock
D Q
CLK
CLK
D
Hold time
Ahmed Abdelazeem
Ahmed Abdelazeem
Data and Clock Signals for Hold Timing Check
Tcycle
Tlaunch
CLK
Launch edge
LFF/CK
CFF/D
CFF/CK
Tcapture
SetupTsetup
Capture edge
D Q
CLK
D Q
CLK
Logic
CLK
data 2
launch flip-flop(LFF) capture flip-flop(CFF)
FF FF
Tlaunch
Tcapture
Tc
Tc
TLFF
Hold condition
Tlaunch + TLFF + Tc > Tcapture + Thold
Ahmed Abdelazeem
Ahmed Abdelazeem
Hold Timing (Min Delay) Analysis
Slack is the difference between
data arrival and required
F1
FF2
Clk
D
Data Required
F1
FF1
Q
CLK
CLK U3
U2
0ns 4ns
Data Arrival
FF1/clk
FF2/clk
FF2/D
1.1ns 5.1ns
1ns 5ns
Hold
Data
Required
Data
Arrival
Time
Ahmed Abdelazeem
Ahmed Abdelazeem
Timing Report for Hold (Min Delay Analysis)
report_timing –delay min
0ns 4ns
Startpoint: FF1 (rising edge-triggered flip-flop clocked by Clk)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by Clk)
Path Group: Clk
Path Type: min
Point Incr Path
----------------------------------------------------------
clock Clk (rise edge) 0.00 0.00
clock network delay (propagated) 1.10 & 1.10
FF1/CLK (fdef1a15) 0.00 1.10 r
FF1/Q (fdef1a15) 0.40 & 1.50 f
U2/Y (buf1a27) 0.05 & 1.55 f
U3/Y (buf1a27) 0.05 & 1.60 f
FF2/D (fdef1a15) 0.01 & 1.61 f
data arrival time 1.61
clock Clk (rise edge) 0.00 0.00
clock network delay (propagated) 1.00 & 1.00
FF2/CLK (fdef1a15) 1.00 r
library hold time 0.10 1.10
data required time 1.10
----------------------------------------------------------
data required time 1.10
data arrival time -1.61
----------------------------------------------------------
slack (MET) 0.51
Ahmed Abdelazeem
Ahmed Abdelazeem
Negative Hold Time
Ahmed Abdelazeem
Ahmed Abdelazeem
Hold Checks and Setup Check Cycles
A hold timing check ensures that:
Data from the subsequent launch edge must not be captured by the setup receiving edge
Data from the setup launch edge must not be captured by the preceding receiving edge
Launch flip-flop clock
Capture flip-flop clock
Launch edge 1 Launch edge 2
Setup check 1
Capture edge 0 Capture edge 1
Hold check 1
Hold 0
Ahmed Abdelazeem
Ahmed Abdelazeem
Clock-to-Q delay
Ahmed Abdelazeem
Ahmed Abdelazeem
Removal Timing Check
Verifies that there is required amount of time between an active clock edge and the release of an
asynchronous control signal
D Q
CLK
D Q
set
CLK
CLK
set
CLK
Earliest set can be removed
Active clock edge
set
Removal time
Ahmed Abdelazeem
Ahmed Abdelazeem
Recovery Timing Check
Verifies that there is a minimum amount of time between the asynchronous signal becoming
inactive and the next active clock edge
D Q
CLK
D Q
set
CLK
CLK
set
CLK
Latest set can be removed
Recovery
time
set
Ahmed Abdelazeem
Ahmed Abdelazeem
Metastability
Metastability Window
In a synchronous system, the data always has a fixed relationship w.r.t. the clock. Metastability window is defined as a
specific length of time, during which both data should not change. If the signal does change during this window, the
output will be unknown or so called “metastable”.
Metastability Window = Setup Time + Hold Time
The combination of setup and hold time requirement determine the width of
the Metastability window.
Ahmed Abdelazeem
Ahmed Abdelazeem
CLK -> D Timing Arc
Clock -> D
Launch Path is arriving at D pin of capture flop, Capture path is arriving at CLK pin of the capture flop.
Thus, the setup/hold requirement is arrival time requirement between CLK pin to D pin.
Ahmed Abdelazeem
Ahmed Abdelazeem
CLK -> D Timing Arc in library
Index 1 is clock transition
Index 2 is data transition
Ahmed Abdelazeem
Ahmed Abdelazeem
Setup Timing Check
Ahmed Abdelazeem
Ahmed Abdelazeem
Interpret Setup Timing Report
Ahmed Abdelazeem
Ahmed Abdelazeem
Fixing Setup Violation
Cell Adjustment
> VT Swap: Replacing High VT threshold cell into low VT threshold cell
> Cell Sizing: Replacing low drive strength cell with high drive strength cell
> Channel length Swap: Replacing Long channel device with short channel
device
Buffer Chain Adjustment
> Add buffer on existing route to break long interconnections
> Load isolation: insert a dedicated buffer for load on critical path
> Load splitting: Share load of a heavily loaded buffer by inserting a parallel buffer
> Rebuild buffer chain: remove excessive buffers added by the tool.
Wire Routing Improvement
> Layer promotion: route the long critical net on a higher layer metal with less
resistance
> Avoid Scenic Routes: reroute the detour routing to shorten the route length
> Cell Strapping: reduce resistance on wire segments close to the driver.
Logic Manipulation
> Bubble Pushing: compound outside inverter into library cell
> Equivalent inputs reordering: put critical signal on faster inputs
> Logic replication: Clone high fanout logic gate to reduce wire
capacitance
Clock Tree
Manipulation
> Clock pull-in
> Clock push-out
Architecture Change
> Pipelining Critical Path: Inserting staging flops to segment
logic path
> Re-design RTL functionality
Ahmed Abdelazeem
Ahmed Abdelazeem
Hold Timing Check
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 21: How is hold timing edge determined?
The hold relations are determined according to that setup relation.
1. Data from the source clock edge that follows the setup launch edge must not be latched by the setup latch
edge.
2. Data from the setup launch edge must not be latched by the destination clock edge that precedes the setup
latch edge.
Ahmed Abdelazeem
Ahmed Abdelazeem
Interpret Hold Timing Report
Ahmed Abdelazeem
Ahmed Abdelazeem
Fixing Hold Violation
Find Optimal Location of a hold buffer
Important: Avoid setup/hold conflicting path!
1. Find timing path with worst hold slack across all PVT
2. Choose the pins with maximum number of violating paths going through (bottleneck) as fixing candidate to minimal number of buffers
inserted.
3. Exclude pins with bad setup margin / negative slack, choose the one have good setup slack to avoid setup/hold conflict
4. If other conditions being equal, it is preferred to choose fixing at Load pins rather than driver pins to be more predictable.
Ahmed Abdelazeem
Ahmed Abdelazeem
Location of Hold Fixing
Find Optimal Hold Buffer Location
Hint: Avoid setup/hold conflicting path!
Ahmed Abdelazeem
Ahmed Abdelazeem
Delay Calculation for Timing Path
Max Timing Check
Slow Launching, Fast Capturing
Min Timing Check
Fast Launching, Slow Capturing
Ahmed Abdelazeem
Ahmed Abdelazeem
Clock Reconvergence Pessimism
CRP = Latest RISE arrival time to common point – Earliest RISE arrival time to common point
or
CRP = Latest FALL arrival time to common point – Earliest FALL arrival time to common point
D Q
CP
U1
U2
U4
CLK
setup/hold
D Q
CP
U5
U3
Common Point
U7
By default, PrimeTime removes clock reconvergence pessimism
timing_remove_clock_reconvergence_pessimism = "true"
The delay to the common point should be the same for the launch and capture path (i.e. CRP == 0)
Ahmed Abdelazeem
Ahmed Abdelazeem
Clock Reconvergence Pessimism Removal
Clock Arrival Variation
Clock Reconvergence pessimism removal is the process by which static variation between the early and late arrivals of a clock edge is
removed
Static Variation: Variation is constant during timing check → Only this type can be removed by CRPR
Dynamic Variation: Variation can change during timing check
Reasons for early/late variations
1) through reconvergent logic cones
2) through differences in min/max slews (different slews propagated for min and max timing)
3) through early/late derate modeling for OCV, as specified with set_timing_derate
4) through variations in voltage
5) through variations in temperature
* In Primetime, CRPR is enabled by setting the
timing_remove_clock_reconvergence_pessimism variable to true
Ahmed Abdelazeem
Ahmed Abdelazeem
CRPR through reconvergent logic cones
CRP to be removed for Max Path
Clock path arrival time difference
CRP to be removed for Min Path
Clock path arrival time difference
Ahmed Abdelazeem
Ahmed Abdelazeem
CRPR through OCV
CRP to be removed for Max Path
On-chip Variation Difference of Common Path
CRP to be removed for Min Path
On-chip Variation difference of common path
Supply voltage variation
Crosstalk effect induced delay difference
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 22: CRPR and Clock Gating
To place clock gating cell near the source
From timing perspective, this will reduce the common clock path of the sink → less CRPR for timing (hostile
for timing)
From power perspective, this can shut off large portion of the clock tree by few ICG cells.
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 22: CRPR and Clock Gating (cont’d)
To place clock gating cell near sinks
From timing perspective, more cells on the common path seen among the sinks → good for
timing
From power perspective, more portion of the clock tree are still on when the downstream ICG is
shut off.
Ahmed Abdelazeem
Ahmed Abdelazeem
Recovery/Removal Timing Check
Ahmed Abdelazeem
Ahmed Abdelazeem
Multi-cycle Path
Multi-cycle Scenario
If Multicycle constraint specified, STA tool will move the capture edge
accordingly. The amount of clock cycles to be moved should come from design
intention.
Default Analysis Scenario
By default, STA tool will assume the data launched from startpoint will
be captured by the endpoint in next clock cycle.
set_multicycle_path -setup N
set_multicycle_path -hold (N-1)
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 23: multi-cycle hold path
Default Behavior
By default, STA tool will assume the datapath could change during any clock before clock edge number
N.
It implicitly applies set_multicycle_path -hold 0
So by default the tools assume you want the path buffered up so that the minimum change is > N-1
cycles.
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 23: multi-cycle hold path (cont’d)
Determine hold check edge
The hold relations are determined according to that setup relation.
1. Data from the source clock edge that follows the setup launch edge must not be latched by the setup latch
edge.
2. Data from the setup launch edge must not be latched by the destination clock edge that precedes the setup
latch edge.
Ahmed Abdelazeem
Ahmed Abdelazeem
set_multicycle_path -setup 3
set_multicycle_path -hold 2
Topic 23: multi-cycle hold path (cont’d)
Ahmed Abdelazeem
Ahmed Abdelazeem
Multi-cycle Path set_multicycle_path -setup 3
set_multicycle_path -hold 2
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 24: Timing Exceptions
Efficiently specify timing exception
If an timing exception is specified on a particular set of pins, it needs to keep track of exceptions on registers, pins, nets.
Types of timing exception
There are three types of timing exception: 1) False path, 2) Min/Max delay and 3) Multi-cycle paths.
What is timing exception?
By default, PrimeTime assumes that data launched at a path startpoint is captured at the path endpoint by the
very next occurrence of a clock edge at the endpoint.
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 24: Timing Exception (cont’d)
Rule #2 More restrictive one dominants
For same type of constraints, more restrictive ones wins and override the less restrictive one.
Rule #3 More specific constraints dominants
If two constraints worked on the same path, the constraints with more specific condition wins.
Rule #1 Exception Priority
In case of conflicting exceptions for a particular path, the timing exception types have the following order of priority, from highest
to lowest:
1) set_false_path
2) set_max_delay and set_min_delay
3) set_multicycle_path
Ahmed Abdelazeem
Ahmed Abdelazeem
Topic 25: Bottleneck Analysis
report_bottleneck
A bottleneck is a common point in the design that contributes to multiple violations.
Ahmed Abdelazeem
Ahmed Abdelazeem
Appendix
● What is a latch?
● Latches vs. Flip Flops
● What is time borrowing?
i. Limits of Time Borrowing
ii. Loopcheck
iii. Calculating borrowing quantity “X”
iv. Displaying “X” quantity in timing reports
v. Optimization and time borrowing
● Modes of Latch
Ahmed Abdelazeem
Ahmed Abdelazeem
Introduction
What is a latch?
It is a level-sensitive register with three ports
When enable is active, input drives output. Latch is “open” or “transparent”
When enable is inactive, the output is kept at existing value
Setup and hold checks - with respect to the asserted edge
D Q
_
G Q
Input Output
Enable
<Transparent> <Transparent> <Transparent>
Capture Edge
Ahmed Abdelazeem
Ahmed Abdelazeem
Latches vs. Flops
• There is area savings over a flop in many cases
I. D Latch uses 4 NAND Gates
II. D Flop uses 8 NAND Gates
• A properly-designed latch-based design can nearly eliminate the clock timing checks from
the critical path
I. The critical path can flow-through the latches
• Latches provide a degree of freedom not present in flop designs
I. There is no exact start or stop to a given cycle
• This allows unbalanced logic stages to be more optimal
• Transparent for a finite time
Ahmed Abdelazeem
Ahmed Abdelazeem
What is Time Borrowing?
Time borrowing is the determination, at each latch in the design, of an artificial boundary between the logic
before and after the latch
It does not mean:
• Any movement of logic to the other side of a latch
• Any change in clock network (skew)
D Q
G
D Q
G
D Q
clk
clk
clk_bar
delay delay
i0 i1 i2
Ahmed Abdelazeem
Ahmed Abdelazeem
Understanding Time Borrowing
Time borrowing ( cycle stealing ) borrows time from next stage
No time borrowing - input must be stable before latch opens
Max time borrowing - input has time to be stable before latch closes
Clock
No time borrowing
Clock
Max time borrowing
Pulse width - setup
Ahmed Abdelazeem
Ahmed Abdelazeem
Understanding Time Borrowing (Cont.)
Every timing analysis tool makes its own assumption about the time borrowing
• They won’t necessarily match (Very important!)
Some tools borrow slack from a later stage - slacks are not equalized
Some tools might equalize slack independently on both sides of the latch
Time borrowing is performed automatically during timing analysis
• Performed even when current stage has positive slack
Ahmed Abdelazeem
Ahmed Abdelazeem
What are the Limits of Time Borrowing?
The latest data can arrive and still “make it into the latch”
• Latch closing edge minus the setup time
• Data must be stable at the latch input before the latch closes
The earliest the next stage can use data from a latch
• Latch opening time (clock arriving) + clock-to Q arc
setup
G -> Q
D -> Q
G
D
Q
D Q
D ->Q
G ->Q
setup
G
Ahmed Abdelazeem
Ahmed Abdelazeem
Example: Time Borrowing
Borrowing time from the next stage in a level sensitive design to meet timing constraints ; Max =
pulse width – setup
For path U1 to U2, signal arrives at 8ns while clock arrives at 5ns. However, since U2 is still open, this
path borrows 3ns from the next path i.e. U2 to U3, and hence meets setup
clk_bar
clk
0 10 15
5 8
3ns borrowed
setup
check
assume setup of 0ns
D Q
G
D Q
G
D Q
clk
clk
clk_bar
8ns 1ns
FF clocked at 10ns
U1
P1
U2 U3
P2
Ahmed Abdelazeem
Ahmed Abdelazeem
Example: Time Borrowing (Cont.)
Time borrowing happens even if current stage has positive slack
The required time for input D that corresponds to the Max time borrowing is denoted by RTsetup(D)
calculated by:
RTsetup(D) = CLK(Closing) - Latch setup time ( CLK -> D )
When the amount of time borrowing is less than its max it is denoted by RT(D)
• RT(D) is earlier than RTsetup(D)
• New arrival time at Q is calculated by: AT(Q) = CLK(opening) + CLK --> Q + X
• The new Required time at D becomes:
RT(D) = CLK(opening) + loop check delay or
RT(D) = CLK(opening) + ( X + CLK --> Q - D --> Q )
• Loop check delay = X - D --> Q + CLK --> Q
• Max “X” is the value that gives the same value for RTsetup(D) and RT(D)
• => X = { CLK(closing) - setup } - { CLK(opening) + CLK --> Q - D --> Q }
Ahmed Abdelazeem
Ahmed Abdelazeem
Example: Time Borrowing (Cont.)
D
Q
ck
D
Q
ck
D Q
ck
Clk_bar
clk
IV(0.11) 2 IVs(2x0.11)
i0 i2
i1
5-10 15-20
q
Ahmed Abdelazeem
Ahmed Abdelazeem
Graphical Representation of Loopcheck
D Q
G ->Q+x
setup
G ->Q+x
setup
loop check
loop check
G
V4.0 Report Timing output
- Loop Check Delay -1.81
V5.0 Report Timing output
+ Time Borrowed 1.65
+ G -> Q Delay 0.54
- D -> Q Delay 0.38
Loopcheck is more restrictive of the two checks for input D, usually applies unless the max is borrowed. Loopcheck
determines how late D can be factored in, if the arrival at Q is allowed to slip by an amount X. The larger the value of X the
greater the required time at D, and greater the arrival at Q
Negative polarity means time is
borrowed from next stage and given to
previous stage
D ->Q
Ahmed Abdelazeem
Ahmed Abdelazeem
Calculating Borrowing Quantity “X”
Slack value is adjusted independently for each latch to be equal on both sides
In reality, Time Borrow “X” is calculated using slack balancing method
Definition: AT(Q) = clk(opening) + (clk -> Q) + “X”
Definition: RT(D)= clk(opening) + (clk-to-Q) - (D-to-Q) + “X”
Definition: RT Setup (D) = Clk (closing) - clk-to-D(setup)
Slack = Required Time - Arrival Time; balance slack on the both sides
Slack@D = slack@Q: arrival time(D) and required time(Q) denoted by at(D) & rt(Q)
D Q
G
D Q
G
D Q
clk
clk
clk_bar
delay delay
i0 i1 i2
Note: rt(Q) calculation starts from i2 to the left, at(D) calculation starts from i0 to the right
Ahmed Abdelazeem
Ahmed Abdelazeem
Calculating Borrowing Quantity “X” (Cont.)
Slack@D = slack@Q:
RT(D) - at(D) = rt(Q) - AT(Q)
(CLK(open) + X + CLK -->Q - D --> Q) - at(D) = rt(Q)(CLK(open) + X + CLK --> Q)
2 (X) = rt(Q) + at(D) -2 CLK(open) - 2 ( CLK --> Q ) + D --> Q
X = (rt(Q) + at(D))/2 - CLK(open) - CLK -->Q + ( D --> Q )/2
Now, calculate slack based on the value of X and using the equations on the previous slide for RT(D), and
at(D)
Slack = RT(D) - at(D)
D Q
G
D Q
G
D Q
clk
clk
clk_bar
delay delay
i0 i1 i2
Ahmed Abdelazeem
Ahmed Abdelazeem
How is “X” Determined
• “X” is determined through a process of iteration
• All “X” values start at 0, and the design is timed
• Slacks are computed at each latch D and Q pins
• The “X” value is adjusted to equalize those slacks
• Repeated until stable
• Every increase in “X” for a given latch makes the previous cycle timing easier at the expense of making
the next cycle more critical and vice versa
Ahmed Abdelazeem
Ahmed Abdelazeem
Design Example 1
No external_delay on the q output:
slackD = slackQ
clk_bar
clk
0 10 15
5 20
Time borrowed(X1) Max Time borrowed(X2)
a
D Q
ck
D Q
ck
D Q
ck
Clk_bar
clk
IV(0.11) 2 IVs(2x0.11)
i0 i2
i1
5-10 15-20
q
Ahmed Abdelazeem
Ahmed Abdelazeem
Design Example (Cont.)
CLK --> Q = 0.1
D --> Q = 0.2
IV DELAY = 0.11
i2 Setup = 0.4
Total time = 15ns or
5-20
D Q
ck
D Q
ck
D Q
ck
Clk_bar
clk
IV(0.11) 2 IVs(2x0.11)
i0
at(D)= 5 + 0.10 + 0.11 =5.21
i2
i1
rt(Q) =19.6 - 0.22= 19.38 <---------rt=19.6
q
Max time borrow at i2 RTsetup(D) = CLK(Closing)
X2= 4.7; X1= (19.38 + 5.21) /2 - 10 -0.1 + 0.1 = 2.3
Loop check_delay1 = 2.3 - 0.2 + 0.1 = 2.2
RT(i1/D) = 10 + 0.1 - 0.2 + 2.3 = 12.2
slack(i1/D) = 12.2 - 5.21 = 6.99 AT(i1/Q) = 10 + 0.1 +2.3 = 12.4
slack (i1/Q) = 19.38 - 12.4 = 6.98
slack (i2/D) = rt(i2/D) - at(i2/D) = 19.6 - (10 + 0.1 + 2.3 + 0.11 + 0.11) = 6.98 <------ i2 timing report
Ahmed Abdelazeem
Ahmed Abdelazeem
Design Example (Cont.)
D Q
ck
D Q
ck
D Q
ck
Clk_bar
clk
IV(0.11) 2 IVs(2x0.11)
i0
at(D)= 5 + 0.10 + 0.11 =5.21
i2
i1
rt(Q) =19.6 - 0.22= 19.38 <---------rt=19.6
q
report_timing -to i1/D
clk arrival at i0 + ck -> Q + X0 + 0.11 ( Arrival time at i1/D )
Other end arrival at i1/ck + X1 + ck -> Q - D -> Q
In latch-based design lab we will focus more on the reports
Ahmed Abdelazeem
Ahmed Abdelazeem
Design Example (Cont.)
CLK --> Q = 0.1
D --> Q = 0.2
IV DELAY = 0.11
Total time = 17ns or
5-22
D Q
ck
D Q
ck
D Q
ck
Clk_bar
clk
IV(0.11) 2 IVs(2x0.11)
i0
at(D)= 5 + 0.10 + 0.11 =5.21
i2
i1
22
q
Adding a 4th latch (set_external_delay -clock CLK -trail 3 q):
Balance slacks at i2: (RT(D) -AT(D) ) = (rt(Q) - AT (Q)):
(15 + 0.1 + X2 -0.2) - (10 + X1 + 0.1 + 0.11 + 0.11) =
22 - (15 + X2 + 0.1)
Balance slacks at i1:
(10 + 0.1 - 0.2 + X1) - 5.21 = (15 + 0.1 + X2 -0.2 - 0.11 - 0.11) - (10 + X1 + 0.1)
Solving the equations gives X1 = 0.7, X2 = 1.51, slack = 5.39
Ahmed Abdelazeem
Ahmed Abdelazeem
How to Display “X” Value in Timing Reports
In addition to the “loop check delay” you can also analyze “X”
Use the report_timing -trace_latch_borrow | -trace_latch_forward
Or use the get_timing_paths command: get_timing_paths <pin> -
trace_latch_borrow | -trace_latch_forward
In either case, the time borrowing “X” quantity appears under the “stolen” column on the Q output
of the latch
Ahmed Abdelazeem
Ahmed Abdelazeem
Optimization and Time Borrowing
During optimization, time borrowing is periodically adjusted
Timing optimization works on combinatorial blocks only
“X” is assumed fixed for a period of optimization
Then every latch’s “X” are revised to balance slack
• Slack before and after latch is equalized as much as possible
• Rise and fall “X” are kept separately.
No separate display of “X” values during optimization
Ahmed Abdelazeem
Ahmed Abdelazeem
Modes of a Latch
Transparent
• Behaves like a buffer
• All signals (including constants) flow through latch
Disabled
• All arcs are disabled. Opaque
Normal
• Usual operation of a latch. Signal propagation is governed by time borrowing concepts
Ahmed Abdelazeem
Ahmed Abdelazeem
Modes of a Latch (Cont.)
Syntax:
set_case_analysis{ 0 | 1 } <list of latch clock pins>
Remove_case_analysis <list of latch clock pins>
Extensions:
Transparent Mode:
set_disable_timing -from latch/g -to latch/q
set_disable_timing -from latch/g
Disabled Mode:
set_disable_timing -from latch/g -to latch/q
set_disable_timing -from latch/d -to latch/q
Latch Type Transparent Value Disable Value
Active High 1 0
Active Low 0 1
Have to set both
Constants arriving on a clk pin will be processed to determine if the latch is transparent or disabled.
Ahmed Abdelazeem
Ahmed Abdelazeem
Limiting Time Borrowing in Latches
set_max_time_borrow <object list> <limit>
remove_max_time_borrow <object list>
Constant propagation and clock information through latches is not performed incrementally, execute
report_timing to update values
report_exceptions
Ahmed Abdelazeem
Ahmed Abdelazeem
Chapter Summary
✓ Design Rule v.s. Optimization Goal
✓ Prelayout & Postlayout STA
✓ Min and Max Timing Path
✓ Setup/Hold Time & Metastability
✓ Setup/Hold Timing Checks
✓ Clock Reconvergence Pessimism Removal (CRPR)
✓ Removal/Recovery Timing Checks
✓ Multi-Cycle Path
✓ Appendix “Time borrowing”
Ahmed Abdelazeem
Ahmed Abdelazeem
Ahmed Abdelazeem
Ahmed Abdelazeem
Thank You ☺

More Related Content

PDF
[Back2School] Timing Checks- Chapter 5.pdf
PDF
[Back2School] Delay Calculation- Chapter 2
PDF
[Back2School] Constraint Develop.pdf- Chapter 3
PPTX
Secrets of the DCM Part 2_________.pptx
PDF
[Back2School] STA Basic Concepts- Chapter 1.pdf
PPTX
Library Characterization Flow
PPTX
FPGA_constraints on the topic verilog .pptx
PPT
Dynamic Shift Frequency Scaling Of ATPG Patterns
[Back2School] Timing Checks- Chapter 5.pdf
[Back2School] Delay Calculation- Chapter 2
[Back2School] Constraint Develop.pdf- Chapter 3
Secrets of the DCM Part 2_________.pptx
[Back2School] STA Basic Concepts- Chapter 1.pdf
Library Characterization Flow
FPGA_constraints on the topic verilog .pptx
Dynamic Shift Frequency Scaling Of ATPG Patterns

Similar to [Back2School] Timing Verification- Chapter 4 (20)

PPT
file-3.ppt
PPT
file-3.ppt
PDF
VLSI Static Timing Analysis Setup And Hold Part 2
PDF
Timing notes 2006
PDF
LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
PDF
Synthesis and Optimization in Vlsi design
PDF
Timing closure document
PPT
Cadence Conformal Logic Equivalance Check
PPTX
Physical design
PPTX
3.TRANSPORT LAYER Computer Network .pptx
PDF
VLSI Static Timing Analysis Timing Checks Part 4 - Timing Constraints
PDF
Deep Explaination of STA_setupandholdchecks
PPT
TCP timers.ppt
PPT
persist timer.ppt
PDF
[Back2School] Crosstalk and Noise- Chapter 6.pdf
PDF
Ethernet as fabric
PPT
ATE Testers Overview
PDF
design-compiler.pdf
PDF
Unit3_all timer interfacing in microcontroller
PPTX
Cdd Dual Sample Pulse width modulation Flow Chart Diagram.pptx
file-3.ppt
file-3.ppt
VLSI Static Timing Analysis Setup And Hold Part 2
Timing notes 2006
LF_OVS_17_OVS/OVS-DPDK connection tracking for Mobile usecases
Synthesis and Optimization in Vlsi design
Timing closure document
Cadence Conformal Logic Equivalance Check
Physical design
3.TRANSPORT LAYER Computer Network .pptx
VLSI Static Timing Analysis Timing Checks Part 4 - Timing Constraints
Deep Explaination of STA_setupandholdchecks
TCP timers.ppt
persist timer.ppt
[Back2School] Crosstalk and Noise- Chapter 6.pdf
Ethernet as fabric
ATE Testers Overview
design-compiler.pdf
Unit3_all timer interfacing in microcontroller
Cdd Dual Sample Pulse width modulation Flow Chart Diagram.pptx
Ad

More from Ahmed Abdelazeem (15)

PDF
Tcl Scripting for EDA.pdf
PDF
[Back2School] STA Methodology- Chapter 7pdf
PDF
Digital Design Flow.pdf
PDF
Electromigration and IR Voltage Drop- EMIR.pdf
PDF
1. Introduction to PnR.pdf
PDF
IO Pad Insertion.pdf
PDF
EMIR: May Your Chips Live Forever.pdf
PDF
Physical Verification.pdf
PDF
Routing.pdf
PDF
Clock Tree Synthesis.pdf
PDF
Placement
PDF
PowerPlanning.pdf
PDF
Floorplanning
PDF
Static Time Analysis
PDF
ASIC Design Flow
Tcl Scripting for EDA.pdf
[Back2School] STA Methodology- Chapter 7pdf
Digital Design Flow.pdf
Electromigration and IR Voltage Drop- EMIR.pdf
1. Introduction to PnR.pdf
IO Pad Insertion.pdf
EMIR: May Your Chips Live Forever.pdf
Physical Verification.pdf
Routing.pdf
Clock Tree Synthesis.pdf
Placement
PowerPlanning.pdf
Floorplanning
Static Time Analysis
ASIC Design Flow
Ad

Recently uploaded (20)

PDF
Smarter Security: How Door Access Control Works with Alarms & CCTV
PPTX
Wireless and Mobile Backhaul Market.pptx
PPTX
Nanokeyer nano keyekr kano ketkker nano keyer
PPTX
Lecture-3-Computer-programming for BS InfoTech
PPTX
PLC ANALOGUE DONE BY KISMEC KULIM TD 5 .0
PPTX
5. MEASURE OF INTERIOR AND EXTERIOR- MATATAG CURRICULUM.pptx
PPTX
code of ethics.pptxdvhwbssssSAssscasascc
PPTX
sdn_based_controller_for_mobile_network_traffic_management1.pptx
PDF
How NGOs Save Costs with Affordable IT Rentals
PPTX
Presentacion compuuuuuuuuuuuuuuuuuuuuuuu
PPTX
Lecture 3b C Library _ ESP32.pptxjfjfjffkkfkfk
PDF
-DIGITAL-INDIA.pdf one of the most prominent
PDF
PPT Determiners.pdf.......................
PPTX
Embedded for Artificial Intelligence 1.pptx
PPTX
DEATH AUDIT MAY 2025.pptxurjrjejektjtjyjjy
PDF
Cableado de Controladores Logicos Programables
PPT
Hypersensitivity Namisha1111111111-WPS.ppt
PPT
Lines and angles cbse class 9 math chemistry
PPTX
"Fundamentals of Digital Image Processing: A Visual Approach"
PPTX
quadraticequations-111211090004-phpapp02.pptx
Smarter Security: How Door Access Control Works with Alarms & CCTV
Wireless and Mobile Backhaul Market.pptx
Nanokeyer nano keyekr kano ketkker nano keyer
Lecture-3-Computer-programming for BS InfoTech
PLC ANALOGUE DONE BY KISMEC KULIM TD 5 .0
5. MEASURE OF INTERIOR AND EXTERIOR- MATATAG CURRICULUM.pptx
code of ethics.pptxdvhwbssssSAssscasascc
sdn_based_controller_for_mobile_network_traffic_management1.pptx
How NGOs Save Costs with Affordable IT Rentals
Presentacion compuuuuuuuuuuuuuuuuuuuuuuu
Lecture 3b C Library _ ESP32.pptxjfjfjffkkfkfk
-DIGITAL-INDIA.pdf one of the most prominent
PPT Determiners.pdf.......................
Embedded for Artificial Intelligence 1.pptx
DEATH AUDIT MAY 2025.pptxurjrjejektjtjyjjy
Cableado de Controladores Logicos Programables
Hypersensitivity Namisha1111111111-WPS.ppt
Lines and angles cbse class 9 math chemistry
"Fundamentals of Digital Image Processing: A Visual Approach"
quadraticequations-111211090004-phpapp02.pptx

[Back2School] Timing Verification- Chapter 4

  • 1. Ahmed Abdelazeem Ahmed Abdelazeem Ahmed Abdelazeem Ahmed Abdelazeem STA Basic Concepts { Concepts } + { Technique } Ahmed Abdelazeem
  • 2. Ahmed Abdelazeem Ahmed Abdelazeem ✓ Design Rule v.s. Optimization Goal ✓ Prelayout & Postlayout STA ✓ Min and Max Timing Path ✓ Setup/Hold Time & Metastability ✓ Setup/Hold Timing Checks ✓ Clock Reconvergence Pessimism Removal (CRPR) ✓ Removal/Recovery Timing Checks ✓ Multi-Cycle Path ✓ Appendix “Time borrowing” 04 Timing Verification
  • 3. Ahmed Abdelazeem Ahmed Abdelazeem Design Rule v.s. Optimization Goal Design Rules for Reliable Design Optimization Constraints for PPA Target
  • 4. Ahmed Abdelazeem Ahmed Abdelazeem Design Rule Check Max Fanout Max_fanout doesn’t mean the number of gates it can drive. It means the total fanout_load shouldn’t exceed a certain limit. Max Transition 1) Make sure delay calculation fall into library characterization range so it can be accurate. 2) Reduce input transition to reduce short circuit power Max Capacitance Similarly to maximum transition constraint, but the cost is based on the total capacitance that a particular standard cell can drive any interconnection in the ASIC design Min Pulse Width The pulse width need to satisfy certain threshold for the sequential elements to function properly.
  • 5. Ahmed Abdelazeem Ahmed Abdelazeem Topic 19: Minimum Pulse Width Min Pulse Width Clock pulse width need to satisfy a certain threshold either defined in .lib or set_min_pulse_width For a low pulse, the tool uses fall_constraint; for a high pulse, it uses rise_constraint What contributes to min pulse width calculation? Clock pulse width (any signal pulse) will be including the effects from following aspects: 1) Non-equal rise and fall delay on gates along the path – taking away credit 2) Dynamic CRP – taking away credit 3) Static CRP – giving credit 4) Clock uncertainty – taking away credit Pulse absorption If the same clock signal passes through a series of the same type of cell, the pulse width of the clock signal keeps decreasing. At some point, if the buffer delay is more than the clock pulse width, the clock pulse is absorbed.
  • 6. Ahmed Abdelazeem Ahmed Abdelazeem Topic 19: Minimum Pulse Width (cont’d) How is min pulse width being calculated? > For min_pulse_width_high: open edge clock latency = (max_rise clock arrival) close edge clock latency = (min_fall clock arrival ) actual pulse width (high) = open edge latency - close edge latency + conservative static CRP - worst case clock uncertainty > For min_pulse_width low: open edge clock latency = (max_fall clock arrival) close edge clock latency = (min_rise clock arrival ) actual pulse width (low) = open edge latency - close edge latency + conservative static CRP - worst case clock uncertainty
  • 7. Ahmed Abdelazeem Ahmed Abdelazeem Optimization Goal (Power, Performance, Area) Timing Optimization Goal (Delay Optimization) - Worst Negative Slack (WNS) - Total Negative Slack (TNS) Power Goal - Minimize both dynamic power and leakage power - VT class usage: trade-off between speed and leakage power. Low VT cell is faster but consume more leakage power. High VT cell is slower but save more leakage power Area Goal - Minimize total area while keep design routable and manufactural
  • 8. Ahmed Abdelazeem Ahmed Abdelazeem Topic 20: report_constraint report_constraint –all_violators -nosplit Shows a summary of the worst violation per endpoint of each violated design rule constraint in the current design
  • 9. Ahmed Abdelazeem Ahmed Abdelazeem Prelayout / Postlayout STA Post-layout STA Pre-layout STA
  • 10. Ahmed Abdelazeem Ahmed Abdelazeem A Basic Flip-Flop structure
  • 12. Ahmed Abdelazeem Ahmed Abdelazeem Setup Timing Check Data must be stable before the active edge of the clock D Q CLK CLK D Unstable data Setup time
  • 13. Ahmed Abdelazeem Ahmed Abdelazeem Launch and Capture Flip-flops Setup is checked from first active edge (clock) of launch flip-flop to closest active edge of capture flip-flop D Q CLK Logic D Q CLK clock 1 clock 2 data 2 launch flip-flop capture flip-flop FF FF
  • 14. Ahmed Abdelazeem Ahmed Abdelazeem Data and Clock Signals for Setup Timing Check Setup condition Tlaunch + TLFF + Tc < Tcapture + Tcycle - Tsetup Tcycle Tlaunch CLK Launch edge LFF/CK CFF/D CFF/CK Tcapture Setup Capture edge D Q CLK D Q CLK Logic CLK data 2 launch flip-flop(LFF) capture flip-flop(CFF) FF FF Tlaunch Tcapture Tc
  • 15. Ahmed Abdelazeem Ahmed Abdelazeem Setup Timing (Max Delay) Analysis Slack is the difference between data arrival and data required times 1ns 5ns FF1/clk FF2/clk FF2/D 1.1ns 5.1ns Setup Data Required Time Data Arrival Time F1 FF2 Clk D Data Arrival Data Required F1 FF1 Q CLK CLK U3 U2 0ns 4ns
  • 16. Ahmed Abdelazeem Ahmed Abdelazeem Timing Report for Setup (Max Delay Analysis) Startpoint: FF1 (rising edge-triggered flip-flop clocked by Clk) Endpoint: FF2 (rising edge-triggered flip-flop clocked by Clk) Path Group: Clk Path Type: max Point Incr Path ----------------------------------------------------------- clock Clk (rise edge) 0.00 0.00 clock network delay (propagated) 1.10 & 1.10 FF1/CLK (fdef1a15) 0.00 1.10 r FF1/Q (fdef1a15) 0.50 & 1.60 r U2/Y (buf1a27) 0.11 & 1.71 r U3/Y (buf1a27) 0.11 & 1.82 r FF2/D (fdef1a15) 0.05 & 1.87 r data arrival time 1.87 clock Clk (rise edge) 4.00 4.00 clock network delay (propagated) 1.00 & 5.00 FF2/CLK (fdef1a15) 5.00 r library setup time -0.21 4.79 data required time 4.79 ------------------------------------------------------------ data required time 4.79 data arrival time -1.87 ------------------------------------------------------------ slack (MET) 2.92 pt_shell> report_timing 0ns 4ns Data arrival Data required Slack Header
  • 19. Ahmed Abdelazeem Ahmed Abdelazeem Hold Time Check Verifies that the data is held stable for a specified amount of time after the active edge of the clock D Q CLK CLK D Hold time
  • 20. Ahmed Abdelazeem Ahmed Abdelazeem Data and Clock Signals for Hold Timing Check Tcycle Tlaunch CLK Launch edge LFF/CK CFF/D CFF/CK Tcapture SetupTsetup Capture edge D Q CLK D Q CLK Logic CLK data 2 launch flip-flop(LFF) capture flip-flop(CFF) FF FF Tlaunch Tcapture Tc Tc TLFF Hold condition Tlaunch + TLFF + Tc > Tcapture + Thold
  • 21. Ahmed Abdelazeem Ahmed Abdelazeem Hold Timing (Min Delay) Analysis Slack is the difference between data arrival and required F1 FF2 Clk D Data Required F1 FF1 Q CLK CLK U3 U2 0ns 4ns Data Arrival FF1/clk FF2/clk FF2/D 1.1ns 5.1ns 1ns 5ns Hold Data Required Data Arrival Time
  • 22. Ahmed Abdelazeem Ahmed Abdelazeem Timing Report for Hold (Min Delay Analysis) report_timing –delay min 0ns 4ns Startpoint: FF1 (rising edge-triggered flip-flop clocked by Clk) Endpoint: FF2 (rising edge-triggered flip-flop clocked by Clk) Path Group: Clk Path Type: min Point Incr Path ---------------------------------------------------------- clock Clk (rise edge) 0.00 0.00 clock network delay (propagated) 1.10 & 1.10 FF1/CLK (fdef1a15) 0.00 1.10 r FF1/Q (fdef1a15) 0.40 & 1.50 f U2/Y (buf1a27) 0.05 & 1.55 f U3/Y (buf1a27) 0.05 & 1.60 f FF2/D (fdef1a15) 0.01 & 1.61 f data arrival time 1.61 clock Clk (rise edge) 0.00 0.00 clock network delay (propagated) 1.00 & 1.00 FF2/CLK (fdef1a15) 1.00 r library hold time 0.10 1.10 data required time 1.10 ---------------------------------------------------------- data required time 1.10 data arrival time -1.61 ---------------------------------------------------------- slack (MET) 0.51
  • 24. Ahmed Abdelazeem Ahmed Abdelazeem Hold Checks and Setup Check Cycles A hold timing check ensures that: Data from the subsequent launch edge must not be captured by the setup receiving edge Data from the setup launch edge must not be captured by the preceding receiving edge Launch flip-flop clock Capture flip-flop clock Launch edge 1 Launch edge 2 Setup check 1 Capture edge 0 Capture edge 1 Hold check 1 Hold 0
  • 26. Ahmed Abdelazeem Ahmed Abdelazeem Removal Timing Check Verifies that there is required amount of time between an active clock edge and the release of an asynchronous control signal D Q CLK D Q set CLK CLK set CLK Earliest set can be removed Active clock edge set Removal time
  • 27. Ahmed Abdelazeem Ahmed Abdelazeem Recovery Timing Check Verifies that there is a minimum amount of time between the asynchronous signal becoming inactive and the next active clock edge D Q CLK D Q set CLK CLK set CLK Latest set can be removed Recovery time set
  • 28. Ahmed Abdelazeem Ahmed Abdelazeem Metastability Metastability Window In a synchronous system, the data always has a fixed relationship w.r.t. the clock. Metastability window is defined as a specific length of time, during which both data should not change. If the signal does change during this window, the output will be unknown or so called “metastable”. Metastability Window = Setup Time + Hold Time The combination of setup and hold time requirement determine the width of the Metastability window.
  • 29. Ahmed Abdelazeem Ahmed Abdelazeem CLK -> D Timing Arc Clock -> D Launch Path is arriving at D pin of capture flop, Capture path is arriving at CLK pin of the capture flop. Thus, the setup/hold requirement is arrival time requirement between CLK pin to D pin.
  • 30. Ahmed Abdelazeem Ahmed Abdelazeem CLK -> D Timing Arc in library Index 1 is clock transition Index 2 is data transition
  • 33. Ahmed Abdelazeem Ahmed Abdelazeem Fixing Setup Violation Cell Adjustment > VT Swap: Replacing High VT threshold cell into low VT threshold cell > Cell Sizing: Replacing low drive strength cell with high drive strength cell > Channel length Swap: Replacing Long channel device with short channel device Buffer Chain Adjustment > Add buffer on existing route to break long interconnections > Load isolation: insert a dedicated buffer for load on critical path > Load splitting: Share load of a heavily loaded buffer by inserting a parallel buffer > Rebuild buffer chain: remove excessive buffers added by the tool. Wire Routing Improvement > Layer promotion: route the long critical net on a higher layer metal with less resistance > Avoid Scenic Routes: reroute the detour routing to shorten the route length > Cell Strapping: reduce resistance on wire segments close to the driver. Logic Manipulation > Bubble Pushing: compound outside inverter into library cell > Equivalent inputs reordering: put critical signal on faster inputs > Logic replication: Clone high fanout logic gate to reduce wire capacitance Clock Tree Manipulation > Clock pull-in > Clock push-out Architecture Change > Pipelining Critical Path: Inserting staging flops to segment logic path > Re-design RTL functionality
  • 35. Ahmed Abdelazeem Ahmed Abdelazeem Topic 21: How is hold timing edge determined? The hold relations are determined according to that setup relation. 1. Data from the source clock edge that follows the setup launch edge must not be latched by the setup latch edge. 2. Data from the setup launch edge must not be latched by the destination clock edge that precedes the setup latch edge.
  • 37. Ahmed Abdelazeem Ahmed Abdelazeem Fixing Hold Violation Find Optimal Location of a hold buffer Important: Avoid setup/hold conflicting path! 1. Find timing path with worst hold slack across all PVT 2. Choose the pins with maximum number of violating paths going through (bottleneck) as fixing candidate to minimal number of buffers inserted. 3. Exclude pins with bad setup margin / negative slack, choose the one have good setup slack to avoid setup/hold conflict 4. If other conditions being equal, it is preferred to choose fixing at Load pins rather than driver pins to be more predictable.
  • 38. Ahmed Abdelazeem Ahmed Abdelazeem Location of Hold Fixing Find Optimal Hold Buffer Location Hint: Avoid setup/hold conflicting path!
  • 39. Ahmed Abdelazeem Ahmed Abdelazeem Delay Calculation for Timing Path Max Timing Check Slow Launching, Fast Capturing Min Timing Check Fast Launching, Slow Capturing
  • 40. Ahmed Abdelazeem Ahmed Abdelazeem Clock Reconvergence Pessimism CRP = Latest RISE arrival time to common point – Earliest RISE arrival time to common point or CRP = Latest FALL arrival time to common point – Earliest FALL arrival time to common point D Q CP U1 U2 U4 CLK setup/hold D Q CP U5 U3 Common Point U7 By default, PrimeTime removes clock reconvergence pessimism timing_remove_clock_reconvergence_pessimism = "true" The delay to the common point should be the same for the launch and capture path (i.e. CRP == 0)
  • 41. Ahmed Abdelazeem Ahmed Abdelazeem Clock Reconvergence Pessimism Removal Clock Arrival Variation Clock Reconvergence pessimism removal is the process by which static variation between the early and late arrivals of a clock edge is removed Static Variation: Variation is constant during timing check → Only this type can be removed by CRPR Dynamic Variation: Variation can change during timing check Reasons for early/late variations 1) through reconvergent logic cones 2) through differences in min/max slews (different slews propagated for min and max timing) 3) through early/late derate modeling for OCV, as specified with set_timing_derate 4) through variations in voltage 5) through variations in temperature * In Primetime, CRPR is enabled by setting the timing_remove_clock_reconvergence_pessimism variable to true
  • 42. Ahmed Abdelazeem Ahmed Abdelazeem CRPR through reconvergent logic cones CRP to be removed for Max Path Clock path arrival time difference CRP to be removed for Min Path Clock path arrival time difference
  • 43. Ahmed Abdelazeem Ahmed Abdelazeem CRPR through OCV CRP to be removed for Max Path On-chip Variation Difference of Common Path CRP to be removed for Min Path On-chip Variation difference of common path Supply voltage variation Crosstalk effect induced delay difference
  • 44. Ahmed Abdelazeem Ahmed Abdelazeem Topic 22: CRPR and Clock Gating To place clock gating cell near the source From timing perspective, this will reduce the common clock path of the sink → less CRPR for timing (hostile for timing) From power perspective, this can shut off large portion of the clock tree by few ICG cells.
  • 45. Ahmed Abdelazeem Ahmed Abdelazeem Topic 22: CRPR and Clock Gating (cont’d) To place clock gating cell near sinks From timing perspective, more cells on the common path seen among the sinks → good for timing From power perspective, more portion of the clock tree are still on when the downstream ICG is shut off.
  • 47. Ahmed Abdelazeem Ahmed Abdelazeem Multi-cycle Path Multi-cycle Scenario If Multicycle constraint specified, STA tool will move the capture edge accordingly. The amount of clock cycles to be moved should come from design intention. Default Analysis Scenario By default, STA tool will assume the data launched from startpoint will be captured by the endpoint in next clock cycle. set_multicycle_path -setup N set_multicycle_path -hold (N-1)
  • 48. Ahmed Abdelazeem Ahmed Abdelazeem Topic 23: multi-cycle hold path Default Behavior By default, STA tool will assume the datapath could change during any clock before clock edge number N. It implicitly applies set_multicycle_path -hold 0 So by default the tools assume you want the path buffered up so that the minimum change is > N-1 cycles.
  • 49. Ahmed Abdelazeem Ahmed Abdelazeem Topic 23: multi-cycle hold path (cont’d) Determine hold check edge The hold relations are determined according to that setup relation. 1. Data from the source clock edge that follows the setup launch edge must not be latched by the setup latch edge. 2. Data from the setup launch edge must not be latched by the destination clock edge that precedes the setup latch edge.
  • 50. Ahmed Abdelazeem Ahmed Abdelazeem set_multicycle_path -setup 3 set_multicycle_path -hold 2 Topic 23: multi-cycle hold path (cont’d)
  • 51. Ahmed Abdelazeem Ahmed Abdelazeem Multi-cycle Path set_multicycle_path -setup 3 set_multicycle_path -hold 2
  • 52. Ahmed Abdelazeem Ahmed Abdelazeem Topic 24: Timing Exceptions Efficiently specify timing exception If an timing exception is specified on a particular set of pins, it needs to keep track of exceptions on registers, pins, nets. Types of timing exception There are three types of timing exception: 1) False path, 2) Min/Max delay and 3) Multi-cycle paths. What is timing exception? By default, PrimeTime assumes that data launched at a path startpoint is captured at the path endpoint by the very next occurrence of a clock edge at the endpoint.
  • 53. Ahmed Abdelazeem Ahmed Abdelazeem Topic 24: Timing Exception (cont’d) Rule #2 More restrictive one dominants For same type of constraints, more restrictive ones wins and override the less restrictive one. Rule #3 More specific constraints dominants If two constraints worked on the same path, the constraints with more specific condition wins. Rule #1 Exception Priority In case of conflicting exceptions for a particular path, the timing exception types have the following order of priority, from highest to lowest: 1) set_false_path 2) set_max_delay and set_min_delay 3) set_multicycle_path
  • 54. Ahmed Abdelazeem Ahmed Abdelazeem Topic 25: Bottleneck Analysis report_bottleneck A bottleneck is a common point in the design that contributes to multiple violations.
  • 55. Ahmed Abdelazeem Ahmed Abdelazeem Appendix ● What is a latch? ● Latches vs. Flip Flops ● What is time borrowing? i. Limits of Time Borrowing ii. Loopcheck iii. Calculating borrowing quantity “X” iv. Displaying “X” quantity in timing reports v. Optimization and time borrowing ● Modes of Latch
  • 56. Ahmed Abdelazeem Ahmed Abdelazeem Introduction What is a latch? It is a level-sensitive register with three ports When enable is active, input drives output. Latch is “open” or “transparent” When enable is inactive, the output is kept at existing value Setup and hold checks - with respect to the asserted edge D Q _ G Q Input Output Enable <Transparent> <Transparent> <Transparent> Capture Edge
  • 57. Ahmed Abdelazeem Ahmed Abdelazeem Latches vs. Flops • There is area savings over a flop in many cases I. D Latch uses 4 NAND Gates II. D Flop uses 8 NAND Gates • A properly-designed latch-based design can nearly eliminate the clock timing checks from the critical path I. The critical path can flow-through the latches • Latches provide a degree of freedom not present in flop designs I. There is no exact start or stop to a given cycle • This allows unbalanced logic stages to be more optimal • Transparent for a finite time
  • 58. Ahmed Abdelazeem Ahmed Abdelazeem What is Time Borrowing? Time borrowing is the determination, at each latch in the design, of an artificial boundary between the logic before and after the latch It does not mean: • Any movement of logic to the other side of a latch • Any change in clock network (skew) D Q G D Q G D Q clk clk clk_bar delay delay i0 i1 i2
  • 59. Ahmed Abdelazeem Ahmed Abdelazeem Understanding Time Borrowing Time borrowing ( cycle stealing ) borrows time from next stage No time borrowing - input must be stable before latch opens Max time borrowing - input has time to be stable before latch closes Clock No time borrowing Clock Max time borrowing Pulse width - setup
  • 60. Ahmed Abdelazeem Ahmed Abdelazeem Understanding Time Borrowing (Cont.) Every timing analysis tool makes its own assumption about the time borrowing • They won’t necessarily match (Very important!) Some tools borrow slack from a later stage - slacks are not equalized Some tools might equalize slack independently on both sides of the latch Time borrowing is performed automatically during timing analysis • Performed even when current stage has positive slack
  • 61. Ahmed Abdelazeem Ahmed Abdelazeem What are the Limits of Time Borrowing? The latest data can arrive and still “make it into the latch” • Latch closing edge minus the setup time • Data must be stable at the latch input before the latch closes The earliest the next stage can use data from a latch • Latch opening time (clock arriving) + clock-to Q arc setup G -> Q D -> Q G D Q D Q D ->Q G ->Q setup G
  • 62. Ahmed Abdelazeem Ahmed Abdelazeem Example: Time Borrowing Borrowing time from the next stage in a level sensitive design to meet timing constraints ; Max = pulse width – setup For path U1 to U2, signal arrives at 8ns while clock arrives at 5ns. However, since U2 is still open, this path borrows 3ns from the next path i.e. U2 to U3, and hence meets setup clk_bar clk 0 10 15 5 8 3ns borrowed setup check assume setup of 0ns D Q G D Q G D Q clk clk clk_bar 8ns 1ns FF clocked at 10ns U1 P1 U2 U3 P2
  • 63. Ahmed Abdelazeem Ahmed Abdelazeem Example: Time Borrowing (Cont.) Time borrowing happens even if current stage has positive slack The required time for input D that corresponds to the Max time borrowing is denoted by RTsetup(D) calculated by: RTsetup(D) = CLK(Closing) - Latch setup time ( CLK -> D ) When the amount of time borrowing is less than its max it is denoted by RT(D) • RT(D) is earlier than RTsetup(D) • New arrival time at Q is calculated by: AT(Q) = CLK(opening) + CLK --> Q + X • The new Required time at D becomes: RT(D) = CLK(opening) + loop check delay or RT(D) = CLK(opening) + ( X + CLK --> Q - D --> Q ) • Loop check delay = X - D --> Q + CLK --> Q • Max “X” is the value that gives the same value for RTsetup(D) and RT(D) • => X = { CLK(closing) - setup } - { CLK(opening) + CLK --> Q - D --> Q }
  • 64. Ahmed Abdelazeem Ahmed Abdelazeem Example: Time Borrowing (Cont.) D Q ck D Q ck D Q ck Clk_bar clk IV(0.11) 2 IVs(2x0.11) i0 i2 i1 5-10 15-20 q
  • 65. Ahmed Abdelazeem Ahmed Abdelazeem Graphical Representation of Loopcheck D Q G ->Q+x setup G ->Q+x setup loop check loop check G V4.0 Report Timing output - Loop Check Delay -1.81 V5.0 Report Timing output + Time Borrowed 1.65 + G -> Q Delay 0.54 - D -> Q Delay 0.38 Loopcheck is more restrictive of the two checks for input D, usually applies unless the max is borrowed. Loopcheck determines how late D can be factored in, if the arrival at Q is allowed to slip by an amount X. The larger the value of X the greater the required time at D, and greater the arrival at Q Negative polarity means time is borrowed from next stage and given to previous stage D ->Q
  • 66. Ahmed Abdelazeem Ahmed Abdelazeem Calculating Borrowing Quantity “X” Slack value is adjusted independently for each latch to be equal on both sides In reality, Time Borrow “X” is calculated using slack balancing method Definition: AT(Q) = clk(opening) + (clk -> Q) + “X” Definition: RT(D)= clk(opening) + (clk-to-Q) - (D-to-Q) + “X” Definition: RT Setup (D) = Clk (closing) - clk-to-D(setup) Slack = Required Time - Arrival Time; balance slack on the both sides Slack@D = slack@Q: arrival time(D) and required time(Q) denoted by at(D) & rt(Q) D Q G D Q G D Q clk clk clk_bar delay delay i0 i1 i2 Note: rt(Q) calculation starts from i2 to the left, at(D) calculation starts from i0 to the right
  • 67. Ahmed Abdelazeem Ahmed Abdelazeem Calculating Borrowing Quantity “X” (Cont.) Slack@D = slack@Q: RT(D) - at(D) = rt(Q) - AT(Q) (CLK(open) + X + CLK -->Q - D --> Q) - at(D) = rt(Q)(CLK(open) + X + CLK --> Q) 2 (X) = rt(Q) + at(D) -2 CLK(open) - 2 ( CLK --> Q ) + D --> Q X = (rt(Q) + at(D))/2 - CLK(open) - CLK -->Q + ( D --> Q )/2 Now, calculate slack based on the value of X and using the equations on the previous slide for RT(D), and at(D) Slack = RT(D) - at(D) D Q G D Q G D Q clk clk clk_bar delay delay i0 i1 i2
  • 68. Ahmed Abdelazeem Ahmed Abdelazeem How is “X” Determined • “X” is determined through a process of iteration • All “X” values start at 0, and the design is timed • Slacks are computed at each latch D and Q pins • The “X” value is adjusted to equalize those slacks • Repeated until stable • Every increase in “X” for a given latch makes the previous cycle timing easier at the expense of making the next cycle more critical and vice versa
  • 69. Ahmed Abdelazeem Ahmed Abdelazeem Design Example 1 No external_delay on the q output: slackD = slackQ clk_bar clk 0 10 15 5 20 Time borrowed(X1) Max Time borrowed(X2) a D Q ck D Q ck D Q ck Clk_bar clk IV(0.11) 2 IVs(2x0.11) i0 i2 i1 5-10 15-20 q
  • 70. Ahmed Abdelazeem Ahmed Abdelazeem Design Example (Cont.) CLK --> Q = 0.1 D --> Q = 0.2 IV DELAY = 0.11 i2 Setup = 0.4 Total time = 15ns or 5-20 D Q ck D Q ck D Q ck Clk_bar clk IV(0.11) 2 IVs(2x0.11) i0 at(D)= 5 + 0.10 + 0.11 =5.21 i2 i1 rt(Q) =19.6 - 0.22= 19.38 <---------rt=19.6 q Max time borrow at i2 RTsetup(D) = CLK(Closing) X2= 4.7; X1= (19.38 + 5.21) /2 - 10 -0.1 + 0.1 = 2.3 Loop check_delay1 = 2.3 - 0.2 + 0.1 = 2.2 RT(i1/D) = 10 + 0.1 - 0.2 + 2.3 = 12.2 slack(i1/D) = 12.2 - 5.21 = 6.99 AT(i1/Q) = 10 + 0.1 +2.3 = 12.4 slack (i1/Q) = 19.38 - 12.4 = 6.98 slack (i2/D) = rt(i2/D) - at(i2/D) = 19.6 - (10 + 0.1 + 2.3 + 0.11 + 0.11) = 6.98 <------ i2 timing report
  • 71. Ahmed Abdelazeem Ahmed Abdelazeem Design Example (Cont.) D Q ck D Q ck D Q ck Clk_bar clk IV(0.11) 2 IVs(2x0.11) i0 at(D)= 5 + 0.10 + 0.11 =5.21 i2 i1 rt(Q) =19.6 - 0.22= 19.38 <---------rt=19.6 q report_timing -to i1/D clk arrival at i0 + ck -> Q + X0 + 0.11 ( Arrival time at i1/D ) Other end arrival at i1/ck + X1 + ck -> Q - D -> Q In latch-based design lab we will focus more on the reports
  • 72. Ahmed Abdelazeem Ahmed Abdelazeem Design Example (Cont.) CLK --> Q = 0.1 D --> Q = 0.2 IV DELAY = 0.11 Total time = 17ns or 5-22 D Q ck D Q ck D Q ck Clk_bar clk IV(0.11) 2 IVs(2x0.11) i0 at(D)= 5 + 0.10 + 0.11 =5.21 i2 i1 22 q Adding a 4th latch (set_external_delay -clock CLK -trail 3 q): Balance slacks at i2: (RT(D) -AT(D) ) = (rt(Q) - AT (Q)): (15 + 0.1 + X2 -0.2) - (10 + X1 + 0.1 + 0.11 + 0.11) = 22 - (15 + X2 + 0.1) Balance slacks at i1: (10 + 0.1 - 0.2 + X1) - 5.21 = (15 + 0.1 + X2 -0.2 - 0.11 - 0.11) - (10 + X1 + 0.1) Solving the equations gives X1 = 0.7, X2 = 1.51, slack = 5.39
  • 73. Ahmed Abdelazeem Ahmed Abdelazeem How to Display “X” Value in Timing Reports In addition to the “loop check delay” you can also analyze “X” Use the report_timing -trace_latch_borrow | -trace_latch_forward Or use the get_timing_paths command: get_timing_paths <pin> - trace_latch_borrow | -trace_latch_forward In either case, the time borrowing “X” quantity appears under the “stolen” column on the Q output of the latch
  • 74. Ahmed Abdelazeem Ahmed Abdelazeem Optimization and Time Borrowing During optimization, time borrowing is periodically adjusted Timing optimization works on combinatorial blocks only “X” is assumed fixed for a period of optimization Then every latch’s “X” are revised to balance slack • Slack before and after latch is equalized as much as possible • Rise and fall “X” are kept separately. No separate display of “X” values during optimization
  • 75. Ahmed Abdelazeem Ahmed Abdelazeem Modes of a Latch Transparent • Behaves like a buffer • All signals (including constants) flow through latch Disabled • All arcs are disabled. Opaque Normal • Usual operation of a latch. Signal propagation is governed by time borrowing concepts
  • 76. Ahmed Abdelazeem Ahmed Abdelazeem Modes of a Latch (Cont.) Syntax: set_case_analysis{ 0 | 1 } <list of latch clock pins> Remove_case_analysis <list of latch clock pins> Extensions: Transparent Mode: set_disable_timing -from latch/g -to latch/q set_disable_timing -from latch/g Disabled Mode: set_disable_timing -from latch/g -to latch/q set_disable_timing -from latch/d -to latch/q Latch Type Transparent Value Disable Value Active High 1 0 Active Low 0 1 Have to set both Constants arriving on a clk pin will be processed to determine if the latch is transparent or disabled.
  • 77. Ahmed Abdelazeem Ahmed Abdelazeem Limiting Time Borrowing in Latches set_max_time_borrow <object list> <limit> remove_max_time_borrow <object list> Constant propagation and clock information through latches is not performed incrementally, execute report_timing to update values report_exceptions
  • 78. Ahmed Abdelazeem Ahmed Abdelazeem Chapter Summary ✓ Design Rule v.s. Optimization Goal ✓ Prelayout & Postlayout STA ✓ Min and Max Timing Path ✓ Setup/Hold Time & Metastability ✓ Setup/Hold Timing Checks ✓ Clock Reconvergence Pessimism Removal (CRPR) ✓ Removal/Recovery Timing Checks ✓ Multi-Cycle Path ✓ Appendix “Time borrowing”
  • 79. Ahmed Abdelazeem Ahmed Abdelazeem Ahmed Abdelazeem Ahmed Abdelazeem Thank You ☺