SlideShare a Scribd company logo
Pingqiang Zhou
ShanghaiTech University
Timing Analysis
ASIC Timing: Role of CAD Tools
2
 ASIC timing has deep interactions with logic and layout
synthesis.
Logic
Synthesis
Layout
Synthesis
Connected cells with
delay constraints on
signal paths
Placed cells
with real locations,
real connecting wires
High-level description
+ Timing Specifications
ASIC Timing: Role of CAD Tools
3
 Requirement on timing analysis
 Logic-side tools must estimate delays through
unplaced/unrouted logic.
 Layout tools must estimate delays through placed/routed
logic.
Logic
Synthesis
Layout
Synthesis
Our Topics for ASIC Timing
4
 Logic-side: StaticTiming Analysis
 How do we estimate the worst-case timing through a logic
network?
 Turns out to be longest paths through a graph, which
properly models the gates and wires.
 Layout-side: Interconnect Delay Analysis
 We place the gates, route the wires.Then, how do we estimate
wire delays?
 The problem is built up on electrical circuit model.We will
show key results.
Timing Analysis at the Logic Level
5
 Goal:Verify timing behavior of our logic design
 Input:
 A gate-level netlist.
 Timing models of the gates and/or wires.
 Output:
 Signal arrival time at various points in the network.
 Longest delays through gate network.
 Does the netlist satisfy the timing requirement? If not, where
are key problems?
 This is surprisingly complicated in the real world...
Analyzing Design Performance
6
 Assume design is synchronous.
 All storage is in explicit sequential elements, e.g., flip-flop elements.
 Consequence: we can just focus on delays through combinational
gates. Flip
Flops
Flip
Flops
Combinational
Logic
(No feedback
loops)
Clock
Launch Capture
Question: Can’t We Just Simulate Logic?
7
 What logic simulation does?
 Determines how a system will behave by simulating the logical
function.
 Gives the most accurate answer with good simulation models.
 … but it is (practically) impossible to give a complete answer –
especially timing.
 Requires examination of an exponential number of cases.
 All possible input vectors …
 With all possible relative timings …
 Under all possible manufacturing variations …
 We need a different, faster solution...
Timing Analysis: Basic Model
8
 Assume we know clock cycle
 E.g., 1GHz clock, cycle = 1ns.
 For logic to work correctly, longest delay through
network must be shorter than the clock cycle.
Flip
Flops
Flip
Flops
Combinational
Logic
Clock
1ns
Longest delay
< Clock cycle
Timing Analysis: Gate Delay Models
9
 First: we need a model of delay through each logic gate.
 Delay of a single gate:
∆
What’s gate delay ∆?
∆
1
X Y Y
X
∆
10
[Courtesy: UC Berkeley]
11 [Courtesy: UC Berkeley]
12 [Courtesy: UC Berkeley]
13 [Courtesy: UC Berkeley]
In Reality: Gate Delay is Very Complex
14
 Gate type affects delay
 Waveform shape affects delay
∆ ∆
≠
 Gate loading affects delay
 Transition direction
affects delay
∆ ≠ ∆
∆ ∆ ∆ ∆
≠ ≠
In Reality: Gate Delay is Very Complex
15
 Gate input pin affects delay
 Why?At transistor level, inputs are not symmetric.
 At nanoscale, delays are even statistical
 Why? Depends on process, voltage, and thermal (PVT) variations.
∆ ∆
≠
∆
∆
PDF
200 240 280
∆
Our Model: Pin-to-Pin Delay
16
 In our lecture, we keep it simple: Fixed, pin-to-pin delay
model
 No slopes, transition direction, distributions. Loading effects
“pushed” into gate delay itself.
 Per-pin delays are essential, but we will use just 1 value per
gate, for simplicity.
 Turns out this is enough to see all the interesting algorithm
ideas.
∆=3
∆=3
∆=5
∆=5
Do We Consider Logical Function?
17
 Does logic function matter?
 Try an example, where we “erase” gates.
 In this example: PI = Primary Input, PO = Primary Output
What is the longest delay? 20
∆=8
∆=2
∆=1
∆=8
∆=1
∆=2
∆=1
PI
PI
PI
PO
Now, Suppose We Know Logic Gates
18
 We cannot sensitize this path: cannot make a logic change
at this input propagate down this path to change this output.
∆=8
∆=1
∆=8
∆=1
0
1
∆=2
0
1
∆=2
PO
PI
PI
PI
Can we indeed have the longest path? No!
Topological vs. Logical Timing Analysis
19
 When we ignore logic, this is called Topological Analysis.
 We only work with graph and delays, don’t consider logic.
 We can get wrong answers: what we found was called a
False Path.
 Going forward: we ignore logic (Too tough to deal with)
 Assume that all paths are statically sensitizable.
 Means: Can find a constant pattern of inputs to other PIs that
makes some output sensitive to some input.
 Reminder: this is exactly the Boolean Difference concept of
sensitivity.
 This timing analysis has a name: StaticTiming Analysis
(STA).
STA Representation: Delay Graph
20
 From gate-level network, we build a delay graph.
 Vertices: Wires in gate network, one per gate output, also one
for each PI and PO.
 Edges: Input pin to output pin of gate in network (one edge
per input pin). Put gate delays on edges.
∆=4
∆=4
PI
PI
c
PI
PO
∆=3
∆=3
e
a
b
d
a
b
c
d
e
4
4
3
3
Delay Graph
21
 Common convention:Add Source/Sink nodes
 Add one “source” (src) node that has a 0-weight edge to
each PI.
 Add one “sink” (snk) node that has a 0-weight edge from
each PO.
 Why do this?
 Now, the network has exactly 1 “entry” node, and 1 “exit” node.
 All the longest (or shortest) path question have same start/end
nodes.
a
b
c
d
e
4
4
3
3
snk
src
0
0
0
0
Representation: Delay Graph
22
 What about interconnect delay?
 Can still use delay graph: model each wire as a “special” gate
that just has a delay.
∆=4
∆=4
PI
PI
c
PI
PO
∆=3
∆=3
q
a
b d
e
w
z
x
y
∆=1
∆=2
∆=2
∆=1
∆=2
a
b
c
d
e
4
4
3
3
snk
src
0
0
0
0
x
y
w
z
q
1
2
2 2
1
Operations on Delay Graph
23
 So how do we use delay graph to do timing analysis?
 What we don’t do:Try to enumerate all the source-to-sink
paths.
 Why not? Exponential explosion in number of paths, even for
small graph.
 There’s a smarter answer: Node-oriented timing analysis
 Find, for each node in delay graph, worst delay to the node
along any path.
0 1 2 n
… How many paths
from 0 to n?
2𝑛
Define Values on Nodes in Delay Graph
24
 ArrivalTime at a node (AT)
 AT(n) = Latest time the signal can become stable node n
 Think: Longest path from source
 Required ArrivalTime at node (RAT)
 RAT(n) =Latest time the signal is allowed to become
stable at node n
 Think: Longest path to sink
snk
src
n
Other paths
AT RAT
Define Values on Nodes in Delay Graph
25
 Slack at node n: Slack(n) = RAT(n) –AT(n)
 Amount of timing “margin” for the signal: positive is good,
negative is bad.
 Determined by longest path through node.
 Amount by which a signal can be delayed at node and
not increase the longest path through the network
 Can increase delay at node (to minimize power, circuit
area) with positive slack and not degrade overall
performance.
snk
src
n
Other paths
AT RAT
Slack(n) = RAT(n) –AT(n)
Slack is Hugely Important in Timing Analysis
26
 About slacks
 Defined so negative slack always bad: it indicates a timing
problem.
 Measures “sensitivity” of network to this node’s delay.
 Positive slack
 Good: can change something at this node, and not hurt network’s
overall timing.
 Example: make this node slower, maybe save some power, not hurt
timing.
 Negative slack
 Bad: have problem at this node; more negative the slack, bigger the
problem.
 Looking for a node to “fix” to help timing?These nodes are where to
look first.These affect the critical paths the most.
How To Compute ATs? Recursively
27
AT(n) = maximum delay to n =
0, if n is source
max {AT(p)+∆(p,n)}, else
p ∈ prec(n)
snk
src n
*
*
p
*
*
s
…
…
predecessor
paths
successor
paths
predecessor successor
∆(p,n)
How To Compute ATs?
28
 Big idea
 If we know the longest path to each predecessor of n, it’s a
simple “Maximum” operation to compute the longest path to
n itself.
src n
x
z
y
∆=7
∆=1
∆=5
AT(x)=5
AT(y)=10
AT(z)=5
AT(n) = max {AT(p)+∆(p,n)}
p ∈ {x,y,z}
= max {5+7, 10+1, 5+5}
=12
How To Compute RATs?
29
 RAT(n): Latest time in cycle where n could change and signal
would still propagate to sink before end of cycle.
 First, what is RAT(snk)?
 How about internal node n?
snk
src n
*
*
p
*
*
s
…
…
predecessor
paths
successor
paths
predecessor successor
∆(n,s)
RAT(n) = min {RAT(s)−∆(n,s)}
s ∈ succ(n)
RAT(snk) = CycleTime
How To Compute RATs? Recursively
30
snk
src n
*
*
p
*
*
s
…
…
predecessor
paths
successor
paths
predecessor successor
∆(n,s)
RAT(n) =
CycleTime, if n is sink
min {RAT(s)−∆(n,s)}, else
s ∈ succ(n)
ATs versus RATs: Look at Clock Cycle
31
 Why the differences betweenAT and RAT definitions?
AT(n) =
0, if n is source
max {AT(p)+∆(p,n)}, else
p ∈ prec(n)
RAT(n) =
CycleTime, if n is sink
min {RAT(s)−∆(n,s)}, else
s ∈ succ(n)
AT(n)
Launch Capture
Clock CycleTime
AT: longest logic
delay after launch
edge of clock.
RAT: longest logic
delay to the capture
edge of clock
RAT(n) longest
Negative Slack is BAD!
32
AT(n)
Launch Capture
Clock CycleTime
RAT(n)
Slack = RAT –AT is Negative!
Signal arrives too late, and
there is too much delay
from node to output.
Signal does not arrive at flip
flop input before the capture
edge of clock.
Example
33
 Suppose clock cycle is 12.
 AT=longest path from source TO node.
 RAT=(cycle time 12) – (longest path FROM node to sink).
 Slack = RAT – AT
src
a
c
b snk
d
e
f
g
h
i
j
k
0
0
0
1
4
1
2
3
5
3
2
1
3
4 2
0
0
0
5
Compute ATs
34
src
a
c
b snk
d
e
f
g
h
i
j
k
0
0
0
1
4
1
2
3
5
3
2
1
3
4 2
0
0
0
5
ComputeATs from src to snk
0
0
0
0
1
2
6
4
10
7
12
15
15
Compute RATs
35
 Clock cycle is 12.
src
a
c
b snk
d
e
f
g
h
i
j
k
0
0
0
1
4
1
2
3
5
3
2
1
3
4 2
0
0
0
5
Compute RATs from snk to src
-3
-3
-1
2
-2
4
3
10
7
12
12
12
12
Compute Slack
36
 Slack = RAT - AT
src
a
c
b snk
d
e
f
g
h
i
j
k
0
0
0
1
4
1
2
3
5
3
2
1
3
4 2
0
0
0
5
-3
-3
-1
2
-2
4
3
10
7
12
12
12
12
0
0
0
0
1
2
6
4
10
7
12
15
15
-3
-3
-1
2
-3
2
-3
6
-3
5
0
-3
-3
Analyzing the Example
37
 Worst (most negative) slack is -3.
 Big results:
 Your timing violation at sink = the worst slack value.
 The worst slack appears along this entire worst path.
src
a
c
b snk
d
e
f
g
h
i
j
k
0
0
0
1
4
1
2
3
5
3
2
1
3
4 2
0
0
0
5
-3
-3
-1
2
-2
4
3
10
7
12
12
12
12
0
0
0
0
1
2
6
4
10
7
12
15
15
-3
-3
-1
2
-3
2
-3
6
-3
5
0
-3
-3
Analyzing the Example
38
 Look at those slacks
 A negative slack at an output (PO) means a failed timing
requirement.
 A negative slack on internal node n means there is a path from n
to some problem PO.
 So, slacks are hugely useful!
 Beyond just knowing what is the worst path, slacks tell us the
problem gates on this path.
The Most Typical STA Problem
39
 Answer this problem:What are all the too-slow paths that
violate timing?
 Most useful report:
 Report paths in order, from slowest to fastest.
 In other words: Enumerate these paths, in delay order.
Flip
Flops
Flip
Flops
Logic
Clock
What Do We Need?
40
 Calculate all the ATs.
 Calculate all the RATs.
 Calculate all the Slacks.
 … do all of this very efficiently: Delay graphs are huge!
 …enumerate the violating paths, in worst delay order.
src
a
c
b snk
d
e
f
g
h
i
j
k
0
0
0
1
4
1
2
3
5
3
2
1
3
4 2
0
0
0
5
-3
-3
-1
2
-2
4
3
10
7
12
12
12
12
0
0
0
0
1
2
6
4
10
7
12
15
15
-3
-3
-1
2
-3
2
-3
6
-3
5
0
-3
-3
Computational Strategy
41
 Topological sorting (“Topsorting”) the delay graph.
 Sort the vertices in the delay graph into one single ordered list.
 Essential property: if there is an edge from 𝑝 to 𝑠, then 𝑝
appears before 𝑠 in sorted order.
 ComputeATs by going forward through the sorted list.
 Compute RATs by going backward through the sorted list.
b
c
d
3
4
5
11
9
6
15
e
a f
LegalTopsorting Order
a, b, c, d, e, f
a, b, d, c, e, f
Assume Have Topsort: Compute ATs
42
computeATs() {
AT(SRC) = 0;
foreach ( n in topsort order ) {
AT(n) = -∞;
foreach ( node p in pred(n) )
AT(n) = max( AT(n), AT(p) + ∆(p,n) );
}
}
snk
src n
*
*
p
*
*
s
…
…
predecessor successor
∆(p,n)
Compute RATs
43
 Trick: Pretend all edges are reversed, they point from SNK to
SRC, and walk graph backwards.
computeRATs() {
RAT(sink) = CycleTime;
foreach ( n in reverse topsort order ) {
RAT(n) = ∞;
foreach (successor s in succ(n) )
RAT(n) = min( RAT(n), RAT(s) - ∆(n,s) );
}
}
snk
src n
*
*
p
*
*
s
…
…
predecessor successor
∆(n,s)
Using Slack For Path Reporting
44
 Useful slack property: all nodes on longest path have same worst
slack value.
 Surprising result: slack let us can find N worst paths, even
though we did not trace them all.
b
c
d
3
4
5
11
9
6
15
e
a f
AT=0
RAT=0
Slack=0
AT=3
RAT=3
Slack=0
AT=8
RAT=23
Slack=15
AT=4
RAT=5
Slack=1
AT=14
RAT=14
Slack=0
AT=29
RAT=29
Slack=0
Assume clock cycle = 29
N-Worst Path Reporting
45
 We evolve partial paths; each partial path stores 3 things:
(Path itself, Delay of this path, Slack of the final node on path)
 We store the partial paths in a min heap, which is indexed on
the Slack value.
 Initially this heap contains only the source node.
 Algorithm is quite simple (and just like maze routing!).
 Expand: Pop partial path off the heap – it has the smallest (most
negative) slack.
 Reach target? If its end node is the sink, print out the path.
 Reach: Else add each successor node to make new partial paths,
push them back onto the heap, each with
(Path, Delay, Slack) labeled.
 Repeat until N paths are reported – go pop next partial path.
Worst Case Path Reporting: Example
46
Min Heap
(a,0,0)
Expand path a,
reach b & c
Min Heap
(a-b,3,0)
(a-c,4,1)
b
c
d
3
4
5
11
9
6
15
e
a f
Slack=0 Slack=15
Slack=0
Slack=1 Slack=0
Slack=0
Source Sink
 Min heap entry of the form (Path, Delay, Slack)
 Initially, heap contains only the source node.
Worst Case Path Reporting: Example
47
Expand path a-b,
reach d & e
Min Heap
(a-b,3,0)
(a-c,4,1)
Min Heap
(a-b-e,14,0)
(a-c,4,1)
(a-b-d,8,15)
b
c
d
3
4
5
11
9
6
15
e
a f
Slack=0 Slack=15
Slack=0
Slack=1 Slack=0
Slack=0
Source Sink
Worst Case Path Reporting: Example
48
Expand path a-b-e,
reach f
Min Heap
(a-b-e,14,0)
(a-c,4,1)
(a-b-d,8,15)
Min Heap
(a-c,4,1)
(a-b-d,8,15)
b
c
d
3
4
5
11
9
6
15
e
a f
Slack=0 Slack=15
Slack=0
Slack=1 Slack=0
Slack=0
Source Sink
f is sink!. Report 1st
worst path a-b-e-f,
with delay=29
Worst Case Path Reporting: Example
49
Expand path a-c,
reach e
Min Heap
(a-c,4,1)
(a-b-d,8,15)
b
c
d
3
4
5
11
9
6
15
e
a f
Slack=0 Slack=15
Slack=0
Slack=1 Slack=0
Slack=0
Source Sink
Min Heap
(a-c-e,13,0)
(a-b-d,8,15)
Worst Case Path Reporting: Example
50
Expand path a-c-e,
reach f
b
c
d
3
4
5
11
9
6
15
e
a f
Slack=0 Slack=15
Slack=0
Slack=1 Slack=0
Slack=0
Source Sink
Min Heap
(a-c-e,13,0)
(a-b-d,8,15) Min Heap
(a-b-d,8,15)
f is sink!. Report 2nd
worst path a-c-e-f,
with delay=28
Worst Case Path Reporting: Example
51
Expand path a-b-d,
reach f
b
c
d
3
4
5
11
9
6
15
e
a f
Slack=0 Slack=15
Slack=0
Slack=1 Slack=0
Slack=0
Source Sink
Min Heap
(a-b-d,8,15)
Min Heap
(EMPTY)
f is sink!. Report 3rd
worst path a-b-d-f,
with delay=14
Done!
Worst Case Path Reporting: Example
52
b
c
d
3
4
5
11
9
6
15
e
a f
Slack=0 Slack=15
Slack=0
Slack=1 Slack=0
Slack=0
Source Sink
 We find three paths:
 a-b-e-f, delay = 29
 a-c-e-f, delay = 28
 a-b-d-f, delay = 14.
Note: only 3 possible paths
from source to sink in graph,
so we found them correctly in
delay order!
Static Timing Analysis: Summary
53
 STA is a very important step in design of complex ASICs.
 It’s a critical “sign off” step, which means: you don’t get to
fabricate unless you pass.
 Several big ideas
 Gate level delay models matter, and can be pretty complex in
real world.
 Logical ≠Topological path analysis (i.e., STA).
 Build delay graph, calculate ATs, RATs, slacks recursively.
 Concept of slack is big: lets us locate worst paths, and problem
gates on path.
 A similar idea to maze routing lets us find worst paths in delay
order.
Static Timing Analysis: Aside
54
 STA is a huge topic – several things we did not cover.
 STA for sequential elements
 How do we model flip flops and latches, so we can verify, e.g., that setup and
hold times are met? More tricks with delay graph.
 Early mode versus late mode timing
 Our development was only so-called late mode timing, where we care about
longest path. Early mode focuses on shortest paths, and is critical for more
advanced timing, e.g., with transparent latches.
 Incremental STA
 In practice, you change 10,000 gates out of 1,000,000 gates, you don’t want to
redo the whole STA analysis.Advanced methods can update incrementally.
55

More Related Content

PDF
2019 3 testing and verification of vlsi design_sta
PDF
10 static timing_analysis_1_concept_of_timing_analysis
PDF
Sta by usha_mehta
PDF
Static_Timing_Analysis_in_detail.pdf
PDF
Lecture-5-STA.pdf
PDF
sta slide ref.pdf
PPT
file-3.ppt
PPT
file-3.ppt
2019 3 testing and verification of vlsi design_sta
10 static timing_analysis_1_concept_of_timing_analysis
Sta by usha_mehta
Static_Timing_Analysis_in_detail.pdf
Lecture-5-STA.pdf
sta slide ref.pdf
file-3.ppt
file-3.ppt

Similar to STA STATIC TIMING ANALYSIS AND ITS USE HOW IT WORKS (20)

PPTX
Major project iii 3
PPT
VLSI Testing & Verification_UNIT - V.ppt
PPT
07_Digital timing_&_Pipelining.ppt
PPT
Verilog HDL Verification
PDF
12 static timing_analysis_3_clocked_design
PPT
15757597 (1).ppt
PPT
Timing Analysis
DOCX
Timing analysis
PDF
[Back2School] Delay Calculation- Chapter 2
PDF
[Back2School] Timing Verification- Chapter 4
PDF
13 static timing_analysis_4_set_up_and_hold_time_violation_remedy
PDF
Identification of high risk hardware path-delay fault locations and Evaluatio...
PDF
Identification of high risk hardware path delay fault locations and evaluatio...
PPTX
Class 4 Static Timing Analysis.pptxkkkkkk
PPTX
Chapter+2.pptx , it's about delay calculation
PPTX
Lec-05_Static timing analysis digital vlsi design
PDF
[Back2School] STA Basic Concepts- Chapter 1.pdf
PDF
Clock distribution in high speed board
PDF
VLSI Static Timing Analysis Timing Checks Part 4 - Timing Constraints
PPTX
Leakage Power Minimization using SA-Based Gate Sizing and Threshold Voltage A...
Major project iii 3
VLSI Testing & Verification_UNIT - V.ppt
07_Digital timing_&_Pipelining.ppt
Verilog HDL Verification
12 static timing_analysis_3_clocked_design
15757597 (1).ppt
Timing Analysis
Timing analysis
[Back2School] Delay Calculation- Chapter 2
[Back2School] Timing Verification- Chapter 4
13 static timing_analysis_4_set_up_and_hold_time_violation_remedy
Identification of high risk hardware path-delay fault locations and Evaluatio...
Identification of high risk hardware path delay fault locations and evaluatio...
Class 4 Static Timing Analysis.pptxkkkkkk
Chapter+2.pptx , it's about delay calculation
Lec-05_Static timing analysis digital vlsi design
[Back2School] STA Basic Concepts- Chapter 1.pdf
Clock distribution in high speed board
VLSI Static Timing Analysis Timing Checks Part 4 - Timing Constraints
Leakage Power Minimization using SA-Based Gate Sizing and Threshold Voltage A...
Ad

Recently uploaded (20)

PPTX
ANATOMY OF ANTERIOR CHAMBER ANGLE AND GONIOSCOPY.pptx
PPTX
Special finishes, classification and types, explanation
DOCX
actividad 20% informatica microsoft project
PPTX
CLASS_11_BUSINESS_STUDIES_PPT_CHAPTER_1_Business_Trade_Commerce.pptx
PPTX
areprosthodontics and orthodonticsa text.pptx
PPTX
12. Community Pharmacy and How to organize it
PDF
Emailing DDDX-MBCaEiB.pdf DDD_Europe_2022_Intro_to_Context_Mapping_pdf-165590...
DOCX
The story of the first moon landing.docx
PDF
Urban Design Final Project-Context
PPTX
Media And Information Literacy for Grade 12
PDF
YOW2022-BNE-MinimalViableArchitecture.pdf
PPTX
DOC-20250430-WA0014._20250714_235747_0000.pptx
PPTX
mahatma gandhi bus terminal in india Case Study.pptx
PPTX
6- Architecture design complete (1).pptx
PDF
BRANDBOOK-Presidential Award Scheme-Kenya-2023
PDF
Skskkxiixijsjsnwkwkaksixindndndjdjdjsjjssk
PPTX
Entrepreneur intro, origin, process, method
PPTX
joggers park landscape assignment bandra
PPTX
Fundamental Principles of Visual Graphic Design.pptx
PPTX
YV PROFILE PROJECTS PROFILE PRES. DESIGN
ANATOMY OF ANTERIOR CHAMBER ANGLE AND GONIOSCOPY.pptx
Special finishes, classification and types, explanation
actividad 20% informatica microsoft project
CLASS_11_BUSINESS_STUDIES_PPT_CHAPTER_1_Business_Trade_Commerce.pptx
areprosthodontics and orthodonticsa text.pptx
12. Community Pharmacy and How to organize it
Emailing DDDX-MBCaEiB.pdf DDD_Europe_2022_Intro_to_Context_Mapping_pdf-165590...
The story of the first moon landing.docx
Urban Design Final Project-Context
Media And Information Literacy for Grade 12
YOW2022-BNE-MinimalViableArchitecture.pdf
DOC-20250430-WA0014._20250714_235747_0000.pptx
mahatma gandhi bus terminal in india Case Study.pptx
6- Architecture design complete (1).pptx
BRANDBOOK-Presidential Award Scheme-Kenya-2023
Skskkxiixijsjsnwkwkaksixindndndjdjdjsjjssk
Entrepreneur intro, origin, process, method
joggers park landscape assignment bandra
Fundamental Principles of Visual Graphic Design.pptx
YV PROFILE PROJECTS PROFILE PRES. DESIGN
Ad

STA STATIC TIMING ANALYSIS AND ITS USE HOW IT WORKS

  • 2. ASIC Timing: Role of CAD Tools 2  ASIC timing has deep interactions with logic and layout synthesis. Logic Synthesis Layout Synthesis Connected cells with delay constraints on signal paths Placed cells with real locations, real connecting wires High-level description + Timing Specifications
  • 3. ASIC Timing: Role of CAD Tools 3  Requirement on timing analysis  Logic-side tools must estimate delays through unplaced/unrouted logic.  Layout tools must estimate delays through placed/routed logic. Logic Synthesis Layout Synthesis
  • 4. Our Topics for ASIC Timing 4  Logic-side: StaticTiming Analysis  How do we estimate the worst-case timing through a logic network?  Turns out to be longest paths through a graph, which properly models the gates and wires.  Layout-side: Interconnect Delay Analysis  We place the gates, route the wires.Then, how do we estimate wire delays?  The problem is built up on electrical circuit model.We will show key results.
  • 5. Timing Analysis at the Logic Level 5  Goal:Verify timing behavior of our logic design  Input:  A gate-level netlist.  Timing models of the gates and/or wires.  Output:  Signal arrival time at various points in the network.  Longest delays through gate network.  Does the netlist satisfy the timing requirement? If not, where are key problems?  This is surprisingly complicated in the real world...
  • 6. Analyzing Design Performance 6  Assume design is synchronous.  All storage is in explicit sequential elements, e.g., flip-flop elements.  Consequence: we can just focus on delays through combinational gates. Flip Flops Flip Flops Combinational Logic (No feedback loops) Clock Launch Capture
  • 7. Question: Can’t We Just Simulate Logic? 7  What logic simulation does?  Determines how a system will behave by simulating the logical function.  Gives the most accurate answer with good simulation models.  … but it is (practically) impossible to give a complete answer – especially timing.  Requires examination of an exponential number of cases.  All possible input vectors …  With all possible relative timings …  Under all possible manufacturing variations …  We need a different, faster solution...
  • 8. Timing Analysis: Basic Model 8  Assume we know clock cycle  E.g., 1GHz clock, cycle = 1ns.  For logic to work correctly, longest delay through network must be shorter than the clock cycle. Flip Flops Flip Flops Combinational Logic Clock 1ns Longest delay < Clock cycle
  • 9. Timing Analysis: Gate Delay Models 9  First: we need a model of delay through each logic gate.  Delay of a single gate: ∆ What’s gate delay ∆? ∆ 1 X Y Y X ∆
  • 11. 11 [Courtesy: UC Berkeley]
  • 12. 12 [Courtesy: UC Berkeley]
  • 13. 13 [Courtesy: UC Berkeley]
  • 14. In Reality: Gate Delay is Very Complex 14  Gate type affects delay  Waveform shape affects delay ∆ ∆ ≠  Gate loading affects delay  Transition direction affects delay ∆ ≠ ∆ ∆ ∆ ∆ ∆ ≠ ≠
  • 15. In Reality: Gate Delay is Very Complex 15  Gate input pin affects delay  Why?At transistor level, inputs are not symmetric.  At nanoscale, delays are even statistical  Why? Depends on process, voltage, and thermal (PVT) variations. ∆ ∆ ≠ ∆ ∆ PDF 200 240 280 ∆
  • 16. Our Model: Pin-to-Pin Delay 16  In our lecture, we keep it simple: Fixed, pin-to-pin delay model  No slopes, transition direction, distributions. Loading effects “pushed” into gate delay itself.  Per-pin delays are essential, but we will use just 1 value per gate, for simplicity.  Turns out this is enough to see all the interesting algorithm ideas. ∆=3 ∆=3 ∆=5 ∆=5
  • 17. Do We Consider Logical Function? 17  Does logic function matter?  Try an example, where we “erase” gates.  In this example: PI = Primary Input, PO = Primary Output What is the longest delay? 20 ∆=8 ∆=2 ∆=1 ∆=8 ∆=1 ∆=2 ∆=1 PI PI PI PO
  • 18. Now, Suppose We Know Logic Gates 18  We cannot sensitize this path: cannot make a logic change at this input propagate down this path to change this output. ∆=8 ∆=1 ∆=8 ∆=1 0 1 ∆=2 0 1 ∆=2 PO PI PI PI Can we indeed have the longest path? No!
  • 19. Topological vs. Logical Timing Analysis 19  When we ignore logic, this is called Topological Analysis.  We only work with graph and delays, don’t consider logic.  We can get wrong answers: what we found was called a False Path.  Going forward: we ignore logic (Too tough to deal with)  Assume that all paths are statically sensitizable.  Means: Can find a constant pattern of inputs to other PIs that makes some output sensitive to some input.  Reminder: this is exactly the Boolean Difference concept of sensitivity.  This timing analysis has a name: StaticTiming Analysis (STA).
  • 20. STA Representation: Delay Graph 20  From gate-level network, we build a delay graph.  Vertices: Wires in gate network, one per gate output, also one for each PI and PO.  Edges: Input pin to output pin of gate in network (one edge per input pin). Put gate delays on edges. ∆=4 ∆=4 PI PI c PI PO ∆=3 ∆=3 e a b d a b c d e 4 4 3 3
  • 21. Delay Graph 21  Common convention:Add Source/Sink nodes  Add one “source” (src) node that has a 0-weight edge to each PI.  Add one “sink” (snk) node that has a 0-weight edge from each PO.  Why do this?  Now, the network has exactly 1 “entry” node, and 1 “exit” node.  All the longest (or shortest) path question have same start/end nodes. a b c d e 4 4 3 3 snk src 0 0 0 0
  • 22. Representation: Delay Graph 22  What about interconnect delay?  Can still use delay graph: model each wire as a “special” gate that just has a delay. ∆=4 ∆=4 PI PI c PI PO ∆=3 ∆=3 q a b d e w z x y ∆=1 ∆=2 ∆=2 ∆=1 ∆=2 a b c d e 4 4 3 3 snk src 0 0 0 0 x y w z q 1 2 2 2 1
  • 23. Operations on Delay Graph 23  So how do we use delay graph to do timing analysis?  What we don’t do:Try to enumerate all the source-to-sink paths.  Why not? Exponential explosion in number of paths, even for small graph.  There’s a smarter answer: Node-oriented timing analysis  Find, for each node in delay graph, worst delay to the node along any path. 0 1 2 n … How many paths from 0 to n? 2𝑛
  • 24. Define Values on Nodes in Delay Graph 24  ArrivalTime at a node (AT)  AT(n) = Latest time the signal can become stable node n  Think: Longest path from source  Required ArrivalTime at node (RAT)  RAT(n) =Latest time the signal is allowed to become stable at node n  Think: Longest path to sink snk src n Other paths AT RAT
  • 25. Define Values on Nodes in Delay Graph 25  Slack at node n: Slack(n) = RAT(n) –AT(n)  Amount of timing “margin” for the signal: positive is good, negative is bad.  Determined by longest path through node.  Amount by which a signal can be delayed at node and not increase the longest path through the network  Can increase delay at node (to minimize power, circuit area) with positive slack and not degrade overall performance. snk src n Other paths AT RAT Slack(n) = RAT(n) –AT(n)
  • 26. Slack is Hugely Important in Timing Analysis 26  About slacks  Defined so negative slack always bad: it indicates a timing problem.  Measures “sensitivity” of network to this node’s delay.  Positive slack  Good: can change something at this node, and not hurt network’s overall timing.  Example: make this node slower, maybe save some power, not hurt timing.  Negative slack  Bad: have problem at this node; more negative the slack, bigger the problem.  Looking for a node to “fix” to help timing?These nodes are where to look first.These affect the critical paths the most.
  • 27. How To Compute ATs? Recursively 27 AT(n) = maximum delay to n = 0, if n is source max {AT(p)+∆(p,n)}, else p ∈ prec(n) snk src n * * p * * s … … predecessor paths successor paths predecessor successor ∆(p,n)
  • 28. How To Compute ATs? 28  Big idea  If we know the longest path to each predecessor of n, it’s a simple “Maximum” operation to compute the longest path to n itself. src n x z y ∆=7 ∆=1 ∆=5 AT(x)=5 AT(y)=10 AT(z)=5 AT(n) = max {AT(p)+∆(p,n)} p ∈ {x,y,z} = max {5+7, 10+1, 5+5} =12
  • 29. How To Compute RATs? 29  RAT(n): Latest time in cycle where n could change and signal would still propagate to sink before end of cycle.  First, what is RAT(snk)?  How about internal node n? snk src n * * p * * s … … predecessor paths successor paths predecessor successor ∆(n,s) RAT(n) = min {RAT(s)−∆(n,s)} s ∈ succ(n) RAT(snk) = CycleTime
  • 30. How To Compute RATs? Recursively 30 snk src n * * p * * s … … predecessor paths successor paths predecessor successor ∆(n,s) RAT(n) = CycleTime, if n is sink min {RAT(s)−∆(n,s)}, else s ∈ succ(n)
  • 31. ATs versus RATs: Look at Clock Cycle 31  Why the differences betweenAT and RAT definitions? AT(n) = 0, if n is source max {AT(p)+∆(p,n)}, else p ∈ prec(n) RAT(n) = CycleTime, if n is sink min {RAT(s)−∆(n,s)}, else s ∈ succ(n) AT(n) Launch Capture Clock CycleTime AT: longest logic delay after launch edge of clock. RAT: longest logic delay to the capture edge of clock RAT(n) longest
  • 32. Negative Slack is BAD! 32 AT(n) Launch Capture Clock CycleTime RAT(n) Slack = RAT –AT is Negative! Signal arrives too late, and there is too much delay from node to output. Signal does not arrive at flip flop input before the capture edge of clock.
  • 33. Example 33  Suppose clock cycle is 12.  AT=longest path from source TO node.  RAT=(cycle time 12) – (longest path FROM node to sink).  Slack = RAT – AT src a c b snk d e f g h i j k 0 0 0 1 4 1 2 3 5 3 2 1 3 4 2 0 0 0 5
  • 34. Compute ATs 34 src a c b snk d e f g h i j k 0 0 0 1 4 1 2 3 5 3 2 1 3 4 2 0 0 0 5 ComputeATs from src to snk 0 0 0 0 1 2 6 4 10 7 12 15 15
  • 35. Compute RATs 35  Clock cycle is 12. src a c b snk d e f g h i j k 0 0 0 1 4 1 2 3 5 3 2 1 3 4 2 0 0 0 5 Compute RATs from snk to src -3 -3 -1 2 -2 4 3 10 7 12 12 12 12
  • 36. Compute Slack 36  Slack = RAT - AT src a c b snk d e f g h i j k 0 0 0 1 4 1 2 3 5 3 2 1 3 4 2 0 0 0 5 -3 -3 -1 2 -2 4 3 10 7 12 12 12 12 0 0 0 0 1 2 6 4 10 7 12 15 15 -3 -3 -1 2 -3 2 -3 6 -3 5 0 -3 -3
  • 37. Analyzing the Example 37  Worst (most negative) slack is -3.  Big results:  Your timing violation at sink = the worst slack value.  The worst slack appears along this entire worst path. src a c b snk d e f g h i j k 0 0 0 1 4 1 2 3 5 3 2 1 3 4 2 0 0 0 5 -3 -3 -1 2 -2 4 3 10 7 12 12 12 12 0 0 0 0 1 2 6 4 10 7 12 15 15 -3 -3 -1 2 -3 2 -3 6 -3 5 0 -3 -3
  • 38. Analyzing the Example 38  Look at those slacks  A negative slack at an output (PO) means a failed timing requirement.  A negative slack on internal node n means there is a path from n to some problem PO.  So, slacks are hugely useful!  Beyond just knowing what is the worst path, slacks tell us the problem gates on this path.
  • 39. The Most Typical STA Problem 39  Answer this problem:What are all the too-slow paths that violate timing?  Most useful report:  Report paths in order, from slowest to fastest.  In other words: Enumerate these paths, in delay order. Flip Flops Flip Flops Logic Clock
  • 40. What Do We Need? 40  Calculate all the ATs.  Calculate all the RATs.  Calculate all the Slacks.  … do all of this very efficiently: Delay graphs are huge!  …enumerate the violating paths, in worst delay order. src a c b snk d e f g h i j k 0 0 0 1 4 1 2 3 5 3 2 1 3 4 2 0 0 0 5 -3 -3 -1 2 -2 4 3 10 7 12 12 12 12 0 0 0 0 1 2 6 4 10 7 12 15 15 -3 -3 -1 2 -3 2 -3 6 -3 5 0 -3 -3
  • 41. Computational Strategy 41  Topological sorting (“Topsorting”) the delay graph.  Sort the vertices in the delay graph into one single ordered list.  Essential property: if there is an edge from 𝑝 to 𝑠, then 𝑝 appears before 𝑠 in sorted order.  ComputeATs by going forward through the sorted list.  Compute RATs by going backward through the sorted list. b c d 3 4 5 11 9 6 15 e a f LegalTopsorting Order a, b, c, d, e, f a, b, d, c, e, f
  • 42. Assume Have Topsort: Compute ATs 42 computeATs() { AT(SRC) = 0; foreach ( n in topsort order ) { AT(n) = -∞; foreach ( node p in pred(n) ) AT(n) = max( AT(n), AT(p) + ∆(p,n) ); } } snk src n * * p * * s … … predecessor successor ∆(p,n)
  • 43. Compute RATs 43  Trick: Pretend all edges are reversed, they point from SNK to SRC, and walk graph backwards. computeRATs() { RAT(sink) = CycleTime; foreach ( n in reverse topsort order ) { RAT(n) = ∞; foreach (successor s in succ(n) ) RAT(n) = min( RAT(n), RAT(s) - ∆(n,s) ); } } snk src n * * p * * s … … predecessor successor ∆(n,s)
  • 44. Using Slack For Path Reporting 44  Useful slack property: all nodes on longest path have same worst slack value.  Surprising result: slack let us can find N worst paths, even though we did not trace them all. b c d 3 4 5 11 9 6 15 e a f AT=0 RAT=0 Slack=0 AT=3 RAT=3 Slack=0 AT=8 RAT=23 Slack=15 AT=4 RAT=5 Slack=1 AT=14 RAT=14 Slack=0 AT=29 RAT=29 Slack=0 Assume clock cycle = 29
  • 45. N-Worst Path Reporting 45  We evolve partial paths; each partial path stores 3 things: (Path itself, Delay of this path, Slack of the final node on path)  We store the partial paths in a min heap, which is indexed on the Slack value.  Initially this heap contains only the source node.  Algorithm is quite simple (and just like maze routing!).  Expand: Pop partial path off the heap – it has the smallest (most negative) slack.  Reach target? If its end node is the sink, print out the path.  Reach: Else add each successor node to make new partial paths, push them back onto the heap, each with (Path, Delay, Slack) labeled.  Repeat until N paths are reported – go pop next partial path.
  • 46. Worst Case Path Reporting: Example 46 Min Heap (a,0,0) Expand path a, reach b & c Min Heap (a-b,3,0) (a-c,4,1) b c d 3 4 5 11 9 6 15 e a f Slack=0 Slack=15 Slack=0 Slack=1 Slack=0 Slack=0 Source Sink  Min heap entry of the form (Path, Delay, Slack)  Initially, heap contains only the source node.
  • 47. Worst Case Path Reporting: Example 47 Expand path a-b, reach d & e Min Heap (a-b,3,0) (a-c,4,1) Min Heap (a-b-e,14,0) (a-c,4,1) (a-b-d,8,15) b c d 3 4 5 11 9 6 15 e a f Slack=0 Slack=15 Slack=0 Slack=1 Slack=0 Slack=0 Source Sink
  • 48. Worst Case Path Reporting: Example 48 Expand path a-b-e, reach f Min Heap (a-b-e,14,0) (a-c,4,1) (a-b-d,8,15) Min Heap (a-c,4,1) (a-b-d,8,15) b c d 3 4 5 11 9 6 15 e a f Slack=0 Slack=15 Slack=0 Slack=1 Slack=0 Slack=0 Source Sink f is sink!. Report 1st worst path a-b-e-f, with delay=29
  • 49. Worst Case Path Reporting: Example 49 Expand path a-c, reach e Min Heap (a-c,4,1) (a-b-d,8,15) b c d 3 4 5 11 9 6 15 e a f Slack=0 Slack=15 Slack=0 Slack=1 Slack=0 Slack=0 Source Sink Min Heap (a-c-e,13,0) (a-b-d,8,15)
  • 50. Worst Case Path Reporting: Example 50 Expand path a-c-e, reach f b c d 3 4 5 11 9 6 15 e a f Slack=0 Slack=15 Slack=0 Slack=1 Slack=0 Slack=0 Source Sink Min Heap (a-c-e,13,0) (a-b-d,8,15) Min Heap (a-b-d,8,15) f is sink!. Report 2nd worst path a-c-e-f, with delay=28
  • 51. Worst Case Path Reporting: Example 51 Expand path a-b-d, reach f b c d 3 4 5 11 9 6 15 e a f Slack=0 Slack=15 Slack=0 Slack=1 Slack=0 Slack=0 Source Sink Min Heap (a-b-d,8,15) Min Heap (EMPTY) f is sink!. Report 3rd worst path a-b-d-f, with delay=14 Done!
  • 52. Worst Case Path Reporting: Example 52 b c d 3 4 5 11 9 6 15 e a f Slack=0 Slack=15 Slack=0 Slack=1 Slack=0 Slack=0 Source Sink  We find three paths:  a-b-e-f, delay = 29  a-c-e-f, delay = 28  a-b-d-f, delay = 14. Note: only 3 possible paths from source to sink in graph, so we found them correctly in delay order!
  • 53. Static Timing Analysis: Summary 53  STA is a very important step in design of complex ASICs.  It’s a critical “sign off” step, which means: you don’t get to fabricate unless you pass.  Several big ideas  Gate level delay models matter, and can be pretty complex in real world.  Logical ≠Topological path analysis (i.e., STA).  Build delay graph, calculate ATs, RATs, slacks recursively.  Concept of slack is big: lets us locate worst paths, and problem gates on path.  A similar idea to maze routing lets us find worst paths in delay order.
  • 54. Static Timing Analysis: Aside 54  STA is a huge topic – several things we did not cover.  STA for sequential elements  How do we model flip flops and latches, so we can verify, e.g., that setup and hold times are met? More tricks with delay graph.  Early mode versus late mode timing  Our development was only so-called late mode timing, where we care about longest path. Early mode focuses on shortest paths, and is critical for more advanced timing, e.g., with transparent latches.  Incremental STA  In practice, you change 10,000 gates out of 1,000,000 gates, you don’t want to redo the whole STA analysis.Advanced methods can update incrementally.
  • 55. 55