SlideShare a Scribd company logo
5
Most read
6
Most read
18
Most read
Amr Adel Mohammady
/amradelm
/amradelm
Logic Synthesis
Part 3 – ASIC Synthesis
/amradelm
/amradelm
Introduction
• In the previous parts we learned the FPGA fabric and the FPGA synthesis flow.
• In this part we will discuss the ASIC synthesis flow.
2
/amradelm
/amradelm
The Inputs and Outputs
• The inputs to ASIC synthesis are:
o HDL: The Verilog or VHDL design files
o Constraints: The timing, power, and area constraints
o Timing Libraries: The standard cell libraries.
o Synthesis Commands:
▪ set target_library <STDCELL_LIBRARY>
set link_library1 "* $target_library io.db rams.db"
read_verilog <RTL FILES LIST>
current_design <TOP_MODULE_NAME>
link
source <TIMING_CONSTRAINTS>
compile #Synthesize the design
• The outputs are:
o Design netlist
o Various reports about the design such as timing, power or area reports
o Synthesis Commands:
▪ write -f verilog -o ./netlist.v
report_timing
report_power
report_area
3
The target_library variable specifies the library that Design Compiler uses to select cells for optimization and mapping. The link_library variable
specifies every library that has cells referenced by the netlist such as RAMs. The tool uses the libraries specified in the link_library variable for
resolving references (linking)
[1] :
/amradelm
/amradelm
Target Library
• Both FPGA and synthesis start synthesis by creating a tech independent netlist.
• After that, the generic netlist is mapped to the target technology and optimized
to meet the constraints.
• The target in ASIC is called a standard cell library:
o It’s a collection of pre-designed and pre-characterized logic gates and other
digital functions used for VLSI design.
o The information can be timing, power consumption, physical layout, logic
functionality, etc
o This information is scattered into multiple files. For example, the timing and
power information exist in timing lib/db file while the physical layout exists in
a LEF/GDS files.
o These files are sometimes called “Views” (e.g. timing view) as they represent
the cell info from a certain point of view.
4
NAND Cell
Schematic View Layout View
1.1 1.2 1.3 1.4
10 2.10 2.20 2.27 3.00
20 2.50 3.00 3.45 3.96
30 2.90 3.40 3.80 4.15
Load Capacitance
𝑪𝑳
Input Transition Time
𝒕
Example Timing View2
Reference: An Exploration of Applying Gate-Length-Biasing Techniques to Deeply-Scaled FinFETs Operating in Multiple Voltage Regimes. IEEE
Transactions on Emerging Topics in Computing. PP. 1-1. 10.1109/TETC.2016.2640185.
[1] :
These are arbitrary number for demonstration only
[2] :
[1]
/amradelm
/amradelm
Wire Load Model (WLM)
5
• For the synthesis to know the cell delay and power, it needs to know the input transition and
capacitive load.
• Both values depend on the cell type and also the wires connecting the cells.
• The cell information is known from the standard cell library. So, the missing info is the wire
values (resistance and capacitance).
• In older technologies, the wire values were estimated using a wire load model (WLM).
• This model estimates the length of a wire (and therefore the resistance and capacitance) based
on the number of fanouts and the block size as shown in the diagrams
• These estimations are based on results from previous designs
Wire Cap
INV Cap
OR Cap
More Fanouts => More Wire Length Larger Block Size => More Wire Length
/amradelm
/amradelm
Wire Load Model (WLM) – Example
6
/amradelm
/amradelm
Physical Synthesis
7
• In newer tech nodes the WLM produced bad estimations so tool vendors tried another approach called Physical Synthesis.
• In this approach the floorplan and physical info (techfile, cell layout, parasitics, etc) are passed to the synthesis.
• This allows the synthesis to do cell placement along with logic synthesis and optimization.
• Since, it knows the distance between the cells, the synthesis can more accurately estimate the expected wire length.
• Physical synthesis produces much better results compared to the WLM approach but has a longer design time
• Two-pass Synthesis: Tool vendors recommend doing physical synthesis in two steps:
1. Synthesize the design with an initial floorplan. The resulting netlist gives info about the cell counts total area, and congestion which enables us to create a
better floorplan
2. Create a new floorplan then redo the synthesis with the physical info.
• In the next slides we will see the other inputs needed (along FP) to do
physical synthesis
One-Pass Synthesis (Not Recommended)
Two-Pass Synthesis (Recommended)
/amradelm
/amradelm
Physical Synthesis Inputs – Tech File
8
• The tech file contains various info about the
technology like:
o The units and precision.
o The coloring of the metals in the GUI.
o The minimum standard cell height and width.
o The design rules such as the layers’ default width
and spacing, etc.
o Via definitions
Example Tech File
/amradelm
/amradelm
Physical Synthesis Inputs – ITF & TLUPLUS
9
Reference : Okuno, Hanako & Fournier, Adeline & Quesnel, E. & Muffato, V. & Poche, Hélène & Fayolle, M. & Dijon, J.. (2010). CNT integration
on different materials suitable for VLSI interconnects. Comptes Rendus Physique - C R PHYS. 11. 381-388. 10.1016/j.crhy.2010.06.008.
[1] :
• The ITF (Interconnect Technology File) is a text-based file that contains raw information about
each technology layer such as the thicknesses, resistivity, and dielectric constants
• These values are further processed to generate the TLU+ file which contains tables of R, and C
values as functions of metal layer widths, and spacing. This is done while taking into account all
adjacent layers’ effects.
• The TLU+ contents are binary and only contain a text header showing the ITF that was used to
generate the TLU+ file
• Along with TLU+, we use a layer mapping file that maps the layer names between the tech file
and the TLU+
CMOS Cross Section1
Example ITF File
/amradelm
/amradelm
Physical Synthesis Inputs – LEF (Library Exchange Format)
10
Reference : Automated integrity checks stop out-of-sync data issues in parallel flows (techdesignforums.com)
[1] :
• The GDS file contains full data about the design layout and masks and is sent to
the fabrication plant to fabricate the chip.
• From a runtime and memory usage point of view, we don’t need all the info of the
GDS when doing placement. We only care about the cell boundary, pin shapes and
locations.
• The LEF file contains only the necessary info needed to perform placement and is
used during physical synthesis and across the PnR stages.
• Once PnR is finished, the LEF views are replaced by the GDS views to produce the
final GDS file that contains all the info needed by the fabrication plant
/amradelm
/amradelm
ASIC Synthesis Options
11
/amradelm
/amradelm
Critical Range & TNS Optimization
• By default the tool focuses on enhancing the worst negative slack (WNS).
• The tool considers the WNS and some paths before it. This is controlled with the critical range variable.
• A critical range of 0.0 means that only the most critical paths (the ones with the worst violation) are optimized. If you specify a nonzero critical range, near-
critical paths within that amount of the worst path will also be optimized, if possible.
• Also, you can instruct the tool to focus on enhancing the entire total negative slack (TNS)
at the cost of additional runtime.
• Synthesis Commands:
o set_critical_range 2.5 top
o set compile_timing_high_effort_tns true
12
WNS
With critical range of 2.5
TNS
/amradelm
/amradelm
Arithmetic Blocks Architecture
13
• Digital blocks have a tradeoff between speed vs power and area. The designer might choose an implementation that consume more power or has larger area
but higher speed.
• For example, there are different ways to implement binary adders. One implementation is the ripple adder which has small area and power consumption but has
high 𝑇𝑐𝑜𝑚𝑏, while a carry-look-ahead (CLA) adder has smaller 𝑇𝑐𝑜𝑚𝑏 but takes a larger area.
• The synthesis tool can automatically choose the best implementation to enhance timing, or area.
• Synthesis Commands:
o set_dp_smartgen_options -optimize_for [area | speed | area,speed]
𝑇𝑐𝑜𝑚𝑏 = 700𝑝𝑠
𝐴𝑟𝑒𝑎 = 75𝜇𝑚2
𝑇𝑐𝑜𝑚𝑏 = 400𝑝𝑠
𝐴𝑟𝑒𝑎 = 130𝜇𝑚2
Kamanga, Isaack. Design Optimization of the 64-Bit Carry Look-Ahead Adder Based on FPGA and Verilog HDL
Reference :
/amradelm
/amradelm
Register Duplication
14
• By duplicating registers, the timing paths can be shortened, reducing the wire and
cell propagation delays.
• This also reduces the fanout on the register which may enhance the output delay of
the register
• Consider the example on the right :
o By duplicating the green registers we managed to move each copy near one of
the blue register
o This first, reduces the wire length between the green and blue registers and
second, allows us to remove the buffers and inverter pairs on the nets and both
reduce the total combinational delay
o This shows that this method becomes more useful when the capture registers
(the blue ones) are placed far away from each other in the chip.
o However, FF1 now drives double the fanout so the delay of the timing path
between FF1 and FF2 is increased. We need to make sure this increase doesn’t
cause the path to violate setup timing.
• Duplication can be enabled globally or on a cell-by-cell basis
• Synthesis Commands:
o set compile_register_replication true
#When this variable is set to true, compile tries to
identify registers in the current design that can be split
to balance the loads for better QoR.
o set_register_replication -num_copies 3 <REGISTER>
#Duplicate a certain register 3 times.
Before Duplication After Duplication
/amradelm
/amradelm
Register Merging
15
• Merging is the opposite of duplication and is done to reduce the area in the design
but might degrade timing.
• Merging can be enabled globally or on a cell-by-cell basis
• Synthesis Commands:
o set compile_enable_register_merging true
o set_register_merging <REGISTER_LIST> true #Merge certain
registers.
Before Merging After Merging
/amradelm
/amradelm
Preferred MUX Implementation
• Standard cell libraries have the basic cells needed to build a MUX (2 AND ,1 OR ,1 Inverter) but also have
integrated MUX cells.
• It’s better to use the basic cells to build a MUX because each cell can be placed and optimized individually
allowing for greater flexibility for placement and optimizations which produces better timing and area
results.
• The problem is this approach increases the number of pins. For example, a 2:1 MUX will have 11 pins (6 pins
for the 2 ANDs, 2 for Inverter, 3 for OR) compared to 4 pins for the integrated MUX (2 inputs, 1 output, 1
selection).
• This might create pin congestion and make routing difficult. In such cases, it’s better to use the MUX cells
• ASIC tools allow you to instruct the synthesis about which implementation it should prefer over the other.
• Synthesis Commands:
o set compile_prefer_mux true
#The default flow typically maps most multiplexers to and-or-invert (AOI)
logic in order to minimize area, but in some cases this can result in
congestion hotspots. With compile_prefer_mux enabled, multiplexing logic
that is likely to cause congestion is converted to MUX trees where possible.
o set hdlin_infer_mux all
set_size_only [get_cells -hier * -filter "@ref_name =~ *MUX_OP*"]
#These commands forces the compiler to use MUX cells instead of the basic
gates. However, this restricts the tool and might degrade QoR.
16
Standard Cell
Standard Cells
/amradelm
/amradelm
Multi-Bit Banking
17
• ASIC standard cell libraries contain special flip-flops that can store more than one bit. These FFs are called multi-bit banking registers.
• The area of a multi-bit register is less than the total area of the registers if implemented individually.
• Also, the clock tree have less buffers (less area and power) when multi-bit banking is enabled.
• The disadvantage is the limited placement and since all the bits are forced to be placed at the same location.
• The other disadvantage is the limited CTS flexibility since all bits are forced to have the same clock latency which limits fixing timing violations using local skew
optimizations.
• Synthesis Commands:
o set hdlin_infer_multibit [never | default_all | default_none]
#The never setting prevents inference of multibit components
from HDL regardless of directives (Verilog) or attributes (VHDL).
#The default_all setting infers multibit components on all bused
registers except where directives or attributes
indicate otherwise.
#The default_none setting specifies that only attributes
or directives are used to infer multibit components.
This is the default for the hdlin_infer_multibit variable.
/amradelm
/amradelm
Thank You!
18

More Related Content

PPTX
Netlist to GDSII flow new.pptx physical design full info
PPTX
EEL71090_Lecture12_13_BKumar_IIITJodh.pptx
PPT
ASCIC.ppt
PPTX
Floor plan & Power Plan
PPTX
PREP_ASIC.pptx KS KKA SPNNDPS FK KMAKDK D
ODP
Inputs of physical design
PPT
synthesis_0501 in digital vlsi design.ppt
PPTX
ASIC Design Fundamentals.pptx
Netlist to GDSII flow new.pptx physical design full info
EEL71090_Lecture12_13_BKumar_IIITJodh.pptx
ASCIC.ppt
Floor plan & Power Plan
PREP_ASIC.pptx KS KKA SPNNDPS FK KMAKDK D
Inputs of physical design
synthesis_0501 in digital vlsi design.ppt
ASIC Design Fundamentals.pptx

Similar to ASIC Synthesis Optimizations And Settings Part 3 (20)

PPTX
module nenddhd dhdbdh dehrbdbddnd d 1.pptx
PDF
24-02-18 Rejender pratap.pdf
PPTX
VLSI_CAD_Introductionxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.pptx
PDF
Physical design-complete
PDF
VLSI Physical Design Physical Design Concepts
PPTX
ZERO WIRE LOAD MODEL.pptx
PDF
ASIC DESIGN OF MINI-STEREO DIGITAL AUDIO PROCESSOR UNDER SMIC 180NM TECHNOLOGY
PDF
Tutorial for EDA Tools:
PDF
Tutorial for EDA Tools
PPTX
Place & Route_Data_Preparation inputs Needed
PPTX
VLSI design Dr B.jagadeesh UNIT-5.pptx
PDF
physical_design_training_FloorPlanning_synapse
PPT
ASIC Design Flow_Introduction_details.ppt
PDF
Input files for Physical Design -> Lef and tf file
PPTX
Analog vs digital integrated circuit design
PPT
floor planning in digital vlsi design .ppt
PPTX
router 1 x 3 project for physical design engineer
PDF
Automated Synthesis from HDL models Design Compiler (Synopsys)
PDF
Floorplanning Power Planning and Placement
PDF
Physical Design, ASIC Design, Standard Cells
module nenddhd dhdbdh dehrbdbddnd d 1.pptx
24-02-18 Rejender pratap.pdf
VLSI_CAD_Introductionxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.pptx
Physical design-complete
VLSI Physical Design Physical Design Concepts
ZERO WIRE LOAD MODEL.pptx
ASIC DESIGN OF MINI-STEREO DIGITAL AUDIO PROCESSOR UNDER SMIC 180NM TECHNOLOGY
Tutorial for EDA Tools:
Tutorial for EDA Tools
Place & Route_Data_Preparation inputs Needed
VLSI design Dr B.jagadeesh UNIT-5.pptx
physical_design_training_FloorPlanning_synapse
ASIC Design Flow_Introduction_details.ppt
Input files for Physical Design -> Lef and tf file
Analog vs digital integrated circuit design
floor planning in digital vlsi design .ppt
router 1 x 3 project for physical design engineer
Automated Synthesis from HDL models Design Compiler (Synopsys)
Floorplanning Power Planning and Placement
Physical Design, ASIC Design, Standard Cells
Ad

More from Amr Adel (12)

PDF
Chip Designer's Code - Linux Terminal Part 3 - File Handling
PDF
VLSI Static Timing Analysis Timing Checks Part 4 - Timing Constraints
PDF
VLSI Static Timing Analysis Timing Checks Part 5 - On Chip Variation
PDF
Clock Domain Crossing Part 3 - Data Duplication
PDF
Clock Domain Crossing All Parts Combined.pdf
PDF
Clock Domain Crossing Part 1 - Intro and MTBF
PDF
Clock Domain Crossing Part 7 - Timing Constraints
PDF
Clock Domain Crossing Part 6 - Asynchronous FIFO
PDF
VLSI Static Timing Analysis Timing Checks Part 3
PDF
VLSI Static Timing Analysis Setup And Hold Part 2
PDF
VLSI Static Timing Analysis Intro Part 1
PDF
FPGA Synthesis Optimizations And Settings Part 2b.pdf
Chip Designer's Code - Linux Terminal Part 3 - File Handling
VLSI Static Timing Analysis Timing Checks Part 4 - Timing Constraints
VLSI Static Timing Analysis Timing Checks Part 5 - On Chip Variation
Clock Domain Crossing Part 3 - Data Duplication
Clock Domain Crossing All Parts Combined.pdf
Clock Domain Crossing Part 1 - Intro and MTBF
Clock Domain Crossing Part 7 - Timing Constraints
Clock Domain Crossing Part 6 - Asynchronous FIFO
VLSI Static Timing Analysis Timing Checks Part 3
VLSI Static Timing Analysis Setup And Hold Part 2
VLSI Static Timing Analysis Intro Part 1
FPGA Synthesis Optimizations And Settings Part 2b.pdf
Ad

Recently uploaded (20)

PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
composite construction of structures.pdf
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
web development for engineering and engineering
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Construction Project Organization Group 2.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Welding lecture in detail for understanding
PPT
Project quality management in manufacturing
PDF
Digital Logic Computer Design lecture notes
PDF
R24 SURVEYING LAB MANUAL for civil enggi
Embodied AI: Ushering in the Next Era of Intelligent Systems
composite construction of structures.pdf
CYBER-CRIMES AND SECURITY A guide to understanding
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
573137875-Attendance-Management-System-original
CH1 Production IntroductoryConcepts.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
additive manufacturing of ss316l using mig welding
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
web development for engineering and engineering
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Construction Project Organization Group 2.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Welding lecture in detail for understanding
Project quality management in manufacturing
Digital Logic Computer Design lecture notes
R24 SURVEYING LAB MANUAL for civil enggi

ASIC Synthesis Optimizations And Settings Part 3

  • 1. Amr Adel Mohammady /amradelm /amradelm Logic Synthesis Part 3 – ASIC Synthesis
  • 2. /amradelm /amradelm Introduction • In the previous parts we learned the FPGA fabric and the FPGA synthesis flow. • In this part we will discuss the ASIC synthesis flow. 2
  • 3. /amradelm /amradelm The Inputs and Outputs • The inputs to ASIC synthesis are: o HDL: The Verilog or VHDL design files o Constraints: The timing, power, and area constraints o Timing Libraries: The standard cell libraries. o Synthesis Commands: ▪ set target_library <STDCELL_LIBRARY> set link_library1 "* $target_library io.db rams.db" read_verilog <RTL FILES LIST> current_design <TOP_MODULE_NAME> link source <TIMING_CONSTRAINTS> compile #Synthesize the design • The outputs are: o Design netlist o Various reports about the design such as timing, power or area reports o Synthesis Commands: ▪ write -f verilog -o ./netlist.v report_timing report_power report_area 3 The target_library variable specifies the library that Design Compiler uses to select cells for optimization and mapping. The link_library variable specifies every library that has cells referenced by the netlist such as RAMs. The tool uses the libraries specified in the link_library variable for resolving references (linking) [1] :
  • 4. /amradelm /amradelm Target Library • Both FPGA and synthesis start synthesis by creating a tech independent netlist. • After that, the generic netlist is mapped to the target technology and optimized to meet the constraints. • The target in ASIC is called a standard cell library: o It’s a collection of pre-designed and pre-characterized logic gates and other digital functions used for VLSI design. o The information can be timing, power consumption, physical layout, logic functionality, etc o This information is scattered into multiple files. For example, the timing and power information exist in timing lib/db file while the physical layout exists in a LEF/GDS files. o These files are sometimes called “Views” (e.g. timing view) as they represent the cell info from a certain point of view. 4 NAND Cell Schematic View Layout View 1.1 1.2 1.3 1.4 10 2.10 2.20 2.27 3.00 20 2.50 3.00 3.45 3.96 30 2.90 3.40 3.80 4.15 Load Capacitance 𝑪𝑳 Input Transition Time 𝒕 Example Timing View2 Reference: An Exploration of Applying Gate-Length-Biasing Techniques to Deeply-Scaled FinFETs Operating in Multiple Voltage Regimes. IEEE Transactions on Emerging Topics in Computing. PP. 1-1. 10.1109/TETC.2016.2640185. [1] : These are arbitrary number for demonstration only [2] : [1]
  • 5. /amradelm /amradelm Wire Load Model (WLM) 5 • For the synthesis to know the cell delay and power, it needs to know the input transition and capacitive load. • Both values depend on the cell type and also the wires connecting the cells. • The cell information is known from the standard cell library. So, the missing info is the wire values (resistance and capacitance). • In older technologies, the wire values were estimated using a wire load model (WLM). • This model estimates the length of a wire (and therefore the resistance and capacitance) based on the number of fanouts and the block size as shown in the diagrams • These estimations are based on results from previous designs Wire Cap INV Cap OR Cap More Fanouts => More Wire Length Larger Block Size => More Wire Length
  • 7. /amradelm /amradelm Physical Synthesis 7 • In newer tech nodes the WLM produced bad estimations so tool vendors tried another approach called Physical Synthesis. • In this approach the floorplan and physical info (techfile, cell layout, parasitics, etc) are passed to the synthesis. • This allows the synthesis to do cell placement along with logic synthesis and optimization. • Since, it knows the distance between the cells, the synthesis can more accurately estimate the expected wire length. • Physical synthesis produces much better results compared to the WLM approach but has a longer design time • Two-pass Synthesis: Tool vendors recommend doing physical synthesis in two steps: 1. Synthesize the design with an initial floorplan. The resulting netlist gives info about the cell counts total area, and congestion which enables us to create a better floorplan 2. Create a new floorplan then redo the synthesis with the physical info. • In the next slides we will see the other inputs needed (along FP) to do physical synthesis One-Pass Synthesis (Not Recommended) Two-Pass Synthesis (Recommended)
  • 8. /amradelm /amradelm Physical Synthesis Inputs – Tech File 8 • The tech file contains various info about the technology like: o The units and precision. o The coloring of the metals in the GUI. o The minimum standard cell height and width. o The design rules such as the layers’ default width and spacing, etc. o Via definitions Example Tech File
  • 9. /amradelm /amradelm Physical Synthesis Inputs – ITF & TLUPLUS 9 Reference : Okuno, Hanako & Fournier, Adeline & Quesnel, E. & Muffato, V. & Poche, Hélène & Fayolle, M. & Dijon, J.. (2010). CNT integration on different materials suitable for VLSI interconnects. Comptes Rendus Physique - C R PHYS. 11. 381-388. 10.1016/j.crhy.2010.06.008. [1] : • The ITF (Interconnect Technology File) is a text-based file that contains raw information about each technology layer such as the thicknesses, resistivity, and dielectric constants • These values are further processed to generate the TLU+ file which contains tables of R, and C values as functions of metal layer widths, and spacing. This is done while taking into account all adjacent layers’ effects. • The TLU+ contents are binary and only contain a text header showing the ITF that was used to generate the TLU+ file • Along with TLU+, we use a layer mapping file that maps the layer names between the tech file and the TLU+ CMOS Cross Section1 Example ITF File
  • 10. /amradelm /amradelm Physical Synthesis Inputs – LEF (Library Exchange Format) 10 Reference : Automated integrity checks stop out-of-sync data issues in parallel flows (techdesignforums.com) [1] : • The GDS file contains full data about the design layout and masks and is sent to the fabrication plant to fabricate the chip. • From a runtime and memory usage point of view, we don’t need all the info of the GDS when doing placement. We only care about the cell boundary, pin shapes and locations. • The LEF file contains only the necessary info needed to perform placement and is used during physical synthesis and across the PnR stages. • Once PnR is finished, the LEF views are replaced by the GDS views to produce the final GDS file that contains all the info needed by the fabrication plant
  • 12. /amradelm /amradelm Critical Range & TNS Optimization • By default the tool focuses on enhancing the worst negative slack (WNS). • The tool considers the WNS and some paths before it. This is controlled with the critical range variable. • A critical range of 0.0 means that only the most critical paths (the ones with the worst violation) are optimized. If you specify a nonzero critical range, near- critical paths within that amount of the worst path will also be optimized, if possible. • Also, you can instruct the tool to focus on enhancing the entire total negative slack (TNS) at the cost of additional runtime. • Synthesis Commands: o set_critical_range 2.5 top o set compile_timing_high_effort_tns true 12 WNS With critical range of 2.5 TNS
  • 13. /amradelm /amradelm Arithmetic Blocks Architecture 13 • Digital blocks have a tradeoff between speed vs power and area. The designer might choose an implementation that consume more power or has larger area but higher speed. • For example, there are different ways to implement binary adders. One implementation is the ripple adder which has small area and power consumption but has high 𝑇𝑐𝑜𝑚𝑏, while a carry-look-ahead (CLA) adder has smaller 𝑇𝑐𝑜𝑚𝑏 but takes a larger area. • The synthesis tool can automatically choose the best implementation to enhance timing, or area. • Synthesis Commands: o set_dp_smartgen_options -optimize_for [area | speed | area,speed] 𝑇𝑐𝑜𝑚𝑏 = 700𝑝𝑠 𝐴𝑟𝑒𝑎 = 75𝜇𝑚2 𝑇𝑐𝑜𝑚𝑏 = 400𝑝𝑠 𝐴𝑟𝑒𝑎 = 130𝜇𝑚2 Kamanga, Isaack. Design Optimization of the 64-Bit Carry Look-Ahead Adder Based on FPGA and Verilog HDL Reference :
  • 14. /amradelm /amradelm Register Duplication 14 • By duplicating registers, the timing paths can be shortened, reducing the wire and cell propagation delays. • This also reduces the fanout on the register which may enhance the output delay of the register • Consider the example on the right : o By duplicating the green registers we managed to move each copy near one of the blue register o This first, reduces the wire length between the green and blue registers and second, allows us to remove the buffers and inverter pairs on the nets and both reduce the total combinational delay o This shows that this method becomes more useful when the capture registers (the blue ones) are placed far away from each other in the chip. o However, FF1 now drives double the fanout so the delay of the timing path between FF1 and FF2 is increased. We need to make sure this increase doesn’t cause the path to violate setup timing. • Duplication can be enabled globally or on a cell-by-cell basis • Synthesis Commands: o set compile_register_replication true #When this variable is set to true, compile tries to identify registers in the current design that can be split to balance the loads for better QoR. o set_register_replication -num_copies 3 <REGISTER> #Duplicate a certain register 3 times. Before Duplication After Duplication
  • 15. /amradelm /amradelm Register Merging 15 • Merging is the opposite of duplication and is done to reduce the area in the design but might degrade timing. • Merging can be enabled globally or on a cell-by-cell basis • Synthesis Commands: o set compile_enable_register_merging true o set_register_merging <REGISTER_LIST> true #Merge certain registers. Before Merging After Merging
  • 16. /amradelm /amradelm Preferred MUX Implementation • Standard cell libraries have the basic cells needed to build a MUX (2 AND ,1 OR ,1 Inverter) but also have integrated MUX cells. • It’s better to use the basic cells to build a MUX because each cell can be placed and optimized individually allowing for greater flexibility for placement and optimizations which produces better timing and area results. • The problem is this approach increases the number of pins. For example, a 2:1 MUX will have 11 pins (6 pins for the 2 ANDs, 2 for Inverter, 3 for OR) compared to 4 pins for the integrated MUX (2 inputs, 1 output, 1 selection). • This might create pin congestion and make routing difficult. In such cases, it’s better to use the MUX cells • ASIC tools allow you to instruct the synthesis about which implementation it should prefer over the other. • Synthesis Commands: o set compile_prefer_mux true #The default flow typically maps most multiplexers to and-or-invert (AOI) logic in order to minimize area, but in some cases this can result in congestion hotspots. With compile_prefer_mux enabled, multiplexing logic that is likely to cause congestion is converted to MUX trees where possible. o set hdlin_infer_mux all set_size_only [get_cells -hier * -filter "@ref_name =~ *MUX_OP*"] #These commands forces the compiler to use MUX cells instead of the basic gates. However, this restricts the tool and might degrade QoR. 16 Standard Cell Standard Cells
  • 17. /amradelm /amradelm Multi-Bit Banking 17 • ASIC standard cell libraries contain special flip-flops that can store more than one bit. These FFs are called multi-bit banking registers. • The area of a multi-bit register is less than the total area of the registers if implemented individually. • Also, the clock tree have less buffers (less area and power) when multi-bit banking is enabled. • The disadvantage is the limited placement and since all the bits are forced to be placed at the same location. • The other disadvantage is the limited CTS flexibility since all bits are forced to have the same clock latency which limits fixing timing violations using local skew optimizations. • Synthesis Commands: o set hdlin_infer_multibit [never | default_all | default_none] #The never setting prevents inference of multibit components from HDL regardless of directives (Verilog) or attributes (VHDL). #The default_all setting infers multibit components on all bused registers except where directives or attributes indicate otherwise. #The default_none setting specifies that only attributes or directives are used to infer multibit components. This is the default for the hdlin_infer_multibit variable.