Upload
Download free for 30 days
Login
Submit Search
Fusion Compiler Incremental Clock Tree Synthesis Update W-2024.09_FC_ICCII_CTS_CCD_MSCTS_Update_Training.pdf
0 likes
274 views
N
netoame4
Fusion Compiler incremental training - Clock tree synthesis
Design
Read more
1 of 86
Download now
Downloaded 14 times
1
2
3
4
5
Most read
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Most read
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
Most read
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
More Related Content
PDF
Judd-Ofelt Theory: Principles and Practices
Brian Walsh
PPTX
Adult stem cells.pptx
PreethiNagarajan7
PDF
modeling techniques for composites structures
john Regassa
PDF
Future orthopedics basics of stem cells and tissue engineering dr.sandeep c a...
AGRASEN Fracture Arthritis Hospital, Ganesh Nagar,Gondia,Maharashtra,INDIA
PPTX
Radiation units
Dr Vijay Raturi
PDF
RADIOSS - Composite Materials & Optimization
Altair
PPTX
Hodgkin huxleymodeling
Nafiz Ishtiaque Ahmed
PPTX
Nuclear Chemistry-Augar effect-Internal conversion-Isomerism
Eswaran Murugesan
Judd-Ofelt Theory: Principles and Practices
Brian Walsh
Adult stem cells.pptx
PreethiNagarajan7
modeling techniques for composites structures
john Regassa
Future orthopedics basics of stem cells and tissue engineering dr.sandeep c a...
AGRASEN Fracture Arthritis Hospital, Ganesh Nagar,Gondia,Maharashtra,INDIA
Radiation units
Dr Vijay Raturi
RADIOSS - Composite Materials & Optimization
Altair
Hodgkin huxleymodeling
Nafiz Ishtiaque Ahmed
Nuclear Chemistry-Augar effect-Internal conversion-Isomerism
Eswaran Murugesan
Similar to Fusion Compiler Incremental Clock Tree Synthesis Update W-2024.09_FC_ICCII_CTS_CCD_MSCTS_Update_Training.pdf
(20)
PPTX
Nokia engineer basic_training_session_v1
mohameddawood35
PDF
Clock Gating of Streaming Applications for Power Minimization on FPGA’s
IRJET Journal
PDF
Training feedback Basavaraju
Basavaraju YM
PDF
Motorola MotoTRBO Firmware 2.3 Release Notes (November 2013)
TwoWayDigitalRadio.com
PDF
NetSIm Technology Library- Cognitive radio
Vishal Sharma
PDF
IRJET- A New High Speed Wide Fan in Carry Look Ahead Adder Design using M...
IRJET Journal
PPTX
Trg138042019_1_annex_MAX-NG.pptx
SAROORNAGARCMCORE
PDF
Signal-Oriented ECUs in a Centralized Service-Oriented Architecture: Scalabil...
RealTime-at-Work (RTaW)
PPT
Verilog HDL Verification
dennis gookyi
PDF
Smart Surveillance Bot with Low Power MCU
IRJET Journal
PDF
Adaptive Laser Cladding System with Variable Spot Sizes
Jorge Rodríguez Araújo
PDF
Impact2014 session #1317 you have got a friend on z - tales from cics tran...
Elena Nanos
PDF
C20 20090615-019-alu csfb-performance_enhance
Emmanuel Msumali
PDF
Automation Production Systems and Computer Integrated Manufacturing 4th Editi...
blingaborjgn
PDF
AIRCOM LTE Webinar 5 - LTE Capacity
AIRCOM International
PDF
Time-Predictable Communication in Service-Oriented Architecture - What are th...
RealTime-at-Work (RTaW)
PDF
Ethernet_Smart_Switches_ElektronikAutomotive_202306_PressArticle_EN.pdf
ChaosXia
PDF
D1.2 analysis and selection of low power techniques, services and patterns
Babak Sorkhpour
PDF
toyota-Challenges towards New Software Platform for Automated Driving.pdf
xmumiao
PDF
High performance low leakage power full subtractor circuit design using rate ...
eSAT Publishing House
Nokia engineer basic_training_session_v1
mohameddawood35
Clock Gating of Streaming Applications for Power Minimization on FPGA’s
IRJET Journal
Training feedback Basavaraju
Basavaraju YM
Motorola MotoTRBO Firmware 2.3 Release Notes (November 2013)
TwoWayDigitalRadio.com
NetSIm Technology Library- Cognitive radio
Vishal Sharma
IRJET- A New High Speed Wide Fan in Carry Look Ahead Adder Design using M...
IRJET Journal
Trg138042019_1_annex_MAX-NG.pptx
SAROORNAGARCMCORE
Signal-Oriented ECUs in a Centralized Service-Oriented Architecture: Scalabil...
RealTime-at-Work (RTaW)
Verilog HDL Verification
dennis gookyi
Smart Surveillance Bot with Low Power MCU
IRJET Journal
Adaptive Laser Cladding System with Variable Spot Sizes
Jorge Rodríguez Araújo
Impact2014 session #1317 you have got a friend on z - tales from cics tran...
Elena Nanos
C20 20090615-019-alu csfb-performance_enhance
Emmanuel Msumali
Automation Production Systems and Computer Integrated Manufacturing 4th Editi...
blingaborjgn
AIRCOM LTE Webinar 5 - LTE Capacity
AIRCOM International
Time-Predictable Communication in Service-Oriented Architecture - What are th...
RealTime-at-Work (RTaW)
Ethernet_Smart_Switches_ElektronikAutomotive_202306_PressArticle_EN.pdf
ChaosXia
D1.2 analysis and selection of low power techniques, services and patterns
Babak Sorkhpour
toyota-Challenges towards New Software Platform for Automated Driving.pdf
xmumiao
High performance low leakage power full subtractor circuit design using rate ...
eSAT Publishing House
Ad
Recently uploaded
(20)
PDF
1 Introduction to Networking (06).pdfbsbsbsb
sinhasumit200508shri
PPTX
PROPOSAL tentang PLN di metode pelaksanaan.pptx
BatutuChiken1
PDF
2025CategoryRanking of technology university
SyedHaiderAliShah12
PDF
Govind singh Corporate office interior Portfolio
GovindSinghManvi
PPTX
timber basics in structure mechanics (dos)
shivagamingoff
PDF
Timeless Interiors by PEE VEE INTERIORS
peeveeinteriors
PPT
Fire_electrical_safety community 08.ppt
SasidharReddy24631
PPTX
lecture-8-entropy-and-the-second-law-of-thermodynamics.pptx
AhmedSalem711613
PDF
trenching-standard-drawings procedure rev
khababbabiker
PDF
Chalkpiece Annual Report from 2019 To 2025
ramyaux
PPTX
2. Competency Based Interviewing - September'16.pptx
kesarisrilakshmi
PPTX
UNIT III - GRAPHICS AND AUDIO FOR MOBILE
GOWSIKRAJA PALANISAMY
PPTX
ENG4-Q2-W5-PPT (1).pptx nhdedhhehejjedheh
SaraJeanNace2
PDF
How Animation is Used by Sports Teams and Leagues
Neil Horowitz
PPTX
Evolution_of_Computing_Presentation (1).pptx
AngelineSenaPabustan
PPTX
WHY UPLOADING IS IMPORTANT TO DOWNLOAD SLIDES.pptx
abdullahinbzamanborn
PDF
Social Media USAGE .............................................................
shanborkd
PPTX
22CDH01-V3-UNIT III-UX-UI for Immersive Design
GOWSIKRAJA PALANISAMY
PPTX
8086.pptx microprocessor and microcontroller
PradeepJuneja1
PPT
Unit I Preparatory process of dyeing in textiles
SANGEETHA PRIYA B
1 Introduction to Networking (06).pdfbsbsbsb
sinhasumit200508shri
PROPOSAL tentang PLN di metode pelaksanaan.pptx
BatutuChiken1
2025CategoryRanking of technology university
SyedHaiderAliShah12
Govind singh Corporate office interior Portfolio
GovindSinghManvi
timber basics in structure mechanics (dos)
shivagamingoff
Timeless Interiors by PEE VEE INTERIORS
peeveeinteriors
Fire_electrical_safety community 08.ppt
SasidharReddy24631
lecture-8-entropy-and-the-second-law-of-thermodynamics.pptx
AhmedSalem711613
trenching-standard-drawings procedure rev
khababbabiker
Chalkpiece Annual Report from 2019 To 2025
ramyaux
2. Competency Based Interviewing - September'16.pptx
kesarisrilakshmi
UNIT III - GRAPHICS AND AUDIO FOR MOBILE
GOWSIKRAJA PALANISAMY
ENG4-Q2-W5-PPT (1).pptx nhdedhhehejjedheh
SaraJeanNace2
How Animation is Used by Sports Teams and Leagues
Neil Horowitz
Evolution_of_Computing_Presentation (1).pptx
AngelineSenaPabustan
WHY UPLOADING IS IMPORTANT TO DOWNLOAD SLIDES.pptx
abdullahinbzamanborn
Social Media USAGE .............................................................
shanborkd
22CDH01-V3-UNIT III-UX-UI for Immersive Design
GOWSIKRAJA PALANISAMY
8086.pptx microprocessor and microcontroller
PradeepJuneja1
Unit I Preparatory process of dyeing in textiles
SANGEETHA PRIYA B
Ad
Fusion Compiler Incremental Clock Tree Synthesis Update W-2024.09_FC_ICCII_CTS_CCD_MSCTS_Update_Training.pdf
1.
Implementation: CTS/CCD/MSCTS Version W-2024.09 Fusion
Compiler / IC Compiler II Update Training © Synopsys, Inc. All Rights Reserved
2.
2 Synopsys Confidential Information
© Synopsys, Inc. Confidential Information CONFIDENTIAL INFORMATION The information contained in this presentation is the confidential and proprietary information of Synopsys. You are not permitted to disseminate or use any of the information provided to you in this presentation outside of Synopsys without prior written authorization. IMPORTANT NOTICE In the event information in this presentation reflects Synopsys’ future plans, such plans are as of the date of this presentation and are subject to change. Synopsys is not obligated to update this presentation or develop the products with the features and functionality discussed in this presentation. Additionally, Synopsys’ services and products may only be offered and purchased pursuant to an authorized quote and purchase order or a mutually agreed upon written contract with Synopsys.
3.
3 Synopsys Confidential Information
© Synopsys, Inc. Agenda Synopsys Confidential Information Clock Tree Synthesis (CTS) Enhancements • Pre-CTS Latency Bottleneck Reporting (GA) • Streamlined CTS Phase 2 (GA) • CTS Debuggability and Log file Enhancements (GA) • Early CTS Flow (GA) • Clock Power Recovery (GA)
4.
4 Synopsys Confidential Information
© Synopsys, Inc. Pre-CTS Latency Bottleneck Reporting • Overview • Solution Description • User Interface W-2024.09
5.
5 Synopsys Confidential Information
© Synopsys, Inc. Pre-CTS Latency Bottleneck Reporting Efficient latency debugging is a feature frequently requested by customers. Currently the longest critical paths can be identified only after executing CTS, using the report_clock_qor or during CTS using the cts.compile.report_latency_bottleneck application option. The latency bottleneck analysis is printed in the CTS log file for the longest path per clock in a separate CTS step. Starting from the W-2024.09 release, the users can identify the longest paths even before running CTS. It can be executed using the report_estimated_clock_latency command, which is a standalone command that can be used independently of the synthesize_clock_trees command. In this feature, the effect of ICG relocation for the benefit of latency during the synthesize_clock_trees command is not reflected in the report. The format of this report looks similar to the usual latency bottleneck analysis step. Overview W-2024.09
6.
6 Synopsys Confidential Information
© Synopsys, Inc. Pre-CTS Latency Bottleneck Reporting Reporting Pre-CTS latency bottleneck analysis : By default, Pre-CTS Latency Bottleneck analysis reports one longest path per clock for the primary corner. This feature also supports the below switches, -clock → reports longest paths for selected clocks only -corner → reports the longest path in the selected corner -longest n → reports ‘n’ longest paths -through <driver> → reports longest path through a particular driver -unique_gate_levels → report longest path with ‘k’ unique driver levels Solution Description
7.
7 Synopsys Confidential Information
© Synopsys, Inc. Pre-CTS Latency Bottleneck Reporting Solution Description W-2024.09 fc_shell> report_estimated_clock_latency **************************************** Report : Estimated Clock Latency **************************************** Latency Bottleneck Paths for clock clk in mode func at root clk: Longest path 1: (0) clk [Location: (0.17, 15.86)] [Fanout: 3] [Delay: 0.000000] (1) U1/GCP [Location: (7.15, 17.75)] [Fanout: 2] [Delay: 0.001116] (2) U3/GCP [Location: (11.44, 15.15)] [Fanout: 2] [Delay: 0.026665] (3) U7/CP [Location: (10.99, 13.44)] [SINK PIN] [Delay: 0.051563] fc_shell> report_estimated_clock_latency -longest 2 **************************************** Report : Estimated Clock Latency -longest 2 **************************************** Latency Bottleneck Paths for clock clk in mode func at root clk: Longest path 1: (0) clk [Location: (0.17, 15.86)] [Fanout: 3] [Delay: 0.000000] (1) U1/GCP [Location: (7.15, 17.75)] [Fanout: 2] [Delay: 0.001116] (2) U3/GCP [Location: (11.44, 15.15)] [Fanout: 2] [Delay: 0.026665] (3) U7/CP [Location: (10.99, 13.44)] [SINK PIN] [Delay: 0.051563] Longest path 2: (0) clk [Location: (0.17, 15.86)] [Fanout: 3] [Delay: 0.000000] (1) U1/GCP [Location: (7.15, 17.75)] [Fanout: 2] [Delay: 0.001116] (2) U3/GCP [Location: (11.44, 15.15)] [Fanout: 2] [Delay: 0.026665] (3) U8/CP [Location: (13.08, 15.03)] [SINK PIN] [Delay: 0.051507]
8.
8 Synopsys Confidential Information
© Synopsys, Inc. Pre-CTS Latency Bottleneck Reporting Solution Description W-2024.09 fc_shell> report_estimated_clock_latency -corner C1 : **************************************** Report : Estimated Clock Latency -corner C1 **************************************** Latency Bottleneck Paths for clock clk in mode func at root clk: Longest path 1: (0) clk [Location: (0.17, 15.86)] [Fanout: 3] [Delay: 0.000000] (1) U1/GCP [Location: (7.15, 17.75)] [Fanout: 2] [Delay: 0.001116] (2) U3/GCP [Location: (11.44, 15.15)] [Fanout: 2] [Delay: 0.026665] (3) U7/CP [Location: (10.99, 13.44)] [SINK PIN] [Delay: 0.051563] fc_shell> report_estimated_clock_latency -longest 1 -unique_gate_levels 1 : **************************************** Report : Estimated Clock Latency -longest 1 -unique_gate_levels 1 **************************************** Latency Bottleneck Paths for clock clk in mode func at root clk: Longest path 1: (0) clk [Location: (0.17, 15.86)] [Fanout: 3] [Delay: 0.000000] (1) U1/GCP [Location: (7.15, 17.75)] [Fanout: 2] [Delay: 0.001116] (2) U3/GCP [Location: (11.44, 15.15)] [Fanout: 2] [Delay: 0.026665] (3) U7/CP [Location: (10.99, 13.44)] [SINK PIN] [Delay: 0.051563]
9.
9 Synopsys Confidential Information
© Synopsys, Inc. Pre-CTS Latency Bottleneck Reporting User Interface fc_shell> report_estimated_clock_latency -help Usage: report_estimated_clock_latency # Report estimated clock latency [-clocks object_list] (List of clocks) [-longest longest] (Number of longest paths) [-corner corner] (Corner for reporting) [-through through_pin] (Through pin) [-unique_gate_levels unique_gate_levels] (Number of unique gate levels) W-2024.09
10.
10 Synopsys Confidential Information
© Synopsys, Inc. Streamlined CTS Phase 2 • Overview • Solution Description • User Interface W-2024.09
11.
11 Synopsys Confidential Information
© Synopsys, Inc. Streamlined CTS Phase 2 There are enhancements that are already done as part of phase 1 in the V-2023.12- SP3 release where the tool used fast cells only for the critical sink associated with the latency critical driver and slow cells added for non-critical sinks. Area and power are recovered during the Compile CTS stage using path-based criticality buffering. It also helped to improve latency by using fast cells for critical sinks. Starting from the W-2024.09 release, two different enhancements are added as part of phase 2. Currently, the kind of cell selection for clustering is different from buffering. In the W- 2024.09 release, cell selection matching for On-route buffering (ORB) and buffering is done. Improving correlation between CTS clustering and buffering helps to improve clustering imbalance. It also helps to improve area, power, and logical DRC. Balanced topology enhancement is also done, which uses a new linear programming- based solution for clock tree synthesis balanced topology generator to replace the previous algorithm. Additional postprocessing of balanced topology is done to further improve wire length and skew. Overview & Solution Description W-2024.09
12.
12 Synopsys Confidential Information
© Synopsys, Inc. Streamlined CTS Phase 2 User Interface Please use the below application option for enabling the feature in the W- 2024.09 release: set_app_options –name cts.compile.align_clustering_and_buffering –value true set_app_options –name cts.compile.balanced_topology_enhancements –value true W-2024.09
13.
13 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements • Overview • Solution Description • User Interface W-2024.09
14.
14 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements As part of the W-2024.09 release, there are couple of enhancements with respect to reporting commands and the log file for CTS. The list of the enhancements is given below: report_clock_qor prints total skew metric check_clock_trees flags auto exception generation points Print summary tables before and after each step in MTCTO get_clock_tree_pins to report all the pins above MSCTS driver using the is_mscts_global_driver_object attribute An attribute is_on_boundary_register is added to get all CCD boundary pins Overview W-2024.09
15.
15 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements Feature #1: Report clock QoR prints total skew metric Currently, there is not a way to report total skew value through the report_clock_qor command This feature is particularly needed when utilizing the total skew optimization enhancement during MTCTO From the W-2024.09 release, the report_clock_qor –type latency command has now been enhanced to have a new column called Total Skew For this, the skew is calculated as {median latency of clock – path delay}, and the total skew is the sum of skews to the endpoints, provided it is greater than the target skew set Expectation We should see a section of total skew after reporting the clock QoR summary Solution Description W-2024.09
16.
16 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements Feature #2: check_clock_trees to flag auto exception generation points Currently, the auto exceptions are handled in CTS during the Clock Tree Initialization step, and a file named clock_auto_exceptions*.tcl is dumped in the current working directory As part of this feature, to make it easier for the user to understand the auto exceptions that are derived during CTS, the check_clock_trees command is enhanced to report them beforehand There is a file with the naming convention check_clock_trees_clock_auto_exceptions*.tcl dumped in the current working directory Please note that this does not change the current database The following exceptions are what get reported by the command: internal pins, loop breaking pins, conflict pins, and split pins. The expectation is that we should see a check_clock_trees_clock_auto_exceptions*.tcl file dumped in the current working directory and that should have reported the above listed cases, if any Solution Description W-2024.09 #conflict pins set_clock_tree_balance_point –consider_for_balancing false –clock [get_clocks $clock –mode $mode –balance_points $balance_point
17.
17 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements Feature #3: Print summary tables before and after each step in MTCTO As per the current behavior, we have CTS-037 messages printed for each step in MTCTO showing the QoR before and after each stage From the W-2024.09 release, we have a feature that prints summary tables before and after each step in MTCTO and provides users with a clearer view of clock QoR Please note that it gets printed for each clock, corner, and mode Expectation: There should be QoR summary tables printed before and after each step in MTCTO Solution Description W-2024.09 Summary table Total scenarios
18.
18 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements Feature #4: get_clock_tree_pins to report all the pins above MSCTS driver through an attribute Currently there is a way to get all the MSCTS subtree drivers with the attribute is_mscts_subtree_driver_object However, to get the H-tree drivers, the user must use double filter: get_clock_tree_pins - to [get_clock_tree_pins -filter is_mscts_subtree_driver_object] From the W-2024.09 release onwards, we have an attribute is_mscts_global_driver_object to report all the H-tree drivers Please note that this feature works for multitap driver setup as well Expectation: We should see the tap drivers getting reported with the attribute Solution Description W-2024.09 fc_shell> get_clock_tree_pins –clock $clock –filter is_mscts_global_driver_object {MSCTS_htree_0_0/Y MSCTS_htree_0_1/Y MSCTS_htree_0_2/Y}
19.
19 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements Feature #5: Attribute is_on_boundary_register to get all boundary pins Currently, to get the CCD boundary pins, users follow a manual method, which is not always feasible Starting from the W-2024.09 release, we have an attribute is_on_boundary_register that reports all the CCD boundary pins in the design Expectation: We should see the CCD boundary register pins getting reported after using the attribute Solution Description W-2024.09 fc_shell> get_clock_tree_pins –filter is_on_boundary_register {reg_0/CLK reg_34/CLK reg_40/CLK}
20.
20 Synopsys Confidential Information
© Synopsys, Inc. CTS Debuggability and Log file Enhancements User Interface W-2024.09 Feature UI report_clock_qor printing the total skew value set_app_options –name cts.report.report_clock_qor_total_skew –value true check_clock_trees to flag auto exception generation points check_clock_trees Print summary tables before and after each step in MTCTO set_app_options -name cts.optimize.print_qor_summary_table -value true Balance point check check_clock_trees get_clock_tree_pins to report all the pins above MSCTS driver with an attribute get_clock_tree_pins –is_mscts_global_driver_object Attribute “is_on_boundary_register” to get all boundary pins get_clock_tree_pins –filter is_on_boundary_register
21.
21 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow • Introduction • Solution Description • Expectation • User Interface • Setup Instructions W-2024.09
22.
22 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow Regular Clock Tree Synthesis (CTS) flow involves clock tree building and optimization only during the clock_opt build_clock stage (after compile_fusion or place_opt). The known behavior of the conventional flow is that the delay optimization and CCD in the compile stage do not view the actual clock tree effects and perform optimization only based on the ideal clock tree. Actual clock tree skews are visible only after the real clock trees are built and propagated OCV analysis comes into effect only when clock trees are built This leads to a miscorrelation in design timing QoR between the stages before and after the clock tree is built. To overcome this miscorrelation, users tend to use a few workarounds : Workaround: Apply a higher clock uncertainty at the ideal clock stage to model the clock skews and add some margin for modeling OCV effects at the ideal clock stage Caveat: However, this poses two difficulties. a) The user does not know what exact values of uncertainty to apply at the pre-CTS stage. b) Chances of over-constraining the design, leading to power and area degradation Introduction W-2024.09
23.
23 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow To overcome the issues, starting from the W-2024.09 release, the tool provides a way to perform one round of clock tree building during the compile/place stage itself – known as Early CTS flow. This makes the aforementioned clock tree effects seen during the data-path optimization and CCD during the compile stage. Essentially, this makes the timing optimization in the compile stage more clock tree aware. Introduction W-2024.09
24.
24 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow With Early CTS, clock tree synthesis happens twice in the flow: compile_fusion/place_opt final_place: o CTS requires the placement to be finalized. Early CTS in the compile_fusion/place_opt happens at the end of the final_place after the direct timing driven placement (DTDP). o Initial clock trees are built and propagated, followed by MTCTO based global latency and skew optimization. o Clock trees are global routed as well o Any long nets on the data path created during CTS clock cell relocation are addressed during the logical DRC (LDRC) fixing steps during the compile_fusion/place_opt final_opto o Now, with the propagated clock tree built, global routed, and optimized for skews, the compile_fusion/place_opt final_opto views the propagated timing picture o With this, delay optimization and offset derivation by CUS in the compile_fusion/place_opt final_opto more are realistic, leading to better QoR convergence down the flow. Solution Description W-2024.09
25.
25 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow CCD during the compile_fusion and place_opt final_opto annotates useful skew offsets and adjustments in the form of set_clock_latency offset (with the recommended options) even in the propagated mode for the timing and optimization to understand the CCD offsets. With Early CTS, clock tree synthesis happens twice in the flow: clock_opt build_clock: It removes the clock trees built by early CTS completely and rebuilds the clock tree guided by the balance points derived by CCD during the compile_fusion final_opto CCD in the clock_opt build_clock incrementally optimizes the timing QoR Solution Description W-2024.09
26.
26 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow compile_fusion/place_opt: Timing looks worse after early CTS, but this is just bringing clock realism earlier into the flow Similarly, area and power appear worse due to the addition of clock tree buffers The compile_fusion/place_opt runtime increases due to early CTS clock_opt or End of Flow : With more accurate timing to drive the compile_fusion/place_opt final_opto, the flow QoR should converge better in terms of timing QoR (WNS/TNS) as it progresses from CTS to clock_opt final_opto We should end up with less hurtful and more helpful skews given that CUS in the compile_fusion/place_opt final_opto sees a better timing context As the propagated timing QoR picture is visible upfront, we can also expect better power in cases of designs with neutral timing QoR by the end of the flow The runtime overhead from trial CTS should be partially offset by faster timing convergence through the clock_opt. Expectation W-2024.09
27.
27 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow User Interface • Early CTS flow can be enabled in the W-2024.09 release during the compile_fusion/place_opt by stage using the below application option: For place_opt flow: set_app_options -list {place_opt.flow.trial_clock_tree true} For compile_fusion flow: set_app_options -list {compile.flow.trial_clock_tree true} W-2024.09
28.
28 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow Make sure all the necessary clock tree setup and settings are applied before the compile_fusion/place_opt final_place stage for trial CTS to correlate with build_clock. CTS and CTO application options and settings, including skew and latency targets, cell spacing rules, skew group settings, clock balance group settings, etc. Clock NDRs, constraints like max_transition/capacitance/fanout/net_length, clock lib-cell reference list Any custom user proc that runs before CTS, which could impact its execution Keep active scenarios the same from the compile_fusion/place_opt final_place stage through the clock_opt final_opto. Reduce clock uncertainty after the compile_fusion/place_opt final_place to account for the existence of the trial clock tree. We recommend using the same uncertainty as that of post-CTS Setup Instructions (1/3) W-2024.09
29.
29 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow Set these application options in the compile_fusion/place_opt: set_app_options -list {place_opt.flow.trial_clock_tree true} (or) set_app_options -list {compile.flow.trial_clock_tree true} set_app_options -list {npo.enable_ects_flow true}; #OBD since D20240830 set_app_options -list {cts.common.disable_ccs_rcv_cap_in_trial false} set_app_options -list {time.enable_offset_latency_computation_in_propagated_clocks true} Set these in both the compile_fusion/place_opt and clock_opt for both baseline and early CTS flows: set_app_options -as_user_default -name cts.optimize.improvement_mode_version -value EIM_20240330 set_app_options -name cts.optimize.enable_improvement_mode -value skew Setup Instructions (2/3 W-2024.09
30.
30 Synopsys Confidential Information
© Synopsys, Inc. Early CTS Flow Set this for the clock_opt build_clock flow in early CTS flow: set_app_options -list {cts.compile.enable_cell_relocation none} Override this setting only during the compile_fusion/place_opt final_place. Restore to previous value after early CTS is done: set_app_options -list {cts.compile.power_opt_mode none} Additionally, the following is recommended for CCD in the clock_opt for both baseline and early CTS flows: set_app_options -list {ccd.max_prepone_postpone_consider_skew_latency true} set_app_options -list {ccd.max_prepone_postpone_consider_corner_scaling true} set_app_options -list {ccd.enable_hyper_ccd true} Reset CCD max_pre/postpone limits from CTS to look for the best possible scope being utilized by CCD in both baseline and early CTS flows Setup Instructions (3/3) W-2024.09
31.
31 Synopsys Confidential Information
© Synopsys, Inc. Clock Power Recovery • Overview • User Interface W-2024.09
32.
32 Synopsys Confidential Information
© Synopsys, Inc. Clock Power Recovery Clock tree optimization (CTO) engine currently considers logical DRC, latency, skew, and area as cost functions to improve clock QoR There were increasing requests to reduce power at the final CTO step A new step CTS STEP: Power optimization is introduced after the Area Recovery step before the Final DRC Fixing step Introduces power as cost function for optimization at this step Performs buffer removal and sizing to improve total power over area Expectation Clock power should be improved There must be no degradation in terms of skew, latency, and logical DRC Overview W-2024.09
33.
33 Synopsys Confidential Information
© Synopsys, Inc. Clock Power Recovery User Interface Below application option can be used to enable clock power recovery feature, set_app_options -list {cts.optimize.enable_cto_power_optimization true} W-2024.09
34.
34 Synopsys Confidential Information
© Synopsys, Inc. Agenda Synopsys Confidential Information Concurrent Clock and Data Optimization (CCD) Enhancements • Data Path Aware CUS for Timing (GA) • Data Path Aware CUS for Power (GA) • Switching Power-Aware Pre-CTS CCD (GA) • Fast CUS (GA) • Clock DRC Improvements (GA)
35.
35 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Timing • Overview • Solution Description • User Interface W-2024.09
36.
36 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Timing The pre-CTS CCD engine, currently incorporated in the flow, follows sequential calls of CCD and data optimization for achieving better timing QoR. It derives offset by considering the scope of prepone and postpone available and scope of implementability. However, the offsets derived by CCD in the compile stage are not data path aware. To have better concurrency on the engine, this feature enables CCD offsets derived and solutions accepted to be data path aware. The current flow is that, once timing is optimized on a timing path and zero slack is achieved, the path does not get disturbed further for being utilized for slack borrowing for its fanin and fanout paths, even if the other paths are critical for timing. This leaves the extra timing potential available for the zero slack path, limiting the design frequency. This feature improves the design frequency by over-optimizing the paths that are easily closed or have positive slack, which in turn provides extra borrowable slack for paths with limiting frequency. Overview and Benefits W-2024.09
37.
37 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Timing Utilizing the unused Data Path Potential: Slack margin is applied on endpoints with zero or positive slack, making the slack more positive One round of useful skew computation is run, which does slack borrow, considering the extra positive slack Slack margins applied on the endpoints are removed. This makes the borrowed path violating Data path optimization is called to recover timing on these violating endpoints Extra call of useful skew computation is done by the end to recover clock tree power (cus-dpae2) This feature impacts final calls of CUS during the place_opt/compile final_opto stage. Solution Description FF1 FF2 FF3 D D D Q Q Q FF1 FF2 FF3 D D D Q Q Q FF1 FF2 FF3 D D D Q Q Q Path Margin Path Margin CUS FF1 FF2 FF3 D D D Q Q Q FF1 FF2 FF3 D D D Q Q Q Data Opto W-2024.09
38.
38 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Timing Delay Potential & Path Margin Computation: The path margin is not applied to all the endpoints but only to the endpoints having paths with available delay potential For every endpoint, the initial slack is stored across scenarios. Based on delays and the size-ability of logic cones, incremental arc delays are annotated on the timer, and a timer update is performed If the new slack degrades, zero delay potential is assumed Otherwise, the difference is applied as predicted delay potential Solution Description W-2024.09
39.
39 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Timing User Interface Data Path Aware CUS for Timing in the place_opt and compile_fusion flow with dynamic delay potential computation, can be enabled using the below new public application options: For compile_fusion flow: set_app_options –list {compile.flow.enable_dpa_ccd_timing_with_delay_potential true} For place_opt flow: set_app_options -list {place_opt.flow.enable_dpa_ccd_timing_with_delay_potential true} W-2024.09
40.
40 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Power • Overview • Solution Description • User Interface W-2024.09
41.
41 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Power The current CCD based place_opt/compile_fusion flow follows sequential calls of CCD and data optimization, where CCD derives offset by considering the scope of prepone/postpone available and scope of implementability. However, the offsets derived by CCD in the compile stage are not data path aware. In current flow, if there are endpoints with zero or slightly positive slack available once data power optimization is completed, those will be blocked from further power reduction to avoid any new timing criticality. Also, positive slack in the path with no further power potential is left unused throughout the flow. The feature utilizes this timing potential on neighboring paths to recover power at parts of the design that have power potential, making the flow more concurrent. Overview & Benefits W-2024.09
42.
42 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Power Data Path Aware CUS flow - Power: Data path aware CUS power recovery is done by borrowing the slack from pipeline stages where the power potential is zero to the stages that have power potential Note that this borrowing is applicable only for logic cones with positive power potential This slack transfer is achieved by tuning the clock latencies (preponing and postponing) that controls the launch and capture arrival times Once the slack is transferred, an extra data path optimization pass is introduced, which recovers power in the path with power potential Finally, these offsets that are not helping for power optimization are pruned This feature impacts final calls of CUS in the place_opt final_opto flow. Solution Description W-2024.09 FF1 FF2 FF3 D D D Q Q Q FF1 FF2 FF3 D D D Q Q Q FF1 FF2 FF3 D D D Q Q Q Path with power potential but no timing scope Path with no power potential but with timing scope CUS Postponing → Slack Transfer Data Power Opt. → Power Potential Utilized
43.
43 Synopsys Confidential Information
© Synopsys, Inc. Data Path Aware CUS for Power User Interface Data Path Aware CUS for Power can be enabled by using the following application option in the W-2024.09 release. For compile_fusion flow: set_app_options –list {compile.flow.enable_dpa_ccd_power true} For place_opt flow: set_app_options –list {place_opt.flow.enable_dpa_ccd_power true} W-2024.09
44.
44 Synopsys Confidential Information
© Synopsys, Inc. Switching Power-Aware Pre-CTS CCD • Overview • Solution Description • User Interface W-2024.09
45.
45 Synopsys Confidential Information
© Synopsys, Inc. Switching Power-Aware Pre-CTS CCD Overview Currently, CUS implements a clock tree model, which is not fully capable of capturing the more complex design characteristics, one of them being power. In the W-2024.09 release, this feature aims at introducing a new cost model for optimizing clock tree power during useful skew optimization by leveraging the switching activity of the clock tree. The new power-aware clock tree cost formulation introduces an additional cost function that is power-aware, i.e., the cost function derives higher offset values on the loads that are placed in nets where the switching activity is higher. Benefits The overall benefits include improved clock tree dynamic power from placement to clock tree synthesis and beyond without any degradation in runtime. Improved clock power and timing QoR. Overview & Benefits W-2024.09
46.
46 Synopsys Confidential Information
© Synopsys, Inc. Switching Power Aware Pre-CTS CCD A new total clock tree power estimation is introduced in CUS to balance the leakage and dynamic power cost. The new estimation is designed to approximate power in two phases. In the first phase, clock gate power is estimated by just accumulating the leakage and dynamic power of every clock gate, and the second phase estimates the power of repeaters. For this, a representative repeater power is chosen as the average across all repeater library cells, and the power contribution for repeaters is then calculated based on the representative dynamic and leakage power multiplied by the estimated repeater count per net, calculated based on the maximum fanout constraint Once dynamic and leakage power are estimated, the clock tree cost is scaled following the ratio given by the estimated dynamic and leakage power Please note that for this feature to be effective, we need to have at least one power scenario (leakage or dynamic). Power awareness is considered in every call of CUS in the compile stage with the same weight. Solution Description W-2024.09
47.
47 Synopsys Confidential Information
© Synopsys, Inc. Switching Power Aware Pre-CTS CCD User Interface Clock Power Aware CUS feature is made on-by-default and can be enabled with the OBD mega option from the W-2024.09 release: set_app_options -list {flow.common.effort 12345} W-2024.09
48.
48 Synopsys Confidential Information
© Synopsys, Inc. Fast CUS • Overview • Solution Description • User Interface W-2024.09 IC Compiler II
49.
49 Synopsys Confidential Information
© Synopsys, Inc. Fast CUS Overview Currently, the IC Compiler II place_opt flow makes four main calls to the global-solver based CUS engine (at the place_opt initial_opto and at the place_opt final_opto), which attempts to adjust clock latencies to improve the timing QoR in each call. This takes a significant amount of runtime (10-25%) in the place_opt flow. In the W-2024.09 release, the tool helps to improve the runtime of CUS by reducing the number of calls to the global-solver based CUS engine and introduces calls to incremental CUS, picking certain target endpoints that are critical for design and works only on it. Impacts the initial_opto of place_opt and the compile_fusion is not affected by Fast CUS. Benefits Helps to improve runtime in the place_opt flow while preserving QoR. For an average design, 20-25% speedup for CUS and 2% speedup for the place_opt flow is expected. Overview & Benefits W-2024.09 IC Compiler II
50.
50 Synopsys Confidential Information
© Synopsys, Inc. Fast CUS User Interface Fast CUS feature is made on-by-default in the place_opt flow and can be enabled with the OBD mega option from the W-2024.09 release: set_app_options -list {flow.common.effort 12345} W-2024.09 IC Compiler II
51.
51 Synopsys Confidential Information
© Synopsys, Inc. Clock DRC Improvements • Overview • Solution Description • User Interface W-2024.09
52.
52 Synopsys Confidential Information
© Synopsys, Inc. Clock DRC Improvements Overview There are designs where, max_transition violations on the clock tree remain unfixed once clock nets are routed, which have to be handled during CCD DRC fixing during the clock_opt final_opto flow. It is observed that in some designs there are maximum transition violations on clock remain after concurrent clock and data (CCD) engines and multi objective optimization CCD (MOO- CCD) calls. Such violations happen because of commit failures caused by legalization failures. As a workaround, over-constraints are applied, and the DRC violations are made more visible for CCD to work on. But over-constraining being a workaround, always limits the OOTB DRC fixing and reduces the effectiveness as it does not target the root cause. Also, with over-constraining, clock power is expected to degrade. This feature provides enhancements, such that the left-over maximum transition violations could be effectively reduced with CCD in the clock_opt final_opto flow. Overview & Benefits W-2024.09
53.
53 Synopsys Confidential Information
© Synopsys, Inc. Clock DRC Improvements The proposed solution consists of two parts. Enhance the maximum transition fixing ability of the legacy CCD engine for DRC fixing (DRC-CCD) and multi-objective CCD engine for DRC fixing. Enhance the protection on maximum transition for other calls of CCD down the flow. The feature includes a solution to fix maximum transition violations because of, Commit failures due to legalization failures No RC node found to insert buffers LEQs having via ladder (VL) candidates Less accuracy on Max transition values is not accurate in sub-graph Sizing down of a cell by multi-vector CCD from the library cell with VL constraint to the one without VL constraint Solution Description W-2024.09
54.
54 Synopsys Confidential Information
© Synopsys, Inc. Clock DRC Improvements User Interface Clock Trans DRC closure feature can be enabled by using the below application option in the W-2024.09 release: set_app_options -name ccd.enable_clock_drc_improvements -value true; #Default false W-2024.09
55.
55 Synopsys Confidential Information
© Synopsys, Inc. Agenda Synopsys Confidential Information Multisource Clock Tree Synthesis (MSCTS) Enhancements • H-tree Improvements (GA) • Tap Assignments Improvements (GA) • Automated H-tree Synthesis Enhancements (GA) • Dynamic Clock Power Improvement for SMSCTS (GA) • Structural MSCTS Clock QoR Improvements (GA) • Irregular Global Tree Synthesis (GA)
56.
56 Synopsys Confidential Information
© Synopsys, Inc. H-tree Improvements • Overview • Solution Description • User Interface W-2024.09
57.
57 Synopsys Confidential Information
© Synopsys, Inc. H-tree Improvements During global tree synthesis, we build a centrally symmetric H-tree structure from the input pins of tap driver to H-tree root driver output pin. Few customers have a special need for single repeater solution at H-tree junctions instead of two repeaters to avoid routing and pin congestion issues. This feature aims to: Enhance the existing H-tree synthesis algorithm to improve H-tree QoR (especially latency) Resolve the routing and pin congestion issues because of multiple H-tree repeater cells at junctions Improve multiple H-tree stem path skew Overview W-2024.09
58.
58 Synopsys Confidential Information
© Synopsys, Inc. H-tree Improvements The enhancements covered as part of this feature are: Support for best reference cell selection for latency improvement Evaluate all repeaters (from H-tree library cell collection) for each node and generate solution if that repeater meets DRC constraint All possible solution (with all repeaters) are available at top Then at the top level, pick the solution which is delay wise best Preference to single-buffer solution Higher preference is given to single-buffer solution for each node. The best single-repeater solution is selected for implementation based on latency This helps resolve routing and pin congestion issues because of two repeater solutions at H- tree junctions Solution Description W-2024.09
59.
59 Synopsys Confidential Information
© Synopsys, Inc. H-tree Improvements Multiple H-tree improvements ZBUF engine is tuned for global stem path skew improvement Routing topology performed post buffering is enhanced for better net lengths Solution Description W-2024.09
60.
60 Synopsys Confidential Information
© Synopsys, Inc. H-tree Improvements User Interface Preference to single-buffer solution at junctions can be enabled by using the application option given below: Support for best reference cell selection for latency can be enabled using the application option given below: set_app_options –list {cts.multisource.htree_single_repeater_at_node true} set_app_options –list {cts.multisource.htree_explore_all_repeater_solutions true} W-2024.09
61.
61 Synopsys Confidential Information
© Synopsys, Inc. Tap Assignment Improvements • Overview • Solution Description • User Interface W-2024.09
62.
62 Synopsys Confidential Information
© Synopsys, Inc. Tap Assignment Improvements During tap assignment, we distribute the clock network below the tap drivers (local tree) between the tap driver cells to create smaller subtrees for the clock_opt build_clock step to synthesize. This feature aims to: Reduce the cluster size during initial clustering Prevent tap assignment for ICG driving only ignore pins Add multi-voltage awareness to tap assignment which enables cloning of cells across power domains Overview W-2024.09
63.
63 Synopsys Confidential Information
© Synopsys, Inc. Tap Assignment Improvements The enhancements covered as part of this feature are: Reduce the cluster size during initial clustering, which prevents sinks from connecting to tap drivers far away. There are few cases where macro sinks are assigned to far away taps This happens because the initial clusters created during tap assignment were too big, resulting in the macro sinks to get clustered with other sinks close to the boundary of two tap regions This issue is resolved by reducing the initial cluster size Prevent tap assignment for ICG driving only ignore pins. We do not have to check in the tap assignment flow to prevent creation of single fanout CGs, which drives only ignore sinks, as this could result in breakage of CUS timing closure Solution Description W-2024.09
64.
64 Synopsys Confidential Information
© Synopsys, Inc. Tap Assignment Improvements Add MV awareness to tap assignment, which enables cloning of cells across power domains. Solution Description Fig 1: Without MV aware TA Fig 2: With MV aware TA W-2024.09 Switch domain AON domain Switch domain AON domain
65.
65 Synopsys Confidential Information
© Synopsys, Inc. Tap Assignment Improvements User Interface Use the following application option to control the reduced cluster sizing feature Use the following application option to control the feature that prevents tap assignment for ICG driving only ignore pins MV awareness, which enables cloning of cells across power domains, can be enabled using the following application option set_app_options –list {cts.multisource.limit_cluster_size true} set_app_options –list {cts.multisource.tap_assignment_reassign_ignore_sinks true} set_app_options –list {cts.multisource.enable_mv_aware_tap_assignment true} W-2024.09
66.
66 Synopsys Confidential Information
© Synopsys, Inc. Enhancements to Automated H-tree Synthesis • Overview • User Interface W-2024.09
67.
67 Synopsys Confidential Information
© Synopsys, Inc. Enhancements to Automated H-tree Synthesis Starting from the W-2024.09 release, ease of use of the automated H-tree synthesis command – synthesize_regular_multisource_clock_trees– is improved by adding a separate step for pin connection using Zroute. For this, an additional sub step is added to synthesize_regular_multisource_clock_trees –from/-to steps, named route_pin_connections. This new step allows users to return to shell after H-tree trunk routing using galaxy custom router (GCR) and run the pin connection as a separate step. Overview W-2024.09
68.
68 Synopsys Confidential Information
© Synopsys, Inc. Enhancements to Automated H-tree Synthesis The enhancements covered as part of this feature are: Starting from the W-2024.09 release, ease of use of the automated H-tree synthesis command – synthesize_regular_multisource_clock_trees– is improved by adding a separate step for pin connection using Zroute. For this, an additional sub-step is added to the synthesize_regular_multisource_clock_trees –from/-to steps, named the route_pin_connections. This new step allows users to return to shell after H-tree trunk routing using galaxy custom router (GCR) and run the pin connection as a separate step. Solution Description W-2024.09
69.
69 Synopsys Confidential Information
© Synopsys, Inc. Enhancements to Automated H-tree Synthesis User Interface The new sub-step the route_pin_connections can be used to stop and run pin connections as an atomic step with the –from and –to switches. set_app_options –list {cts.multisource.tap_assignment_reassign_ignore_sinks true} set_regular_multisource_clock_tree_options –clock $clk –topology htree_only … -skip_pin_connections synthesize_regular_multisource_clock_trees –to tap_synthesis synthesize_regular_multisource_clock_trees –from htree_synthesis –to htree_synthesis synthesize_regular_multisource_clock_trees –from route_pin_connections W-2024.09
70.
70 Synopsys Confidential Information
© Synopsys, Inc. Enhancements to Automated H-tree Synthesis Ensure to set the –skip_pin_connections switch with the set_regular_multisource_clock_tree_options command when atomic pin routing step is desired. When H-tree options are defined with the –skip_pin_connections, the htree_synthesis step returns before CTS STEP: Zroute for pin connections, and the –from route_pin_connections can be used to atomically run this step If not defined, then the htree_synthesis runs the pin connection step as well and returns to shell only after all the steps of H-tree synthesis If the -skip_pin_connections option is not used, the new pin connection step cannot be performed, and an error message is generated There is no need to alter the set_regular_multisource_clock_tree_options definition before running the –from route_pin_connections. Things to note W-2024.09
71.
71 Synopsys Confidential Information
© Synopsys, Inc. Dynamic Clock Power Improvement for SMSCTS • Overview • Solution Description • User Interface W-2024.09
72.
72 Synopsys Confidential Information
© Synopsys, Inc. Dynamic Clock Power Improvement for SMSCTS Clock dynamic power increases linearly with switching activity of nets. This feature aims to reduce the wirelength of high switching nets so that overall clock dynamic power improves. With proper SAIF information, the tool accurately estimates the high switching nets. The cell or cluster of cells with high input toggle rate is relocated closer to its driver. In the V-2023.12 release, we relocated all the cells that had a toggle ratio (Output TR and Input TR) less than a threshold value. The relocation percentage also depended on the toggle ratio value. Some gaps were identified in this initial implementation. The relocation of few candidate nodes resulted in an increased overall wire length and degraded power. Starting from the W-2024.09 release, this feature is enhanced to address these gaps. Overview W-2024.09
73.
73 Synopsys Confidential Information
© Synopsys, Inc. Dynamic Clock Power Improvement for SMSCTS Starting from the W-2024.09 release, this feature is enhanced with an accept and reject mechanism after each power-aware relocation, so that the tool only accepts moves where dynamic power is improved. The tool no longer relocates all the nodes below a threshold toggle ratio. Instead, the tool identifies all high switching nets and their drivers to begin with. For each such driver, we relocate all the loads incrementally towards it. Then, the tool calculates the power and wire length after each relocation and accepts the move only if there is an improvement. Any relocation that results in degraded power is reverted, and the tool moves to the next driver. The power-aware relocation steps kick in after major SMSCTS steps like DRC fixing and latency optimization and are supported in both standalone and integrated flows. Solution Description W-2024.09
74.
74 Synopsys Confidential Information
© Synopsys, Inc. Dynamic Clock Power Improvement for SMSCTS User Interface This feature can be enabled in both standalone and integrated SMSCTS flows using the following application option: set_app_options -name cts.multisource.enable_activity_aware_relocation -value true W-2024.09
75.
75 Synopsys Confidential Information
© Synopsys, Inc. Structural MSCTS Clock QoR Improvements (GA) • Overview • Solution Description • User Interface W-2024.09
76.
76 Synopsys Confidential Information
© Synopsys, Inc. Structural MSCTS Clock QoR Improvements Multiple enhancements are introduced to improve the overall SMSCTS clock QoR. Enhancement to reduce long path violations to improve latency (on-by-default from W- 2024.09) Enhancement to dynamic sink reassignment for reducing clock wire length Enhancement to first-level cell assignment for better clock implementation.(on-by-default from W-2024.09) Overview W-2024.09
77.
77 Synopsys Confidential Information
© Synopsys, Inc. Structural MSCTS Clock QoR Improvements Enhancement to reduce long path violations to improve latency. Long path violations during latency optimization are being addressed here. SMSCTS downsizes cells for area recovery post DRC fixing. With the new enhancement, the tool does not downsize a cell if it is part of a long path. The SMSCTS flow does not split cells for latency optimization if the driver has a transition violation. The synthesize_multisource_clock_subtrees command is enhanced to enable splitting of intermediate cells for latency even if they are violating for transition. This has helped to bring down long path violations. Solution Description W-2024.09
78.
78 Synopsys Confidential Information
© Synopsys, Inc. Structural MSCTS Clock QoR Improvements Enhancement to dynamic sink reassignment for reducing clock wire length. This feature is targeted to improve the SMSCTS clock wire length. Under this feature, loads are reassigned to nearby equivalent drivers such that routing overlaps reduce, and total clock wire length improves. Solution Description Purple L2 Inverter Yellow L2 output net Red L3 ICG Overlaps circled Without re-assignment With re-assignment W-2024.09
79.
79 Synopsys Confidential Information
© Synopsys, Inc. Structural MSCTS Clock QoR Improvements Enhancement to first-level cell assignment for better clock implementation. The first-level cells are meant to be assigned to subtree drivers that are closest to its fanout centroid. There was an issue observed where, at intermediate stages of SMSCTS, first- level cells were not assigned to the nearest subtree driver. The clock optimizations are done based on this sub-optimal first-level assignment, and the tool later reassigns the first-level cells to its closest SMSCTS subtree driver or SMSCTS clock tap. This leads to a miscorrelation. This problem is addressed by this feature. Solution Description W-2024.09
80.
80 Synopsys Confidential Information
© Synopsys, Inc. Structural MSCTS Clock QoR Improvement User Interface Set the following application option to enable the various enhancements: Enhancement to dynamic sink reassignment for reducing clock wire length. set_app_options -name cts.multisource.subtree_dynamic_sinks_reclustering -value true W-2024.09
81.
81 Synopsys Confidential Information
© Synopsys, Inc. Structural MSCTS Clock QoR Improvement User Interface These two features are on-by-default starting from the W-2024.09 release. For testing purposes in older versions, please use the below application option: Enhancement to reduce long path violations to improve latency. Enhancement to first-level cell assignment for better clock implementation. set_app_options -name cts.multisource.reduce_long_path_violations –value true set_app_options -name cts.multisource.first_level_cell_tree_assignment_fix –value true W-2024.09
82.
82 Synopsys Confidential Information
© Synopsys, Inc. Irregular Global Tree Synthesis • Overview • User Interface W-2024.09
83.
83 Synopsys Confidential Information
© Synopsys, Inc. Irregular Global Tree Synthesis Prior to the W-2024.09 release, automated global trees could only be created to symmetrically inserted tap drivers, distributed evenly across the layout. This caused some floorplan and global tree complexities requiring manual global tree insertion It also left a scope for a different style of global tree methodology for designs without tight on- chip variation (OCV) issues or common path requirements Starting from the W-2024.09 release, a new global tree synthesis methodology is introduced to automatically construct multilevel DRC clean global trees to non- symmetric, unevenly inserted tap drivers called irregular global tree synthesis. In irregular global tree synthesis, global trees are created using multilevel buffer trees in a non-H-tree style, ensuring minimum latency for all tap drivers This global tree style can be used to construct global tree for irregularly placed tap drivers at the block or even to build a global distribution of clock to block inputs in top level channels Overview W-2024.09
84.
84 Synopsys Confidential Information
© Synopsys, Inc. Irregular Global Tree Synthesis User Interface (1/2) A new set of commands is introduced for irregular global tree synthesis. #Define settings related to global tree set_irregular_multisource_clock_tree_options -clock <clock> [-net <net_name>] to specify net for global tree creation. If not specified, global tree will be built on the net connected to clock source. This net must drive all of the inserted tap drivers -lib_cell {list} to specify lib_cell <inverter or buffer> to be used in global tree. -routing_rule <rule> -routing_layers {list} -prefix to name the global tree cells inserted #To synthesize global tree synthesize_irregular_multisource_clock_trees [-clocks {list}] to specify clock to synthesize, if multiple irregular global tree options are defined. W-2024.09
85.
85 Synopsys Confidential Information
© Synopsys, Inc. Irregular Global Tree Synthesis User Interface (2/2) Additionally, the irregular global tree options can be reported or removed using the following new commands: #Report settings related to irregular global tree report_irregular_multisource_clock_tree_options [-clock] <clock> #Remove the settings related to irregular global tree remove_irregular_multisource_clock_tree_options [-clock] <clock> Note: Irregular Global Tree Synthesis performs global tree construction to already inserted tap drivers. Here, users must insert the irregular tap drivers themselves as per requirement. W-2024.09
Download