This article is licensed under a Creative Commons Attribution 4.0 International License 
Timing derating means adding an extra margin to STA analysis to accommodate variation in timing parameters of gates (as they were characterized in a timing library). Timing libraries are characterized for a particular operating condition representing a combination of process, voltage and temperature (a PVT for short).
However, the real operating condition will certainly differ from the one used for STA (e.g. voltage level will fluctuate, voltage level will drop over power distribution network and in response to current draw peaks, manufacturing process fluctuates in a fab, parameters change wafer to wafer, etc.). The implied variation of operating conditions has a global and a local component.
The global variation is countered running STA analysis at the corner conditions that form a bounding "box" of where the real silicon is to operate. Despite accounting for worst case and best case shifts in operating conditions, every chip will suffer from local variations (a.k.a. on-chip variation or OCV), meaning that some gates will be little faster, some little slower, some will see slightly higher temperature, or slightly lower voltage.
To account for the local variation, designers add extra margin to make sure the performance of the chip stays on the safe side. The simplest means to scale (or de-rate) the timing from a gate library to yield more pessimistic timing. Hence the term timing derating.
Note
The examples in this text were used with PrimeTime S-2021.06-SP5 and Tempus 21.14.
The easiest way to scale down STA timing is using a fixed scaling factor to affect the timing
paths so that slow paths become slower and fast paths to become faster. And that is what the
set_timing_derate -early|-late SDC command does. Default STA setting is such that all paths
have scaling factor of 1.0 (shown by PrimeTime report_timing -derate). To make slow or late
paths 10% slower, we use set_timing_derate -late 1.1. Note that the late path for the setup
check is the data path and for the hold check the clock path:
pt_shell> pt_shell> set_timing_derate -late 1.1
pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \
-derate -derate
Last common pin: clk Last common pin: CKBUF2/Q
Path Group: CLK Path Group: CLK
Path Type: max Path Type: max
Point Derate Incr Path Point Derate Incr Path
----------------------------------------------------------- -----------------------------------------------------------
clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000
clock network delay (propagated) 5.000 5.000 clock network delay (propagated) 5.500 5.500
FF1/CK (dffprqx05_d) 0.000 5.000 r FF1/CK (dffprqx05_d) 0.000 5.500 r
FF1/Q (dffprqx05_d) <- 1.000 3.000 8.000 f FF1/Q (dffprqx05_d) <- 1.100 3.300 8.800 f
BUF1/A (bufx10_d) 1.000 0.000 8.000 f BUF1/A (bufx10_d) 1.100 0.000 8.800 f
BUF1/Q (bufx10_d) 1.000 1.000 9.000 f BUF1/Q (bufx10_d) 1.100 1.100 9.900 f
... ...
FF2/D (dffprqx05_d) 1.000 0.000 12.000 f FF2/D (dffprqx05_d) 1.100 0.000 13.200 f
data arrival time 12.000 data arrival time 13.200
clock CLK (rise edge) 10.000 10.000 clock CLK (rise edge) 10.000 10.000
clock network delay (propagated) 5.000 15.000 clock network delay (propagated) 5.000 15.000
clock reconvergence pessimism 0.000 15.000 clock reconvergence pessimism 0.200 15.200
clock uncertainty -0.500 14.500 clock uncertainty -0.500 14.700
FF2/CK (dffprqx05_d) 14.500 r FF2/CK (dffprqx05_d) 14.700 r
library setup time 1.000 -0.400 14.100 library setup time 1.000 -0.400 14.300
data required time 14.100 data required time 14.300
----------------------------------------------------------- -----------------------------------------------------------
data required time 14.100 data required time 14.300
data arrival time -12.000 data arrival time -13.200
----------------------------------------------------------- -----------------------------------------------------------
slack (MET) 2.100 slack (MET) 1.100
pt_shell> set_timing_derate -late 1.1
pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \
-derate -path_type full_clock -derate -path_type full_clock
Last common pin: clk Last common pin: CKBUF2/Q
Path Group: CLK Path Group: CLK
Path Type: min Path Type: min
Point Derate Incr Path Point Derate Incr Path
----------------------------------------------------------- -----------------------------------------------------------
clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000
clock source latency 0.000 0.000 clock source latency 0.000 0.000
clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r
CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r
CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r
... ...
FF1/CK (dffprqx05_d) 1.000 0.000 5.000 r FF1/CK (dffprqx05_d) 1.000 0.000 5.000 r
FF1/Q (dffprqx05_d) <- 1.000 2.000 7.000 r FF1/Q (dffprqx05_d) <- 1.000 2.000 7.000 r
BUF1/A (bufx10_d) 1.000 0.000 7.000 r BUF1/A (bufx10_d) 1.000 0.000 7.000 r
BUF1/Q (bufx10_d) 1.000 1.000 8.000 r BUF1/Q (bufx10_d) 1.000 1.000 8.000 r
... ...
FF2/D (dffprqx05_d) 1.000 0.000 11.000 r FF2/D (dffprqx05_d) 1.000 0.000 11.000 r
data arrival time 11.000 data arrival time 11.000
clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000
clock source latency 0.000 0.000 clock source latency 0.000 0.000
clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r
CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 1.100 0.000 0.000 r
CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 1.100 1.100 1.100 r
... ...
FF2/CK (dffprqx05_d) 1.000 0.000 5.000 r FF2/CK (dffprqx05_d) 1.100 0.000 5.500 r
clock reconvergence pessimism 0.000 5.000 clock reconvergence pessimism -0.200 5.300
clock uncertainty 0.500 5.500 clock uncertainty 0.500 5.800
library hold time 1.000 0.300 5.800 library hold time 1.000 0.300 6.100
data required time 5.800 data required time 6.100
----------------------------------------------------------- -----------------------------------------------------------
data required time 5.800 data required time 6.100
data arrival time -11.000 data arrival time -11.000
----------------------------------------------------------- -----------------------------------------------------------
slack (MET) 5.200 slack (MET) 4.900
There are few things to note in the above example:
- Only the late path is scaled (i.e. the data path for
maxtiming and capture clock path formintiming). - (Clock) cells that appear in both the late and early path scale in only in the late path
(see the
minpath timing which has used the-path_type full_clockoption). The STA tool shall remove clock reconvergence pessimism (CRP) to counter this effect, so it either adds or reduces the capture clock path the difference on the common clock path segment. - Slack reduction (due to being 10% more pessimistic) is more pronounced for the setup timing than for the hold timing. As we will see in the next example, the impact reverses for early derate so that hold timing is impacted more and setup timing less.
To also scale the early path, we would need a second instance of set_timing_derate.
However, as the early path is the fast one, we need to scale it down to make it faster.
Hence set_timing_derate -early 0.9 is for 10% margin. Reports below show the early
derate vs. the default no-scaling baseline:
pt_shell> pt_shell> set_timing_derate -early 0.9
pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type max -from FF1 -to FF2 -nosplit \
-derate -derate
Last common pin: clk Last common pin: CKBUF2/Q
Path Group: CLK Path Group: CLK
Path Type: max Path Type: max
Point Derate Incr Path Point Derate Incr Path
----------------------------------------------------------- -----------------------------------------------------------
clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000
clock network delay (propagated) 5.000 5.000 clock network delay (propagated) 5.000 5.000
FF1/CK (dffprqx05_d) 0.000 5.000 r FF1/CK (dffprqx05_d) 0.000 5.000 r
FF1/Q (dffprqx05_d) <- 1.000 3.000 8.000 f FF1/Q (dffprqx05_d) <- 1.000 3.000 8.000 f
BUF1/A (bufx10_d) 1.000 0.000 8.000 f BUF1/A (bufx10_d) 1.000 0.000 8.000 f
BUF1/Q (bufx10_d) 1.000 1.000 9.000 f BUF1/Q (bufx10_d) 1.000 1.000 9.000 f
... ...
FF2/D (dffprqx05_d) 1.000 0.000 12.000 f FF2/D (dffprqx05_d) 1.000 0.000 12.000 f
data arrival time 12.000 data arrival time 12.000
clock CLK (rise edge) 10.000 10.000 clock CLK (rise edge) 10.000 10.000
clock network delay (propagated) 5.000 15.000 clock network delay (propagated) 4.500 14.500
clock reconvergence pessimism 0.000 15.000 clock reconvergence pessimism 0.200 14.700
clock uncertainty -0.500 14.500 clock uncertainty -0.500 14.200
FF2/CK (dffprqx05_d) 14.500 r FF2/CK (dffprqx05_d) 14.200 r
library setup time 1.000 -0.400 14.100 library setup time 1.000 -0.400 13.800
data required time 14.100 data required time 13.800
----------------------------------------------------------- -----------------------------------------------------------
data required time 14.100 data required time 13.800
data arrival time -12.000 data arrival time -12.000
----------------------------------------------------------- -----------------------------------------------------------
slack (MET) 2.100 slack (MET) 1.800
pt_shell> set_timing_derate -early 0.9
pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \ pt_shell> report_timing -delay_type min -from FF1 -to FF2 -nosplit \
-derate -path_type full_clock -derate -path_type full_clock
Last common pin: clk Last common pin: CKBUF2/Q
Path Group: CLK Path Group: CLK
Path Type: min Path Type: min
Point Derate Incr Path Point Derate Incr Path
----------------------------------------------------------- -----------------------------------------------------------
clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000
clock source latency 0.000 0.000 clock source latency 0.000 0.000
clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r
CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 0.900 0.000 0.000 r
CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 0.900 0.900 0.900 r
... ...
FF1/CK (dffprqx05_d) 1.000 0.000 5.000 r FF1/CK (dffprqx05_d) 0.900 0.000 4.500 r
FF1/Q (dffprqx05_d) <- 1.000 2.000 7.000 r FF1/Q (dffprqx05_d) <- 0.900 1.800 6.300 r
BUF1/A (bufx10_d) 1.000 0.000 7.000 r BUF1/A (bufx10_d) 0.900 0.000 6.300 r
BUF1/Q (bufx10_d) 1.000 1.000 8.000 r BUF1/Q (bufx10_d) 0.900 0.900 7.200 r
... ...
FF2/D (dffprqx05_d) 1.000 0.000 11.000 r FF2/D (dffprqx05_d) 0.900 0.000 9.900 r
data arrival time 11.000 data arrival time 9.900
clock CLK (rise edge) 0.000 0.000 clock CLK (rise edge) 0.000 0.000
clock source latency 0.000 0.000 clock source latency 0.000 0.000
clk (in) 0.000 0.000 r clk (in) 0.000 0.000 r
CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r
CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r CKBUF1/Q (bufx10_d) 1.000 1.000 1.000 r
... ...
FF2/CK (dffprqx05_d) 1.000 0.000 5.000 r FF2/CK (dffprqx05_d) 1.000 0.000 5.000 r
clock reconvergence pessimism 0.000 5.000 clock reconvergence pessimism -0.200 4.800
clock uncertainty 0.500 5.500 clock uncertainty 0.500 5.300
library hold time 1.000 0.300 5.800 library hold time 1.000 0.300 5.600
data required time 5.800 data required time 5.600
----------------------------------------------------------- -----------------------------------------------------------
data required time 5.800 data required time 5.600
data arrival time -11.000 data arrival time -9.900
----------------------------------------------------------- -----------------------------------------------------------
slack (MET) 5.200 slack (MET) 4.300
The idea of timing derating is intuitive, but gets more complex in real applications. Different IPs may need to be scaled differently, just by the fact that some do have timing margins already incorporated. Certain IP instances may need tighter or more relaxed margins. Nets may derate differently than cell delays. Different cell timing parameters may scale differently (e.g. setup/hold vs. cell delay). Some timing derates should be applied incrementally, which is better done as additive margins than multiplicative scale factors.
STA engines support these case through various set_timing_derate options. Yet there are slight differences how
they do or how they prioritize or combine derate factors targeting the same instance through derates associated with
different design objects (i.e. instances vs. libraries vs. design).
PrimeTime can set timing derates for different design objects and uses the more specific derate wins precedence method. Hence a derate applied to the design object (i.e. without specifying a libcell or an instance) applies to all instances. This may be preceded by a more specific derate set to a libcell. That may be further preceded by a derate set for a particular instance. For example:
pt_shell> set_timing_derate -cell_delay -late 1.1
pt_shell> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d]
pt_shell> set_timing_derate -cell_delay -late 1.3 [get_cells INV2]
pt_shell> report_timing_derate
----- Clock ------ ------ Data ------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
----------------------------------------------------------------------------------------------
design: circ
Net delay static -- -- -- -- -- -- -- --
Net delay dynamic -- -- -- -- -- -- -- --
Cell delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Cell check -- -- -- -- -- -- -- --
cell (leaf): INV2
Cell delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
lib_cell: testlib01/invx05_d
Cell delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
pt_shell> report_timing -delay_type max -derate -nosplit
Point Derate Incr Path
--------------------------------------------------------------------------
clock CLK (rise edge) 0.000 0.000
clock network delay (propagated) 5.700 5.700
FF1/CK (dffprqx05_d) 0.000 5.700 r
FF1/Q (dffprqx05_d) 1.100 3.300 9.000 f
BUF1/A (bufx10_d) 1.000 0.000 9.000 f
BUF1/Q (bufx10_d) 1.100 1.100 10.100 f <--- all cells are derated by at least the global 1.1 factor
INV2/A (invx05_d) 1.000 0.000 10.100 f
INV2/Q (invx05_d) 1.300 1.300 11.400 r <--- INV2 cell is specifically derated by 1.3 factor
INV3/A (invx05_d) 1.000 0.000 11.400 r
INV3/Q (invx05_d) 1.200 1.200 12.600 f <--- all other invx05_d cells are derated by 1.2 factor
BUF4/A (bufx10_d) 1.000 0.000 12.600 f
BUF4/Q (bufx10_d) 1.100 1.100 13.700 f
FF2/D (dffprqx05_d) 1.000 0.000 13.700 f
data arrival time 13.700
clock CLK (rise edge) 10.000 10.000
clock network delay (propagated) 5.000 15.000
clock reconvergence pessimism 0.200 15.200
clock uncertainty -0.500 14.700
FF2/CK (dffprqx05_d) 14.700 r
library setup time 1.000 -0.400 14.300
data required time 14.300
--------------------------------------------------------------------------
data required time 14.300
data arrival time -13.700
--------------------------------------------------------------------------
slack (MET) 0.600
Following the scale factors, we can define incremental adjustments through set_timing_derate -incremental.
Notice that the incremental derates are not reported in report_timing_derate but only with
report_timing_derate -incremental:
pt_shell> set_timing_derate -cell_delay -late 0.01 -increment
pt_shell> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d]
pt_shell> set_timing_derate -cell_delay -late 0.03 -increment [get_cells INV3]
pt_shell> report_timing_derate
----- Clock ------ ------ Data ------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
----------------------------------------------------------------------------------------------
design: circ
Net delay static -- -- -- -- -- -- -- --
Net delay dynamic -- -- -- -- -- -- -- --
Cell delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Cell check -- -- -- -- -- -- -- --
cell (leaf): INV2
Cell delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
lib_cell: testlib01/invx05_d
Cell delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
pt_shell> report_timing_derate -increment
----- Clock ------ ------ Data ------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
----------------------------------------------------------------------------------------------
design: circ
Net delay static -- -- -- -- -- -- -- --
Net delay dynamic -- -- -- -- -- -- -- --
Cell delay -- 0.010 -- 0.010 -- 0.010 -- 0.010
Cell check -- -- -- -- -- -- -- --
cell (leaf): INV3
Cell delay -- 0.030 -- 0.030 -- 0.030 -- 0.030
lib_cell: testlib01/invx05_d
Cell delay -- 0.020 -- 0.020 -- 0.020 -- 0.020
pt_shell> report_timing -delay_type max -derate -nosplit
Point Derate Incr Path
--------------------------------------------------------------------------
clock CLK (rise edge) 0.000 0.000
clock network delay (propagated) 5.770 5.770
FF1/CK (dffprqx05_d) 0.000 5.770 r
FF1/Q (dffprqx05_d) 1.110 3.330 9.100 f <--- all cells have scale factor increased by at least 0.01
BUF1/A (bufx10_d) 1.000 0.000 9.100 f
BUF1/Q (bufx10_d) 1.110 1.110 10.210 f
INV2/A (invx05_d) 1.000 0.000 10.210 f
INV2/Q (invx05_d) 1.320 1.320 11.530 r <--- invx05_d cells have scale factor increased by at least 0.02
INV3/A (invx05_d) 1.000 0.000 11.530 r (note that INV2 has unique scale factor of 1.3)
INV3/Q (invx05_d) 1.230 1.230 12.760 f <--- INV3 has its scale factor specifically increased by 0.03
BUF4/A (bufx10_d) 1.000 0.000 12.760 f (note that INV3 has a scale factor of 1.2 default factor of
BUF4/Q (bufx10_d) 1.110 1.110 13.870 f invx05_d cells)
FF2/D (dffprqx05_d) 1.000 0.000 13.870 f
data arrival time 13.870
clock CLK (rise edge) 10.000 10.000
clock network delay (propagated) 5.000 15.000
clock reconvergence pessimism 0.220 15.220
clock uncertainty -0.500 14.720
FF2/CK (dffprqx05_d) 14.720 r
library setup time 1.000 -0.400 14.320
data required time 14.320
--------------------------------------------------------------------------
data required time 14.320
data arrival time -13.870
--------------------------------------------------------------------------
slack (MET) 0.450
Note that using set_timing_derate -increment again is not incremental to the already existing
increment. It actually overrides the previous increment. It is indeed the same for the scale factor, too;
a new set_derate_timing overrides the last one. See the examples below.
It implies that PrimeTime actually maintains the timing derate as two separate components, a scale
factor and an incremental margin. The total derate is then total_derate = scale_factor + margin.
The scale factor is set instantly through set_timing_derate and the margin through
set_timing_derate -incremental. Hence if the total derate is composed of multiple components,
users need to come up with sub-totals for both the scale factor and the margin (or can come up
with the total itself and use only either the scale factor or the margin).
pt_shell> set_timing_derate -cell_delay -late -0.01 -increment
pt_shell> set_timing_derate -cell_delay -late -0.02 -increment [get_lib_cells */invx05_d]
pt_shell> set_timing_derate -cell_delay -late -0.03 -increment [get_cells INV3]
pt_shell> report_timing_derate -increment
----- Clock ------ ------ Data ------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
----------------------------------------------------------------------------------------------
design: circ
Net delay static -- -- -- -- -- -- -- --
Net delay dynamic -- -- -- -- -- -- -- --
Cell delay -- -0.010 -- -0.010 -- -0.010 -- -0.010
Cell check -- -- -- -- -- -- -- --
cell (leaf): INV3
Cell delay -- -0.030 -- -0.030 -- -0.030 -- -0.030
lib_cell: testlib01/invx05_d
Cell delay -- -0.020 -- -0.020 -- -0.020 -- -0.020
In the scale factor overrides, notice that we have not overridden INV2 scale factor, which
then remained at its last value of 1.3 despite the new global and libcell derates. This is in line
with the expected precedence of more specific derate applies:
pt_shell> set_timing_derate -cell_delay -late 0.9
pt_shell> set_timing_derate -cell_delay -late 0.8 [get_lib_cells */invx05_d]
pt_shell> set_timing_derate -cell_delay -late 0.7 [get_cells INV3]
pt_shell> report_timing_derate
----- Clock ------ ------ Data ------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
----------------------------------------------------------------------------------------------
design: circ
Net delay static -- -- -- -- -- -- -- --
Net delay dynamic -- -- -- -- -- -- -- --
Cell delay -- 0.900 -- 0.900 -- 0.900 -- 0.900
Cell check -- -- -- -- -- -- -- --
cell (leaf): INV2
Cell delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
cell (leaf): INV3
Cell delay -- 0.700 -- 0.700 -- 0.700 -- 0.700
lib_cell: testlib01/invx05_d
Cell delay -- 0.800 -- 0.800 -- 0.800 -- 0.800
The new timing after derate updates then looks like follows:
pt_shell> report_timing -delay_type max -derate -nosplit Point Derate Incr Path -------------------------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock network delay (propagated) 4.230 4.230 FF1/CK (dffprqx05_d) 0.000 4.230 r FF1/Q (dffprqx05_d) 0.890 2.670 6.900 f BUF1/A (bufx10_d) 1.000 0.000 6.900 f BUF1/Q (bufx10_d) 0.890 0.890 7.790 f INV2/A (invx05_d) 1.000 0.000 7.790 f INV2/Q (invx05_d) 1.280 1.280 9.070 r INV3/A (invx05_d) 1.000 0.000 9.070 r INV3/Q (invx05_d) 0.670 0.670 9.740 f BUF4/A (bufx10_d) 1.000 0.000 9.740 f BUF4/Q (bufx10_d) 0.890 0.890 10.630 f FF2/D (dffprqx05_d) 1.000 0.000 10.630 f data arrival time 10.630 clock CLK (rise edge) 10.000 10.000 clock network delay (propagated) 5.000 15.000 clock reconvergence pessimism 0.000 15.000 clock uncertainty -0.500 14.500 FF2/CK (dffprqx05_d) 14.500 r library setup time 1.000 -0.400 14.100 data required time 14.100 -------------------------------------------------------------------------- data required time 14.100 data arrival time -10.630 -------------------------------------------------------------------------- slack (MET) 3.470
Tempus takes a slightly different approach. It maintains a single derate scale factor set by set_timing_derate.
Users can incrementally update this scale factor by a multiplicative factor (set_timing_derate -multiply) or
additive increment (set_timing_derate -add). There can be as many incremental updates as one likes:
@tempus 3> set_timing_derate -cell_delay -late 1.1
@tempus 4> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Net Delay Static -- -- -- -- -- -- -- --
...
@tempus 5> set_timing_derate -cell_delay -late 1.1 -multiply
@tempus 6> set_timing_derate -cell_delay -late 1.1 -multiply
@tempus 7> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Cell Delay -- 1.331 -- 1.331 -- 1.331 -- 1.331
Net Delay Static -- -- -- -- -- -- -- --
...
@tempus 8> set_timing_derate -cell_delay -late 0.01 -add
@tempus 9> set_timing_derate -cell_delay -late 0.01 -add
@tempus 10> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Cell Delay -- 1.351 -- 1.351 -- 1.351 -- 1.351
Net Delay Static -- -- -- -- -- -- -- --
...
@tempus 11> set_timing_derate -cell_delay -late 1.1 -multiply
@tempus 12> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Cell Delay -- 1.486 -- 1.486 -- 1.486 -- 1.486
Net Delay Static -- -- -- -- -- -- -- --
...
@tempus 13> set_timing_derate -cell_delay -late 1.1
@tempus 14> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Net Delay Static -- -- -- -- -- -- -- --
...
Tempus honors similar derate precence like PrimeTime in the sense that more specific derate for a particular design object applies. However, there are certain quirks that are unxepected from user perspective and make the overal derate specification more intricate.
The intuitive derate precednce with decreasing priority is as follows:
- Specific instance/cell object.
- Library cell object.
- Design object.
@tempus 15> set_timing_derate -cell_delay -late 1.1
@tempus 16> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d]
@tempus 17> set_timing_derate -cell_delay -late 1.3 [get_cells INV2]
@tempus 18> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Net Delay Static -- -- -- -- -- -- -- --
...
Cell (leaf): INV2
Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
Input_switching -- -- -- -- -- -- -- --
@tempus 19> report_timing -late
Capture Launch
Clock Edge:+ 10.000 0.000
Src Latency:+ 0.000 0.000
Net Latency:+ 5.000 (P) 5.700 (P)
Arrival:= 15.000 5.700
Setup:- 0.400
Uncertainty:- 0.500
Cppr Adjust:+ 0.200
Required Time:= 14.300
Launch Clock:= 5.700
Data Path:+ 8.000
Slack:= 0.600
#-------------------------------------------------------------------
# Pin Cell Load Trans Incr Total Delay Arrival
# (pf) (ns) Delay Derate (ns) (ns)
#-------------------------------------------------------------------
FF1/CK (arrival) 0.005 0.201 - - - 5.700
FF1/Q dffprqx05_d 0.003 0.201 0.000 1.100 3.300 9.000
BUF1/Q bufx10_d 0.005 0.301 0.000 1.100 1.100 10.100
INV2/Q invx05_d 0.005 0.301 0.000 1.300 1.300 11.400
INV3/Q invx05_d 0.003 0.201 0.000 1.200 1.200 12.600
BUF4/Q bufx10_d 0.005 0.301 0.000 1.100 1.100 13.700
FF2/D dffprqx05_d 0.005 0.301 0.000 1.000 0.000 13.700
#-------------------------------------------------------------------
The quirky behavior is tied to the incremental derate specification, whether through -multiply or -add.
The behaviour is similar for both and so showing only the -add option in the example that follows. Effect
of the incremental derate specs is as follows:
- Inc. derate of cell/instance objects affects only that target objects.
- Inc. derate of library cell objects affects existing derates of those library cells and of any specific cells/instances (that use those library cells).
- Inc. derate of design object affects all existing derates.
@tempus> reset_timing_derate
@tempus> set_timing_derate -cell_delay -late 1.1
@tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2]
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Cell (leaf): INV2
Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
########
######## Notice the following affects *all* objects.
########
@tempus> set_timing_derate -cell_delay -late 0.01 -add
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110
Cell (leaf): INV2
Cell Delay -- 1.310 -- 1.310 -- 1.310 -- 1.310
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.210 -- 1.210 -- 1.210 -- 1.210
########
######## Notice the following affects library and instance objects.
########
@tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d]
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110
Cell (leaf): INV2
Cell Delay -- 1.330 -- 1.330 -- 1.330 -- 1.330
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.230 -- 1.230 -- 1.230 -- 1.230
########
######## Finally the following affects only target instance objects.
########
@tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3]
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110
Cell (leaf): INV2
Cell Delay -- 1.330 -- 1.330 -- 1.330 <--+ -- 1.330
|
Cell (leaf): INV3 |
Cell Delay -- 1.260 -- 1.260 <--+ -- 1.260 | -- 1.260
| |
LibraryCell: testlib01/invx05_d | |
Cell_delay -- 1.230 <--+ -- 1.230 | -- 1.230 | -- 1.230
| | |
| | +-- Total additive increment of 0.01+0.02.
| +-- Total additive increment of 0.01+0.02+0.03.
+-- Total additive increment of 0.01+0.02.
The unfortunate result is that the order of these incremental derate specification matters;
not just relative to themselves (i.e. -multiply vs. -add, which is expected) but
relative to the global ones, too (which may be unexpected). For example:
########
######## Reference case.
########
@tempus> reset_timing_derate
@tempus> set_timing_derate -cell_delay -late 1.1
@tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2]
@tempus>
@tempus> set_timing_derate -cell_delay -late 0.01 -add
@tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3]
@tempus>
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110
Cell (leaf): INV2
Cell Delay -- 1.330 -- 1.330 -- 1.330 -- 1.330
Cell (leaf): INV3
Cell Delay -- 1.260 -- 1.260 -- 1.260 -- 1.260
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.230 -- 1.230 -- 1.230 -- 1.230
########
######## Notice the reverse order of incremental derates (which makes no difference)
########
@tempus> reset_timing_derate
@tempus> set_timing_derate -cell_delay -late 1.1
@tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2]
@tempus>
@tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3]
@tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.01 -add
@tempus>
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110
Cell (leaf): INV2
Cell Delay -- 1.330 -- 1.330 -- 1.330 -- 1.330
Cell (leaf): INV3
Cell Delay -- 1.260 -- 1.260 -- 1.260 -- 1.260
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.230 -- 1.230 -- 1.230 -- 1.230
########
######## Notice a different order of global and incremental derates (and hence different ending derates).
########
@tempus> reset_timing_derate
@tempus>
@tempus> set_timing_derate -cell_delay -late 1.1
@tempus> set_timing_derate -cell_delay -late 0.01 -add
@tempus>
@tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d]
@tempus>
@tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2]
@tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3]
@tempus>
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.110 -- 1.110 -- 1.110 -- 1.110
Cell (leaf): INV2
Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
Cell (leaf): INV3
Cell Delay -- 1.250 -- 1.250 -- 1.250 -- 1.250
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.220 -- 1.220 -- 1.220 -- 1.220
There is, however, more to that quirkiness to keep in mind and that is that the incremental derates
affect only derates **existing* in that moment*. This then yields different results between resetting
derates through reset_timing_derate (which makes all derates undefined) and setting derates to 1.0
(which makes derates defined). Hence for example:
########
######## Reference.
########
@tempus> reset_timing_derate; # removes all existing derates (which is different than setting them to 1.0)
@tempus>
@tempus> set_timing_derate -cell_delay -late 0.01 -add
@tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3]
@tempus>
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.010 -- 1.010 -- 1.010 -- 1.010
Cell (leaf): INV3
Cell Delay -- 1.050 -- 1.050 -- 1.050 -- 1.050
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.020 -- 1.020 -- 1.020 -- 1.020
########
######## Notice the reverse order of incremental derates (which makes a difference now)
########
@tempus> reset_timing_derate; # removes all existing derates (which is different than setting them to 1.0)
@tempus>
@tempus> set_timing_derate -cell_delay -late 0.03 -add [get_cells INV3]
@tempus> set_timing_derate -cell_delay -late 0.02 -add [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.01 -add
@tempus>
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.010 -- 1.010 -- 1.010 -- 1.010
Cell (leaf): INV3
Cell Delay -- 1.060 -- 1.060 -- 1.060 -- 1.060
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.030 -- 1.030 -- 1.030 -- 1.030
In recent versions, Tempus added -incerement option that supposedly brings it closer to how e.g. PrimeTime
keeps derate settings. This is what the help message says:
-increment # incrementally add the derate value to total derate value in all (OCV/AOCV/SOCV) mode
So the expectation is that -increment is like -add but does not accumulate. The following is what
it then looks like in the reference case. The results are almost like in PrimeTime, except the -increment
option is not supported at the design/global level. Yet it is obvious that Tempus now keeps the deratings
as a separate scaling factor and an additive margin, and that new set_timing_derate -increment
replaces the existing one (rather than accumulating like in the -add case).
########
######## Reference.
########
@tempus> reset_timing_derate
@tempus> set_timing_derate -cell_delay -late 1.1
@tempus> set_timing_derate -cell_delay -late 1.2 [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 1.3 [get_cells INV2]
@tempus> set_timing_derate -cell_delay -late 0.01 -increment
**ERROR: (TCLCMD-1022): -incremental_adjust/-increment options must be specified to instance or library cell objects.
@tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.03 -increment [get_cells INV3]
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Cell (leaf): INV2
Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
Cell (leaf): INV3
Cell Delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
Incremental_adjust 0.000 0.030 0.000 0.030 0.000 0.030 0.000 0.030
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020
########
######## Another `-increment` replaces the previous one.
########
@tempus> set_timing_derate -cell_delay -late 0.025 -increment [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.035 -increment [get_cells INV3]
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell Delay -- 1.100 -- 1.100 -- 1.100 -- 1.100
Cell (leaf): INV2
Cell Delay -- 1.300 -- 1.300 -- 1.300 -- 1.300
Cell (leaf): INV3
Cell Delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
Incremental_adjust 0.000 0.035 0.000 0.035 0.000 0.035 0.000 0.035
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.200 -- 1.200 -- 1.200 -- 1.200
Incremental_adjust 0.000 0.025 0.000 0.025 0.000 0.025 0.000 0.025
########
######## Also works when scaling factors are undefined.
########
@tempus> reset_timing_derate
@tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.03 -increment [get_cells INV3]
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
Cell (leaf): INV3
Cell Delay -- -- -- -- -- -- -- --
Incremental_adjust 0.000 0.030 0.000 0.030 0.000 0.030 0.000 0.030
LibraryCell: testlib01/invx05_d
Cell_delay -- -- -- -- -- -- -- --
Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020
There seems to be some unexpected side effects with -increment (that look more like a bug than a feature):
One problem is interaction between the seperate
-incrementmargin and the Cadence legacy incremental derates (such as-addas in the exmaple below). The problem is that different order of derate application yields a different total derate. This is exposed when-add <libcell>follows the-incrementderates. Changing the order of-addvs. increment, or squeezing a different derate in between yields the expected results.@tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 0.022 -increment [get_cells INV3] @tempus> set_timing_derate -cell_delay -late 0.033 -increment [get_lib_cells */invx05_d] -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell (leaf): INV3 Cell Delay -- -- -- -- -- -- -- -- Incremental_adjust 0.000 0.022 0.000 0.022 0.000 0.022 0.000 0.022 LibraryCell: testlib01/invx05_d Cell_delay -- -- -- -- -- -- -- -- Incremental_adjust 0.000 0.033 0.000 0.033 0.000 0.033 0.000 0.033 ####################### ######## `-add <libcell>` yields double-counting the new derate into a specific cell derate ####################### @tempus> set_timing_derate -cell_delay -late 0.015 -add [get_lib_cells */invx05_d] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell (leaf): INV3 Cell Delay -- 1.030 -- 1.030 -- 1.030 -- 1.030 Incremental_adjust 0.000 0.022 0.000 0.022 0.000 0.022 0.000 0.022 LibraryCell: testlib01/invx05_d Cell_delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 Incremental_adjust 0.000 0.033 0.000 0.033 0.000 0.033 0.000 0.033 ######## `-add <cell>` updates only the specific cell as expected @tempus> set_timing_derate -cell_delay -late 0.01 -add [get_cells INV3] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Cell (leaf): INV3 Cell Delay -- 1.040 -- 1.040 -- 1.040 -- 1.040 Incremental_adjust 0.000 0.022 0.000 0.022 0.000 0.022 0.000 0.022 LibraryCell: testlib01/invx05_d Cell_delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 Incremental_adjust 0.000 0.033 0.000 0.033 0.000 0.033 0.000 0.033The other inconvenience is that the
-incrementpart of the total derate is not reported in theuser_deratefield ofreport_timing(and neither in a separate field, such asincr_derate).@tempus> reset_timing_derate @tempus> set_timing_derate -cell_delay -late 0.01 -add [get_lib_cells */bufx10_d] @tempus> set_timing_derate -cell_delay -late 0.005 -add [get_cells BUF1] @tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */bufx10_d] @tempus> set_timing_derate -cell_delay -late 0.03 -increment [get_cells BUF1] @tempus> report_timing_derate -------------Clock----------------- ---------------Data----------------- Rise Fall Rise Fall Early Late Early Late Early Late Early Late --------------------------------------------------------------------------- Cell (leaf): BUF1 Cell Delay -- 1.015 -- 1.015 -- 1.015 -- 1.015 Incremental_adjust 0.000 0.030 0.000 0.030 0.000 0.030 0.000 0.030 LibraryCell: testlib01/bufx10_d Cell_delay -- 1.010 -- 1.010 -- 1.010 -- 1.010 Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020 @tempus> report_timing #--------------------------------------------------------------------------- # Pin Cell Load Trans Incr Delay Arrival User Total # (pf) (ns) Delay (ns) (ns) Derate Derate #--------------------------------------------------------------------------- FF1/CK (arrival) 0.005 0.201 - - 5.090 - - FF1/Q dffprqx05_d 0.003 0.201 0.000 3.000 8.090 1.000 1.000 BUF1/Q bufx10_d 0.005 0.301 0.000 1.045 9.135 1.015 1.045 <-- 0.015 from `-add <libcell> + <cell>` + 0.030 from `-increment <cell>` INV2/Q invx05_d 0.005 0.301 0.000 1.000 10.135 1.000 1.000 INV3/Q invx05_d 0.003 0.201 0.000 1.000 11.135 1.000 1.000 BUF4/Q bufx10_d 0.005 0.301 0.000 1.030 12.165 1.010 1.030 <-- 0.01 from `-add <libcell>` + 0.02 from `-increment <libcell>` FF2/D dffprqx05_d 0.005 0.301 0.000 0.000 12.165 1.000 1.000 #---------------------------------------------------------------------------
As noted above, flat OCV timing derating can range from a simple, global scale factor to a complex combination of scale factors and incremental margins set on different design objects. When the flat derates are to be used across different tools, users must be very careful about differences in how those tools interpret the derate commands and manage the derate settings internally.
The following list gives summary of key aspects in major STA tools:
- Synopsys PrimeTime
- Internally represents the derate settings by two components, a scale factor and
an incremental margin. The former is reported by
report_timing_derate, the latter byreport_timing_derate -increment. - Total derate is a scale factor derived by summing the two components together.
- Generally supports two commands,
set_timing_derate <scale_factor>andset_timing_derate -increment <margin>. - Both derate commands are instant and do not accumulate. That is, applying the command multiple times makes the latter occurrence replace any former one(s).
- Derate settings can be applied to different design objects. If cell instance is subject to multiple derate settings (for different objects such as cell and library cell), the more specific (in terms of the object identification) setting applies.
- Internally represents the derate settings by two components, a scale factor and
an incremental margin. The former is reported by
- Cadence Tempus
- Internally represents the derate settings as a scale factor. The setting
can be done instantly through
set_timing_derate <scale_factor>. - Settings can be applied to different design objects and like in PrimeTime, the more specific derate settings applies.
- Provides additional commands to incrementally alter the scale factor setting,
set_timing_derate -multiply <factor>andset_timing_derate -add <margin>.- These incremental derates accumulate and hence order of their application matters.
- The incremental effect is on every design object that can be affected by the incremental settings and has an existing derate setting already defined. For example, a design-level incremental derate would also alter existing derate settings of a library cell and of a cell/instance; a library cell incremental derate would also affect an existing cell/instance derate (that qualifies for the library cell pattern).
- Recent tool versions also support
set_timing_derate -incrementwith the semantics of PrimeTime. The only difference is that this command is supported only for library cell and cell/instance objects.
- Internally represents the derate settings as a scale factor. The setting
can be done instantly through
Here is some guidance that would hopefully lead to more consistent results (at least for Synopsys and Cadence tools):
- Mind the differences among tools. Always report the final derate factors and review the report against the intended derate settings.
- Whenever possible, compile the total derate factors for individual design object groups and
apply through the instant
set_timing_derate <scale_factor>. This would have consistent effect in all tools. - When need to use incremental margins (e.g. in combination with AOCV, see later), express all derate settings as incremental margins (to the default scale factor of 1.0). This is mainly to avoid exploiting differences among tools.
- When using
-addor-multiplyoptions in Cadence tools:- Turn the derate settings such that you would not mix those options. That is, all
set_timing_deratecommands would use either-addor-multiply. - Order the incremental derate commands from the most generic to the most specific. This would help to avoid "double counting" (i.e. a more generic incremental setting affecting a more specific incremental setting).
- Avoid mixing instant derate commands and incremental derate commands. This will help to avoid "double counting".
- Turn the derate settings such that you would not mix those options. That is, all
- Prefer using
-incrementto any of-add,-multiply. This is possible in recent Cadence tools. Mind that Cadence tools support only library cell and cell/instance objects with this option.
One size fits all works neither for humans nor for timing derate. A fixed flat OCV derate turns either overly pessimistic or sometimes pessimistic/sometimes optimistic as the technology node scales down. SPICE statistical simulation shows that the process local variation effects (per gate) decrease as the length of timing path increases. That is, the local variation effect for long paths averages out.
Delay averaging comes from the fact that local variations tend to have normal disribution. N-times such distribution is again a normal ditribution with N-times scaled mean and sigma.
Calculating a relative 3-sigma spread for an n-cell long path yields the function f(n), plot of which appears as "AOCV" in the following figure:
f(n) = (3 * sigma_n) / mean_n = ((3 * sigma) / mean) * 1/sqrt(n) = k * 1/sqrt(n)
Hence the next OCV evolution step is to come up with look-up tables that would identify a derate factor based on the number of cells (a.k.a. stages) in the timing path. This look-up table would be compiled from results of statistical SPICE simulation (a.k.a. Monte Carlo simulations) for cells in a series of increasing length and put into a side file that would be used along with the traditional timing library. This approach is called advanced OCV (AOCV) or stage-based OCV (SBOCV). The side file with scaling factor look-up tables would look like follows:
object_type: lib_cell delay_type: cell rf_type: rise derate_type: late object_spec: testlib01/bufx10_d depth: 1 2 3 4 5 6 7 8 distance: 0 // Here the derate is only an example and comes from `1 - 1/(10*depth)` calculation. table: 1.1 1.05 1.033 1.025 1.020 1.017 1.014 1.013
Note
Please note the derate factors for particular stages as we will be refering to it from timing reports.
STA analysis session with AOCV is not much different; it only needs to load the AOCV side file and turn on the advanced analysis:
# Synopsys PrimeTime # Cadence Tempus (stylus/CUI)
# ------------------------------- # -------------------------------
# enable on AOCV
set_db timing_analysis_aocv true;
read_libs -aocv ...; # AOCV side file
# usual session commands # usual session commands
set link_path ...; # Liberty timing library read_libs ...; # Liberty timing library
read_verilog ...; # netlist read_netlist ...; # netlist
link init_design
# enable on AOCV
set_app_var timing_aocvm_enable_analysis true;
read_aocvm ...; # AOCV side file
# usual timing analysis # usual timing analysis
report_timing ... report_timing ...
We can test the AOCV flow with the example data. Notice in the prepared AOCV library that not all cells have AOCV tables defined. This is intentitional to see differences between cells with and without AOCV models when combining AOCV and OCV.
To see that the AOCV side file was used, use the report_aocvm, which prints a summary of annotating
AOCV tables. To see more details, use the -list_annotated option. The command is then use for reporting
derate details for a particular timing path.
pt_shell> report_aocvm
AOCV Table Set : *Default*
******************************************************
| | Fully | Partially | Not |
| Total | annotated | annotated | annotated |
------------+-----------+-----------+-----------+-----------+
Leaf cells | 14 | 8 | 0 | 6 |
Nets | 16 | 0 | 0 | 16 |
------------+-----------+-----------+-----------+-----------+
| 30 | 8 | 0 | 22 |
pt_shell> report_timing -from FF1 -to FF2 -derate -nosplit -path_type full_clock_expanded Last common pin: CKBUF2/Q Path Group: CLK Path Type: max Point Derate Incr Path ----------------------------------------------------------- clock CLK (rise edge) 0.000 0.000 clock source latency 0.000 0.000 clk (in) 0.000 0.000 r CKBUF1/A (bufx10_d) 1.000 0.000 0.000 r CKBUF1/Q (bufx10_d) 1.017 1.017 1.017 r <--- 1.017 late scaling factor due to 6 cells in the launch clock path ... (CKBUF1-2, CKBUF3a1-3, FF1) FF1/CK (dffprqx05_d) 1.000 0.000 5.047 r FF1/Q (dffprqx05_d) <- 1.025 3.075 8.122 f <--- 1.025 as AOCV table for dffprqx05_d ending at 4 stages ... INV3/A (invx05_d) 1.000 0.000 10.135 r <--- no derate due to no AOCV table for invx05_d INV3/Q (invx05_d) 1.000 1.000 11.135 f BUF4/A (bufx10_d) 1.000 0.000 11.135 f BUF4/Q (bufx10_d) 1.013 1.013 12.148 f <--- 1.013 derate due to 8 cells in data path since the common clock point FF2/D (dffprqx05_d) 1.000 0.000 12.148 f (CKBUF3a1-3, FF1, BUF1, INV1-2, BUF2) data arrival time 12.148 clock CLK (rise edge) 10.000 10.000 clock source latency 0.000 10.000 clk (in) 0.000 10.000 r CKBUF1/A (bufx10_d) 1.000 0.000 10.000 r CKBUF1/Q (bufx10_d) 0.980 0.980 10.980 r <--- 0.980 early factor as there are 5 cells in the capture clock path ... (CKBUF1-2, CKBUF3b1-3) CKBUF3b1/A (bufx10_d) 1.000 0.000 11.960 r CKBUF3b1/Q (bufx10_d) 0.967 0.967 12.927 r <--- 0.967 as there are 3 cells in capture clock from the common clock point CKINV3b2/A (invx05_d) 1.000 0.000 12.927 r CKINV3b2/Q (invx05_d) 1.000 1.000 13.927 f CKINV3b3/A (invx05_d) 1.000 0.000 13.927 f CKINV3b3/Q (invx05_d) 1.000 1.000 14.927 r FF2/CK (dffprqx05_d) 1.000 0.000 14.927 r clock reconvergence pessimism 0.074 15.001 clock uncertainty -0.500 14.501 library setup time 1.000 -0.400 14.101 data required time 14.101 ----------------------------------------------------------- data required time 14.101 data arrival time -12.148 ----------------------------------------------------------- slack (MET) 1.953 Derate Summary Report ------------------------------------------------------------------- total derate : required time 0.073 total derate : arrival time -0.148 ------------------------------------------------------------------- total derate : slack 0.221 slack (with derating applied) (MET) 1.953 clock reconvergence pessimism (due to derating) -0.074 ------------------------------------------------------------------- slack (with no derating) (MET) 2.100 ################### ######## actuals for CKBUF2 stage count ################### pt_shell> report_aocvm [get_timing_arc -from CKBUF1/A -to CKBUF1/Q] From pin: CKBUF1/A To pin: CKBUF1/Q Arc type: cell (clock network) AOCVM arc metrics Launch Capture -------------------------------------------------------- Distance -- -- Depth 6.00 5.00 AOCVM arc derates Launch Capture -------------------------------------------------------- Early rise 0.9830 0.9800 Early fall 0.9830 0.9800 Late rise 1.0170 1.0200 Late fall 1.0170 1.0200 From pin: CKBUF1/A To pin: CKBUF1/Q Arc type: cell (data network) AOCVM arc metrics Launch -------------------------------------- Distance -- Depth 5.00 AOCVM arc derates Launch -------------------------------------- Early rise 0.9800 Early fall 0.9800 Late rise 1.0200 Late fall 1.0200 ################### ######## actuals for FF1 stage count ################### pt_shell> report_aocvm [get_timing_arc -from FF1/CK -to FF1/Q] From pin: FF1/CK To pin: FF1/Q Arc type: cell (data network) AOCVM arc metrics Launch -------------------------------------- Distance -- Depth 8.00 AOCVM arc derates Launch -------------------------------------- Early rise 0.9750 Early fall 0.9750 Late rise 1.0250 Late fall 1.0250 ################### ######## actuals for BUF1 derates ################### pt_shell> report_aocvm [get_timing_arc -from BUF1/A -to BUF1/Q] From pin: BUF1/A To pin: BUF1/Q Arc type: cell (data network) AOCVM arc metrics Launch -------------------------------------- Distance -- Depth 8.00 AOCVM arc derates Launch -------------------------------------- Early rise 0.9880 Early fall 0.9880 Late rise 1.0130 Late fall 1.0130
Tempus results are consistent with those of PrimeTime. The advantage in Tempus report_timing
is larger set of reported fileds, so one can see the number of AOCV stages and the implied derate
directly in a timing report. The reported fields can be set either globally through
set_db timing_report_fields <list> or temporarily through report_timing -fileds <list>.
There are slight differences (compared to PrimeTime) in how Tempus counts AOCV stages, such as
for the clock path to FF1/CK. In some cases, the reported stage count seems wrong and
does not correspond to the actual AOCV derate, e.g. CKBUF2 in both the data and clock path.
As the differences (to PrimeTime) were only in the common clock path, they do not account for
differences in timing thanks to CPPR removal/adjustment.
Hence in our example Tempus yields the same timing result as in PrimeTime.
@tempus> set_db timing_report_fields [concat [get_db timing_report_fields] stage_count aocv_derate]
@tempus> report_timing -late -path_type full_clock -from FF1 -to FF2
Group: CLK
Startpoint: (R) FF1/CK
Clock: (R) CLK
Endpoint: (F) FF2/D
Clock: (R) CLK
Capture Launch
Clock Edge:+ 10.000 0.000
Src Latency:+ 0.000 0.000
Net Latency:+ 4.927 (P) 5.053 (P)
Arrival:= 14.927 5.053
Setup:- 0.400
Uncertainty:- 0.500
Cppr Adjust:+ 0.080
Required Time:= 14.107
Launch Clock:= 5.053
Data Path:+ 7.101
Slack:= 1.953
Timing Path:
#-----------------------------------------------------------------------------------------------
# Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv
# (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate
# Count
#-----------------------------------------------------------------------------------------------
clk (arrival) 0.003 0.003 0.000 0.000 0.000 - - 5.000 -
CKBUF1/Q bufx10_d 0.003 0.003 0.000 1.020 1.020 1.000 1.020 5.000 1.020 <-- Tempus does not count FF1 as a stage, hence 5 (instead of 6 in PrimeTime)
CKBUF2/Q bufx10_d 0.006 0.201 0.000 1.020 2.040 1.000 1.020 8.000 1.020 <-- Tempus reports 8 stages but counts 5 (i.e. as for CKBUF1, which is correct)
CKBUF3a1/Q bufx10_d 0.005 0.201 0.000 1.013 3.053 1.000 1.013 8.000 1.013
CKINV3a2/Q invx05_d 0.005 0.201 0.000 1.000 4.053 1.000 1.000 8.000 1.000
CKINV3a3/Q invx05_d 0.005 0.301 0.000 1.000 5.053 1.000 1.000 8.000 1.000
FF1/Q dffprqx05_d 0.003 0.201 0.000 3.075 8.128 1.000 1.025 8.000 1.025
BUF1/Q bufx10_d 0.005 0.301 0.000 1.013 9.141 1.000 1.013 8.000 1.013 <-- 8 stages as in PrimeTime
INV2/Q invx05_d 0.005 0.301 0.000 1.000 10.141 1.000 1.000 8.000 1.000
INV3/Q invx05_d 0.003 0.201 0.000 1.000 11.141 1.000 1.000 8.000 1.000
BUF4/Q bufx10_d 0.005 0.301 0.000 1.013 12.154 1.000 1.013 8.000 1.013
FF2/D dffprqx05_d 0.005 0.301 0.000 0.000 12.154 1.000 1.000 8.000 1.000
#-----------------------------------------------------------------------------------------------
Other End Path:
#-----------------------------------------------------------------------------------------------
# Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv
# (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate
# Count
#-----------------------------------------------------------------------------------------------
clk (arrival) 0.003 0.003 0.000 0.000 10.000 - - 5.000 -
CKBUF1/Q bufx10_d 0.003 0.003 0.000 0.980 10.980 1.000 0.980 5.000 0.980 <-- 5 stages as in PrimeTime
CKBUF2/Q bufx10_d 0.006 0.201 0.000 0.980 11.960 1.000 0.980 3.000 0.980 <-- Tempus reports 3 stages but counts 5 (i.e. as for CKBUF1, which is correct)
CKBUF3b1/Q bufx10_d 0.005 0.201 0.000 0.967 12.927 1.000 0.967 3.000 0.967 <-- 3 stages as in PrimeTime
CKINV3b2/Q invx05_d 0.005 0.201 0.000 1.000 13.927 1.000 1.000 3.000 1.000
CKINV3b3/Q invx05_d 0.005 0.301 0.000 1.000 14.927 1.000 1.000 3.000 1.000
FF2/CK dffprqx05_d 0.005 0.201 0.000 0.000 14.927 1.000 1.000 3.000 1.000
#-----------------------------------------------------------------------------------------------
###################
######## actuals for CKBUF2 stage count and derates
###################
@tempus> foreach t {early late} { \
foreach a {data launch_clock capture_clock} { \
puts "aocv_stage_count_${a}_${t} = [get_db [get_arcs -from CKBUF2/A -to CKBUF2/Q] .aocv_stage_count_${a}_${t}]"; \
} \
}
aocv_stage_count_data_early = 5.0
aocv_stage_count_launch_clock_early = 5.0
aocv_stage_count_capture_clock_early = 5.0
aocv_stage_count_data_late = no_value
aocv_stage_count_launch_clock_late = 5.0
aocv_stage_count_capture_clock_late = 5.0
@tempus> foreach t {early late} { \
foreach a {data launch_clock capture_clock} { \
foreach e {rise fall} { \
puts "aocv_derate_${a}_${t}_${e} = [get_db [get_arcs -from CKBUF2/A -to CKBUF2/Q] .aocv_derate_${a}_${t}_${e}]"; \
} \
} \
}
aocv_derate_data_early_rise = 0.98
aocv_derate_data_early_fall = 0.98
aocv_derate_launch_clock_early_rise = 0.98
aocv_derate_launch_clock_early_fall = 0.98
aocv_derate_capture_clock_early_rise = 0.98
aocv_derate_capture_clock_early_fall = 0.98
aocv_derate_data_late_rise = 1.02
aocv_derate_data_late_fall = 1.02
aocv_derate_launch_clock_late_rise = 1.02
aocv_derate_launch_clock_late_fall = 1.02
aocv_derate_capture_clock_late_rise = 1.02
aocv_derate_capture_clock_late_fall = 1.02
###################
######## actuals for FF1 stage count and derates
###################
@tempus> foreach t {early late} { \
foreach a {data launch_clock capture_clock} { \
puts "aocv_stage_count_${a}_${t} = [get_db [get_arcs -from FF1/CK -to FF1/Q] .aocv_stage_count_${a}_${t}]"; \
} \
}
aocv_stage_count_data_early = 8.0
aocv_stage_count_launch_clock_early = 8.0
aocv_stage_count_capture_clock_early = no_value
aocv_stage_count_data_late = no_value
aocv_stage_count_launch_clock_late = 8.0
aocv_stage_count_capture_clock_late = no_value
@tempus> foreach t {early late} { \
foreach a {data launch_clock capture_clock} { \
foreach e {rise fall} { \
puts "aocv_derate_${a}_${t}_${e} = [get_db [get_arcs -from FF1/CK -to FF1/Q] .aocv_derate_${a}_${t}_${e}]"; \
} \
} \
}
aocv_derate_data_early_rise = 0.975
aocv_derate_data_early_fall = 0.975
aocv_derate_launch_clock_early_rise = 0.975
aocv_derate_launch_clock_early_fall = 0.975
aocv_derate_capture_clock_early_rise = 1.0
aocv_derate_capture_clock_early_fall = 1.0
aocv_derate_data_late_rise = 1.025
aocv_derate_data_late_fall = 1.025
aocv_derate_launch_clock_late_rise = 1.025
aocv_derate_launch_clock_late_fall = 1.025
aocv_derate_capture_clock_late_rise = 1.0
aocv_derate_capture_clock_late_fall = 1.0
###################
######## actuals for BUF1 derates (early & late reported separately)
###################
@tempus> report_delay_calculation -from BUF1/A -to BUF1/Q -min
From pin : BUF1/A
To Pin : BUF1/Q
Cell : bufx10_d
Library : testlib01
Arc sense : positive unate
Delay type : cell delay
Rise Fall
-------------------------------------------------------------
Input transition time : 0.200600 ns 0.300800 ns
Cell delay : 0.999900 ns 0.999900 ns
Timing Derate : 0.988000 0.988000
Derated Cell delay : 0.987900 ns 0.987900 ns
Output transition time : 0.200600 ns 0.300800 ns
-------------------------------------------------------------
@tempus> report_delay_calculation -from BUF1/A -to BUF1/Q -max
From pin : BUF1/A
To Pin : BUF1/Q
Cell : bufx10_d
Library : testlib01
Arc sense : positive unate
Delay type : cell delay
Rise Fall
-------------------------------------------------------------
Input transition time : 0.200600 ns 0.300800 ns
Cell delay : 1.000000 ns 1.000000 ns
Timing Derate : 1.013000 1.013000
Derated Cell delay : 1.013000 ns 1.013000 ns
Output transition time : 0.200600 ns 0.300800 ns
-------------------------------------------------------------
As AOCV models only process variation, users still need to somehow account for variation components from voltage and temperature. Hence some reduced flat OCV margins are typically used along with AOCV tables.
While the principle is simple and intuitive, the practice is little harder. The flat OCV
component still comes from set_timing_derate but STA tools may differ in how they
combine it with AOCV.
PrimeTime follows these rules:
PT keeps the OCV margin as two components, a multiplicative scale factor (
set_timing_derate) plus an additive margin (set_timing_derate -incremental). FLat-only total derate formula is then<total derate> = <flat scaling> + <flat add margin>.In AOCV flow:
The OCV additive margin component is always added.
The OCV multiplicative scale component is applied differently to cells with and without AOCV model. Cells with AOCV models are only affected by
set_timing_derate -aocvm_guardband. Cells with no AOCV model are only affected byset_timing_derate.This makes sense as a larger scale factor is to be used for cells with no AOCV tables.
OCV selection precedence rules apply as in flat-only OCV flow.
Hence the total OCV derate formula becomes like this:
- cells w/- AOCV model:
<total derate> = <AOCV derate> * <flat aocvm_guardband scaling> + <flat add margin> - cells w/o AOCV model:
<total derate> = <flat scaling> + <flat add margin>
- cells w/- AOCV model:
The following example shows combining OCV and AOCV timing without using -aocvm_guardband;
that is, the library AOCV is with no scaling. The example shows various combinations, with BUF1
having AOCV model and OCV scaling and incremental margin, FF1 having AOCV model but no OCV
derates, and INV2 having only OCV scaling and incremental margin.
pt_shell> reset_timing_derate
pt_shell> set_timing_derate -cell_delay -late 1.01 [get_lib_cells */invx05_d]
pt_shell> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d]
pt_shell> set_timing_derate -cell_delay -late 1.015 [get_lib_cells */bufx10_d]
pt_shell> set_timing_derate -cell_delay -late 0.025 -increment [get_lib_cells */bufx10_d]
pt_shell> report_timing_derate
----- Clock ------ ------ Data ------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
----------------------------------------------------------------------------------------------
lib_cell: testlib01/bufx10_d
Cell delay -- 1.015 -- 1.015 -- 1.015 -- 1.015
lib_cell: testlib01/invx05_d
Cell delay -- 1.010 -- 1.010 -- 1.010 -- 1.010
pt_shell> report_timing_derate -increment
----------------------------------------------------------------------------------------------
lib_cell: testlib01/bufx10_d
Cell delay -- 0.025 -- 0.025 -- 0.025 -- 0.025
lib_cell: testlib01/invx05_d
Cell delay -- 0.020 -- 0.020 -- 0.020 -- 0.020
pt_shell> report_aocvm [get_timing_arc -from BUF1/A -to BUF1/Q]
...
AOCVM arc derates Launch
---------------------------------------
Early rise 0.9880
Early fall 0.9880
Late rise 1.0130
Late fall 1.0130
pt_shell> report_timing -from FF1 -to FF2 -derate -nosplit -path_type full_clock_expanded
Startpoint: FF1 (rising edge-triggered flip-flop clocked by CLK)
Endpoint: FF2 (rising edge-triggered flip-flop clocked by CLK)
Last common pin: CKBUF2/Q
Path Group: CLK
Path Type: max
Point Derate Incr Path aocv_derate aocv_scaling flat_scaling flat_inc
--------------------------------------------------------------------------------------------------------------
clock CLK (rise edge) 0.000 0.000
clock source latency 0.000 0.000
clk (in) 0.000 0.000 r
CKBUF1/Q (bufx10_d) 1.042 1.042 1.042 r
CKBUF2/Q (bufx10_d) 1.042 1.042 2.084 r 1.017 1.015 0.025 <--- 1.017 + 0.025 = 1.042
...
FF1/CK (dffprqx05_d) 1.000 0.000 5.182 r
FF1/Q (dffprqx05_d) <- 1.025 3.075 8.257 f 1.025
BUF1/Q (bufx10_d) 1.038 1.038 9.295 f 1.013 1.015 0.025 <--- 1.013 + 0.025 = 1.038
INV2/Q (invx05_d) 1.030 1.030 10.325 r n/a 1.010 0.020 <--- 1.010 + 0.020 = 1.030
...
FF2/D (dffprqx05_d) 1.000 0.000 12.393 f
data arrival time 12.393
clock CLK (rise edge) 10.000 10.000
clock source latency 0.000 10.000
clk (in) 0.000 10.000 r
CKBUF1/Q (bufx10_d) 0.980 0.980 10.980 r
CKBUF2/Q (bufx10_d) 0.980 0.980 11.960 r
CKBUF3b1/Q (bufx10_d) 0.967 0.967 12.927 r
CKINV3b2/Q (invx05_d) 1.000 1.000 13.927 f
CKINV3b3/Q (invx05_d) 1.000 1.000 14.927 r
FF2/CK (dffprqx05_d) 1.000 0.000 14.927 r
clock reconvergence pessimism 0.124 15.051
clock uncertainty -0.500 14.551
library setup time 1.000 -0.400 14.151
data required time 14.151
---------------------------------------------------------
data required time 14.151
data arrival time -12.393
---------------------------------------------------------
slack (MET) 1.758
Derate Summary Report
-------------------------------------------------------------------
total derate : required time 0.073
total derate : arrival time -0.393
-------------------------------------------------------------------
total derate : slack 0.466
slack (with derating applied) (MET) 1.758
clock reconvergence pessimism (due to derating) -0.124
-------------------------------------------------------------------
slack (with no derating) (MET) 2.100
When adding -aocvm_guardband, all cells with an AOCV model are scaled accordingly:
pt_shell> set_timing_derate -cell_delay -late 1.02 -aocvm_guardband [get_lib_cells */invx05_d]
pt_shell> set_timing_derate -cell_delay -late 1.025 -aocvm_guardband [get_lib_cells */bufx10_d]
pt_shell> report_timing_derate -aocvm_guardband
----- Clock ------ ------ Data ------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
----------------------------------------------------------------------------------------------
lib_cell: testlib01/bufx10_d
Cell delay -- 1.025 -- 1.025 -- 1.025 -- 1.025
lib_cell: testlib01/invx05_d
Cell delay -- 1.020 -- 1.020 -- 1.020 -- 1.020
# Note that AOCV scaling shows up immediately in the reported aocv derating.
pt_shell> report_aocvm [get_timing_arc -from BUF1/A -to BUF1/Q]
...
AOCVM arc derates Launch
---------------------------------------
Early rise 0.9880
Early fall 0.9880
Late rise 1.0383
Late fall 1.0383
pt_shell> report_timing -from FF1 -to FF2 -derate -nosplit -path_type full_clock_expanded -significant_digits 5
Point Derate Incr Path aocv_derate aocv_scaling flat_scaling flat_inc
--------------------------------------------------------------------------------------------------------------------
...
CKBUF1/Q (bufx10_d) 1.05725 1.05725 1.05725 r ...
CKBUF2/Q (bufx10_d) 1.05725 1.05726 2.11451 r 1.017 1.015 1.015 0.025 <--- 1.017*1.015 + 0.025 = 1.057255
...
FF1/CK (dffprqx05_d) 1.00000 0.00000 5.22771 r
FF1/Q (dffprqx05_d) <- 1.02500 3.07500 8.30270 f 1.025
BUF1/Q (bufx10_d) 1.05319 1.05319 9.35590 f 1.013 1.015 1.015 0.025 <--- 1.013*1.015 + 0.025 = 1.053195
INV2/Q (invx05_d) 1.03000 1.03000 10.38590 r n/a 1.010 1.010 0.020 <--- 1.010 + 0.020 = 1.030
...
FF2/D (dffprqx05_d) 1.00000 0.00000 12.46910 f
data arrival time 12.46910
clock CLK (rise edge) 10.00000 10.00000
clock source latency 0.00000 10.00000
clk (in) 0.00000 10.00000 r
CKBUF1/Q (bufx10_d) 0.98000 0.98000 10.98000 r
CKBUF1/Q (bufx10_d) 0.98000 0.98000 10.98000 r
CKBUF2/Q (bufx10_d) 0.98000 0.98000 11.96000 r
CKBUF3b1/Q (bufx10_d) 0.96700 0.96700 12.92700 r
CKINV3b2/Q (invx05_d) 1.00000 1.00000 13.92700 f
CKINV3b3/Q (invx05_d) 1.00000 1.00000 14.92700 r
FF2/CK (dffprqx05_d) 1.00000 0.00000 14.92700 r
clock reconvergence pessimism 0.15451 15.08151
clock uncertainty -0.50000 14.58151
library setup time 1.00000 -0.40000 14.18151
data required time 14.18151
---------------------------------------------------------------
data required time 14.18151
data arrival time -12.46910
---------------------------------------------------------------
slack (MET) 1.71242
Tempus follows these rules:
- Tempus keeps a single flat OCV multiplicative scaling factor. This
scaling factor can be set instantly by
set_timing_derateor incrementally by subsequent calls toset_timing_derate -add|-multiply. The total derate formula is simply<total derate> = <flat scaling>. - Recent versions of Tempus allow incremental additive margin that is maintained separately to the scaling factor (see Flat Deratings (Tempus)). This then affects how OCV gets combined with AOCV!
- In AOCV flow:
- The OCV flat derate scaling factor is applied either multiplicatively (default) or, when
the
timing_aocv_derate_moderoot attribute set toaocv_additive, as an additive margin. - Tempus makes no distinction between cells with and without AOCV models. The same flat OCV scaling factor is used, following the OCV selection precedence rules (like in PrimeTime).
- If
-incrementis used (in recent Tempus versions), the incremental additive margin is added to the product of AOCV derate and flat OCV scaling factor. - Hence the total OCV derate formula (note that
<flat add margin>is an additive margin set throughset_timing_derate -incremental):timing_aocv_derate_mode==aocv_multiplicative:<total derate> = <aocv_derate> * <flat scaling> + <flat add margin>.timing_aocv_derate_mode==aocv_additive:<total derate> = <aocv_derate> + (<flat scaling> - 1.0) + <flat add margin>.
- The OCV flat derate scaling factor is applied either multiplicatively (default) or, when
the
The following example shows the timing for the aocv_multiplicative. Notice that the results
are the same as in PrimeTime, except differences in AOCV stage count (and hence AOCV and total
derate) for CKBUF1 and CKBUF2. The difference gets compensated, though, through CRPR
removal.
@tempus> reset_timing_derate
@tempus> set_timing_derate -cell_delay -late 1.01 [get_lib_cells */invx05_d]
@tempus> set_timing_derate -cell_delay -late 0.02 -increment [get_lib_cells */invx05_d]
# the following yields same as `-cell_delay -late 1.015` without `-add`
@tempus> set_timing_derate -cell_delay -late 0.015 -add [get_lib_cells */bufx10_d]
@tempus> set_timing_derate -cell_delay -late 0.025 -increment [get_lib_cells */bufx10_d]
@tempus> report_timing_derate
-------------Clock----------------- ---------------Data-----------------
Rise Fall Rise Fall
Early Late Early Late Early Late Early Late
---------------------------------------------------------------------------
LibraryCell: testlib01/bufx10_d
Cell_delay -- 1.015 -- 1.015 -- 1.015 -- 1.015
Incremental_adjust 0.000 0.025 0.000 0.025 0.000 0.025 0.000 0.025
Input_switching -- -- -- -- -- -- -- --
LibraryCell: testlib01/invx05_d
Cell_delay -- 1.010 -- 1.010 -- 1.010 -- 1.010
Incremental_adjust 0.000 0.020 0.000 0.020 0.000 0.020 0.000 0.020
@tempus> get_db timing_aocv_derate_mode
aocv_multiplicative
@tempus> report_timing -late -path_type full_clock -from FF1 -to FF2
Path 1: MET (1.71242 ns) Setup Check with Pin FF2/CK->D
Group: CLK
Startpoint: (R) FF1/CK
Clock: (R) CLK
Endpoint: (F) FF2/D
Clock: (R) CLK
Capture Launch
Clock Edge:+ 10.00000 0.00000
Src Latency:+ 0.00000 0.00000
Net Latency:+ 4.92700 (P) 5.23380 (P)
Arrival:= 14.92700 5.23380
Setup:- 0.40000
Uncertainty:- 0.50000
Cppr Adjust:+ 0.16060
Required Time:= 14.18760
Launch Clock:= 5.23380
Data Path:+ 7.24139
Slack:= 1.71242
Timing Path:
#---------------------------------------------------------------------------------------------------------
# Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv
# (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate
# Count
#---------------------------------------------------------------------------------------------------------
clk (arrival) 0.003 0.00300 0.00000 0.00000 0.00000 - - 5.00000 -
CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 1.06030 1.06030 1.015 1.060 5.00000 1.020
CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 1.06030 2.12060 1.015 1.060 8.00000 1.020 <-- 1.02 * 1.015 + 0.025 = 1.06030
CKBUF3a1/Q bufx10_d 0.005 0.20060 0.00000 1.05319 3.17379 1.015 1.053 8.00000 1.013
CKINV3a2/Q invx05_d 0.005 0.20060 0.00000 1.03000 4.20379 1.010 1.030 8.00000 1.000
CKINV3a3/Q invx05_d 0.005 0.30080 0.00000 1.03000 5.23380 1.010 1.030 8.00000 1.000
FF1/Q dffprqx05_d 0.003 0.20060 0.00000 3.07500 8.30879 1.000 1.025 8.00000 1.025
BUF1/Q bufx10_d 0.005 0.30080 0.00000 1.05319 9.36199 1.015 1.053 8.00000 1.013 <-- 1.013 * 1.015 + 0.025 = 1.053195
INV2/Q invx05_d 0.005 0.30080 0.00000 1.03000 10.39199 1.010 1.030 8.00000 1.000
INV3/Q invx05_d 0.003 0.20060 0.00000 1.03000 11.42199 1.010 1.030 8.00000 1.000
BUF4/Q bufx10_d 0.005 0.30080 0.00000 1.05319 12.47518 1.015 1.053 8.00000 1.013
FF2/D dffprqx05_d 0.005 0.30080 0.00000 0.00000 12.47518 1.000 1.000 8.00000 1.000
#---------------------------------------------------------------------------------------------------------
Other End Path:
#---------------------------------------------------------------------------------------------------------
# Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv
# (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate
# Count
#---------------------------------------------------------------------------------------------------------
clk (arrival) 0.003 0.00300 0.00000 0.00000 10.00000 - - 5.00000 -
CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 0.98000 10.98000 1.000 0.980 5.00000 0.980
CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 0.98000 11.96000 1.000 0.980 3.00000 0.980
CKBUF3b1/Q bufx10_d 0.005 0.20060 0.00000 0.96700 12.92700 1.000 0.967 3.00000 0.967
CKINV3b2/Q invx05_d 0.005 0.20060 0.00000 1.00000 13.92700 1.000 1.000 3.00000 1.000
CKINV3b3/Q invx05_d 0.005 0.30080 0.00000 1.00000 14.92700 1.000 1.000 3.00000 1.000
FF2/CK dffprqx05_d 0.005 0.20060 0.00000 0.00000 14.92700 1.000 1.000 3.00000 1.000
#---------------------------------------------------------------------------------------------------------
Difference between timing_aocv_derate_mode can be fairly small if the derates are meaningfully
sized. Using the preceding setup, here is an example of aocv_multiplicative and aocv_additive
for calculating cell_delay of BUF1:
@tempus> set_db timing_aocv_derate_mode aocv_multiplicative
@tempus> report_delay_calculation -from BUF1/A -to BUF1/Q
From pin : BUF1/A
To Pin : BUF1/Q
Cell : bufx10_d
Library : testlib01
Arc sense : positive unate
Delay type : cell delay
Rise Fall
-------------------------------------------------------------
Input transition time : 0.200600 ns 0.300800 ns
Cell delay : 0.999900 ns 0.999900 ns
Timing Derate : 1.053195 1.053195
Derated Cell delay : 1.053100 ns 1.053100 ns
Output transition time : 0.200600 ns 0.300800 ns
-------------------------------------------------------------
@tempus> set_db timing_aocv_derate_mode aocv_additive
@tempus> report_delay_calculation -from BUF1/A -to BUF1/Q
...
Rise Fall
-------------------------------------------------------------
Input transition time : 0.200600 ns 0.300800 ns
Cell delay : 0.999900 ns 0.999900 ns
Timing Derate : 1.053000 1.053000
Derated Cell delay : 1.052900 ns 1.052900 ns
Output transition time : 0.200600 ns 0.300800 ns
-------------------------------------------------------------
When using aocv_additive, the full path timing looks like follows and, as expected,
yields somewhat different slack:
@tempus> set_db timing_aocv_derate_mode aocv_additive
@tempus> report_timing -late -path_type full_clock -from FF1 -to FF2
Group: CLK
Startpoint: (R) FF1/CK
Clock: (R) CLK
Endpoint: (F) FF2/D
Clock: (R) CLK
Capture Launch
Clock Edge:+ 10.00000 0.00000
Src Latency:+ 0.00000 0.00000
Net Latency:+ 4.92700 (P) 5.23300 (P)
Arrival:= 14.92700 5.23300
Setup:- 0.40000
Uncertainty:- 0.50000
Cppr Adjust:+ 0.16000
Required Time:= 14.18700
Launch Clock:= 5.23300
Data Path:+ 7.24100
Slack:= 1.71300
Timing Path:
#---------------------------------------------------------------------------------------------------------
# Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv
# (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate
# Count
#---------------------------------------------------------------------------------------------------------
clk (arrival) 0.003 0.00300 0.00000 0.00000 0.00000 - - 5.00000 -
CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 1.06000 1.06000 1.015 1.060 5.00000 1.020
CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 1.06000 2.12000 1.015 1.060 8.00000 1.020
CKBUF3a1/Q bufx10_d 0.005 0.20060 0.00000 1.05300 3.17300 1.015 1.053 8.00000 1.013
CKINV3a2/Q invx05_d 0.005 0.20060 0.00000 1.03000 4.20300 1.010 1.030 8.00000 1.000
CKINV3a3/Q invx05_d 0.005 0.30080 0.00000 1.03000 5.23300 1.010 1.030 8.00000 1.000
FF1/Q dffprqx05_d 0.003 0.20060 0.00000 3.07500 8.30800 1.000 1.025 8.00000 1.025
BUF1/Q bufx10_d 0.005 0.30080 0.00000 1.05300 9.36100 1.015 1.053 8.00000 1.013
INV2/Q invx05_d 0.005 0.30080 0.00000 1.03000 10.39100 1.010 1.030 8.00000 1.000
INV3/Q invx05_d 0.003 0.20060 0.00000 1.03000 11.42100 1.010 1.030 8.00000 1.000
BUF4/Q bufx10_d 0.005 0.30080 0.00000 1.05300 12.47400 1.015 1.053 8.00000 1.013
FF2/D dffprqx05_d 0.005 0.30080 0.00000 0.00000 12.47400 1.000 1.000 8.00000 1.000
#---------------------------------------------------------------------------------------------------------
Other End Path:
#---------------------------------------------------------------------------------------------------------
# Pin Cell Load Trans Incr Delay Arrival User Total Aocv Aocv
# (pf) (ns) Delay (ns) (ns) Derate Derate Stage Derate
# Count
#---------------------------------------------------------------------------------------------------------
clk (arrival) 0.003 0.00300 0.00000 0.00000 10.00000 - - 5.00000 -
CKBUF1/Q bufx10_d 0.003 0.00300 0.00000 0.98000 10.98000 1.000 0.980 5.00000 0.980
CKBUF2/Q bufx10_d 0.006 0.20060 0.00000 0.98000 11.96000 1.000 0.980 3.00000 0.980
CKBUF3b1/Q bufx10_d 0.005 0.20060 0.00000 0.96700 12.92700 1.000 0.967 3.00000 0.967
CKINV3b2/Q invx05_d 0.005 0.20060 0.00000 1.00000 13.92700 1.000 1.000 3.00000 1.000
CKINV3b3/Q invx05_d 0.005 0.30080 0.00000 1.00000 14.92700 1.000 1.000 3.00000 1.000
FF2/CK dffprqx05_d 0.005 0.20060 0.00000 0.00000 14.92700 1.000 1.000 3.00000 1.000
#---------------------------------------------------------------------------------------------------------
As AOCV models only local process variation of transistors/gates, other local variations (V, T, RC) still need to be modeled by flat OCV. Hence combining the two margining methods. This is nothing else then enabling the AOCV flow in the STA tool and specifying alongside flat OCV margins.
The following lists key aspects of combining AOCV and OCV deratings:
- Synopsys PrimeTime
- PrimeTime maintains two derating components, a (multiplicative) scaling factor and an additive margin. This is consistent with flat OCV deratings.
- PrimeTime differentiates between cells with and without AOCV models by separating the multiplicative
scaling factor.
- For cells without AOCV, it uses the flat scaling defined by
set_timing_derate <factor>. - For cells with AOCV, it uses the aocv scaling defined by
set_timing_derate -aocvm_gaurdband <factor>.
- For cells without AOCV, it uses the flat scaling defined by
- The flat add margin is always applied.
- The total timing derate is then:
- w/o AOCV:
<total derate> = 1.0 * <flat scaling> + <flat add margin> - w/- AOCV:
<total derate> = <aocv derate> * <aocv scaling> + <flat add margin>
- w/o AOCV:
- Cadence Tempus
- Tempus maintains a single (multiplicative) scaling factor. This scaling factor is used
independent if the cell has an AOCV model or not.
- Users can select if the scaling factor is to be applied multiplicatively or additively. This setting is global and defaults to multiplicative.
- Tempus versions that support
set_timing_derate -incrementalso maintain a separate flat add margin. - The total timing derate is then (based on
timing_aocv_derate_mode, for cells without AOCV<aocv_derate>=1.0):aocv_multiplicativescaling:<total derate> = <aocv derate> * <flat scaling> [+ <flat add margin>]aocv_additivescaling:<total derate> = <aocv derate> * (<flat scaling> - 1) [+ <flat add margin>]
- Tempus maintains a single (multiplicative) scaling factor. This scaling factor is used
independent if the cell has an AOCV model or not.
Some recommendations for yielding consistent analysis results across tools:
Follow the recommendations for flat OCVs.
Use separate scaling factors for cells with and without AOCV models. Typically only standard cells come with AOCV models. Those cells that do tend to have lower flat OCV margins anyway.
Separating to w/- and w/o AOCV is really only for PrimeTime to know when to use the
-aocvm_guardbandoption. A conservative approach may be to always apply bothset_timing_derateandset_timing_derate -aocvm_guardbandwith the same factor as the two factors do not interfere with one another.Avoid using incremental margins; at all if possible. The only incremental margins that yield consistent results is
set_timing_derate -increment.
Some more practical considerations:
It may happen that a worst-case AOCV PVT (i.e. with largest margins) may not alias with a worst-case Liberty/timing PVT. This may shift the combined worst-case away from just the Liberty-only one or may call for sign-off at more potentially worst-case corners.
One inconvenience that may kick in is that some tools may not use/honor (A)OCV at all phases or may optimize at only single PVT.
AOCV stops to fairly model OCV effects as technology nodes scale further. First they scale only cell delays and do not anyhow address timing constraints (e.g. setup/hold times). AOCV factors are simple 1D tables and hence do not capture variation in input slew or cell loading.
The next OCV evolution step is statistical OCV (SOCV), where potentially every timing parameter is represented by its mean value (i.e. the 2D tables from good old timing library) and statistical distribution parameters (such as sigma) as either 1D or 2D tables. STA tool than does "simple" statistical calculations to compute the cumulative statistical effect of propagating an input distribution through a series of cells in the timing path. This yields more accurate (and still computationally affordable) OCV modeling than the other OCV methods. This approach also allows to model distributions "skewed" and "tilted" from the typical normal one.
SOCV details are not covered in this text.
While OCV is not a complicated matter, using it across different tools with consistent results may quickly turn a nightmare in a complex derate setups and it is generally recommended to minimize the use of incremental derates.
| [StaBasics] | Brabec, Tomas. Static Timing Analysis Basics. On-line. |





very useful observation and information. Thx!