# Adaptive Techniques for Overcoming Performance Degradation due to Aging in Digital Circuits

Sanjay Kumar Chris Kim Sachin Sapatnekar

## **Negative Bias Temperature Instability (NBTI)**



# Impact of BTI

- 25-30% degradation in PMOS V<sub>th</sub> – drain current reduces
- Positive Bias Temperature Instability (PBTI)
  - In NMOS devices when  $V_{gs} = V_{dd}$
  - Lower impact reported as compared with PMOS NBTI
  - Increasing impact with Hf-based high-k dielectrics
- Challenges in nanometer design
  - Quantify the impact of BTI on circuit performance
  - Design robust circuits



[Alam, IRPS05]

# **Overcoming BTI in Digital Circuits**



## Sizing for Reliability [DATE06, ICCD06]



# **BTI-Aware Synthesis** [DAC07]



#### **Limitations of "One-time" Fixes**





*Circuit synthesized s.t.*  $D(t_{life}) \leq D_{spec}$ 

Temporal leakage of BTI-aware synthesized circuit

- Circuit runs faster than spec for  $t < t_{life}$
- Burns additional power in comparison with nominal design due to design-time (one-time) fix
- Leakage decreases below budget for t closer to t<sub>life</sub>
- Potential for leakage-performance tradeoff not utilized

## **Adaptive Techniques**



## **Ideal Case**



Decrease in currents with increase in V<sub>th</sub>

Increase in  $I_{on}$  and  $I_{off}$  (measured at  $t_{life}$ ) with PMOS body bias

- NBTI causes on and subthreshold currents to decrease
- FBB (of around 0.3V) to the PMOS device sets  $\rm V_{th}$  back to  $\rm V_{th0}$
- On and subthreshold currents back to their nominal values
- Effectively, device reset to its original state?

#### UNIVERSITY OF MINNESOTA

## Leakage Components



Delay (Rise + Fall)/2 for an inverter with FBB



Leakage power of an inverter with FBB





0.30

Components of leakage power

## **Problem Formulation**

- Exponential increase in junction leakage with FBB
- Complete recovery in performance without leakage overhead not possible with ABB
- Use ASV (Adaptive Supply Voltage) as an additional knob

- ASV (Adaptive Supply Voltage)
  - Better control over performance (delay) with V<sub>dd</sub> than V<sub>bb</sub>
  - Leakage still increases (exponentially) with V<sub>dd</sub>
  - Active power increases quadratically with V<sub>dd</sub>
  - Minimize overall power consumed subject to delay constraints

### **Problem Formulation**

Total Power P =  $f(P_{active}, P_{leakage})$ Active power:  $P_{active}(t, V_{dd})$ Leakage power:  $P_{leakage}(t, v_{bn}, v_{bp}, V_{dd})$ 

 $\begin{array}{l} \mbox{Minimize } P \\ s.t. \ D(t, v_{bn}, v_{bp}, V_{dd}) \leq D_{spec} \\ 0 \leq v_{bn}(t) \leq v_{bnmax} \\ 0 \leq v_{bp}(t) \leq v_{bpmax} \\ 0 \leq t \leq t_{life} \end{array}$ 

# **Control System**

- Critical path replica based
  - Large number of critical paths required for an identical f<sub>max</sub> distribution as CUT
  - Aging of critical path replicas depend on signal probabilities, is usage specific – cannot be predicted *a priori*
  - Critical paths can change temporally based on relative aging of paths



[Intel, JSSC2002]

# **Control system**

- Lookup table based
  - Stores optimal values referenced by time
  - Software routine (assumed) to track time of usage
  - On chip local body bias and  $\mathrm{V}_{\mathrm{dd}}$  generators
  - Optimal (v<sub>bn</sub>, v<sub>bp</sub>, V<sub>dd</sub>) precomputed at design time by estimating degradation in delay considering BTI



| Time (s) | v <sub>bn</sub> (V) | v <sub>bp</sub> (V) | V <sub>dd</sub> (V) |
|----------|---------------------|---------------------|---------------------|
|          |                     |                     |                     |
|          |                     |                     |                     |

# How to precompute delay degradation

- Signal activity based model
  - Cannot predict signal probabilities a priori
  - Circuit must work under all conditions
- Worst-case method
  - Assume maximal degradation of all NMOS and PMOS devices
  - Compute delay of the circuit at different times
  - Upper bound over the temporal delay of the circuit

#### UNIVERSITY OF MINNESOTA

## **Optimal ABB/ASV Computation**



- Amount of compensation at t<sub>0</sub> depends on degradation in [t<sub>0</sub>,t<sub>1</sub>]
- Compute degradation assuming worst-case aging
- Second order dependence of the extent of trap generation on V<sub>dd</sub>
- Determine optimal (V<sub>dd</sub>, v<sub>bn</sub>, v<sub>bp</sub>) such that delay is met and power is minimized using an enumeration based algorithm [KumarTVLSI08]

#### UNIVERSITY OF MINNESOTA

## Lookup Table (LGSYNTH93 Circuit "des")

| Time (x10 <sup>8</sup> s) | $v_{bn}\left(V ight)$ | $v_{bp}\left(V ight)$ | V <sub>dd</sub> (V) | Delay (ps) | $P_{act}(\mu W)$ | $P_{lkg}(\mu W)$ | % Increase |
|---------------------------|-----------------------|-----------------------|---------------------|------------|------------------|------------------|------------|
| Nominal                   | 0.00                  | 0.00                  | 1.00                | 355        | 641              | 327              |            |
| 0.0000                    | 0.00                  | 0.05                  | 1.03                | 341        | 680              | 416              | 16%        |
| 0.0001                    | 0.00                  | 0.05                  | 1.03                | 351        | 680              | 346              | 6%         |
| 0.0004                    | 0.00                  | 0.10                  | 1.03                | 351        | 680              | 362              | 8%         |
| 0.0016                    | 0.05                  | 0.10                  | 1.03                | 351        | 680              | 369              | 9%         |
| 0.0035                    | 0.00                  | 0.05                  | 1.06                | 352        | 721              | 344              | 9%         |
| 0.0080                    | 0.05                  | 0.05                  | 1.06                | 351        | 721              | 357              | 11%        |
| 0.0180                    | 0.05                  | 0.10                  | 1.06                | 351        | 721              | 368              | 12%        |
| 0.0400                    | 0.10                  | 0.10                  | 1.06                | 352        | 721              | 377              | 14%        |
| 0.0600                    | 0.00                  | 0.10                  | 1.09                | 351        | 762              | 353              | 13%        |
| 0.1100                    | 0.05                  | 0.10                  | 1.09                | 351        | 762              | 360              | 14%        |
| 0.1700                    | 0.10                  | 0.20                  | 1.06                | 352        | 721              | 398              | 17%        |
| 0.2500                    | 0.05                  | 0.15                  | 1.09                | 352        | 762              | 362              | 15%        |
| 0.3600                    | 0.05                  | 0.20                  | 1.09                | 351        | 762              | 388              | 19%        |
| 0.5500                    | 0.10                  | 0.20                  | 1.09                | 351        | 762              | 396              | 20%        |
| 0.7500                    | 0.05                  | 0.15                  | 1.12                | 352        | 804              | 359              | 17%        |
| 1.0000                    |                       |                       |                     | 355        | 804              | 350              | 16%        |

## **Temporal Delay**



Temporal delay of a benchmark with worst-case synthesis and our ABB/ASV based adaptive approach

## **Temporal Power**





Temporal active power with different approaches

Leakage power versus time using different approaches

| Power   | Synthesis                                                                              | Our (Adaptive) Approach                                                                                                                                           |
|---------|----------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Active  | Constant, large overhead                                                               | Increases in steps temporally with $\mathbf{V}_{dd}$                                                                                                              |
| Leakage | Highest at t=0s when<br>there is no BTI<br>Decreases below nominal<br>value temporally | Varies with time – always greater than<br>nominal value since ABB/ASV is<br>applied to compensate for aging<br>Max leakage (at t=0s) comparable<br>with synthesis |

# **Optimal Adaptation Times Selection**

- Number of points chosen depends on
  - Ability of software routine to track time with desired accuracy
  - Discreteness in generating ABB/ASV voltages (50mV for v<sub>bn</sub>, v<sub>bp</sub>, and 30mV for V<sub>dd</sub> in our case)
  - Minimum change in delay over [t<sub>i</sub>,t<sub>i+1</sub>] subject to modeling errors (assumed to be 1% in our case)
  - Resolution of mapping each delay to a unique optimal  $(v_{bn}, v_{bp}, V_{dd})$  using our enumeration algorithm
  - BTI model accuracy particularly for very small values of t << t<sub>life</sub> (model asymptotically accurate beyond 10<sup>4</sup>s with t<sub>life</sub>=10<sup>8</sup>s)

# **Optimal Adaptation Time Selection**

- Want to compensate as much as is required only – keep delay closest to D<sub>spec</sub>
- Larger number of points leads to
  - Lower degradation in each time interval
  - Minimal ABB/ASV to compensate for increase in delay in each [t<sub>i</sub>,t<sub>i+1</sub>]
  - Less overall temporal power overhead
- Compensating at t=0ps only is overkill (as compared with synthesis)



#### Active power versus time for different cases



Leakage power versus time for different cases

### **Area and Power Overhead**

|                | Nominal Circuit           |                                              |              | Our approach                              |                                            | Worst-case Synthesis |                                           |                                            |  |
|----------------|---------------------------|----------------------------------------------|--------------|-------------------------------------------|--------------------------------------------|----------------------|-------------------------------------------|--------------------------------------------|--|
| Bench-<br>mark | D <sub>spec</sub><br>(ps) | Increase<br>in delay<br>(BTI for<br>3 years) | Area<br>(µm) | Maximal<br>Increase<br>in active<br>power | Maximal<br>Increase<br>in leakage<br>power | Area<br>Overhead     | Maximal<br>Increase<br>in active<br>Power | Maximal<br>Increase<br>in leakage<br>Power |  |
| b14            | 1078                      | 14%                                          | 95626        | 19%                                       | 26%                                        | 16%                  | 17%                                       | 16%                                        |  |
| b15            | 902                       | 13%                                          | 179096       | 19%                                       | 26%                                        | 16%                  | 18%                                       | 15%                                        |  |
| C3540          | 769                       | 14%                                          | 18692        | 25%                                       | 30%                                        | 32%                  | 37%                                       | 38%                                        |  |
| C5315          | 729                       | 15%                                          | 29951        | 19%                                       | 29%                                        | 14%                  | 18%                                       | 25%                                        |  |
| C7552          | 616                       | 15%                                          | 42261        | 19%                                       | 29%                                        | 18%                  | 19%                                       | 15%                                        |  |
| des            | 355                       | 15%                                          | 81777        | 25%                                       | 27%                                        | 35%                  | 38%                                       | 28%                                        |  |
| i8             | 840                       | 17%                                          | 55128        | 25%                                       | 26%                                        | 18%                  | 71%                                       | 44%                                        |  |
| i10            | 830                       | 14%                                          | 4063         | 25%                                       | 32%                                        | 21%                  | 26%                                       | 28%                                        |  |
| Avg            |                           | 15%                                          |              | 23%                                       | 27%                                        | 24%                  | 30%                                       | 26%                                        |  |

## Summary

- BTI causes delay to increase and leakage to reduce
- Existing "one-time" fix techniques (sizing, synthesis) lead to large area and power overhead
- Attempt to recover performance through available slack in leakage, adaptively
- ABB + ASV to combat increase in delay
- Lookup table based control system indexed by time of stress
- Similar power overhead as compared with synthesis with large area savings