#### MTTF-aware Design Methodology of Error Prediction Based Adaptively Voltage-scaled Circuits

Yutaka Masuda, Masanori Hashimoto

Osaka University {masuda.yutaka,hasimoto}@ist.osaka-u.ac.jp

### Background : Performance Variation

- Circuit speed is more sensitive to PVTA\* variation \*process, voltage, temperature, and aging
- Conventional countermeasure : Worst case design (WC)
   Adds timing margin assuming worst PVTA conditions



# Adaptive Voltage Scaling (AVS)

- Adaptively adjust V<sub>dd</sub> w/ estimating timing slack
  - Exploit PVTA margin w/ preventing error occurrence



# Error Prediction based AVS (EP-AVS)

• Estimate slack  $\Rightarrow$  Predict error  $\Rightarrow$  Adjust  $V_{dd}$ 

E.g. TEP-FF (timing error predictive FF)

Voltage-scaled circuit

4



#### **Our work focuses on EP-AVS design w/ TEP-FF.**

### Conventional Works for EP-AVS

- 1. Design : Only for voltage control logic
  - Voltage-scaled circuit has many critical paths

     ✓ Utilizing high-Vth/smaller cells for power/area savings.
     → We may need to observe many paths w/ EP-AVS
- 2. Evaluation : Not lifetime-aware
  - Timing error may happen when long operation, e.g., years.



#### Objective : MTTF-aware Design of EP-AVS \*mean time to failure

1. Design of both voltage-scaled circuit and TEP-FF.

Extend MTTF and facilitate TEP-FF insertion.



- 2. MTTF-aware performance evaluation
  - Consider long MTTF, e.g., years.
  - Use stochastic error rate estimation method [1].

[1] S. Iizuka, Y. Masuda, M. Hashimoto, and T. Onoye,

# Outline

- Background and objective
- Proposed design methodology of EP-AVS
  - Design of voltage-scaled circuit under AVS
  - TEP-FF insertion to voltage-scaled circuit
- Experimental evaluation
- Conclusion

# Design of Voltage-Scaled Circuit

This work applies ASA\* [2] to voltage-scaled circuit. \*adaptive slack assignment

# Path



ASA increases setup slack of highly-activated critical paths.

8

- Extend MTTF
- Slack Facilitate TEP-FF insertion

Two step implementation of ASA

Step1 : Increase setup constraint of FF

Step2 : Perform re-synthesis as ECO and restore setup constraint.



[2] Y. Masuda, M. Hashimoto, and T. Onoye,

"Critical path isolation for time-to-failure extension and lower voltage operation," in *Proc. ICCAD*, 2016.

### **TEP-FF** Insertion

#### This work focuses on failure probability.

• Failure prob. = timing violation prob. × activation prob.



Timing violation prob.

- High act. prob.
- Small Slack
- : Need to predict timing errors
  - : Help to reduce # of buffers in TEP-FF

# FF Selection for TEP-FF Insertion <sup>10</sup>

Proposed : Maximize sum of gate-wise failure prob.

#### E.g. Perform ASA to two FFs

• Proposed selects FF3 and FF1 (not FF3 and FF2)



This work formulates FF selection problem as ILP\*. \*integer linear programming

# Outline

- Background and objective
- Proposed design methodology of EP-AVS
- Experimental evaluation
  - Evaluation setup
  - Performance improvement thanks to EP-AVS
  - Discussion : Effectiveness of ASA
- Conclusion

#### **Evaluation Setup**

#### **Experiment: Evaluate average V<sub>dd</sub> and MTTF w/ EP-AVS**

- Target circuit : synthesized w/ 45nm NanGATE cell library
  - OpenRISC processor
    - ▶ 1.46M gates including 589K latches and 2.5K FFs
    - ➢Workload : SHA1, CRC, Dijkstra
  - AES (Advanced Encryption Standard) circuits
    - ≻17K gates including 530 FFs
    - Workload : 1000 random test patterns
- $V_{dd}$  w/ EP-AVS : 1.2V to 0.8V w/ 50 mV interval
- Delay variation source
  - Supply noise, NBTI aging and manufacturing variability
- Target MTTF  $: 10^{17}$  Cycles (3.3 years in OpenRISC)

# **Evaluation Results (OpenRISC)**

- 34.0% speed up @ supply voltage of 0.8 V
- 20.8%  $V_{dd}$  reduction @ clock period of 1040  $\mbox{ps}$



### Evaluation Results (AES)

• 19.5% speed up, 7.5% V<sub>dd</sub> reduction
 ➤ Effectiveness of ASA is smaller than OpenRISC.



### **OpenRISC** and **AES**

15

- AES is highly activated circuit

   In AES, intrinsic critical paths w/ high act. prob. exist.
- OpenRISC is more suitable to ASA



# Compatibility of EP-AVS and ASA

16

Even w/ ASA, EP-AVS exploits PVTA margins similarly.
 → EP-AVS and ASA are highly comparable.



# Comparison of # of Failure FFs

17

ASA reduces # of failure FFs and failure prob. of FFs.
 >ASA helps to facilitate TEP-FF insertion.



# Outline

- Background and objective
- Proposed design methodology of EP-AVS
- Experimental evaluation
- Conclusion

### Conclusion

- Proposed MTTF-aware design methodology of EP-AVS.
  - -ASA to voltage-scaled circuit.
  - Gate-wise failure prob. aware TEP-FF insertion.
  - Consideration of practical long MTTF, e.g., 3 years.
- Performance evaluation results show that
  - $-20.8 \% V_{dd}$  reduction in OpenRISC.
  - -7.5 %  $V_{dd}$  reduction in AES.

# Effectiveness of TEP-FF Insertion

**Compare MTTF between proposed and slack-based method** 

⇒ Only proposed insertion methodology achieves target MTTF.



# **MTTF** Estimation

#### MTTF (Mean Time To Failure)

1. Calculate timing violation and activation probability for all paths in isolated circuit.

Probability

2. Calculate transition rate and estimate MTTF (Next Slide).



# MTTF Estimation[1] : Markov Chain



[1] S. lizuka, Y. Masuda, M. Hashimoto, and T. Onoye,

# MTTF Estimation [1] : Markov Chain

24

From transition rate, we can know Time To Failure (TTF).



[1] S. lizuka, Y. Masuda, M. Hashimoto, and T. Onoye,

# Markov Model w/ Aging State [1] <sup>25</sup>

- 3-dimensional Markov chain
  - $-V_{dd},\ \Delta\,V_{dd},$  Aging



[1] S. Iizuka, Y. Masuda, M. Hashimoto, and T. Onoye,

# Model of Aging Effect

- 1. Model aging effect from measured data[3].
  - Average degradation data of 666 transistors and fit to equation representing NBTI aging effect[4].
    - $\Delta V_{th}(t) = Xe^{V_g} + Ye^{V_g} \log(1 + Zt)$  (T/D model)
      - $-V_g$ : Stress Voltage, X, Y, Z: Constants
- 2. Definition of degradation states
  - 0, 0.5, 1, 5, 10, 15, 20 mV (7 states).
- 3. Calculate transition rate between each pair of states.

 $p_{trans\_i} = \frac{1}{t_{stay\_i}} \begin{bmatrix} p_{trans\_i} : Transition rate to i+1-th state. \\ t_{stay\_i} : # of staying cycle in i-th state. \end{bmatrix}$ 

[3] B. J. Velamala, K. B. Sutaria, H. Shimizu, H. Awano, T. Sato, G. Wirth, and Y. Cao, "Compact Modeling of Statistical BTI Under Trapping/Detrapping," IEEE Trans. ED, vol.60, no.11, pp.3645-3654, 2013.

[4] K. B. Sutaria, J. B. Velamala, C. H. Kim, T. Sato, and Y. Cao, "Aging Statistics Based on Trapping/Detrapping: Compact Modeling and Silicon Validation," IEEE Trans. Device and Materials Reliability, vol.14, no.2, pp.607-615, 2014.

# Problem Formulation of FF Selection <sup>27</sup> <u>Maximizing sum of gate-wise failure prob.</u>

- Objective
  - Maximize :  $\sum_{k=1}^{N_{inst}} (P_{inst_k_{fail}} \times B_{inst_k})$
- Constraint

- Variable
- $0 \leq B_{\text{inst}_k} \leq 1 \ (1 \leq k \leq N_{\text{inst}})$   $B_{\text{TEP}_i} (1 \leq i \leq N_{\text{FF}})$
- 0  $\leq$  B<sub>TEP<sub>i</sub></sub>  $\leq$  1 (1  $\leq$  i  $\leq$  N<sub>FF</sub>)

$$-\sum_{k=1}^{N_{\rm FF}} \mathbf{B}_{{\rm TEP}_i} \le \mathbf{N}_{{\rm TEP}}$$

-  $B_{inst_k} = \bigvee_{k=1}^{N_{FF}} (B_{TEP_i} \times B_{FF_i\_inst_k}) \le \sum_{k=1}^{N_{FF}} (B_{TEP_i} \times B_{FF_i\_inst_k})$ 

| B <sub>inst<sub>k</sub></sub> | : It will be <b>1</b> when paths ending target FFs include <i>k</i> -th instance.      |
|-------------------------------|----------------------------------------------------------------------------------------|
| $B_{TEP_i}$                   | : It will be <b>1</b> when <i>i</i> -th FF is selected for TEP-FF insertion.           |
| N <sub>TEP</sub>              | : Maximum # of TEP-FF insertion.                                                       |
| $B_{FF_i\_inst_k}$            | : It will be <b>1</b> when path ending <i>i</i> -th FF includes <i>k</i> -th instance. |