### On Potential Design Impacts of Electromigration Awareness

#### Andrew B. Kahng, <u>Siddhartha Nath</u> and Tajana S. Rosing VLSI CAD LABORATORY, UC San Diego



UC San Diego / VLSI CAD Laboratory

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

#### **Electromigration in Interconnects**

 Electromigration (EM) is the gradual displacement of metal atoms in an interconnect

I<sub>avg</sub> causes DC EM and affects power delivery networks

Irms causes AC EM and affects clock and logic signals

#### **EM Lifetime**

- EM degrades interconnect lifetime
- Black's Equation calculates lifetime of interconnect segment due to EM degradation

$$t_{50} = \frac{A^* - E_a}{J^n} \cdot \frac{e^{-E_a}}{m}$$

- $t_{50}$  median time to failure (=  $log_e 2 \times MTTF$ )
- A\* geometry-dependent constant
- J current density in interconnect segment
- n constant (= 2)
- E<sub>a</sub> activation energy of metal atoms
- k Boltzmann's constant
- T temperature of the interconnect

#### **Parameters Affecting EM MTTF**







### Why Is EM Important Now?

ITRS 2011 data shows that EM will be a significant reliability issue



#### **Examples of EM Guardband**



To meet EM MTTF margin at given wire width upper bound
 – Reduce J<sub>rms</sub> → reduce driver size → slower circuit

- To meet EM MTTF margin at given performance requirement
  - Increase  $W_{wire} \rightarrow$  increase capacitance, dynamic power

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

## **To Meet EM Lifetime Requirements**

Three major categories of prior work

#### EM MTTF modeling

- Black69 (Black's Equation)
- Liew89 (AC lifetime models)
- Lu07 and Wu12 (Joule heating)
- Architecture changes to mitigate EM
  - Srinivasan04 (RAMP)
  - Romanescu08 (core cannibalization)

 Synthesis and physical design (PD) techniques to reduce current density violations

- Dasgupta96 (limit J<sub>rms</sub> violation at synthesis)
- Jerke04 (limit J<sub>rms</sub> violation at PD)
- Lienig03 (post-route J<sub>rms</sub> fixes)

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

#### **Key Idea**

We quantify impact of EM guardband on performance (F<sub>max</sub>), area and power



# Approach

#### We conduct two studies

- 1. MTTF vs.  $F_{max}$  tradeoffs with fixed resource budget
- 2. MTTF vs. resources tradeoffs with fixed performance requirement

#### Assumptions

- –10 years = example default EM MTTF
- -Six testcases
  - Report three representative (AES, DMA, JPEG)

## **Key Contributions**

- We are the *first* to quantify impacts of EM guardband on performance and resources by using PD flows
- We introduce EM slack as an accurate measure of potential performance improvements in different circuits at reduced MTTF requirements
  - Black's Equation cannot accurately quantify the impacts of EM-awareness in circuits
- We study how tightness vs. looseness of timing constraints determine area and power trends at reduced MTTF
- Our study flow/methodology can potentially be used by
  - architects and front-end designers to improve performance at no area cost
  - physical designers whose levers are conventional SI and EM fixing methods

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

#### **EM Slack**

#### When EM violations occur

$$I_{rms,net} = C_{load} \cdot V_{dd} \cdot \sqrt{\alpha \cdot F_{max} \cdot \left(\frac{1}{t_{rise}} + \frac{1}{t_{fall}}\right)} > I_{rms,limit}$$

#### Black's Equation

$$MTTF = \frac{A^*(WH)^2}{I_{rms}^2} \cdot e^{E_a/_{kT}}$$

#### Theoretical limit of I<sub>rms,net</sub>

$$I_{rms,net} \leq I_{rms,limit} \cdot \sqrt{\frac{MTTF_{default}}{MTTF_{reduced}}}$$

#### Basic Concept: EM slack of a net (units: mA)

 $EM_{slack,net} = I_{rms,net} - I_{rms,limit} \le I_{rms,limit}$ 



#### **Significance of EM Slack**

- Positive EM slack  $\Rightarrow$  potential for improved  $F_{max}$
- If EM slack > 0, a part of it can be used to
  - increase I<sub>rms,limit</sub> by reducing MTTF (from Black's Equation), and
  - improve F<sub>max</sub> by using <u>SP&R knobs</u> (e.g., gate sizing) without causing EM violations

$$I_{rms,net} = C_{load} \cdot V_{dd} \cdot \sqrt{\alpha \cdot F_{max} \left( \left( \frac{1}{t_{rise}} + \frac{1}{t_{fall}} \right) \right)} > I_{rms,limit}$$

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

# Study 1: MTTF vs. F<sub>max</sub>

- Study MTTF vs. F<sub>max</sub> tradeoffs given upper bounds on area, temperature and #EM violations
- Setup
  - Three testcases: AES, DMA and JPEG
  - Two technology libraries: TSMC 45GS and 65GPLUS
  - Upper bounds
    - temperature = 378 K
    - area = 66% utilization
    - #EM violations = 25
  - Synopsys DesignCompiler and Cadence SOC Encounter flows
  - Thermal analysis using Hotspot

#### **Automated Flow to Determine F**max



### **Automated Flow to Determine F**max



# **Derating LEF**

We derate current density limits in technology Library Exchange Format (LEF) file



### **Automated Flow to Determine F**max



# **Binary Search for F**<sub>max</sub>

- Increase frequency by ∆step until some constraint is violated
- Perform binary search between the current F and the last feasible F to find F<sub>max</sub>

### **Automated Flow to Determine F<sub>max</sub>**



#### **Flow to Fix EM Violations**



#### **Automated Flow to Determine F**max





- F<sub>max</sub> scaling is not uniform across designs and at reduced MTTF as suggested by Black's Equation
- F<sub>max</sub> scaling is determined by the EM slack in each design at each MTTF requirement
- $\succ$  Large  $F_{max}$  improvements may be setup artifacts



EM slack (not timing slack) limits performance scaling due to AC EM

EM slack determines F<sub>max</sub> at fixed resources
 % of positive EM slack is usable to improve F<sub>max</sub> by

reducing MTTF requirement

> EM violations in critical paths lead to positive EM slack



Area and temperature can be dominating constraints at lower MTTF requirements

 ➤ Area limits F<sub>max</sub> scaling for MTTF ≤ 7 years (DMA)
 ➤ Area upper bounds are violated for MTTF ≤ 6 years; Temperature upper bounds are violated for MTTF ≤ 3 years

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

# Study 2: MTTF vs. Area, Power

Study MTTF vs. area and power tradeoffs at a fixed performance requirement

#### Setup

- DMA at 2000 MHz (2ps slack after SP&R at 45nm)
- AES at 1100 MHz (1.6ps slack after SP&R at 45nm)
- JPEG at 850 MHz (93ps slack after SP&R at 45nm)
- Two technology libraries: TSMC 45GS and 65GPLUS



 Large positive timing slack at MTTF = 10 years can lead to smaller area when MTTF requirement is reduced
 Large positive timing slack at MTTF = 10 years can lead to smaller power when MTTF requirement is reduced



Area and power can decrease as MTTF requirement is reduced for designs with loose timing constraints

 Small positive timing slack at MTTF = 10 years can lead to increase in area as MTTF requirement is reduced
 Small positive timing slack at MTTF = 10 years can lead to increase in power as MTTF requirement is reduced

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

#### **Conventional EM Fixes and MTTF**

- Study how conventional SI and EM fixing methods affect area and performance at reduced MTTF requirements.
- Setup
  - Sweep MTTF from 10 years down to 1 year
  - Apply per-net NDRs, driver downsizing and fanout reduction fixes
  - Study using AES, JPEG and DMA testcases
  - Two technology libraries: TSMC 45GS and 65GPLUS
  - Insights are very instance-, technology/libraryand flow-specific



> Fixing EM violations using NDRs can be effective in improving  $F_{max}$  only till MTTF = 7 years

- $\succ$  % increase in F<sub>max</sub> is less than 5%
- $\gg$  % increase in area is ~2%



# NDRs can be more effective knobs to increase F<sub>max</sub> with less increase in area

- Fanout reductions to fix EM can increase F<sub>max</sub> by 3% at the cost of 1.86% increase in area
  Drive downsizing to fix EM can increase F<sub>max</sub> by 2.5% at
  - the cost of 2% increase in area

### Outline

Motivation Previous Work Our Work Preliminaries Study 1: MTTF vs. F<sub>max</sub> Study 2: MTTF vs. Area, Power Insights on Conventional EM Fixes Conclusions

#### Conclusions

We study and quantify potential impacts of improved EM-awareness in designs through two basic studies

#### Our key observations

- Study 1: Available performance scaling (up to 80%) from MTTF reduction is dependent on *EM slack*
- Study 2: Area and power can decrease when MTTF is reduced in designs with loose timing constraints
- Additional studies: NDRs can be more effective in increasing performance ~5% at the cost of 2% increase in area for MTTF up to 7 years

#### Ongoing work

- EM reliability requirements in multiple operating modes
- Combined impacts of EM and other back end of the line reliability mechanisms on interconnect lifetime

#### Acknowledgments

Work supported by IMPACT, SRC, NSF, Qualcomm Inc. and NXP Semiconductors **Thank You!** 



#### Hotspot Setup

- We use Hotspot5.0 calibrated with thermal package from Qualcomm Inc.
- We perform two kinds of modeling
  - Without heat spread and heat sink when profiling single block of AES, JPEG or DMA (area in µm<sup>2</sup>)
  - With heat spreader and heat sink when profiling 50x50 blocks of AES, JPEG, or DMA in an area of ~5mm<sup>2</sup>
- We get same values of temperature for a single block from both these methods