Technique for Controlling Power-Mode Transition Noise in Distributed Sleep Transistor Network

> Yongho Lee, and Taewhan Kim Seoul National University



- Introduction
- Related Work
- Motivation example
- The proposed algorithm
- Experimental results
- Conclusion

# **Power Gating on Circuits**

### Basic idea

OReduce the leakage power by inserting power gating cell(s) into the power or ground nets



### **Design Issues in Power Gated Logic Circuit**

#### Active mode

○ IR drop between source and drain node of sleep transistor

○ Sleep transistor overhead

Sleep mode
 State retention FFs

#### Mode transition

- O Wakeup delay
- Huge discharging current
  - Accumulated charges in '0' state nodes and virtual ground rail
  - Short circuit current

# **Related work**

- Sleep transistor design
  - Module based [5]— centralized sleep transistor design
    - Large interconnect resistance of virtual ground
  - O Cluster based [6]
    - Design overhead
  - O Distributed sleep transistor network: DSTN [7]
    - Current balancing effect, PVT tolerance
- Sleep transistor sizing
  - O Based on MSSC & PL [8]
  - Average current method [11]
  - Path based switching current method [12]
- Mode transition noise
  - O Wakeup order scheduling of power gated blocks in system level [13]
  - Incremental turn-on scheme; gradually or sequentially [3]
  - Logic cell clustering method [14]

#### **DSTN: distributed sleep transistor network**





### **Design flow of power gated circuits**



Motivation example

### **Characteristics of sleep transistors**



### **Sleep transistor sizing**

$$\left(\frac{W}{L}\right)_{STr, \ total} = \frac{I_{STr, \ total}(t)}{\delta\mu_n C_{ox}(V_{DD} - V_{tl})(V_{DD} - V_{th})}$$
$$I_{STr, \ total}(t) = \sum_{i \in gates} I_{STr, \ i}(t)$$

$$I_{STr, total}(t) \leq I_{MSSC}$$

\* Source: M. Anis, et al., DAC 2002

Motivation example

#### **Relation between:**

#### **Sleep transistor size and switching current**



|       | Original circuit       | Power gated circuit |                   |  |  |
|-------|------------------------|---------------------|-------------------|--|--|
|       | Switching current [mA] | Size [W/L]          | Delay increase[%] |  |  |
| CASE1 | 1.98                   | 75                  | 2.3               |  |  |
| CASE2 | 0.78                   | 29                  | 4.5               |  |  |

$$\left(\frac{W}{L}\right)_{STr, \ total} = \frac{I_{STr, \ total}(t)}{\delta\mu_n C_{ox}(V_{DD} - V_{tl})(V_{DD} - V_{th})}$$
$$I_{STr, \ total}(t) = \sum_{i \in gates} I_{STr, \ i}(t)$$
$$I_{STr, \ total}(t) \le I_{MSSC}$$

#### **Relation between:**

#### **Power mode trans. noise & sleep transistor size**



| Combinations of STr to be turned on | Peak value [mA] |       |  |  |
|-------------------------------------|-----------------|-------|--|--|
|                                     | Case1           | Case2 |  |  |
| STR1 + STR2 + STR3                  | 4.73            | 2.30  |  |  |
| STR1 + STR2                         | 3.83            | 1.64  |  |  |
| STR1                                | 1.96            | 0.79  |  |  |
| STR2                                | 2.19            | 0.89  |  |  |
| STR3                                | 1.79            | 0.72  |  |  |

#### **Power-up controlling of sleep transistors**



| Sleep transistor location | Delay [ns] | Ratio |  |  |
|---------------------------|------------|-------|--|--|
| Locl <sup>1</sup>         | 4.17       | 1     |  |  |
| $Loc2^2$                  | 3.96       | 0.95  |  |  |
| Loc3 <sup>3</sup>         | 3.84       | 0.92  |  |  |

| S0  | S1   | S2  |      |     |      |     | S7  | S8  | <b>S</b> 9 |
|-----|------|-----|------|-----|------|-----|-----|-----|------------|
| S10 | S11  | S12 |      |     |      |     |     | S17 |            |
|     |      |     |      |     |      |     |     |     |            |
|     |      |     |      |     | S35  |     |     |     |            |
|     |      |     | S43  | S44 | S45  | S46 | S47 | S48 |            |
|     |      |     | S53  | S54 | 855  | S56 | S57 | S58 |            |
|     |      |     | S63  | S64 | S65  | S66 | S67 | S68 |            |
|     |      |     | S73  | S74 | S75  | S76 | S77 | S78 |            |
|     |      |     |      |     |      |     |     |     |            |
|     |      |     |      |     |      |     | S97 | S98 | S99        |
|     | Loc1 |     | Loc2 |     | Loc3 |     |     |     |            |

# **Unate Covering Problem (UCP)**

- Method for the two level logic optimization
   Given a Boolean function f, find a minimum SOP formula
- Let M<sub>mxn</sub> be a Boolean matrix, the UCP is to find a minimum number of columns to cover M in the sense that any row with a 1-entry has at least one of its 1entries covered by these columns.

# **UCP example**

f(w, x, y, z) = x'y' + wxy + x'yz' + wy'z

|          | wxy | wxz | wyz' | wy'z | x'y' | x'z' |
|----------|-----|-----|------|------|------|------|
| wx'y'z'  |     |     |      |      | 1    | 1    |
| w'x'y'z  |     |     |      |      | 1    |      |
| w'x'y'z' |     |     |      |      | 1    | 1    |
| wxyz     | 1   | 1   |      |      |      |      |
| wxyz'    | 1   |     | 1    |      |      |      |
| wx'yz'   |     |     | 1    |      |      | 1    |
| w'x'yz'  |     |     |      |      |      | 1    |
| wxy'z    |     | 1   |      | 1    |      |      |
| wx'y'z   |     |     |      | 1    | 1    |      |

# **UCP example**

f(w, x, y, z) = x'y' + wxy + x'yz' + wy'z

|          | wxy | wxz | wyz' | wy'z | x'y' | x'z' |                            |
|----------|-----|-----|------|------|------|------|----------------------------|
| wx'y'z'  |     |     |      |      | 1    | 1    |                            |
| w'x'y'z  |     |     |      |      | 1    |      |                            |
| w'x'y'z' |     |     |      |      | 1    | 1    |                            |
| wxyz     | 1   | 1   |      |      |      |      |                            |
| wxyz'    | 1   |     | 1    |      |      |      | _ Solutions to UCP :       |
| wx'yz'   |     |     | 1    |      |      | 1    | $\{x'y', x'z', wxy, wxz\}$ |
| w'x'yz'  |     |     |      |      |      | 1    |                            |
| wxy'z    |     | 1   |      | 1    |      |      |                            |
| wx'y'z   |     |     |      | 1    | 1    |      |                            |

## **UCP example**

f(w, x, y, z) = x'y' + wxy + x'yz' + wy'z

|          | wxy | wxz | wyz' | wy'z | x'y' | <b>x</b> ' <b>z</b> ' |  |
|----------|-----|-----|------|------|------|-----------------------|--|
| wx'y'z'  |     |     |      |      | 1    | 1                     |  |
| w'x'y'z  |     |     |      |      |      |                       |  |
| w'x'y'z' |     |     |      |      | 1    | 1                     |  |
| wxyz     | 1   | 1   |      |      |      |                       |  |
| wxyz'    | 1   |     | 1    |      |      |                       |  |
| wx'yz'   |     |     |      |      |      |                       |  |
| w'x'yz'  |     |     |      |      |      | 1                     |  |
| wxy'z    |     | 1   |      | 1    |      |                       |  |
| wx y z   |     |     |      | 1    |      |                       |  |

Solutions to UCP: {x'y', x'z', wxy, wxz}

# **UCP formulation**



UCP solution  $d_i$ ; a disjoint subset of sleep transistors

$$D = \{ d_1, d_2, ... \}$$
  

$$T = \{ t_1, t_2, ... \}$$
  
Schedule T,  $t_1 \le t_2 \le ..., I_T \le I_{max}$ 

# **Experimental setup**

- Implemented the proposed algorithm in C++
- Tested on a set of ISCAS benchmark circuits
- Decomposed with INV, NAND2, NOR2, XOR2, XNOR2
- Simulated with 130nm standard cell library
- Controlled input vectors using SAT formulation

#### **Experimental Results:**

# **Sleep transistor sizing**

#### • PL: 5%

| Circuit | #PI | #gate | LONG [7] [W/L] | STSizing [W/L] | Ratio |
|---------|-----|-------|----------------|----------------|-------|
| с432    | 36  | 176   | 353.58         | 120.17         | 33.99 |
| с880    | 60  | 332   | 536.58         | 278.14         | 51.84 |
| с1355   | 41  | 201   | 347.42         | 164.11         | 47.24 |
| C1908   | 33  | 244   | 373.92         | 112.84         | 30.18 |
| с2670   | 233 | 455   | 600.25         | 398.78         | 66.44 |
| с3540   | 50  | 996   | 883.83         | 378.11         | 42.78 |
| с5315   | 178 | 1,295 | 1,878.33       | 1,060.00       | 56.43 |
| с7552   | 207 | 1,219 | 1,593.83       | 1,048.65       | 65.79 |
| Avg.    |     |       |                |                | 49.34 |

#### **Experimental Results:**

### **Power-mode transition noise controlling**

|         |                              | LONG [7] |                  |        |                  |        |                  |  |  |
|---------|------------------------------|----------|------------------|--------|------------------|--------|------------------|--|--|
|         |                              | CONV     |                  |        | SEQ              | STC    |                  |  |  |
| circuit | circuit $I_{max} [mA] (W/L)$ |          | $T_{wakeup}(ns)$ | I [mA] | $T_{wakeup}(ns)$ | I [mA] | $T_{wakeup}(ns)$ |  |  |
| с432    | 10 (116)                     | 28.39    | 1.54             | 9.75   | 2.24             | 9.71   | 1.72             |  |  |
| C880    | 14 (165)                     | 43.77    | 1.78             | 13.08  | 2.89             | 13.96  | 2.07             |  |  |
| C1355   | 9 (105)                      | 28.43    | 1.10             | 8.99   | 2.60             | 8.94   | 1.45             |  |  |
| C1908   | 10 (116)                     | 30.74    | 1.07             | 8.90   | 2.56             | 9.91   | 1.56             |  |  |
| C2670   | 15 (175)                     | 49.71    | 1.98             | 14.12  | 3.41             | 14.43  | 2.37             |  |  |
| с3540   | 20 (232)                     | 74.74    | 2.72             | 19.93  | 5.79             | 19.95  | 3.61             |  |  |
| с5315   | 40 (457)                     | 154.41   | 1.90             | 39.95  | 3.94             | 39.91  | 2.46             |  |  |
| с7552   | 34 (417)                     | 132.08   | 2.71             | 33.04  | 5.10             | 33.74  | 3.39             |  |  |

|         |                      |        | STSizing (Section V) |        |                  |        |                  |  |
|---------|----------------------|--------|----------------------|--------|------------------|--------|------------------|--|
|         |                      | CONV   |                      |        | SEQ              | STC    |                  |  |
| circuit | $I_{max} [mA] (W/L)$ | I [mA] | $T_{wakeup}(ns)$     | I [mA] | $T_{wakeup}(ns)$ | I [mA] | $T_{wakeup}(ns)$ |  |
| С432    | 10 (116)             | 9.04   | 2.03                 | 9.04   | 2.03             | 9.04   | 2.03             |  |
| C880    | 14 (165)             | 23.53  | 1.99                 | 13.28  | 3.86             | 13.98  | 2.15             |  |
| C1355   | 9 (105)              | 13.96  | 1.29                 | 8.52   | 3.04             | 8.75   | 1.49             |  |
| C1908   | 10 (116)             | 9.91   | 1.74                 | 9.91   | 1.74             | 9.91   | 1.74             |  |
| C2670   | 15 (175)             | 33.62  | 2.05                 | 14.97  | 4.03             | 14.92  | 2.41             |  |
| с3540   | 20 (232)             | 33.90  | 3.22                 | 19.93  | 7.17             | 19.97  | 3.62             |  |
| с5315   | 40 (457)             | 89.42  | 2.13                 | 37.22  | 5.24             | 39.51  | 2.58             |  |
| с7552   | 34 (417)             | 88.59  | 2.89                 | 31.40  | 5.43             | 33.63  | 3.46             |  |

# Conclusion

- Mode transition noise should be limited for a reliable system
- Peak value of discharging current depends on sleep transistor size
- Sleep transistor size can be reduced by using worst delay path aware approach
- Reduced sleep transistor size reduces the peak value of discharging current
- To meet the constraint of mode transition noise, clustering method of sleep transistors is proposed using UCP formulation

# Thank you

# **Input Vector Formulation**

The quantity to be minimized

$$Q = \sum_{n_i \in gates} \# fanout(n_i) \cdot VDD \cdot \gamma(n_i)$$

 SAT formulation with Pseduo Boolean expression

$$\bigcirc c_1 l_1 + c_2 l_2 + ... ≤ T$$

I<sub>i</sub> is literal of Boolean decision variables of SAT solver

#### **CNF** expression

```
(a+d)•(b+d)•(ā+b+d)•
(ā+ē)•(c+ē)•(a+c+e)•
(c+f)•(c+f)•
(d+g)•(e+g)•(d+e+g)•
(e+h)•(f+h)•(ē+f+h) = 1
```





#### Cost function

 $Min \{ d + 2e + f + g + h \}$ 

#### Solution:

Input vector : a = 0, b = 1, c = 1 # of 1 : 2