# Thermal-Aware 3D IC Placement Via Transformation 

Jason Cong, Guojie Luo, Jie Wei and Yan Zhang
Computer Science Department
University of California, Los Angeles

January 24, 2007

Supported by GSRC, NSF


## Outline

- Technology Background
- Problem Formulation \& Existing Works
-Our 3D Placement Framework
- Transformation for 3D Placement
- RCN Graph-based Refinement
- Experiment Results
-Conclusions and Future Work


## Technology Background

## - MITLL .18um 3D SOI Technology

3DOGC Opens to This Metal Level


## Problem Formulation

$\left(x_{i}, y_{i}, z_{i}\right)$ is assigned to every cell $i$ such that
$\bullet$ Wire length $\boldsymbol{\Sigma} \mathrm{I}(\mathrm{e})$ and \#via $\boldsymbol{\Sigma} \mathrm{v}(\mathrm{e})$ are minimized

- $l(e)=\max _{v_{i}, v_{j} \in e}\left|x_{i}-x_{j}\right|+\max _{v_{i}, v_{j} \in e}\left|y_{i}-y_{j}\right|$
- $v(e)=\max _{v_{i}, v_{j} \in e}\left|Z_{i}-Z_{j}\right|$
- I(e) + av(e) is the half perimeter wire length in 3D IC
$\star$ Non-overlap and temperature constraints are met


## Review of Existing 3D placements

$\rightarrow$ Min-cut

- [Das, Chandrakasan, Reif, ASP-DAC 2003]
$\bullet$ Min-cut + Simulated Annealing
- [Balakrishnan, Nanda, Easwar, Lim, ASP-DAC 2005]
- Force Directed
- [Goplen, Sapatnekar, ICCAD 2003]


## 3D Placement Framework: mPL-3D

## -Features

2D Wirelength- and/or Thermal- Driven Placement


- Existing well-performing 2D placers can be reused
- Simple but effective transformation heuristics
- Trade-off between wire length and \#via to adapt different manufacturing ability
- Refinement through RCN graph


## mPL-3D Framework - Thermal Model

Resistive network model

- [Wilkerson, et al. ITherm 2004]
-Resistive chain for fast analysis
- [Cong, Zhang, ICCAD 2005]
- Basic principal is to put high power cells close to heat sink


## Transformation for 3D Placement

-Local Stacking Transformation

- Transformation through Folding
-Window-based Stacking / Folding


## Local Stacking Transformation

$\checkmark$ 2D placement on area KA

- For 3D chip with K device layers and each with area A
$\rightarrow$ Shrink: $\left(x_{i}, y_{i}\right) \rightarrow\left(x_{i} / \sqrt{K}, y_{i} / \sqrt{K}\right)$
-Tetris-style 3D legalization
- Cost R = $\alpha d+\beta v+\gamma t$
- Minimize displacement, \#via and thermal cost


## Transformation through Folding

Layer assignment and location mapping according to the folded order

- Folding-2



## Comparison between Stacking and Folding

-Stacking

- Tend to optimize 2D wire length
- Result in great number of via
$\rightarrow$ Folding
- Tend to avoid over-optimized local nets
- Less optimization in wire length
- Trade-off is needed for different manufacturing ability


## Window-based Stacking / Folding

-Divde 2D placement into NxN windows

- Apply stacking or folding in a window

-Effect of stacking or folding would be spreaded out, and trade-offs are achieved with different N


## RCN Graph-based Refinement

-Construction of Relaxed Conflict-Net (RCN) graph

- Vertex Set consists of all the cells and nets
- Two types of edges
- Conflict edge. Cost is imposed if cells are "overlapped"

- Net edge. Cost relates to the layer assignment



## RCN Graph Example



## RCN Graph-based Refinement

- Layer reassignment
- Objective: minimize total cost of RCN graph
- Variables: layer assignment (z)
- Constants: ( $x, y$ ) location of cells
$\rightarrow$ Algorithm [Chang, Cong, IEEE Trans. CAD, 1999]
- Optimal solution can be achieved if the graph is a tree
- Induced sub-tree is constructed, and it will cover $40 \%-50 \%$ nodes
- Iteratively optimize over these sub-trees to achieve good solution on the whole graph


## Experiment Setup

- IBMv1 placement benchmark
- 2D placer mPL 5.0
- Initial 2D placement
- Final layer-by-layer legalization and detailed placement
- Transformations
- LST (10\%) - Local Stacking with 10\% overlap during refinement
- LST (20\%) - Local Stacking with 20\% overlap during refinement
- Folding-2 - Folding-2 Transformation
- Folding-4 - Folding-4 Transformation
- $8 \times 8$ LST( $10 \%$ ) $-8 \times 8$ windows and apply LST( $10 \%$ ) on each window
- LST (10\%) w/ temp. opt. - LST(10\%) with temp. cost during legalization


## Experiment Results (1/2)

## -Comparison of different transformations

| circuit | 2D mPL5 | LST (10\%) |  | LST (20\%) |  | Folding-2 |  | Folding-4 |  | 8x8 LST (10\%) |  |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
|  |  | WL | via \# | WL | via \# | WL | via \# | WL | via \# | WL | via \# |
| ibm01 | 5.19E+06 | $2.52 \mathrm{E}+06$ | 18519 | $2.68 \mathrm{E}+06$ | 14102 | $4.61 \mathrm{E}+06$ | 1671 | $4.55 \mathrm{E}+06$ | 2476 | $3.53 \mathrm{E}+06$ | 6688 |
| ibm03 | $1.37 \mathrm{E}+07$ | $6.62 \mathrm{E}+06$ | 30434 | $7.29 \mathrm{E}+06$ | 21406 | $1.14 \mathrm{E}+07$ | 4125 | $1.11 \mathrm{E}+07$ | 5909 | $8.36 \mathrm{E}+06$ | 12318 |
| ibm04 | $1.67 \mathrm{E}+07$ | $8.45 \mathrm{E}+06$ | 37414 | $9.20 \mathrm{E}+06$ | 26871 | $1.55 \mathrm{E}+07$ | 2940 | $1.43 \mathrm{E}+07$ | 6388 | $1.10 \mathrm{E}+07$ | 15315 |
| ibm06 | $2.20 \mathrm{E}+07$ | $1.10 \mathrm{E}+07$ | 50139 | $1.52 \mathrm{E}+07$ | 32939 | $2.02 \mathrm{E}+07$ | 4116 | $1.83 \mathrm{E}+07$ | 9077 | $1.44 \mathrm{E}+07$ | 19315 |
| ibm07 | $3.73 \mathrm{E}+07$ | $1.83 \mathrm{E}+07$ | 65093 | $2.07 \mathrm{E}+07$ | 44715 | $3.18 \mathrm{E}+07$ | 5932 | $3.10 \mathrm{E}+07$ | 8755 | $2.37 \mathrm{E}+07$ | 25021 |
| ibm08 | $3.94 \mathrm{E}+07$ | $1.98 \mathrm{E}+07$ | 70317 | $2.13 \mathrm{E}+07$ | 49844 | $3.48 \mathrm{E}+07$ | 5801 | $3.28 \mathrm{E}+07$ | 10181 | $2.56 \mathrm{E}+07$ | 25205 |
| ibm09 | $3.46 \mathrm{E}+07$ | $1.72 \mathrm{E}+07$ | 72787 | $1.95 \mathrm{E}+07$ | 50755 | $3.19 \mathrm{E}+07$ | 4540 | $2.93 \mathrm{E}+07$ | 8257 | $2.34 \mathrm{E}+07$ | 23836 |
| ibm13 | $6.58 \mathrm{E}+07$ | $3.24 \mathrm{E}+07$ | 121135 | $3.60 \mathrm{E}+07$ | 85103 | $6.03 \mathrm{E}+07$ | 7696 | $5.85 \mathrm{E}+07$ | 13071 | $4.50 \mathrm{E}+07$ | 42568 |
| ibm15 | $1.65 \mathrm{E}+08$ | $8.26 \mathrm{E}+07$ | 246509 | $9.11 \mathrm{E}+07$ | 176018 | $1.45 \mathrm{E}+08$ | 15128 | $1.38 \mathrm{E}+08$ | 23662 | $1.14 \mathrm{E}+08$ | 72956 |
| ibm18 | $2.43 \mathrm{E}+08$ | $1.26 \mathrm{E}+08$ | 297771 | $1.34 \mathrm{E}+08$ | 208564 | $2.24 \mathrm{E}+08$ | 12077 | $2.08 \mathrm{E}+08$ | 28287 | $1.74 \mathrm{E}+08$ | 83380 |
| Avg. | 2.00 | 1.00 | 1.00 | 1.12 | 0.71 | 1.78 | 0.08 | 1.7 | 0.14 | 1.34 | 0.36 |

## Experiment Results (2/2)

- Effect of temperature optimization

|  | LST, $\mathrm{r}=10 \%$, | LST, $\mathrm{r}=10 \%, \mathrm{w} /$ temp optimization |  |  |
| :---: | :---: | :---: | :---: | :---: |
| circuit | Temp. $\left({ }^{\circ} \mathrm{C}\right)$ | WL | via \# | Temp. $\left({ }^{\circ} \mathrm{C}\right)$ |
| ibm01 | 276.5 | $2.81 \mathrm{E}+06$ | 19020 | 159.8 |
| ibm03 | 196.7 | $7.13 \mathrm{E}+06$ | 31780 | 121.6 |
| ibm04 | 159.6 | $9.11 \mathrm{E}+06$ | 40219 | 96.0 |
| ibm06 | 160.4 | $1.23 \mathrm{E}+07$ | 50576 | 103.5 |
| ibm07 | 107.5 | $2.01 \mathrm{E}+07$ | 69111 | 66.4 |
| ibm08 | 97.7 | $2.05 \mathrm{E}+07$ | 75397 | 63.2 |
| ibm09 | 96.1 | $1.94 \mathrm{E}+07$ | 78102 | 60.6 |
| ibm13 | 249.3 | $3.47 \mathrm{E}+07$ | 127520 | 156.2 |
| ibm15 | 136.5 | $8.58 \mathrm{E}+07$ | 260681 | 90.1 |
| ibm18 | 89.4 | $1.31 \mathrm{E}+08$ | 332012 | 58.7 |
| Avg. | 1.0 | 1.08 | 1.06 | 0.63 |

## Conclusions and Future Work

Our contribution

- A simple but effective heuristic to reuse existing 2D placers
- Trade-offs between wire length and \#via
- A general refinement method for 3D placement
- Future work
- More folding-like heuristics for arbitrary K layers
- Mix-sized 3D placement
- White space reservation for inter-layer via
- Find out a good measurement for WL \& \#via


## The End

## Thank you!

