



## Thermal-Aware 3D IC Placement Via Transformation

### Jason Cong, Guojie Luo, Jie Wei and Yan Zhang Computer Science Department University of California, Los Angeles

January 24, 2007



Supported by GSRC, NSF

## **Outline**

### Technology Background

### Problem Formulation & Existing Works

### Our 3D Placement Framework

- Transformation for 3D Placement
- RCN Graph-based Refinement

### Experiment Results

### Conclusions and Future Work

## **Technology Background**

### MITLL .18um 3D SOI Technology

**3DOGC Opens to This Metal Level** 



## **Problem Formulation**

(x<sub>i</sub>,y<sub>i</sub>,z<sub>i</sub>) is assigned to every cell i such that

### • Wire length Σ I(e) and #via Σ v(e) are minimized

• 
$$l(e) = \max_{v_i, v_j \in e} |x_i - x_j| + \max_{v_i, v_j \in e} |y_i - y_j|$$

• 
$$v(e) = \max_{v_i, v_j \in e} |z_i - z_j|$$

I(e) + αv(e) is the half perimeter wire length in 3D IC

### Non-overlap and temperature constraints are met

## **Review of Existing 3D placements**

### Min-cut

[Das, Chandrakasan, Reif, ASP-DAC 2003]

### Min-cut + Simulated Annealing

[Balakrishnan, Nanda, Easwar, Lim, ASP-DAC 2005]

### Force Directed

Goplen, Sapatnekar, ICCAD 2003]

## 3D Placement Framework: mPL-3D



#### Features

- Existing well-performing 2D placers can be reused
- Simple but effective transformation heuristics
- Trade-off between wire length and #via to adapt different manufacturing ability
- Refinement through RCN graph

## mPL-3D Framework – Thermal Model



Tiles Stack Array

Single Tile Stack Tile Stack Analysis

#### Resistive network model

- [Wilkerson, et al. ITherm 2004]
- Resistive chain for fast analysis
  - [Cong, Zhang, ICCAD 2005]
  - Basic principal is to put high power cells close to heat sink

## Transformation for 3D Placement

- Local Stacking Transformation
- Transformation through Folding
- Window-based Stacking / Folding

## Local Stacking Transformation



#### ◆2D placement on area KA

 For 3D chip with K device layers and each with area A

•Shrink:  $(x_i, y_i) \rightarrow (x_i / \sqrt{K}, y_i / \sqrt{K})$ 

- Tetris-style 3D legalization
  - Cost R = αd + βv + γt
  - Minimize displacement, #via and thermal cost

## Transformation through Folding

- Layer assignment and location mapping according to the folded order
  - Folding-2



• Folding-4



UCLA VLSICAD LAB

## Comparison between Stacking and Folding

## Stacking

- Tend to optimize 2D wire length
- Result in great number of via

## Folding

- Tend to avoid over-optimized local nets
- Less optimization in wire length

## Trade-off is needed for different manufacturing ability

## Window-based Stacking / Folding

- Divde 2D placement into NxN windows
- Apply stacking or folding in a window

 Effect of stacking or folding would be spreaded out, and trade-offs are achieved with different N



## **RCN Graph-based Refinement**

### Construction of Relaxed Conflict-Net (RCN) graph

- Vertex Set consists of all the cells and nets
- Two types of edges
  - Conflict edge. Cost is imposed if cells are "overlapped"



• Net edge. Cost relates to the layer assignment



## **RCN Graph Example**



## **RCN Graph-based Refinement**

### Layer reassignment

- Objective: minimize total cost of RCN graph
- Variables: layer assignment (z)
- Constants: (x,y) location of cells
- ◆ Algorithm [Chang, Cong, IEEE Trans. CAD, 1999]
  - Optimal solution can be achieved if the graph is a tree
  - Induced sub-tree is constructed, and it will cover 40%-50% nodes
  - Iteratively optimize over these sub-trees to achieve good solution on the whole graph

## Experiment Setup

- IBMv1 placement benchmark
- ◆ 2D placer mPL 5.0
  - Initial 2D placement
  - Final layer-by-layer legalization and detailed placement

#### Transformations

- LST (10%) Local Stacking with 10% overlap during refinement
- LST (20%) Local Stacking with 20% overlap during refinement
- Folding-2 Folding-2 Transformation
- Folding-4 Folding-4 Transformation
- 8x8 LST(10%) 8x8 windows and apply LST(10%) on each window
- LST (10%) w/ temp. opt. LST(10%) with temp. cost during legalization

### Comparison of different transformations

| circuit | 2D mPL5  | LST (10%) |        | LST (20%) |        | Folding-2 |       | Folding-4 |       | 8x8 LST (10%) |       |
|---------|----------|-----------|--------|-----------|--------|-----------|-------|-----------|-------|---------------|-------|
|         |          | WL        | via #  | WL        | via #  | WL        | via # | WL        | via # | WL            | via # |
| ibm01   | 5.19E+06 | 2.52E+06  | 18519  | 2.68E+06  | 14102  | 4.61E+06  | 1671  | 4.55E+06  | 2476  | 3.53E+06      | 6688  |
| ibm03   | 1.37E+07 | 6.62E+06  | 30434  | 7.29E+06  | 21406  | 1.14E+07  | 4125  | 1.11E+07  | 5909  | 8.36E+06      | 12318 |
| ibm04   | 1.67E+07 | 8.45E+06  | 37414  | 9.20E+06  | 26871  | 1.55E+07  | 2940  | 1.43E+07  | 6388  | 1.10E+07      | 15315 |
| ibm06   | 2.20E+07 | 1.10E+07  | 50139  | 1.52E+07  | 32939  | 2.02E+07  | 4116  | 1.83E+07  | 9077  | 1.44E+07      | 19315 |
| ibm07   | 3.73E+07 | 1.83E+07  | 65093  | 2.07E+07  | 44715  | 3.18E+07  | 5932  | 3.10E+07  | 8755  | 2.37E+07      | 25021 |
| ibm08   | 3.94E+07 | 1.98E+07  | 70317  | 2.13E+07  | 49844  | 3.48E+07  | 5801  | 3.28E+07  | 10181 | 2.56E+07      | 25205 |
| ibm09   | 3.46E+07 | 1.72E+07  | 72787  | 1.95E+07  | 50755  | 3.19E+07  | 4540  | 2.93E+07  | 8257  | 2.34E+07      | 23836 |
| ibm13   | 6.58E+07 | 3.24E+07  | 121135 | 3.60E+07  | 85103  | 6.03E+07  | 7696  | 5.85E+07  | 13071 | 4.50E+07      | 42568 |
| ibm15   | 1.65E+08 | 8.26E+07  | 246509 | 9.11E+07  | 176018 | 1.45E+08  | 15128 | 1.38E+08  | 23662 | 1.14E+08      | 72956 |
| ibm18   | 2.43E+08 | 1.26E+08  | 297771 | 1.34E+08  | 208564 | 2.24E+08  | 12077 | 2.08E+08  | 28287 | 1.74E+08      | 83380 |
| Avg.    | 2.00     | 1.00      | 1.00   | 1.12      | 0.71   | 1.78      | 0.08  | 1.7       | 0.14  | 1.34          | 0.36  |

### Effect of temperature optimization

|         | LST, r = 10%, | LST, $r = 10\%$ , w/ temp optimization |        |            |  |  |  |
|---------|---------------|----------------------------------------|--------|------------|--|--|--|
| circuit | Temp. (°C)    | WL                                     | via #  | Temp. (°C) |  |  |  |
| ibm01   | 276.5         | 2.81E+06                               | 19020  | 159.8      |  |  |  |
| ibm03   | 196.7         | 7.13E+06                               | 31780  | 121.6      |  |  |  |
| ibm04   | 159.6         | 9.11E+06                               | 40219  | 96.0       |  |  |  |
| ibm06   | 160.4         | 1.23E+07                               | 50576  | 103.5      |  |  |  |
| ibm07   | 107.5         | 2.01E+07                               | 69111  | 66.4       |  |  |  |
| ibm08   | 97.7          | 2.05E+07                               | 75397  | 63.2       |  |  |  |
| ibm09   | 96.1          | 1.94E+07                               | 78102  | 60.6       |  |  |  |
| ibm13   | 249.3         | 3.47E+07                               | 127520 | 156.2      |  |  |  |
| ibm15   | 136.5         | 8.58E+07                               | 260681 | 90.1       |  |  |  |
| ibm18   | 89.4          | 1.31E+08                               | 332012 | 58.7       |  |  |  |
| Avg.    | 1.0           | 1.08                                   | 1.06   | 0.63       |  |  |  |

## **Conclusions and Future Work**

## Our contribution

- A simple but effective heuristic to reuse existing 2D placers
- Trade-offs between wire length and #via
- A general refinement method for 3D placement

### Future work

- More folding-like heuristics for arbitrary K layers
- Mix-sized 3D placement
- White space reservation for inter-layer via
- Find out a good measurement for WL & #via

# The End Thank you!