An Efficient Algorithm of Adjustable Delay Buffer Insertion for Clock Skew Minimization in Multiple Dynamic Supply Voltage Designs

**Asia and South Pacific Design Automation Conference** 

#### Authors: Kuan-Yu Lin, Hong-Ting Lin, and Tsung-Yi Ho Presenter: Hong-Ting Lin

chibli@csie.ncku.edu.tw http://eda.csie.ncku.edu.tw Electronic Design Automation Laboratory Department of Computer Science and Information Engineering National Cheng Kung University Tainan, Taiwan



NCKU CSIE EDALAB



#### Introduction

- Multiple Dynamic Supply Voltage (MDSV) Designs
- The Clock Skew Issue in MDSV Designs
- A Model of Adjustable Delay Buffer
- . Problem Formulation
- . Algorithm Flow
- . Clock Skew Minimization in MDSV Designs
- . Experimental Results
- . Conclusions

### Introduction

- Multiple supply voltage designs
  - Pros: Reduce partial power consumption
  - Cons: Degrade the performance

.

- Multiple dynamic supply voltage designs
  - Pros: Reduce power consumption while keeps the performance
  - Cons: Lead to the variability issue in the clock tree



### Multiple Dynamic Supply Voltage (MDSV) Designs

Power mode concept

The power optimal scheme of voltage mode operation on each voltage island



### The Clock Skew Issue in MDSV Designs

#### The variability of clock skew

In MDSV designs, the clock skew may violate the skew constraint during the switching between different power modes



### A Model of Adjustable Delay Buffer

#### Adjustable Delay Buffer (ADB) [5]

.

- Parallel two inverters and add the SELECT pins to control the driving modes
- ADB are used to produce additional delays during transportation



[5] G. N. Roberts, "Adjustable buffer driver," U. S. Patent, no. 5361003, 1994.6

### **Previous Work**

- Su et al. [6] proposed the algorithm to reduce the clock skew by adopting ADBs
- In single power mode:
  - Random insert ADBs and iteratively improve the results by adding one ADB and removing another ADB
- In multiple power modes



[6] Y. S. Su et al., "Value assignment of adjustable delay buffers for clock skew minimization in multi-voltage mode designs," ICCAD, pp. 535-538, 2009.

7

- . Introduction
- . Problem Formulation
- . Algorithm Flow
- . Clock Skew Minimization in MDSV Designs
- . Experimental Results
- . Conclusions

## **Problem Formulation**

#### Input

 Given an MDSV design with a buffered clock tree and the skew constraint, and given locations of voltage islands and power mode assignments in the design

#### Objective

 Insert ADBs with delay value assignments to minimize the clock tree skew in the MDSV design

#### Timing model

 The timing information refers to the Synopsys industry cell library, and the values of clock latency refer to the summation of total buffer delay in every branch of the clock tree

- . Introduction
- . Problem Formulation
- . Algorithm Flow
- . Clock Skew Minimization in MDSV Designs
- . Experimental Results
- . Conclusions

### **Algorithm Flow**



- . Introduction
- . Problem Formulation
- . Algorithm Flow
  - **Clock Skew Minimization in MDSV Designs** 
    - Top-Down ADB Insertion and Delay Value Assignments
    - Bottom-Up ADB Elimination and Delay Value Extensions
    - ADB Insertion with Delay Value Assignments in Multiple Power Modes
    - ADB Delay Value Reduction in Multiple Power Modes
- . Experimental Results
- . Conclusions

## **Clock Skew Minimization in MDSV Designs**

Clock skew reduction in single power mode

- An efficient two-stage algorithm for ADB insertion
- Clock skew reduction in multiple power modes
  - Results combination of every single power mode by union method
- ADB delay value reduction
  - To reduce further delay values in an ADB

### Top-Down ADB Insertion and Delay Value Assignments

The top-down strategy

 Since inserting ADBs in higher tree levels can affect larger parts of the clock tree, the top-down strategy can provide high priority for inserting ADBs in the higher level of the clock tree



### Top-Down ADB Insertion and Delay Value Assignments

Optimal delay value assignment

 Referring to [6], the algorithm aligns the local clock tree with less clock latency to the global clock tree with maximum clock latency

ADB<sub>d</sub>: the additional delay value assigned on ADB

 $ADB_d = L_{gt} - L_{lt}$  L<sub>gt</sub>: the maximum clock latency of global clock tree L<sub>lt</sub>: the maximum clock latency of local clock tree



[6] Y. S. Su et al., "Value assignment of adjustable delay buffers for clock skew minimization in multi-voltage mode designs," ICCAD, pp. 535-538, 2009.

### Top-Down ADB Insertion and Delay Value Assignments



### Bottom-Up ADB Elimination and Delay Value Extensions

The bottom-up strategy

 Because upper level of ADBs can affect and improve the skew problem to larger parts of the clock tree. In this step, we focus on eliminating the ADBs which drive the smaller local clock tree.



The advantage of the extended delay range

Whatever delay values are assigned in the delay range, the assignments still can meet the skew constraint

#### Bottom-Up ADB Elimination and Delay Value Extensions



### ADB Insertion with Delay Value Assignments in Multiple Power Modes

- The union method
  - Apply the two-stage algorithm in every power mode and confirm the skew constraint is not violated in all power modes The results of clock trees can be merged by the union method



#### ADB Delay Value Reduction in Multiple Power Modes

- The application of extended delay ranges
  - By searching the intersections of extended delay ranges, the algorithm can merge the delay values which share the same intersection.



<sup>(</sup>a) Initial ADB delay values

### ADB Delay Value Reduction in Multiple Power Modes



## **Algorithm Flow Recap**



- . Introduction
- . Problem Formulation
- . Algorithm Flow
- . Clock Skew Minimization in Designs
- **Experimental Results** 
  - Experimental Results
  - The Layout Result
  - Conclusions

## **Experimental Results**

| Benchmarks   | # Flip-<br>flops | # Buffers | Skew<br>Constraint<br>(ps) | Worst Clock Skew (ps) |          |                    | Average Clock Skew (ps) |          |                    | Worst Clock Latency (ps) |          |                                         |
|--------------|------------------|-----------|----------------------------|-----------------------|----------|--------------------|-------------------------|----------|--------------------|--------------------------|----------|-----------------------------------------|
|              |                  |           |                            | Original              | [6]      | Ours               | Original                | [6]      | Ours               | Original                 | [6]      | Ours                                    |
| design1.def  | 384              | 22        | 200                        | 476                   | 200      | 200                |                         |          |                    | 316                      | 1316     | 1316                                    |
| design1.def  | 384              | 33        | 300                        | 476                   | 293      | 300                |                         | out skew | violatic           | DN 316                   | 1316     | 1316                                    |
| design2.def  | 992              | 79        | 200                        | 463                   | 200      | 200                | 388                     | 197      | 198                | 1560                     | 1560     | 1560                                    |
|              |                  | 19        | 300                        | 463                   | 300      | 300                | 388                     | 272      | 292                | 1560                     | 1560     | 1560                                    |
| design3.def  | 1536             | 127       | 200                        | 630                   | 200      | 200                | 506                     | 195      | 200                | 1667                     | 1667     | 1667                                    |
| designo.dei  | 1550             | 127       | 300                        | 630                   | 298      | 300                | 506                     | 290      | 300                | 1667                     | 1667     | 1667                                    |
| design4.def  | 3360             | 337       | 200                        | 1018                  | 199      | 200                | 785                     | 196      | 200                | 2888                     | 2888     | 2888                                    |
| design4.def  | 5500             | 557       | 300                        | 1018                  | 299      | 300                | 785                     | 292      | 300                | 2888                     | 2888     | 2888                                    |
| design5.def  | 6144             | 519       | 200                        | 1167                  | 198      | 200                | 796                     | 196      | 200                | 3069                     | 3069     | 3069                                    |
| design5.def  | 5144             | 519       | 300                        | 11                    | -        | 000/ 0             |                         |          | 1001               | 6                        | 17 0 11  | . – – – – – – – – – – – – – – – – – – – |
|              |                  |           |                            | 0%                    | % 10 25. | 83% 01             | _ /.59                  | % to 42  | 4U% C              | NT _ T                   | 17.84X ( | OT                                      |
| Benchmarks   | # Flip-<br>flops | # Buffers | Skew<br>Constraint         | # .                   | ADB rec  | duction            | # Tr                    | ansistor | reducti            | ion ru                   | untime s | speedup                                 |
|              |                  |           | (ps)                       | [6]                   | Ours     | Improvement<br>(%) | [6]                     | Ours     | Improvement<br>(%) | [6]                      | Ours     | [6] / Ours                              |
| decient def  | 384              | 33        | 200                        | 19                    | 19       | 0.00%              | 336                     | 260      | 22.62%             | 0.01                     | < 0.01   | > 1.00                                  |
| design1.def  | 384              | 22        | 300                        | 8                     | 7        | 12.50%             | 216                     | 168      | 22.22%             | < 0.01                   | < 0.01   | 1.00                                    |
| decise 2.1.0 | 002              | 70        | 200                        | 61                    | 61       | 0.00%              | 896                     | 828      | 7.59%              | 0.17                     | 0.01     | 17.00                                   |
| design2.def  | 992              | 79        | 300                        | 18                    | 17       | 5.55%              | 644                     | 516      | 19.88%             | 0.09                     | 0.01     | 9.00                                    |
| 1            | 1000             | 107       | 200                        | 100                   | 100      | 0.00%              | 2192                    | 1724     | 21.35%             | 1.05                     | 0.04     | 26.25                                   |
| design3.def  | 1536             | 127       | 300                        | 37                    | 32       | 13.51%             | 1764                    | 1016     | 42.40%             | 0.55                     | 0.03     | 18.33                                   |
|              |                  |           | 200                        | 199                   | 196      | 1.51%              | 5172                    | 4156     | 19.64%             | 12.84                    | 0.13     | 98.77                                   |
| design4.def  | 3360             | 337 -     | 300                        | 113                   | 99       | 12.38%             | 4688                    | 2980     | 36.43%             | 10.03                    | 0.14     | 71.64                                   |
|              |                  |           | 200                        | 289                   | 286      | 1.04%              | 7680                    | 6112     | 20.42%             | 50.67                    | 0.43     | 117.84                                  |
| design5.def  | 6144             | 519       | h                          | ۹                     | +        | ┼───╉              | <u>+</u>                | <u>`</u> | ┼────╊             | ┼────╂                   | <u> </u> | +                                       |

[6] Y. S. Su et al., "Value assignment of adjustable delay buffers for clock skew minimization in multi-voltage mode designs," ICCAD, pp. 535-538, 2009. —

25.83%

6756

3996

40.85%

35.50

89

300

120

82.56

0.43

## **Experimental Results**

#### . Layout result of design3.def



(a) Before ADB insertion



- . Introduction
- . Problem Formulation
- . Algorithm Flow
- . Clock Skew Minimization in MDSV Designs
- . Experimental Results
- Conclusions

#### **Conclusions**

Novel techniques are proposed to reduce the clock skew in MDSV designs

The proposed algorithms of delay value reduction can reduce the area overhead of ADBs in MDSV designs

Experimental results show that our algorithms are effective and efficient on clock skew, area, and runtime results in MDSV designs

