

# Pre-Routing Path Delay Estimation Based on Transformer and Residual Framework

Tai Yang, Guoqing He, Peng Cao

National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China

caopeng@seu.edu.cn

ASP-DAC2022

# OUTLINE



- Background
- Related Work
- Pre-Routing Path Delay Framework
- Results
- Conclusion

## Background





Problems: As the design flow gets closer to tape-out, the updated circuit timing faces nonnegligible mismatch between each stage of design flow, posing severe challenges for circuit optimization.

## Background





#### > This work: pre- and post-routing timing correlation

[DAC'19]E. C. Barboza, N. Shukla, Y. Chen, J. Hu, "Machine learning-based pre-routing timing prediction with reduced pessimism," in 2019 56th ACM/IEEE Design Automation Conference (DAC), pp. 1-6, IEEE, 2019.

4





Fast pre-routing timing estimation based on traditional mathematical model

# Timing analysis based on machine learning method



#### Fast pre-routing timing estimation based on traditional mathematical model

| Ref.     | Affiliation                      | Title                                                                         | Focus                                                                                              |
|----------|----------------------------------|-------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
| 10'SLIP  | Synopsys,<br>Brown<br>University | Fast, accurate a priori routing delay estimation                              | Post-routing delay estimation                                                                      |
| 04'DAC   | University of<br>California      | Pre-layout wire length and congestion estimation                              | Wire length and congestion estimation                                                              |
| 06'ICECC | Syracuse<br>University           | Pre-layout estimation of interconnect lengths for digital integrated circuits | Pre-layout interconnect<br>lengths estimation                                                      |
| 00'SLIP  | University of<br>Toronto         | Pre-layout estimation of individual wire lengths                              | Individual wire lengths<br>estimating during the<br>technology mapping<br>phase of logic synthesis |
| 00'SLIP  | University of<br>Toronto         | Pre-layout estimation of individual wire lengths                              | estimating during the<br>technology mapping<br>phase of logic synthesis                            |

Focus: wire length or wire delay estimation



#### Fast pre-routing timing estimation based on traditional mathematical model

| Synopsys,<br>Brown<br>University | Fast, accurate a priori routing delay estimation                                                                      | Post-routing delay estimation                                                                                                                                                                                                                                                                                                                         |
|----------------------------------|-----------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| University of<br>California      | Pre-layout wire length and congestion estimation                                                                      | Wire length and congestion estimation                                                                                                                                                                                                                                                                                                                 |
| Syracuse<br>University           | Pre-layout estimation of interconnect lengths for digital integrated circuits                                         | Pre-layout interconnect<br>lengths estimation                                                                                                                                                                                                                                                                                                         |
| University of<br>Toronto         | Pre-layout estimation of individual wire lengths                                                                      | Individual wire lengths<br>estimating during the<br>technology mapping<br>phase of logic synthesis                                                                                                                                                                                                                                                    |
|                                  | Synopsys,<br>Brown<br>University<br>University of<br>California<br>Syracuse<br>University<br>University of<br>Toronto | Synopsys,<br>Brown<br>UniversityFast, accurate a priori routing<br>delay estimationUniversity of<br>CaliforniaPre-layout wire length and<br>congestion estimationSyracuse<br>UniversityPre-layout estimation of<br>interconnect lengths for<br>digital integrated circuitsUniversity of<br>TorontoPre-layout estimation of<br>individual wire lengths |

Focus: wire length or wire delay estimation



Average increase ratio of net delays between routing and placement stages for all types of cells

The impact of routing to the cell delay is much more significant than that of wire delay.



#### Fast pre-routing timing estimation based on traditional mathematical model

| Ref.     | Affiliation                      | Title                                                                         | Focus                                                                                              |
|----------|----------------------------------|-------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------|
| 10'SLIP  | Synopsys,<br>Brown<br>University | Fast, accurate a priori routing delay estimation                              | Post-routing delay estimation                                                                      |
| 04'DAC   | University of<br>California      | Pre-layout wire length and congestion estimation                              | Wire length and congestion estimation                                                              |
| 06'ICECC | Syracuse<br>University           | Pre-layout estimation of interconnect lengths for digital integrated circuits | Pre-layout interconnect lengths estimation                                                         |
| 00'SLIP  | University of<br>Toronto         | Pre-layout estimation of individual wire lengths                              | Individual wire lengths<br>estimating during the<br>technology mapping<br>phase of logic synthesis |



Average increase ratio of net delays between routing and placement stages for all types of cells

> The impact of routing to the cell delay is much more significant than that of wire delay.

Focus: wire length or wire delay estimation

Pre-routing timing estimation requirement: (1)fast and accurate (2)pay more attention to cell delay or path delay





## In recent years, the learning-based methods have been extended in the application of timing analysis

- Application: A fast and accurate timing estimator which can highly correlate with a sign-off timer to shorten turn-around time
- ◆ ML models: RF, Lasso, XGBoost



[DAC'20, H. H. Cheng, Fast and accurate wire timing estimation on tree and non-tree net structures]

- Application: Wire delay/slew models for internal incremental STA to delay the deviation in endpoint slack from a STA tool.
- ML models: Least squares regression



[SLIP'13, A. B. Kahng, Learning-based approximation of interconnect delay and slew in signoff timing tools]

9



## In recent years, the learning-based methods have been extended in the application of timing analysis

- Application: Predict path-based slack from graphbased timing analysis
- ♦ ML models: RF

| GBA Timing Re           | ports                  | PBA Timir           | ng Reports |
|-------------------------|------------------------|---------------------|------------|
|                         | Feature Ex<br>Model Ti | traction<br>raining | Training   |
| GBA Timing Re<br>unseen | Predictive             | Model               | Testing    |
|                         | Predicted PE           | BA Timing           |            |

[ICCD'18, A. B. Kahng, Using machine learning to predict pathbased slack from graph-based timing analysis]

- Application: MLParest provides an accurate estimate of expected post-layout interconnect parasitics in the pre-layout design phase
- ◆ ML models: RF



<sup>[</sup>DAC'20, B. Shook, MLParest: Machine learning based parasitic estimation for custom circuit design]



#### Problems:

- Neglect of the delay correlation along the path
- Prediction error accumulation and computational complexity increase

## An efficient and accurate pre-routing path delay prediction framework is proposed in this work by employing transformer network and residual model.

- Sequence features at placement stage
- Transformer network: exploits the correlations through circuit path
- Residual model: calibrate the mismatch between the pre- and post-routing path delay
- Without additional computation

## Pre-Routing Path Delay Framework





Overview of the prediction

## Framework: feature selection





#### Overview of the prediction

#### Feature selection and data pre-process

| Features           | Continuous variable                            | Discrete variable |
|--------------------|------------------------------------------------|-------------------|
| Physical sequence  | Pin cap,<br>pin location                       | Cell type         |
| Timing<br>sequence | Input/output transition<br>time,<br>cell delay | Signal polarity   |
| Timing scalar      | Pre-routing path delay                         |                   |

#### Sequence:

the representation of path characteristics

### Framework: data pre-process





#### Overview of the prediction

#### Feature selection and data pre-process

| Features            | Continuous variable                            | Discrete variable     |
|---------------------|------------------------------------------------|-----------------------|
| Physical sequence   | Pin cap,<br>pin location                       | Cell type             |
| Timing<br>sequence  | Input/output transition<br>time,<br>cell delay | Signal polarity       |
| data<br>pre-process | Bin<br>+padding                                | Tokenizer<br>+padding |

## Framework: "pre-routing path delay"





## Framework: transformer encoder





## Framework: attention mechanism





Overview of the prediction

## Framework: data dimension reduction





- Dimension reduction and data concatenation
- Predict the residual value and add it to the pre-routing path delay



Experiment setup:

- Framework implementation: Python, keras
- TSMC 28nm technology
- Circuits: 5 circuits

3 seen circuits, randomly divide training and test sets2 unseen circuits, all of them are test sets

#### **Circuit Statistics**

| Circuits | # Train | # Test | #cell  | #net   | category |
|----------|---------|--------|--------|--------|----------|
| ckt #1   | 40791   | 17483  | 10154  | 18892  | seen     |
| ckt #2   | 93786   | 40194  | 234391 | 340004 | seen     |
| ckt #3   | 16099   | 6900   | 37958  | 51175  | seen     |
| ckt #4   | 0       | 16998  | 6667   | 9072   | unseen   |
| ckt #5   | 0       | 23785  | 11830  | 15170  | unseen   |
| Total    | 150676  | 105360 | 301000 | 434312 |          |



#### Accuracy Comparison:



Error distribution of different models on seen and unseen designs.

|                                  | Seen ckt  | Unseen ckt |
|----------------------------------|-----------|------------|
| rRMSE                            | <1.68%    | <3.12%     |
| Compared with<br>RF, reduced by  | 2.3X~5X   | 2.4X~2.7X  |
| Compared with<br>CNN, reduced by | 1.7X~2.9X | 1.1X~1.5X  |
| Residual model<br>benefits       | >30%      | >10%       |

20



#### Accuracy Comparison:



Error distribution of different models on seen and unseen designs.

|                                  | Seen ckt  | Unseen ckt |
|----------------------------------|-----------|------------|
| rRMSE                            | <1.68%    | <3.12%     |
| Compared with<br>RF, reduced by  | 2.3X~5X   | 2.4X~2.7X  |
| Compared with<br>CNN, reduced by | 1.7X~2.9X | 1.1X~1.5X  |
| Residual model<br>benefits       | >30%      | >10%       |

21



#### Accuracy Comparison:



Error distribution of different models on seen and unseen designs.

|                                  | Seen ckt  | Unseen ckt |
|----------------------------------|-----------|------------|
| rRMSE                            | <1.68%    | <3.12%     |
| Compared with<br>RF, reduced by  | 2.3X~5X   | 2.4X~2.7X  |
| Compared with<br>CNN, reduced by | 1.7X~2.9X | 1.1X~1.5X  |
| Residual model<br>benefits       | >30%      | >10%       |



#### Accuracy Comparison:



Error distribution of different models on seen and unseen designs.

|                                  | Seen ckt  | Unseen ckt |
|----------------------------------|-----------|------------|
| rRMSE                            | <1.68%    | <3.12%     |
| Compared with<br>RF, reduced by  | 2.3X~5X   | 2.4X~2.7X  |
| Compared with<br>CNN, reduced by | 1.7X~2.9X | 1.1X~1.5X  |
| Residual model<br>benefits       | >30%      | >10%       |



Accuracy Comparison:





### Runtime Comparison:

|             |             |                 | Predicti          | rediction Runtime (s) |                 |                 |  |
|-------------|-------------|-----------------|-------------------|-----------------------|-----------------|-----------------|--|
| Model       |             | ckt #1          | ckt #2            | ckt #3                | ckt #4          | ckt #5          |  |
|             | CTS         | 1378            | 31896             | 109                   | 193             | 620             |  |
| Traditional | Routing     | 1818            | 411968            | 143                   | 254             | 816             |  |
| IC flow     | STA<br>(PT) | 655             | 1608              | 276                   | 680             | 951             |  |
|             | Total       | 3851            | 445472            | 528                   | 1127            | 2387            |  |
| RF          |             | 11.2            | 26.5              | 10.6                  | 15.7            | 28.4            |  |
| CNN         |             | 1.28            | 2.68              | 0.57                  | 1.22            | 1.66            |  |
| This work   |             | 1.02<br>(3775X) | 2.16<br>(206237X) | 0.40<br>(1320X)       | 1.02<br>(1105X) | 1.38<br>(1730X) |  |

#### Runtime analysis

25





## An efficient and accurate pre-routing path delay prediction framework is proposed in this work by employing transformer network and residual model.

- Transformer network: exploits the correlations of the timing and physical information through circuit path by its multi-head self-attention mechanism
- Residual model: calibrate the mismatch between the pre- and post-routing path delay
- More accurate and less runtime



# THANKS Q&A

Tai Yang, Guoqing He, Peng Cao

National ASIC System Engineering Technology Research Center, Southeast University, Nanjing, China

caopeng@seu.edu.cn

ASP-DAC2022