## Design and Chip Implementation of a Heterogeneous Multi-core DSP

Shuming Chen\*, Xiaowen Chen, Yi Xu, Yongjie Sun, Jianzhuang Lu, Xiangyuan Liu, and Shenggang Chen

\*Corresponding Author, Email: smchen@nudt.edu.cn



Institute of Microelectronics and Microprocessors, School of Computer, National University of Defense Technology, China



#### Trend of System-on-Chips (SoCs)



**Evolution of processor chips** 

# YHFT-QDSP: a heterogeneous multi-core Digital Signal Processor



**A RISC CPU core □**Four enhanced **YHFT-DSP/700 cores D**several peripherals **Dindividual memory** interfaces **Three kinds of communication** •AHB/MB Bridge Fast Shared Data **Pool (FSDP)** • Qlink-Crossbar-**PCIE** mechanism

#### **Chip Implementation of YHFT-DSP**



micrograph of YHFT-QDSP

### **Chip Implementation of YHFT-DSP**

| ◀──── 10.70mm ─── ►  ◀── 4.19m        |                                  | im                        |             |
|---------------------------------------|----------------------------------|---------------------------|-------------|
| DSP<br>DSP<br>Core-1<br>DSP<br>Core-2 |                                  |                           |             |
|                                       | Characteristics of the YHFT-QDSP |                           |             |
|                                       | Technology                       | SMIC 130nm LV             | T CMOS,     |
|                                       |                                  | 1 Poly, 8 Metals          | (Cu)        |
|                                       | Transistors                      | 73.9 million              | (YHFT-QDSP) |
|                                       |                                  | 17.1 million              | (DSP core)  |
|                                       |                                  | 0.82 million              | (CPU core)  |
|                                       | Die Area                         | $114.49 \text{ mm}^2$     | (YHFT-QDSP) |
|                                       |                                  | $19.32 \text{ mm}^2$      | (DSP core)  |
|                                       |                                  | $1.29 \text{ mm}^2$       | (CPU core)  |
|                                       | Clock Frequency                  | 350MHz@1.2V               | (DSP core)  |
|                                       |                                  | $200 \overline{MHz@1.2V}$ | (CPU core)  |
|                                       | Power Dissipation                | 2.99W@1.2V                | (YHFT-QDSP) |

#### **Performance Evaluation of YHFT-DSP**

#### 2D FFT benchmark



When the data size is relatively small, e.g. 32x32 points, data transfer by the FSDP is faster than by the QLink.
As the data size is larger, data transfer by the QLink is faster.
As we can see, using the FSDP and the Qlink together obtains the best speedups.