# ScanPUF: Robust Ultralow-Overhead PUF using Scan Chain

Yu Zheng, Aswin Krishna, & Swarup Bhunia

Department of Electrical Engineering and Computer Science Case Western Reserve University, Cleveland, OH, USA *Email: yu.zheng3@case.edu* 



**ASP-DAC 2013** 



# Outline

## Introduction

- Fundamental of Physically Unclonable Function (PUF)
- Existing PUFs

# Methodology of ScanPUF

- Hardware architecture
- Signature generation procedure
- Design details

#### Simulation and emulation results

- Uniqueness and robustness
- Aging effect
- FPGA validation
- Discussion && Conclusion

# The Fundamental of Physically Unclonable Function (PUF)

• Why PUF?

The storage of digital key in non-volatile memory is vulnerable to invasive attacks.

Highly tampering-resistance environment is expensive.

#### • What is PUF?

 Basically, a challenge-response protocol that exploits the inherent random variations in a manufacturing process to generate unique signatures.

#### The advantages of PUF

- Random and vast challenge-response pairs.
- Only available when the chip is running.
- Cost is lower than key storage of tampering-resistance.

# **Existing PUFs**

Dedicated circuit structure

- Ring-oscillator PUF (RO-PUF)
- Arbiter PUF
- Butterfly PUF
- PE-PUF



# Non-negligible area overhead.

- Utilization the on-chip structure
  - SRAM PUF
  - MECCA
  - Intrinsic PUF



Limited challengeresponse pairs or established on special structure.

# **Illustration of ScanPUF Realization**

#### Signature generation cycle: 1 ns



## **The Architecture of ScanPUF**



# **Signature Generation Illustration**



# **Signature Generation Procedure**

- Select the SFFs in a design to generate the signature.
- In the test mode, scan sequence of alternating '0' and '1' from primary inputs (PIs) into SFFs.
- Produce the signature-generation cycle.
- Shift out the response bits (signature) stored in the SFFs through primary outputs (POs).

# **Hspice Simulation Setup**

- **128-bit signature** is evaluated in Hspice simulation under 45 nm PTM CMOS process.
- Process variation of  $V_{th}$ :  $\sigma_{th,inter} = 15\%$  and  $\sigma_{th0,intra} = 10\%$ .
- Four inverters on each clock delay line of SGCC.
- Four inverters on each scan path.
- Eight clock delay lines are employed in the SGCC and each one produces an 16-bit signature.
- 1000 chips are simulated and 0 -> 1 is generated on the scan path.

# **Scan Path Distribution**

#### Scan path follows the Gaussian distribution.



Median delay of each chip fluctuates largely due to the inter-die variation.

# Signature Generation Cycle Controller (SGCC)

SGCC is comprised of N clock delay lines and a (N+1):1 multiplexor.



## The switch between clock 1 and clock 2 generates a cycle of signature-generation period.

#### Architecture of SGCC



#### Timing of SGCC

# Design keynote of SGCC for the signature of good uniqueness

Signature-generation cycle and scan path delay should have the approximate shift on the inter-die corner.



The upsize of *W* and *L* drops the influence from the intra-die process variation.

#### Intra-die variation on SGCC

$$\sigma_{th,intra} = \sigma_{th0,intra} \cdot \sqrt{\frac{W_{min}L_{min}}{WL}}$$

minimum gate in a process

#### Approximate shift

12

# **Shannon Entropy Estimation**

- The Shannon Entropy is related to: (1) selection of SFFs, (2) signature-generation cycles, (3) prestored value in SFF (0->1, 1->0 transition).
- For only one signature-generation cycle, the maximum entropy for one SFF under (0->1 or 1->0 transition) is one bit.
- For m signature-generation cycles, the maximum entropy for an SFF under (0->1 or 1->0 transition) is larger than one bit, but less than m bits.

# Uniqueness analysis Signature-generation cycle (nominal) Equal Scan path delay (nominal)



Average Inter-die Hamming Distance (HD): 49.9% Probability of zero in the signature: 0.5 14



Probability of zero: 0.4

#### 15

Probability of zero: 0.6

# **Robustness Analysis**



18% bits flip when the supply voltage is reduced to 0.8 Volt from 1 Volt.

92.1% chips has no morethan 10 unstable bits within85 degree Celsius.

# **Aging Effect Analysis**

Unstable-bit percentage under non-identical NBTI on clock delay line and scan paths.



For the identical shift on  $V_{th}$  , nearly 4% bits flip.

# **FPGA** Validation

#### Structure of scan chain mapping into FPGA



Each basic structure is mapped into a LE for onebit signature with the delay as 5.42 ± 0.02 ns.

Totally, five 128-bit signatures are arranged on a FPGA from the top to bottom.

# **FPGA Validation (cont'd)**

#### **Result:**



arge standard deviation of inter-die HD is due to the uncontrollability of intra-die V<sub>th</sub> process variation on clock delay line because of the uniform structure of LE. 19

# **Discussion**

- Because of the unbalanced wire lengths, fanout loads and residual clock skewing, the nominal delays of different scan paths may be far ways from identity in real IC design.
- We can obtain the signature by two steps:
  - Put scan paths with near-identical nominal delays into a set.
  - Each set has a clock delay line to trigger its own signature bits and aggregate them from different sets as the signature.

# **Area Overhead**

- We make the comparison on the area of 128-bit ScanPUF and RO-PUF.
- For the ScanPUF, 8 clock delay lines (four inverters each) are included in the signaturegeneration controller.
- For the RO-PUF, if all ordering of ROs are possible, it requires 35 2:1 MUXs, two 32-bit counters and one 32-bit comparators.

|                        | RO-PUF  |         | ScanPUF |     |              |
|------------------------|---------|---------|---------|-----|--------------|
|                        |         | w/ INVs | w/o INV | s – | inout INVS   |
| Area(um <sup>2</sup> ) | 2721.40 | 303.96  | 62.04   |     | on scan path |
|                        |         |         |         |     | 21           |

# **Influence on Power**

- Some inverters may be inserted into scan path that impacts the power of circuit in the function mode and test mode.
- For the function mode, power-gating can prevent the signal propagation on scan path and therefore incur no extra power overhead.
- For the test mode, when four inverters are on the scan path, the test power grows by 9.65% in Hspice.
- Since 78% test power is consumed by the combinational logic, taking 64-bit Alpha processor with 2408 SFFs for example, the overall power is increased by 0.11%.

# Conclusion

- We have presented a novel PUF (ScanPUF) established on the scan chain, a prevalent on-chip structure.
- Since the on-chip structure is exploited, we show that the area overhead and design effort are reduced significantly.
- The extensive circuit-level Hspice simulation and FPGA validation show excellent signature uniqueness and robustness under temperature and supply voltage fluctuation and aging effect.
- It can provide a large number of challengeresponse pairs.

# Thank My

**Questions ??**