## Adaptive Control-Logic Routing for Fully Programmable Valve Array Biochips Using Deep Reinforcement Learning

Huayang Cai<sup>1</sup>, Genggeng Liu<sup>1,\*</sup>, Wenzhong Guo<sup>1</sup>, Zipeng Li<sup>2</sup>, Tsung-Yi Ho<sup>3</sup>, and Xing Huang<sup>4</sup>

#### <sup>1</sup> Fuzhou University, China;

<sup>2</sup>Apple Inc, Cupertino, USA;

<sup>3</sup>The Chinese University of Hong Kong, Hong Kong;

<sup>4</sup>Northwestern Polytechnical University, China

Speaker: Huayang Cai Email: c\_huayang@163.com

## Background

### Preliminary and Problem Formulation Mechanism of Control Logics Control Logic Considering Length Matching Problem Formulation

# Details of the Proposed Routing Flow

Overall Design Flow

Action Space and State Space

**Reward Function** 

# Simulation Results

## Conclusion

### Background



(c) Fully programmable valve array biochip

### Background

### Control system for fully programmable valve array biochips



(a) Fully programmable valve array biochip

(b) Multiplexing-based control logic

Background

### Preliminary and Problem Formulation

Mechanism of Control Logics Control Logic Considering Length Matching

**Problem Formulation** 

## Details of the Proposed Routing Flow

Overall Design Flow

Action Space and State Space

**Reward Function** 

## Simulation Results

## Conclusion

### **Preliminary and Problem Formulation**



# Mechanism of Control Logics

- Core input is connected to the flow valves through control channels.
- Control ports are deployed to actuate open/closed states of control valves.
- Control patterns can be generated using the combinations of control valves.
- Control pattern is activated to specify which control channel can be connected to the core input.

Schematic of the control logic for driving five flow valves

### **Control Logic Considering Length Matching**



(c) Control logic considering length matching

(b) Valve-grouping result

 $X_1$ 

 $\overline{x}_1$ 

 $x_2$ 

 $\overline{x}_{2}$ 

 $x_3$ 

 $\overline{x}_3$ 

### **Preliminary and Problem Formulation**

### **Problem Formulation**

- Inputs:
- 1) A logic forest indicating the control path information for all flow valves
- 2) The positions of all control valves in a control logic
- 3) The positions of all flow valves in an FPVA biochip
- Output:

An efficient control channel network considering length matching

- Objectives
- 1) Minimize the timing skews among synchronized valves
- 2) Minimize the total length of control channels

- Background
- Preliminary and Problem Formulation Mechanism of Control Logics Control Logic Considering Length Matching

Problem Formulation

Details of the Proposed Routing Flow

Overall Design Flow

Action Space and State Space

**Reward Function** 

# Simulation Results

Conclusion

### **Details of the Proposed Routing Flow**

### **Overall Design Flow of the Proposed DRL Method**



An optimized routing solution is generated to construct a control channel network with minimized timing skews and cost

### Action Space and State Space

Action Space is composed of all possible routing directions for control channels

$$\mathcal{A}_1 = \{a_t \mid a_t \in \{D_1, D_2, D_3, D_4\}, D_i = (w_1, w_2)\}$$

 $D_1(up) = (0,0); D_2(down) = (0,1); D_3(left) = (1,0); D_4(right) = (1,1)$ 

**State Space:** State  $S_t$  consist of four core components (the flow valve location  $F_t$  of the path, the essential point location  $E_t$  of the path, the current location  $C_t$  of the path, and the path properties  $C_t^p$ )

$$S_1 = ig\{ s_t \mid s_t = [F_t, E_t, C_t, C_t^p], F_t = ig( f_t^x, f_t^y ig), E_t = ig( e_t^x, e_t^y ig), C_t = ig( c_t^x, c_t^y ig), C_t^p = ig( c_t^s, c_t^l, c_t^m, c_t^{md} ig) ig\},$$

 $C_t^s$ : Serial number of the control path;

 $C_t^l$ : Current path length;

 $C_t^m$ : Matching path identifier;

 $C_t^{md}$ : Matching degree of the control path 11

### **Details of the Proposed Routing Flow**

### **Reward Function**

**Reward Function** is formulated based on the following five types of points that the control path can pass through: 1) **important point (IP)**, 2) **obstacle point (OP)**, 3) **target point (TP)**, 4) **general point (GP)**, and 5) **shared point (SP)**.

$$r_t = egin{cases} p_r^1, & ext{if } C_{t+1} = ext{IP} \ n_r^1, & ext{if } C_{t+1} = ext{OP} \ R_{TP}(C_{t+1}), & ext{if } C_{t+1} = ext{OP} \ R_{GP}(C_{t+1}), & ext{if } C_{t+1} = ext{OP} \ R_{GP}(C_{t+1}) + p_r^2, & ext{otherwise} \end{cases}$$

$$R_{TP}(C_{t+1}) = egin{cases} p_r^3 + p_r^4, & ext{if } c_t^{md} \leq T_m ext{ and } c_t^m = 1 \ p_r^3, & ext{otherwise} \end{cases}$$

$$R_{GP}(C_{t+1}) = egin{cases} n_r^2, & ext{if } \operatorname{mdis}(E_t,C_{t+1}) < \operatorname{mdis}(E_t,C_t) \ n_r^3, & ext{if } \operatorname{mdis}(E_t,C_{t+1}) = \operatorname{mdis}(E_t,C_t) \ n_r^4, & ext{otherwise} \end{cases}$$



- Background
- Preliminary and Problem Formulation Mechanism of Control Logics Control Logic Considering Length Matching Problem Formulation
- Details of the Proposed Routing Flow
  - Overall Design Flow
  - Action Space and State Space
  - **Reward Function**
- Simulation Results
- Conclusion

Benchmarks and parameters used in experiments.

| Benchmarks (# $C_s$ ,# $N_c$ ,# $N_f$ ,# $N_g$ ) |                 |                                   |                      |  |  |  |  |  |  |  |
|--------------------------------------------------|-----------------|-----------------------------------|----------------------|--|--|--|--|--|--|--|
| R1: $(18 \times 3)$                              | 34, 78, 20, 7)  | R2: $(18 \times 43, 92, 26, 8)$   |                      |  |  |  |  |  |  |  |
| R3: $(18 \times 53)$                             | 3, 130, 31, 10) | $R4:(18 \times 59, 139, 28, 16)$  |                      |  |  |  |  |  |  |  |
| $R5:(21\times82)$                                | 2, 229, 50, 15) | R6: $(21 \times 98, 327, 52, 20)$ |                      |  |  |  |  |  |  |  |
| Parameters                                       |                 |                                   |                      |  |  |  |  |  |  |  |
| $T_m = 3$                                        | $p_r^1 = 20$    | $p_{r}^{2} = 2$                   | $p_r^3 = 12$         |  |  |  |  |  |  |  |
| $p_r^4 = 20$                                     | $n_r^1 = -40$   | $n_r^2 = -2$                      | $n_r^3 = -5$         |  |  |  |  |  |  |  |
| $n_r^4 = -10$                                    | $\eta = 0.99$   | E = 25000                         | $\varepsilon = 0.90$ |  |  |  |  |  |  |  |
| $N_b = 256$                                      | $U_p = 200$     | $U_t = 300$                       | $v_p = 10$           |  |  |  |  |  |  |  |

 $\#C_s$ : The chip area occupied by the control channel network

- $\#N_c$ : The number of control valves
- $#N_f$ : The number of flow valves
- $#N_q$ : The number of synchronized valve groups

 Comparison with a heuristic method called CRA in terms of the total timing skew, total control channel length, and the number of zero skew groups.

| Benchmark  | Total timing skew (s) |      |         | Total channel length (mm) |      |         | 16                     | CRA | Ours | Ours |    |    | 14 |
|------------|-----------------------|------|---------|---------------------------|------|---------|------------------------|-----|------|------|----|----|----|
| Deneminark | CRA                   | Ours | Imp (%) | CRA                       | Ours | Imp (%) | ] sd 14 -              |     |      |      |    | 1  | 3  |
| R1         | 0.8                   | 0.0  | 100.0   | 288                       | 251  | 12.8    | j ≝ <sub>60</sub> 12 - |     |      |      | 10 | )  |    |
| R2         | 0.6                   | 0.0  | 100.0   | 378                       | 311  | 17.7    |                        |     | 8    | 8    |    |    |    |
| R3         | 0.4                   | 0.1  | 75.0    | 481                       | 423  | 12.1    | 8 - 8 -                | 7   |      |      |    | 6  | 7  |
| R4         | 0.8                   | 0.3  | 62.5    | 526                       | 444  | 15.6    | fo fo fo               |     |      | 4    | 4  |    |    |
| R5         | 0.4                   | 0.1  | 75.0    | 757                       | 665  | 12.2    | aqui 4 -               | 2   | 3    |      |    |    |    |
| <b>R</b> 6 | 0.7                   | 0.1  | 85.7    | 976                       | 868  | 11.1    | ] <b>ヹ</b> 2 -         |     |      |      |    |    |    |
| Average    |                       |      | 83.0    |                           |      | 13.6    | ] 0 +                  | R1  | R2   | R3   | R4 | R5 | R6 |

The total timing skew is defined as the maximum timing skew among all synchronized valve groups

 Comparison with three other architectures, including DuelingDQN, DoubleDQN, and DQN.

| Bench   | D3QN    |         | DuelingDQN |         | Imp (%) |         | DoubleDQN |         | Imp (%) |         | DQN     |         | Imp (%) |         |
|---------|---------|---------|------------|---------|---------|---------|-----------|---------|---------|---------|---------|---------|---------|---------|
|         | $\#T_s$ | $\#C_l$ | $\#T_s$    | $\#C_l$ | $\#T_s$ | $\#C_l$ | $\#T_s$   | $\#C_l$ | $\#T_s$ | $\#C_l$ | $\#T_s$ | $\#C_l$ | $\#T_s$ | $\#C_l$ |
| R1      | 0.0     | 251     | 0.1        | 267     | 100.0   | 6.0     | 0.1       | 272     | 100.0   | 7.7     | 0.3     | 285     | 100.0   | 12.0    |
| R2      | 0.0     | 311     | 0.2        | 328     | 100.0   | 5.2     | 0.2       | 334     | 100.0   | 6.8     | 0.4     | 350     | 100.0   | 11.1    |
| R3      | 0.1     | 423     | 0.2        | 441     | 20.0    | 4.1     | 0.1       | 445     | 0.0     | 4.9     | 0.1     | 468     | 0.0     | 9.6     |
| R4      | 0.3     | 444     | 0.5        | 468     | 40.0    | 5.1     | 0.5       | 473     | 40.0    | 6.1     | 0.5     | 495     | 40.0    | 10.3    |
| R5      | 0.1     | 665     | 0.1        | 710     | 0.0     | 6.3     | 0.1       | 705     | 0.0     | 5.7     | 0.1     | 730     | 0.0     | 8.9     |
| R6      | 0.1     | 868     | 0.1        | 912     | 0.0     | 4.8     | 0.3       | 919     | 40.0    | 5.5     | 0.4     | 955     | 75.0    | 9.1     |
| Average |         |         |            |         | 43.2    | 5.3     |           |         | 46.7    | 6.1     |         |         | 52.5    | 10.2    |

 $#T_s$ : The total timing skew;

 $#C_{j}$ : The total length of control channels;

- Background
- Preliminary and Problem Formulation Mechanism of Control Logics Control Logic Considering Length Matching Problem Formulation
- Details of the Proposed Routing Flow
  - Overall Design Flow
  - Action Space and State Space
  - Reward Function
- Simulation Results
- Conclusion

- We formulate the control channel network construction problem considering length matching for the FPVA control logic.
- We develop the DRL routing flow based on D3QN to construct an efficient control channel network, so that the timing skews and the total length of control channels can be minimized simultaneously.
- Simulation results demonstrate that the proposed routing flow leads to a control logic with both excellent timing behavior and low chip cost.

# Thank you for your attention!

Speaker: Huayang Cai Fuzhou University, China Email: c\_huayang@163.com