ASP-DAC 2011 Technical Program

The 16th Asia and South Pacific Design Automation Conference

Session 5C High-Level and Logic Synthesis
Time: 13:40 - 15:40 Thursday, January 27, 2011
Location: Room 414+415
Chairs: Kiyoung Choi (Seoul National University, Republic of Korea), Shigeru Yamashita (Ritsumeikan University, Japan)

5C-1 (Time: 13:40 - 14:10)

Title	An Efficient Hybrid Engine to Perform Range Analysis and Allocate Integer Bit-widths for Arithmetic Circuits
Author	*Yu Pang (Chongqing University of Posts and Telecommunications, China), Katarzyna Radecka, Zeljko Zilic (McGill University, Canada)
Page	pp. 455 - 460
Keyword	arithmetic circuits, range analysis, SMT, arithmetic transform, fixed-point synthesis
Abstract	Range analysis is an important task in obtaining the correct, yet fast and inexpensive arithmetic circuits. The traditional methods, either simulation-based or static, have the disadvantage of low efficiency and coarse bounds, which may lead to unnecessary bits. In this paper, we propose a new method that combines several techniques to perform fixed-point range analysis in a datapath towards obtaining the much tighter ranges efficiently. We show that the range and the bit-width allocation can be obtained with better results relative to the past methods, and in significantly shorter time.
Slides

5C-2 (Time: 14:10 - 14:40)

Title	Register Pressure Aware Scheduling for High Level Synthesis
Author	*Rami Beidas, Wai Sum Mong, Jianwen Zhu (University of Toronto, Canada)
Page	pp. 461 - 466
Keyword	Phase Coupling, Scheduling, Register Pressure, Area Optimization
Abstract	Variations of list scheduling became the de-facto standard of scheduling straight line code in software compilers, a trend faithfully inherited by high-level synthesis solutions. Due to its nature, list scheduling is oblivious of the tightly coupled register pressure; a dangling fundamental problem that has been attacked by the compiler community for decades, and which results, in case of high-level synthesis, in excessive instantiations of registers and accompanying steering logic. To alleviate this problem, we propose a synthesis framework called "soft scheduling", which acts as a resource unconstrained pre-scheduling stage that restricts subsequent scheduling to minimize register pressure. This optimization objective is formulated as a live range minimization problem, a measure shown to be proportional to register pressure, and optimally solved in polynomial time using minimum cost network flow formulation. Unlike past solutions in the compiler community, which try to reduce register pressure by local serialization of subject instructions, the proposed solution operates on the entire basic block or hyperblock and systematically handles instruction chaining subject to the same objective. The application of the proposed solution to a set of real-life benchmarks results in a register pressure reduction ranging, on average, between 11% and 41% depending on the compilation and synthesis configurations with minor 2% to 4% increase in schedule latency.

5C-3 (Time: 14:40 - 15:10)

Title	Parallel Cross-Layer Optimization of High-Level Synthesis and Physical Design
Author	*James Williamson (ECEE Dept., University of Colorado at Boulder, U.S.A.), Yinghai Lu (EECS Dept., Northwestern University, U.S.A.), Li Shang (ECEE Dept., University of Colorado at Boulder, U.S.A.), Hai Zhou (EECS Dept., Northwestern University, U.S.A.), Xuan Zeng (State Key Lab of ASIC & System, Microelectronics Dept., Fudan University, China)
Page	pp. 467 - 472
Keyword	cross-layer optimization, parallel CAD, GPGPU, heterogeneous architectures, parallelization
Abstract	Integrated circuit (IC) design automation has traditionally followed a hierarchical approach. Modern IC design flow is divided into sequentially-addressed design and optimization layers; each successively finer in design detail and data granularity while increasing in computational complexity. Eventual agreement across the design layers signals design closure. Obtaining design closure is a continual problem, as lack of awareness and interaction between layers often results in multiple design flow iterations. In this work, we propose parallel cross-layer optimization, in which the boundaries between design layers are broken, allowing for a more informed and efficient exploration of the design space. We leverage the heterogeneous parallel computational power in current and upcoming multi-core/many-core computation platforms to suite the heterogeneous characteristics of multiple design layers. Specifically, we unify the high-level and physical synthesis design layers for parallel cross-layer IC design optimization. In addition, we introduce a massively-parallel GPU floorplanner with local and global convergence test as the proposed physical synthesis design layer. Our results show average performance gains of 11X speed-up over state-of-the-art.
Slides

5C-4 (Time: 15:10 - 15:40)

Title	Network Flow-based Simultaneous Retiming and Slack Budgeting for Low Power Design
Author	Bei Yu, Sheqin Dong, *Yuchun Ma, Tao Lin, Yu Wang (Tsinghua University, China), Song Chen, Satoshi GOTO (Waseda University, Japan)
Page	pp. 473 - 478
Keyword	Retiming, Slack Budgeting, Network Flow, Low Power
Abstract	Low power design has become one of the most significant requirements when CMOS technology entered the nanometer era. Therefore, timing budget is often performed to slow down as many components as possible so that timing slacks can be applied to reduce the power consumption while maintaining the performance of the whole design. Retiming is a procedure that involves the relocation of flip-flops (FFs) across logic gates to achieve faster clocking speed. In this paper we show that the retiming and slack budgeting problem can be formulated to a convex cost dual network flow problem. Both the theoretical analysis and experimental results show the efficiency of our approach which can not only reduce power consumption but also speedup previous work.
Slides