The 12th Asia and South Pacific Design Automation Conference Technical Program

The 12th Asia and South Pacific Design Automation Conference

Session 5C High-Level Synthesis
Time: 13:30 - 15:35 Thursday, January 25, 2007
Location: Room 414+415
Chairs: Ki-seok Chung (Hanyang Univ., Republic of Korea), Katsuharu Suzuki (NEC, Japan)

5C-1 (Time: 13:30 - 13:55)

Title	Optimization of Arithmetic Datapaths with Finite Word-Length Operands
Author	*Sivaram Gopalakrishnan, Priyank Kalla (University of Utah, United States), Florian Enescu (Georgia State University, United States)
Page	pp. 511 - 516
Keyword	Finite Integer Rings, Modulo Arithmetic
Abstract	This paper presents an approach to area optimization of arithmetic datapaths that perform polynomial computations over bit-vectors with finite widths. Examples of such designs abound in DSP for audio, video and multimedia computations where the input and output bit-vector sizes are dictated by the desired precision. A bit-vector of size m represents integer values reduced modulo 2^m (% 2^m). Therefore, finite word-length bit-vector arithmetic can be modeled as algebra over finite integer rings, where the bit-vector size dictates the ring cardinality. This paper demonstrates how the number-theoretic properties of finite integer rings can be exploited for optimization of bit-vector arithmetic. Along with an analytical model to estimate the implementation cost at RTL, two algorithms are presented to optimize bit-vector arithmetic. Experimental results, conducted within practical CAD settings, demonstrate significant area savings due to our approach.

5C-2 (Time: 13:55 - 14:20)

Title	Exploiting Power-Area Tradeoffs in Behavioural Synthesis through Clock and Operations Throughput Selection
Author	*Marco A. Ochoa-Montiel, Bashir M. Al-Hashimi (University of Southampton, Great Britain), Peter Kollig (Philips Semiconductors, Great Britain)
Page	pp. 517 - 522
Keyword	High Level Synthesis, low power
Abstract	This paper describes a new dynamic-power aware High Level Synthesis (HLS) data path approach that considers the close interrelation between clock choice and operations throughput selection whilst attempting to minimize area, power, or a combination thereof. It is shown that the proposed approach with its compound cost function and its novel clock and operations throughput selection algorithm, obtains solutions with lower power and area than using previous relevant work [11]. Moreover, different power-area tradeoffs can be explored due to the appropriate choice of clock period and operations throughput using our novel approach.

5C-3 (Time: 14:20 - 14:45)

Title	A Parameterized Architecture Model in High Level Synthesis for Image Processing Applications
Author	*Yazhuo Dong, Yong Dou (National University of Defense Technology, China)
Page	pp. 523 - 528
Keyword	high level synthesis, image processing, data reuse
Abstract	Most image processing applications are computationally intensive and data intensive. Reconfigurable hardware boards provide a convenient and flexible solution to speed up these algorithms. To get a high performance design without going through the time-consuming hardware design process for each different algorithm, we present a universal parameterized architecture in high level synthesis to generate the hardware frames for all image processing applications automatically. The value of the parameters which decide the target architecture can be obtained from the compiler. The algorithm how to get these parameters is also discussed in this paper.

5C-4 (Time: 14:45 - 15:10)

Title	High-Level Power Estimation and Low-Power Design Space Exploration for FPGAs
Author	*Deming Chen (University of Illinois at Urbana-Champaign, United States), Jason Cong, Yiping Fan, Zhiru Zhang (University of California, Los Angeles, United States)
Page	pp. 529 - 534
Keyword	high-level synthesis, low-power, FPGA, power estimation
Abstract	In this paper, we present a simultaneous resource allocation and binding algorithm for FPGA power minimization. To fully validate our methodology and result, our work targets a real FPGA architecture - Altera Stratix FPGA [2], which includes generic logic elements, DSP cores, and memories, etc. We design a high-level power estimator for this architecture and evaluate its estimation accuracy against a commercial gate-level power estimator - Quartus II PowerPlay Analyzer [1]. During the synthesis stage, we pay special attention to interconnects and multiplexers. We concentrate on resource allocation and binding tasks because they are the key steps to determine the interconnections. We use a novel approach to explore the design space during synthesis. It forms, propagates, and prunes synthesis solution points, where each solution point represents one actual implementation of the datapath. Eventually, we generate a design solution curve, which can provide ideal solution points with low power and high performance. Experimental results show that our high-level power estimator is 8.7% away from PowerPlay Analyzer. Meanwhile, we are able to achieve a significant amount of power reduction (32%) with better circuit speed (16%) compared to a traditional resource allocation and binding algorithm.

5C-5 (Time: 15:10 - 15:35)

Title	Numerical Function Generators Using Edge-Valued Binary Decision Diagrams
Author	*Shinobu Nagayama (Hiroshima City University, Japan), Tsutomu Sasao (Kyushu Institute of Technology, Japan), Jon Butler (Naval Postgraduate School, United States)
Page	pp. 535 - 540
Keyword	edge-valued BDD, non-uniform segmentation, piecewise polynomial approximation, numerical function generator, FPGA
Abstract	In this paper, we introduce the edge-valued binary decision diagram (EVBDD) to reduce the memory and delay in numerical function generators (NFGs). An NFG realizes a function, such as a trigonometric, logarithmic, square root, or reciprocal function, in hardware. NFGs are important in, for example, digital signal applications, where high speed and accuracy are necessary. We use the EVBDD to produce a fast and compact segment index encoder (SIE) that is a key component in our NFG. We compare our approach with NFG designs based on multi-terminal BDD's (MTBDDs), and show that the EVBDD produces SIEs that have, on average, only 7% of the memory and 40% of the delay of those designed using MTBDDs. Therefore, our NFGs based on EVBDDs have, on average, only 38% of the memory and 59% of the delay of NFGs based on MTBDDs.