ASP-DAC 2014 Technical Program

The 19th Asia and South Pacific Design Automation Conference

Session 3A Synthesis and Exploration Techniques for Computing Platforms
Time: 15:50 - 17:30 Tuesday, January 21, 2014
Location: Room 300
Chairs: Sri Parameswaran (University of New South Wales, Australia), Kyle Rupnow (Nanyang Technological University, Singapore)

3A-1 (Time: 15:50 - 16:15)

Title	Leveraging the Error Resilience of Machine-Learning Applications for Designing Highly Energy Efficient Accelerators
Author	*Zidong Du (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China), Avinash Lingamneni (Electrical and Computer Engineering, Rice University, U.S.A.), Yunji Chen (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China), Krishna Palem (Rice University, U.S.A.), Olivier Temam (INRIA, France), Chengyong Wu (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China)
Page	pp. 201 - 206
Keyword	Accelerator, Inexact computing, Hardware Neuron Network
Abstract	In the recent years, inexact computing has been increasingly regarded as one of the most promising approaches for slashing energy consumption in many applications that can tolerate a certain degree of inaccuracy. Driven by the principle of trading tolerable amounts of application accuracy in return for significant resource savings---the energy consumed, the (critical path) delay and the (silicon) area--this approach has been limited to ASICs so far. These ASIC realizations have a narrow application scope and are often rigid in their tolerance to inaccuracy, as currently designed; the latter often determining the extent of resource savings we would achieve. In this paper, we propose to improve the application scope, error resilience as well as the energy savings of inexact computing by combining it with hardware neural networks. These neural networks are fast emerging as popular candidate accelerators for future heterogeneous multi-core platforms and have flexible error tolerance limits owing to their ability to be trained. Our results in 65nm technology demonstrate that the proposed inexact neural network accelerator could achieve 1.78x 2.67x savings in energy consumption (with corresponding delay and area savings being 1.23x and 1.46x respectively) when compared to the existing baseline neural network implementation, at the cost of a small accuracy loss (MSE increases from 0.14 to 0.20 on average).

3A-2 (Time: 16:15 - 16:40)

Title	ArISE: Aging-Aware Instruction Set Encoding for Lifetime Improvement
Author	*Fabian Oboril, Mehdi Tahoori (KIT, Germany)
Page	pp. 207 - 212
Keyword	Opcode, Transistor Aging, Reliability, Decoder, Microprocessor Pipeline
Abstract	Microprocessors fabricated at nanoscale nodes are exposed to accelerated transistor aging due to Bias Temperature Instability and Hot Carrier Injection. As a result, device delays increase over time reducing Mean Time To Failure (MTTF). To address this challenge, many (micro)-architectural-techniques target the execution stage of the instruction pipeline. However, also the decoding stages can become aging-critical and limit the microprocessor lifetime. Therefore, we propose a novel aging-aware instruction set encoding methodology, which increases MTTF of these stages in case of the FabScalar microprocessor by 2x with negligible implementation costs.
Slides

3A-3 (Time: 16:40 - 17:05)

Title	DRuiD: Designing Reconfigurable Architectures with Decision-Making Support
Author	*Giovanni Mariani (Universita della Svizzera Italiana - ALaRI, Switzerland/Politecnico di Milano, Italy), Gianluca Palermo (Politecnico di Milano - DEIB, Italy), Roel Meeuws, Vlad-Mihai Sima (Delft Technical University, Netherlands), Cristina Silvano (Politecnico di Milano - DEIB, Italy), Koen Bertels (Delft Technical University, Netherlands)
Page	pp. 213 - 218
Keyword	Reconfigurable architectures, Heterogeneous architectures, Machine learning, Random forest, FPGA
Abstract	The development process for heterogeneous computing platforms requires a clear understanding of both, application requirements and heterogeneous computing technologies. To support the development process, we propose a framework called DRuiD capable of learning characteristics that make application functionalities suitable for certain computing elements. An expert system supports the designer in the mapping decision and gives hints on possible code modifications to be applied to make the applications more suitable for a given computing element.
Slides

3A-4 (Time: 17:05 - 17:30)

Title	Edit Distance Based Instruction Merging Technique to Improve Flexibility of Custom Instructions Toward Flexible Accelerator Design
Author	Hui Huang (University of California, Los Angeles, U.S.A.), *Taemin Kim, Yatin Hoskote (Intel Labs, U.S.A.)
Page	pp. 219 - 224
Keyword	Instruction set extension, Flexibility, Application specific instruction set processor, System-On-a-Chip
Abstract	Due to ever shortening time-to-market of a system-on-a-chip (SoC) and increasing NRE cost of designing accelerators in the SoC, a design methodology for a flexible accelerator is desirable. We propose a novel technique to make custom instructions (CIs) of an application specific instruction-set processor (ASIP) flexible. By doing so, CIs can support applications that were not considered at design time of the ASIP, which is difficult to do with a conventional CI design method. We have shown that custom instructions generated by our technique can support future applications by up to 7X better than those from a conventional method.
Slides