Title | Leveraging the Error Resilience of Machine-Learning Applications for Designing Highly Energy Efficient Accelerators |
Author | *Zidong Du (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China), Avinash Lingamneni (Electrical and Computer Engineering, Rice University, U.S.A.), Yunji Chen (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China), Krishna Palem (Rice University, U.S.A.), Olivier Temam (INRIA, France), Chengyong Wu (State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China) |
Page | pp. 201 - 206 |
Keyword | Accelerator, Inexact computing, Hardware Neuron Network |
Abstract | In the recent years, inexact computing has been increasingly regarded as one of the most promising approaches for slashing energy consumption in many applications that can tolerate a certain degree of inaccuracy. Driven by the principle of trading tolerable amounts of application accuracy in return for significant resource savings---the energy consumed, the (critical path) delay and the (silicon) area--this approach has been limited to ASICs so far. These ASIC realizations have a narrow application scope and are often rigid in their tolerance to inaccuracy, as currently designed; the latter often determining the extent of resource savings we would achieve. In this paper, we propose to improve the application scope, error resilience as well as the energy savings of inexact computing by combining it with hardware neural networks. These neural networks are fast emerging as popular candidate accelerators for future heterogeneous multi-core platforms and have flexible error tolerance limits owing to their ability to be trained. Our results in 65nm technology demonstrate that the proposed inexact neural network accelerator could achieve 1.78x 2.67x savings in energy consumption (with corresponding delay and area savings being 1.23x and 1.46x respectively) when compared to the existing baseline neural network implementation, at the cost of a small accuracy loss (MSE increases from 0.14 to 0.20 on average). |
Title | ArISE: Aging-Aware Instruction Set Encoding for Lifetime Improvement |
Author | *Fabian Oboril, Mehdi Tahoori (KIT, Germany) |
Page | pp. 207 - 212 |
Keyword | Opcode, Transistor Aging, Reliability, Decoder, Microprocessor Pipeline |
Abstract | Microprocessors fabricated at nanoscale nodes are exposed to accelerated transistor aging due to Bias Temperature Instability and Hot Carrier Injection. As a result, device delays increase over time reducing Mean Time To Failure (MTTF). To address this challenge, many (micro)-architectural-techniques target the execution stage of the instruction pipeline. However, also the decoding stages can become aging-critical and limit the microprocessor lifetime. Therefore, we propose a novel aging-aware instruction set encoding methodology, which increases MTTF of these stages in case of the FabScalar microprocessor by 2x with negligible implementation costs. |
Slides |
Title | DRuiD: Designing Reconfigurable Architectures with Decision-Making Support |
Author | *Giovanni Mariani (Universita della Svizzera Italiana - ALaRI, Switzerland/Politecnico di Milano, Italy), Gianluca Palermo (Politecnico di Milano - DEIB, Italy), Roel Meeuws, Vlad-Mihai Sima (Delft Technical University, Netherlands), Cristina Silvano (Politecnico di Milano - DEIB, Italy), Koen Bertels (Delft Technical University, Netherlands) |
Page | pp. 213 - 218 |
Keyword | Reconfigurable architectures, Heterogeneous architectures, Machine learning, Random forest, FPGA |
Abstract | The development process for heterogeneous computing platforms requires a clear understanding of both, application requirements and heterogeneous computing technologies. To support the development process, we propose a framework called DRuiD capable of learning characteristics that make application functionalities suitable for certain computing elements. An expert system supports the designer in the mapping decision and gives hints on possible code modifications to be applied to make the applications more suitable for a given computing element. |
Slides |
Title | Edit Distance Based Instruction Merging Technique to Improve Flexibility of Custom Instructions Toward Flexible Accelerator Design |
Author | Hui Huang (University of California, Los Angeles, U.S.A.), *Taemin Kim, Yatin Hoskote (Intel Labs, U.S.A.) |
Page | pp. 219 - 224 |
Keyword | Instruction set extension, Flexibility, Application specific instruction set processor, System-On-a-Chip |
Abstract | Due to ever shortening time-to-market of a system-on-a-chip (SoC) and increasing NRE cost of designing accelerators in the SoC, a design methodology for a flexible accelerator is desirable. We propose a novel technique to make custom instructions (CIs) of an application specific instruction-set processor (ASIP) flexible. By doing so, CIs can support applications that were not considered at design time of the ASIP, which is difficult to do with a conventional CI design method. We have shown that custom instructions generated by our technique can support future applications by up to 7X better than those from a conventional method. |
Slides |