Sponsors:
ACM SIGDA, IEEE CASS, IEEE CEDA, Singapore Chapter of IEEE CASS, SSIA

Supported by:

Held in:

Special Sessions

Date: 21-23, 2014
Place: Room 302

	Date/Time	Title
1S-1	10:40 - 11:05 Tuesday, January 21, 2014	Normally-Off Computing: Towards Zero Stand-by Power Management
1S-2	11:05 - 11:30 Tuesday, January 21, 2014	Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors"
1S-3	11:30 - 11:55 Tuesday, January 21, 2014	Normally-Off MCU Architecture for Low-Power Sensor Node
1S-4	11:55 - 12:20 Tuesday, January 21, 2014	Normally-Off Technologies for Healthcare Appliance
2S-1	13:50 - 14:20 Tuesday, January 21, 2014	Applying VLSI EDA to Energy Distribution System Design
2S-2	14:20 - 14:50 Tuesday, January 21, 2014	A Model-Based Design of Cyber-Physical Energy Systems
2S-3	14:50 - 15:20 Tuesday, January 21, 2014	The Data Center as a Grid Load Stabilizer
3S-1	15:50 - 16:20 Tuesday, January 21, 2014	A Silicon Nanodisk Array Structure Realizing Synaptic Response of Spiking Neuron Models with Noise
3S-2	16:20 - 16:50 Tuesday, January 21, 2014	Energy Efficient In-Memory Machine Learning for Data Intensive Image-Processing by Non-Volatile Domain-Wall Memory
3S-3	16:50 - 17:20 Tuesday, January 21, 2014	Lessons from the Neurons Themselves
4S-1	10:10 - 10:40 Wednesday, January 22, 2014	SDG2KPN: System Dependency Graph to Function-Level KPN Generation of Legacy Code for MPSoCs
4S-2	10:40 - 11:10 Wednesday, January 22, 2014	Low Power Design of the Next-Generation High Efficiency Video Coding
4S-3	11:10 - 11:40 Wednesday, January 22, 2014	Mapping Complex Algorithm into FPGA with High Level Synthesis
4S-4	11:40 - 12:10 Wednesday, January 22, 2014	Leveraging Parallelism in the Presence of Control Flow on CGRAs
5S-1	13:50 - 14:20 Wednesday, January 22, 2014	Soft Error Resiliency Characterization on IBM BlueGene/Q Processor
5S-2	14:20 - 14:50 Wednesday, January 22, 2014	Resiliency for Many-Core System on a Chip
5S-3	14:50 - 15:20 Wednesday, January 22, 2014	Rethinking Error Injection for Effective Resilience
6S-1	15:50 - 16:20 Wednesday, January 22, 2014	Accurate and Inexpensive Performance Monitoring for Variability-Aware Systems
6S-2	16:20 - 16:50 Wednesday, January 22, 2014	Quantifying Workload Dependent Reliability in Embedded Processors
6S-3	16:50 - 17:20 Wednesday, January 22, 2014	QED Post-Silicon Validation and Debug: Frequently Asked Questions
7S-1	10:10 - 10:40 Thursday, January 23, 2014	Spiking Brain Models: Computation, Memory and Communication Constraints for Custom Hardware Implementation
7S-2	10:40 - 11:10 Thursday, January 23, 2014	Advanced Technologies for Brain-Inspired Computing
7S-3	11:10 - 11:40 Thursday, January 23, 2014	GPGPU Accelerated Simulation and Parameter Tuning for Neuromorphic Applications
7S-4	11:40 - 12:10 Thursday, January 23, 2014	A Scalable Custom Simulation Machine for the Bayesian Confidence Propagation Neural Network Model of the Brain
8S-1	13:50 - 14:15 Thursday, January 23, 2014	An Overview of Spin-Based Integrated Circuits
8S-2	14:15 - 14:40 Thursday, January 23, 2014	Advances in Spintronics Devices for Microelectronics - from Spin-Transfer Torque to Spin-Orbit Torque
8S-3	14:40 - 15:05 Thursday, January 23, 2014	Hybrid CMOS/Magnetic Process Design Kit and SOT-Based Non-Volatile Standard Cell Architectures
8S-4	15:05 - 15:30 Thursday, January 23, 2014	Architectural Aspects in Design and Analysis of SOT-Based Memories
9S-1	15:50 - 16:30 Thursday, January 23, 2014	The Role of Photons in Cryptanalysis
9S-2	16:30 - 17:10 Thursday, January 23, 2014	SPADs for Quantum Random Number Generators and Beyond
9S-3	17:10 - 17:50 Thursday, January 23, 2014	Quantum Key Distribution with Integrated Optics

Session 1S
Special Session: Normally-Off Computing: Towards Zero Stand-by Power Management
Time: 10:40 - 12:20 Tuesday, January 21, 2014
Organizer: Hiroshi Nakamura (University of Tokyo, Japan)

1S-1 (Time: 10:40 - 11:05)

Title	(Invited Paper) Normally-Off Computing Project : Challenges and Opportunities
Author	*Hiroshi Nakamura, Takashi Nakada, Shinobu Miwa (The University of Tokyo, Japan)
Abstract	Normally-Off is a way of computing which aggressively powers off components of computer systems when they need not to operate. Simple power gating cannot fully take the chances of power reduction because volatile memories lose data when power is turned off. Recently, new non-volatile memories (NVMs) have appeared. High attention has been paid to normally-off computing using these NVMs. In this paper, its expectation and challenges are addressed with a brief introduction of our project started in 2011.

Slides

1S-2 (Time: 11:05 - 11:30)

Title	(Invited Paper) Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors"
Author	*Shinobu Fujita, Kumiko Nomura, Hiroki Noguchi, Susumu Takeda, Keiko Abe (Toshiba Corporation, Japan)
Abstract	This paper presents novel processor architecture for HP-processor with nonvolatile/volatile hybrid cache memory. By simulations of high-performance (HP)-processor using MTJs, it has been clarified that total power of the HP-processor using perpendicular-(p-)STT-MRAM can be reduced by over 90 % with little degradation of processor performance. The presented architecture with nonvolatile memory hierarchy will realize the “normally-off computers”.

Slides

1S-3 (Time: 11:30 - 11:55)

Title	(Invited Paper) Normally-Off MCU Architecture for Low-Power Sensor Node
Author	*Masanori Hayashikoshi, Yohei Sato, Hiroshi Ueki, Hiroyuki Kawai, Toru Shimizu (Renesas Electronics Corporation, Japan)
Abstract	The production volume of sensor nodes is much increased with the development of cyber-physical systems. Therefore, it becomes important how to reduce the power consumption of huge sensor nodes. In this work, normally-off architecture of microcontroller for future low-power sensor node is proposed. To realize true low-power effects with normally-off computing technology, a co-design of hardware and software technology is much important. In this work, the power consumption of sensor nodes is possible to reduce of around 70%.

Slides

1S-4 (Time: 11:55 - 12:20)

Title	(Invited Paper) Normally-Off Technologies for Healthcare Appliance
Author	*Shintaro Izumi, Hiroshi Kawaguchi, Yoshimoto Masahiko (Kobe University, Japan), Yoshikazu Fujimori (Rohm, Japan)
Abstract	Battery mass and power consumption of wearable system must be reduced because the key factors affecting wearable system usability are miniaturization and weight reduction. This report describes a wearable biosignal monitoring system using normally-off technologies to minimize the power consumption. Especially we focused on daily-life monitoring and electrocardiograph processor. Our system employs FeRAM and Near Field Communication (NFC). A robust heart rate monitor and Cortex M0 core are used to on-node processing for logging data reduction.

Slides

Session 2S
Special Session: EDA for Energy
Time: 13:50 - 15:30 Tuesday, January 21, 2014
Organizer: Fadi Kurdahi (University of California, Irvine, U.S.A.), Sani Nassif (IBM Austin Research Lab, U.S.A.), Mohammad Al Faruque (University of California, Irvine, U.S.A.)
2S-1 (Time: 13:50 - 14:20)

Title	(Invited Paper) Applying VLSI EDA to Energy Distribution System Design
Author	*Sani Nassif, Gi-Joon Nam, Jerry Hayes (IBM Austin Research Laboratory, U.S.A.), Sani Fakhouri (University of California, Irvine, U.S.A.)
Abstract	Energy distribution networks refer to that part of the electricity network that delivers power to homes and business. It is reported that significant amounts of energy are being wasted simply due to inefficiencies in this network. Further, this domain is rapidly changing with new types of loads such as electric vehicles or the spread of new types of energy sources such as photo-voltaic and wind. In this paper, we demonstrate a comprehensive design automation capability for energy distribution networks leading to much more flexible yet effective system. The new system's capabilities include power load distribution and transfers, equipment upgrading, geospatial-aware network optimization, outage identification, contingency planning and loss analysis/reduction. These features are enabled by advanced simulation, analysis and optimization engines that are adapted from those available in the traditional VLSI design automation area. The paper will conclude with potential future research directions that require further innovations in energy distribution networks.

2S-2 (Time: 14:20 - 14:50)

Title	(Invited Paper) A Model-Based Design of Cyber-Physical Energy Systems
Author	Mohammad Abdullah Al Faruque, *Fereidoun Ahourai (University of California, Irvine, U.S.A.)
Abstract	Cyber-Physical Energy Systems (CPES) are an amalgamation of both power gird technology, and the intelligent communication and co-ordination between the supply and the demand side through distributed embedded computing. Through this combination, CPES are intended to deliver power efficiently, reliably, and economically. The design and development work needed to either implement a new power grid network or upgrade a traditional power grid to a CPES-compliant one is both challenging and time consuming due to the heterogeneous nature of the associated components/subsystems. The Model Based Design (MBD) methodology has been widely seen as a promising solution to address the associated design challenges of creating a CPES. In this paper, we demonstrate a MBD method and its associated tool for the purpose of designing and validating various control algorithms for a residential microgrid. Our presented co-simulation engine GridMat is a MATLAB/Simulink toolbox; the purpose of it is to co-simulate the power systems modeled in GridLAB-D as well as the control algorithms that are modeled in Simulink. We have presented various use cases to demonstrate how different levels of control algorithms may be developed, simulated, debugged, and analyzed by using our GridMat toolbox for a residential microgrid.

2S-3 (Time: 14:50 - 15:20)

Title	(Invited Paper) The Data Center as a Grid Load Stabilizer
Author	Hao Chen, Michael C. Caramanis, *Ayse K. Coskun (Boston University, U.S.A.)
Abstract	To accommodate the increasing presence of volatile and intermittent renewable energy sources in power generation, independent system operators (ISO) offer opportunities for demand side regulation service (RS) so as to stabilize the grid load. These power market features allow the demand side to earn monetary credits by modulating its power consumption dynamically following an RS signal broadcast by ISO. This paper studies the capacities and benefits of a major potential demand side, the data center, to provide RS. We propose a dynamic control policy that modulates the data center power consumption in response to ISO requests by leveraging server power capping techniques and various server power states. Results demonstrate that using our policy, data centers can provide fast reserves in quantities that are substantial proportions (around 50%) of their average energy consumption, with no major deterioration in quality of service (QoS). By doing so, data centers decrease their energy costs around 50%, while providing the ISOs and the society in general with cost effective demand side reserves that render massive renewable generation adoption affordable.

Slides

Session 3S
Special Session: Neuron Inspired Computing using Nanotechnology
Time: 15:50 - 17:30 Tuesday, January 21, 2014
Organizer: Kevin Cao (Arizona State University, U.S.A.), Sarma Vrudhula (Arizona State University, U.S.A.)

3S-1 (Time: 15:50 - 16:20)

Title	(Invited Paper) A Silicon Nanodisk Array Structure Realizing Synaptic Response of Spiking Neuron Models with Noise
Author	*Takashi Morie, Haichao Liang, Yilai Sun, Takashi Tohara (Kyushu Institute of Technology, Japan), Makoto Igarashi, Seiji Samukawa (Tohoku University, Japan)
Abstract	In the implementation of spiking neuron models, which can achieve realistic neuron operation, generation of post-synaptic potentials (PSPs) is an essential function. We have already proposed a new nanodisk array structure for generating PSPs using delay in electron hopping among nanodisks. Generated PSPs have fluctuation caused by stochastic electron movement. Noise or fluctuation is effectively used in neural processing. In this paper, we review our proposed structure and show fluctuation controllability based on single-electron circuit simulation.

3S-2 (Time: 16:20 - 16:50)

Title	(Invited Paper) Energy Efficient In-Memory Machine Learning for Data Intensive Image-Processing by Non-Volatile Domain-Wall Memory
Author	*Hao Yu, Yuhao Wang, Shuai Chen, Wei Fei (Nanyang Technological University, Singapore), Chuliang Weng, Junfeng Zhao, Zhulin Wei (Huawei Shannon Laboratory, China)
Abstract	Image processing in conventional logic-memory I/O-integrated systems will incur significant communication congestion at memory I/Os for excessive big image data at exa-scale. This paper explores an in-memory machine learning on neural network architecture by utilizing the newly introduced domain-wall nanowire, called DW-NN. We show that all operations involved in machine learning on neural network can be mapped to a logic-in-memory architecture by non-volatile domain-wall nanowire. Domain-wall nanowire based logic is customized for in machine learning within image data storage. As such, both neural network training and processing can be performed locally within the memory. The experimental results show that system throughput in DW-NN is improved by 11.6x and the energy efficiency is improved by 92x when compared to conventional image processing system.

Slides

3S-3 (Time: 16:50 - 17:20)

Title	(Invited Paper) Lessons from the Neurons Themselves
Author	*Louis Scheffer (Howard Hughes Medical Institute, U.S.A.)
Abstract	Natural neural circuits, optimized by millions of years of evolution, are fast, low power, and robust, all characteristics we would love to have in systems we ourselves design. Recently there have been enormous advances in understanding how neurons implement computations within the brain of living creatures. Can we use this new-found knowledge to create better artificial system? What lessons can we learn from the neurons themselves, that can help us create better neuromorphic circuits?

Slides

Session 4S
Special Session: Design Automation Methods for Highly-Complex Multimedia Systems
Time: 10:10 - 12:15 Wednesday, January 22, 2014
Organizer: Sri Parameswaran (University of New South Wales, Australia)
4S-1 (Time: 10:10 - 10:40)

Title	(Invited Paper) SDG2KPN: System Dependency Graph to Function-Level KPN Generation of Legacy Code for MPSoCs
Author	Jude Angelo Ambrose, Jorgen Peddersen (University of New South Wales, Australia), Alvin Labios, Yusuke Yachide (Canon Information Systems Research Australia (CiSRA), Australia), *Sri Parameswaran (University of New South Wales, Australia)
Abstract	The Multiprocessor System-on-Chip (MPSoC) paradigm as a viable implementation platform for parallel processing has expanded to encompass embedded devices. The ability to execute code in parallel gives MPSoCs the potential to achieve high performance with low power consumption. In order for sequential legacy code to take advantage of the MPSoC design paradigm, it must first be partitioned into data flow graphs (such as Kahn Process Networks --- KPNs) to ensure the data elements can be correctly passed between the separate processing elements that operate on them. Existing techniques are inadequate for use in complex legacy code. This paper proposes SDG2KPN, a System Dependency Graph to KPN conversion methodology targeting the conversion of legacy code. By creating KPNs at the granularity of the function-/procedure-level, SDG2KPN is the first of its kind to support shared and global variables as well as many more program patterns/application types. We also provide a design flow which allows the creation of MPSoC systems utilizing the produced KPNs. We demonstrate the applicability of our approach by retargeting several sequential applications to the Tensilica MPSoC framework. Our system parallelized AES, an application of 950 lines, in 4.8 seconds, while H.264, of 57896 lines, took 164.9 seconds to parallelize.

Slides

4S-2 (Time: 10:40 - 11:10)

Title	(Invited Paper) Low Power Design of the Next-Generation High Efficiency Video Coding
Author	*Muhammad Shafique, Jörg Henkel (Karlsruhe Institute of Technology, Germany)
Abstract	This paper provides a comprehensive analysis of the computational complexity, temperature, and memory access behavior for the next-generation High Efficiency Video Coding (HEVC) standard. We highlight the associated design challenges and present several low-power algorithmic and architectural techniques for developing power-efficient HEVC-based multimedia system. We explore the interplay between the algorithms and architectures to provide high power efficiency while leveraging the application-specific knowledge and video content characteristics.

Slides

4S-3 (Time: 11:10 - 11:40)

Title	(Invited Paper) Mapping Complex Algorithm into FPGA with High Level Synthesis
Author	*Kazutoshi Wakabayashi, Takashi Takenaka, Hiroaki Inoue (NEC Corp., Japan)
Abstract	This presentation discusses on the comparison between “Reconfigurable Chip with High Level Synthesis” and “CPU, GPCPU with compiler such as CUDA” from the compiler perspective. Initially, we introduce several demands for acceleration with FPGA to achieve low latency calculation and control. As an application example, we show a High Frequency Trading. We accelerate it by FPGA NIC with C-based and SQL-based HLS, and show the necessity of high level language customizable reconfigurable chip. Then, we illustrate the difference of FPGA and processor (CPU, GPGPU) with the “FSM+Datapath” model and examine how the architecture difference affects delay and parallelism of operations. Next, we discuss parallelization of operations, threads with High Level Synthesis for FPGA and software compiler for processors. The main advantage of the former method is it is able to parallelize operations beyond control dependencies while the latter method has to obey control dependencies. Finally, some experimental results prove that “FPGA and HLS” generate better performance than a processor for control intensive algorithm.

4S-4 (Time: 11:40 - 12:10)

Title	(Invited Paper) Leveraging Parallelism in the Presence of Control Flow on CGRAs
Author	Jihyun Ryoo, Kyuseung Han, *Kiyoung Choi (Seoul National University, Republic of Korea)
Abstract	Coarse-Grained Reconfigurable Architectures (CGRAs) are suitable for accelerating data-intensive applications in embedded systems due to high performance and power efficiency. However, as application programs become complex having more control flows in them, it becomes harder to accelerate such programs on CGRAs. Previous researches on this issue have focused on correct execution of control flows rather than their acceleration. This paper reveals how control flows degrade the performance of programs and proposes a software approaches to accelerating control flows by exploiting parallelism residing in each conditionals as well as among conditionals. Experiments show that our proposed techniques improve performance by 2.51 times on average.

Slides

Session 5S
Special Session: Billion Chips of Trillion Transistors
Time: 13:50 - 15:30 Wednesday, January 22, 2014
Organizer: Chen-Yong Cher (IBM TJ Watson Research Center, U.S.A.)
5S-1 (Time: 13:50 - 14:20)

Title	(Invited Paper) Soft Error Resiliency Characterization on IBM BlueGene/Q Processor
Author	*Chen-Yong Cher, K. Paul Muller, Ruud A. Haring, David L. Satterfield, Thomas E. Musta, Thomas M. Gooding, Kristan D. Davis, Marc B. Dombrowa, Gerard V. Kopcsay, Robert M. Senger, Yutaka Sugawara, Krishnan Sugavanam (IBM T. J. Watson Research Center, U.S.A.)
Abstract	Fault injection through accelerated irradiation is an effective way to evaluate the overall soft error resiliency of microprocessors. In this work, we report on irradiation experiments on a Blue Gene/Q (BG/Q) compute processor chip running selected applications. Blue Gene/Q is the third generation of IBM’s massively parallel, energy efficient Blue Gene series of supercomputers. In the experiments, we found 26 code fails that are relevant for the calculation of the mean-time-between-failures (MTBF) for a 20 PetaFLOP, 96 rack system running a comparable workload mix. The expected MTBF for check-stops due to cosmic radiation and alpha particles from chip packaging materials is calculated to be 51 days for sea-level at New York City running the application mix studied. If the most vulnerable application is run exclusively, the projected MTBF is 35 days. These are outstanding results for a machine of this magnitude. The beaming experiment and projected MTBF validate the necessity to include autonomous hardware detection and recovery at the cost of design effort, silicon area and power.

5S-2 (Time: 14:20 - 14:50)

Title	(Invited Paper) Resiliency for Many-Core System on a Chip
Author	*Tanay Karnik, James Tschanz, Nitin Borkar, Jason Howard, Sriram Vangal, Vivek De, Shekhar Borkar (Intel Corporation, U.S.A.)
Abstract	Resilient techniques are commonly employed for dynamic and static variation tolerance. In this paper, we present an adaptive clocking technique that achieves 31% throughput increase with 15% energy reduction, and an adaptive interconnect fabric technique that increases bandwidth by 63% with 14.6% energy reduction. We also discuss variations in many-core microprocessors and some techniques to enable a resilient many-core system on a chip.

5S-3 (Time: 14:50 - 15:20)

Title	(Invited Paper) Rethinking Error Injection for Effective Resilience
Author	Shahrzad Mirkhani (University of Texas, U.S.A.), Hyungmin Cho, Subhasish Mitra (Stanford University, U.S.A.), *Jacob Abraham (University of Texas, U.S.A.)
Abstract	Soft errors, caused by radiation, have become a major challenge in today’s computer systems and networking equipment, making it imperative that systems be designed to be resilient to errors. Error injection is a powerful approach to evaluate system resilience, and current practice is to inject errors in architectural registers of processors, program variables of applications, or storage elements in the hardware model. This paper, using answers to frequently asked questions, discusses the need for rethinking conventional approaches to error injection, showing data from recent research and our simulation results. Approaches to improving current error injections are also suggested.

Slides

Session 6S
Special Session: Overcoming Major Silicon Bottlenecks: Variability, Reliability, Validation and Debug
Time: 15:50 - 17:30 Wednesday, January 22, 2014
Organizer: Subhasish Mitra (Stanford University, U.S.A.)

6S-1 (Time: 15:50 - 16:20)

Title	(Invited Paper) Accurate and Inexpensive Performance Monitoring for Variability-Aware Systems
Author	Liangzhen Lai, *Puneet Gupta (UCLA, U.S.A.)
Abstract	Designing reliable integrated systems has become a major challenge with shrinking geometries, increasing fault rates and devices which age substantially in their usage life. The proposed research is motivated by the observation many of the infield failures are delay failures and several variability signatures are also delay-related. The origins of temporal delay fluctuations include manufacturing variability, voltage/temperature changes, negative or positive bias temperature instability-related Vth degradation, etc. Since the actual delay changes depend on process variations as well as workload, on-chip monitoring may be the best way of predicting them. There is a need to monitor circuit performance during manufacturing as well as at runtime to predict achievable performance and warn against impending failures. Adaptive mechanisms in hardware and/or software can optimize the trade o between errors, energy and performance based on the feedback from runtime circuit performance monitors. This paper presents approaches for automated synthesis of design-dependent performance monitors. These monitors can be used to predict impending delay failures relatively inexpensively. For low-overhead monitoring, we propose multiple designdependent ring oscillators (DDROs) as smart canary structures which can reliably predict achievable chip frequency but with margins for local variations. Early silicon results indicate that DDROs can reduce delay monitoring error by 35% compared to conventional ring oscillators. To further improve the prediction (albeit at a higher overhead), we propose in-situ slack monitors (SlackProbe) which can match local variations as well at overheads much smaller than monitoring all sequential elements. SlackProbe reduces the number of monitors required by over 15X with 5% additional delay margin in several commercial processor benchmarks. Finally, we show an example of software testbed that demonstrates a variability-aware system that utilizes the hardware monitors and operates with both hardware and software adaptation.

6S-2 (Time: 16:20 - 16:50)

Title	(Invited Paper) Quantifying Workload Dependent Reliability in Embedded Processors
Author	*Vikas Chandra (ARM, U.S.A.)
Abstract	With nearly three decades of continued CMOS scaling, the devices have now been pushed to their physical and reliability limits. Scaling to sub-20nm technology nodes changes the nature of reliability effects from abrupt functional problems to progressive degradation of the performance characteristics of devices and system components. The impact of unreliability results in time-dependent variability, directly translating into design uncertainty in manufactured chips. Further, application workloads can significantly affect the overall system reliability. In this work, we have analyzed aging effects on various design hierarchies of an embedded processor in 28nm running real-world applications. We have also quantified the dependencies of aging effects on switching-activity and power-state of workloads. Implementation results show that the processor timing degradation can vary from 2% to 11%, depending on the workload.

6S-3 (Time: 16:50 - 17:20)

Title	(Invited Paper) QED Post-Silicon Validation and Debug: Frequently Asked Questions
Author	David Lin, *Subhasish Mitra (Stanford University, U.S.A.)
Abstract	During post-silicon validation and debug, one or more manufactured integrated circuits (ICs) are tested in actual system environments to detect and fix design flaws (bugs). According to several industrial reports, the costs of post-silicon validation and debug are rising faster than design costs. Hence, new systematic techniques are essential to overcome the rising costs of existing post-silicon validation and debug techniques. QED, an acronym for Quick Error Detection, is such a technique that effectively overcomes several post-silicon validation and debug challenges. QED systematically creates a wide variety of validation tests to quickly detect bugs, not only inside processor cores, but also in uncore components (i.e., components in an SoC that are neither processor cores nor co-processors) of multi-core system-on-chips. In this paper, we present a brief overview of QED through a series of frequently asked questions.

Session 7S
Special Session: Brain Like Computing: Modelling, Technology, and Architecture
Time: 10:10 - 12:15 Thursday, January 23, 2014
Chair: Ahmed Hemani (KTH, Sweden)

7S-1 (Time: 10:10 - 10:40)

Title	(Invited Paper) Spiking Brain Models: Computation, Memory and Communication Constraints for Custom Hardware Implementation
Author	*Anders Lansner, Ahmed Hemani, Nasim Farahini (KTH, Sweden)
Abstract	We estimate the computational capacity required to simulate in real time the neural information processing in the human brain. We show that the computational demands of a detailed implementation are beyond reach of current technology, but that some biologically plausible reductions of problem complexity can give performance gains between two and six orders of magnitude, which put implementations within reach of tomorrow’s technology.

7S-2 (Time: 10:40 - 11:10)

Title	(Invited Paper) Advanced Technologies for Brain-Inspired Computing
Author	*Fabien Clermidy, Rodolphe Heliot, Alexandre Valentian (CEA-LETI, France), Christian Gamrat, Olivier Bichler, Marc Duranton (CEA-LIST, France), Bilel Blehadj, Olivier Temam (INRIA, France)
Abstract	This paper aims at presenting how new technologies can overcome classical implementation issues of Neural Networks. Resistive memories such as Phase Change Memories and Conductive-Bridge RAM can be used for obtaining low-area synapses thanks to programmable resistance also called Memristors. Similarly, the high capacitance of Through Silicon Vias can be used to greatly improve analog neurons and reduce their area. The very same devices can also be used for improving connectivity of Neural Networks as demonstrated by an application. Finally, some perspectives are given on the usage of 3D monolithic integration for better exploiting the third dimension and thus obtaining systems closer to the brain.

Slides

7S-3 (Time: 11:10 - 11:40)

Title	(Invited Paper) GPGPU Accelerated Simulation and Parameter Tuning for Neuromorphic Applications
Author	Kristofor D. Carlson, Michael Beyeler, *Nikil Dutt, Jeffrey L. Krichmar (UC Irvine, U.S.A.)
Abstract	Neuromorphic engineering takes inspiration from biology to design brain-like systems that are extremely low-power, fault-tolerant, and capable of adaptation to complex environments. The design of these artificial nervous systems involves both the development of neuromorphic hardware devices and the development neuromorphic simulation tools. In this paper, we describe a simulation environment that can be used to design, construct, and run spiking neural networks (SNNs) quickly and efficiently using graphics processing units (GPUs). We then explain how the design of the simulation environment utilizes the parallel processing power of GPUs to simulate large-scale SNNs and describe recent modeling experiments performed using the simulator. Finally, we present an automated parameter tuning framework that utilizes the simulation environment and evolutionary algorithms to tune SNNs. We believe the simulation environment and associated parameter tuning framework presented here can accelerate the development of neuromorphic software and hardware applications by making the design, construction, and tuning of SNNs an easier task.

Slides

7S-4 (Time: 11:40 - 12:10)

Title	(Invited Paper) A Scalable Custom Simulation Machine for the Bayesian Confidence Propagation Neural Network Model of the Brain
Author	Nasim Farahini, *Ahmed Hemani, Anders Lansner (KTH, Sweden), Fabian Clermidy (CEA-LETI, France), Christer Svensson (Linköping University, Sweden)
Abstract	A multi-chip custom digital super-computer called eBrain for simulating Bayesian Confidence Propagation Neural Network (BCPNN) model of the human brain has been proposed. It uses Hybrid Memory Cube (HMC), the 3D stacked DRAM memories for storing synaptic weights that are integrated with a custom designed logic chip that implements the BCPNN model. In 22nm node, eBrain executes BCPNN in real time with 740 TFlops/s while accessing 30 TBs synaptic weights with a bandwidth of 112 TBs/s while consuming less than 6 kWs power for the typical case. This efficiency is three orders better than general purpose supercomputers in the same technology node.

Session 8S
Special Session: Design Flow for Integrated Circuits using Magnetic Tunnel Junction Switched by Spin Orbit Torque
Time: 13:50 - 15:30 Thursday, January 23, 2014
Organizer: Mehdi Tahoori (Karlsruhe Institute of Technology, Germany)
8S-1 (Time: 13:50 - 14:15)

Title	(Invited Paper) An Overview of Spin-Based Integrated Circuits
Author	Wang Kang (University Beihang, China/IEF, Université Paris-Sud, France), *Weisheng Zhao, Zhaohao Wang, Jacques-Olivier Klein, Yue Zhang, Djaafar Chabi (IEF, Université Paris-Sud, France), Youguang Zhang (Univ. Beihang, China), Dafiné Ravelosona, Claude Chappert (IEF, Université Paris-Sud, France)
Abstract	Conventional CMOS integrated circuits suffer from serve power and scalability challenges as technology node scales into ultra-deep-micron technology nodes. Alternative approaches beyond charge-only based circuits. In particular, spin-based devices or integrated circuits show promising merits to overcome these issues by adding the spin freedom of electrons to the electronic circuits. Spintronics has now become a hot topic in both academics and industrials. This paper overviews the status and prospects of spin-based integrated circuits under intense investigation and address particularly their merits and challenges for practical applications.

Slides

8S-2 (Time: 14:15 - 14:40)

Title	(Invited Paper) Advances in Spintronics Devices for Microelectronics - from Spin-Transfer Torque to Spin-Orbit Torque
Author	*Shunsuke Fukami, Hideo Sato, Michihiko Yamanouchi, Shoji Ikeda, Fumihiro Matsukura, Hideo Ohno (Tohoku University, Japan)
Abstract	Recent advances in spintronics devices make it possible to open a new era of microelectronics. In this paper, we review the spintronics devices utilizing spin-transfer torques (STTs) and spin-orbit torques (SOTs) developed in recent years. The progresses of two-terminal STT device with CoFeB-MgO based magnetic tunnel junction (MTJ), three-terminal magnetic domain wall (DW) motion device with Co/Ni multilayer, and three-terminal SOT device with Cu-based channel are described. Integrated circuits with the developed spintronics devices are also reviewed.

8S-3 (Time: 14:40 - 15:05)

Title	(Invited Paper) Hybrid CMOS/Magnetic Process Design Kit and SOT-Based Non-Volatile Standard Cell Architectures
Author	*Gregory Di Pendina, Kotb Jabeur, Guillaume Prenat (Spintec Laboratory, CEA-INAC/CNRS/UJF/G-INP, France)
Abstract	This paper gives an overview of hybrid CMOS/magnetic logic circuit design. We describe the magnetic devices, the expected advantages of using them beside CMOS to help to circumvent the incoming limits of VLSI circuits and the tools required to design such circuits, including Process Design Kit (PDK) and Standard Cells (SC). As a case of study, we particularly focus on a new and promising device technology based on Spin Orbit Torque (SOT) effect.

Slides

8S-4 (Time: 15:05 - 15:30)

Title	(Invited Paper) Architectural Aspects in Design and Analysis of SOT-Based Memories
Author	Rajendra Bishnoi, Mojtaba Ebrahimi, Fabian Oboril, *Mehdi Tahoori (Karlsruhe Institute of Technology, Germany)
Abstract	Magnetic Random Access Memory (MRAM) and in particular SOT-MRAM is a promising emerging memory technology because of its various advantages. In this work, we provide an analysis of SOT-MRAM at circuit- and architecture-level, and compare SOT-MRAM with several other technologies. Our architecture-level analysis shows that a hybrid-combination of SRAM and SOT-MRAM for the L1- and L2-cache, respectively, can significantly reduce area and energy while the performance slightly increases.

Slides

Session 9S
Special Session: The Role of Photons in Harming or Increasing Security
Time: 15:50 - 17:30 Thursday, January 23, 2014
Organizer: Francesco Regazzoni (University of Lugano, Switzerland), Edoardo Charbon (Delft University of Technology, Netherlands)
9S-1 (Time: 15:50 - 16:30)

Title	(Invited Paper) The Role of Photons in Cryptanalysis
Author	*Juliane Krämer (University Berlin, Germany), Michael Kasper (Fraunhofer Institute for Secure Information Technology, Germany), Jean-Pierre Seifert (University Berlin)
Abstract	Photons can be exploited to reveal secrets of security ICs like smartcards, secure microcontrollers, and cryptographic coprocessors. One such secret is the secret key of cryptographic algorithms. This work gives an overview about current research on revealing these secret keys by exploiting the photonic side channel. Different analysis methods are presented. It is shown that the analysis of photonic emissions also helps to gain knowledge about the attacked device and thus poses a threat to modern security ICs. The presented results illustrate the differences between the photonic and other side channels, which do not provide fine-grained spatial information. It is shown that the photonic side channel has to be addressed by software engineers and during chip design.

9S-2 (Time: 16:30 - 17:10)

Title	(Invited Paper) SPADs for Quantum Random Number Generators and Beyond
Author	Samuel Burri (EPFL, Switzerland), Damien Stucki (ID Quantique, Switzerland), Yuki Maruyama (Delft University of Technology, Netherlands), Claudio Bruschini (EPFL, Switzerland), Edoardo Charbon (Delft University of Technology, Netherlands), *Francesco Regazzoni (ALaRI - USI, Switzerland)
Abstract	This paper explores the design of a QRNG based on a massively parallel array of SPAD. The matrix comprises 512x128 independent cells that convert photons onto a raw bit-stream of random bits. The sequences are read out in a 128-bit parallel bus, concatenated, and pipelined onto a de-biasing filter. Reported results, achieved on the manufactured devices, show that the architecture can reach up to 5 Gbit/s while consuming 25pJ/bit, demonstrating scalability and performance for any RNG based on SPADs.

9S-3 (Time: 17:10 - 17:50)

Title	(Invited Paper) Quantum Key Distribution with Integrated Optics
Author	*Mirko Lobino (Griffith University, Australia), Anthony Laing (University of Bristol, U.K.), Pei Zhang (Xi'an Jiaotong University, U.K.), Kanin Aungskunsiri, Enrique Martin-Lopez (University of Bristol, U.K.), Joachim Wabnig (Nokia Research Centre, U.K.), Richard W. Nock, Jack Munns, Damien Bonneau, Pisu Jiang (University of Bristol, U.K.), Hong Wei Li (Nokia Research Centre, U.K.), John G. Rarity (University of Bristol, U.K.), Antti O. Niskanen (Nokia Research Centre, U.K.), Mark G. Thompson, Jeremy L. O'Brien (University of Bristol, U.K.)
Abstract	We report on a quantum key distribution (QKD) experiment where a client with an on-chip polarisation rotator can access a server through a telecom-fibre link. Large resources such as photon source and detectors are situated at server-side. We employ a reference frame independent QKD protocol for polarisation qubits and show that it overcomes detrimental effects of drifting fibre birefringence in a polarisation maintaining fibre.

19th Asia and South Pacific Design Automation Conference

ASP-DAC 2014

Special Sessions

Last Updated on: Mar 13, 2014