Tutorials

ASP-DAC 2017 offers attendees a set of two-hour intense introductions to specific topics. Each tutorial will be presented twice a day to allow attendees to cover multiple topics. If you register for tutorials, you have the option to select three out of the six topics.

  • Date: Monday, January 16, 2017 (9:30 - 17:15)
  • Place: Makuhari Messe, International Conference Hall, 1F
Room 102 Room 103 Room 104 Room 105
9:30 - 11:30 Tutorial-1
Silicon Photonics for Computing Systems: Opportunities, Challenges, and Implementations
Tutorial-2
Towards Energy-Efficient Intelligence in Power-/Area-Constrained Hardware
Tutorial-3
Post-Silicon Validation and Emulation-Based Validation Using Exercisers
Tutorial-4
Quick Start Guide of Digital PLL for Digital Designers
12:45 - 14:45 Tutorial-1
Silicon Photonics for Computing Systems: Opportunities, Challenges, and Implementations
Tutorial-2
Towards Energy-Efficient Intelligence in Power-/Area-Constrained Hardware
Tutorial-5
The Emergence of Hardware Oriented Security and Trust
Tutorial-6
Cross-Layer Reliability Aware Design, Optimization and Dynamic Management
15:15 - 17:15 Tutorial-3
Post-Silicon Validation and Emulation-Based Validation Using Exercisers
Tutorial-4
Quick Start Guide of Digital PLL for Digital Designers
Tutorial-5
The Emergence of Hardware Oriented Security and Trust
Tutorial-6
Cross-Layer Reliability Aware Design, Optimization and Dynamic Management



(Student Grp: a group of four or more students from the same affiliation)
Advance (JPY) Late (JPY)
Member 22,000 26,000
Non-Member 26,000 30,000
Student 14,000 16,000
Student Grp 10,000 12,000



Tutorial-1: Monday, January 16, 9:30 - 11:30, 12:45 - 14:45@Room 102

Silicon Photonics for Computing Systems: Opportunities, Challenges, and Implementations

Organizers:
Jiang Xu (Hong Kong University of Science and Technology)
Yuichi Nakamura (NEC)
Speakers:
Jiang Xu (Hong Kong University of Science and Technology)
Shigeru Nakamura (NEC)

Tutorial Outline:

Computing systems, from HPC and data center to automobile and cellphone, are integrating growing numbers of processors, accelerators, memories, and peripherals to meet the burgeoning performance requirements of new applications under tight energy and thermal constraints. Recent advances in silicon photonics technologies promise ultra-high bandwidth, low latency, and great energy efficiency to alleviate the inter-rack, intra-rack, intra-board, and intra-chip communication bottlenecks in computing systems. Silicon photonics technologies piggyback onto developed silicon fabrication processes to provide viable and cost-effective solutions. Industry and academia have been actively developing silicon photonics technologies for the last decade. A large number of silicon photonics devices and circuits have been demonstrated in CMOS-compatible fabrication processes. Silicon photonics technologies open up new opportunities for architectures, design techniques, and EDA tools to fully explore new approaches and address the challenges of next-generation computing systems. This tutorial reviews the latest progresses and provides insights into the challenges and future developments, and covers the following topics.
  • Implementation examples
  • Optical and electrical interconnects and OE interfaces
  • Integrated optical switches
  • Inter/intra-chip optical networks
  • High-radix optical switching fabric
  • Optical thermal effects
  • Optical crosstalk noises
  • Modeling, analysis, and simulation platforms



Tutorial-2: Monday, January 16, 9:30 - 11:30, 12:45 - 14:45@Room 103

Towards Energy-Efficient Intelligence in Power-/Area-Constrained Hardware

Organizer:
Jae-sun Seo (Arizona State Univ.)
Speakers:
Zhengya Zhang (U. Michigan, Ann Arbor)
Mingoo Seok (Columbia Univ.)
Jae-sun Seo (Arizona State Univ.)

Tutorial Outline:

In recent years, machine learning algorithms (e.g. deep neural networks) have become widespread across a broad range of vision, speech, and biomedical applications. For similar cognitive tasks, there also has been a surge of interest in neuromorphic computing (e.g. spiking neural networks), which more closely follow biological nervous systems.

While state-of-the-art deep learning algorithms keep advancing designs of large-scale network models (e.g., 1000-layer networks) for incremental accuracy improvement, many embedded hardware applications face limitations on their scale in terms of cost, power, and area. Several special purpose hardware solutions (e.g., IBM TrueNorth, DaDianNao, MIT Eyeriss) have been previously proposed to help bring expensive algorithms to a low-power processor; however, limitations still exist in homogeneous architecture, memory footprint, on-chip communication, and online learning capability. It is a challenging task to enable essential machine learning and neuromorphic processors in mobile, wearable, internet of things (IoT), and in extreme implantable devices, due to their divergent constraints in low power and small footprint. Efficient hardware implementation on these different platforms thus require substantial and holistic system enhancements that include computation (e.g., low-precision/approximate computing), memory (e.g., weight/network compression), and communication (e.g., processor-in-memory, spatial architecture).

In this context, this tutorial will present a range of recent algorithm/architecture/circuit/device co-design techniques that can advance the hardware implementation of learning and classification algorithms for various embedded applications, such as computer vision, speech recognition, personal health monitoring, brain-computer interface, etc. We will present results of recent CMOS ASIC prototype designs, as well as technology beyond CMOS, which employ software-hardware co-design to accomplish substantial improvements in performance, energy efficiency and form factor. The proposed and demonstrated techniques include model/memory compression, architecture optimization, novel circuit design, incorporation of emerging devices, and neuro-inspired learning. These techniques will effectively reduce the computation complexity, memory footprint, and communication energy, thereby improving the overall mapping of machine learning and neuromorphic algorithms to energy- and size-constrained embedded platforms. This tutorial will help shed light on the tremendous potential and research needs towards energy- efficient intelligence in ubiquitous resource-constrained hardware systems.


Tutorial-3: Monday, January 16, 9:30 - 11:30@Room 104, 15:15 - 17:15@Room 102

Post-Silicon Validation and Emulation-Based Validation Using Exercisers

Organizers:
Ronny Morad (IBM Research - Haifa)
Vitali Sokhin (IBM Research - Haifa)
Speakers:
Ronny Morad (IBM Research - Haifa)
Vitali Sokhin (IBM Research - Haifa)

Tutorial Outline:

A study conducted by Wilson Research Group and Mentor Graphics in 2014 reveals that 70% of the designs don't make it to first silicon success and require re-spins. Various chip vendors reported that more than 1% of their bugs escape to silicon. Also 57% of the verification projects that involve big designs (>80M gates) use emulation. More and more companies realize that simulation alone is just not enough for ensuring that their design is clean of bugs and turn to a fast platform, whether silicon or emulation or both. However, the unique characteristics of the fast platforms (on the one hand speed, on the other hand, limited observability and controllability) make it very challenging to use them efficiently. Simply re-using tests from simulation won't give the desired benefit. A bare-metal exerciser is a technology which is increasingly used by various chip vendors for validation on emulation and silicon. An exerciser is basically a program which generates a test, executes it and then checks for correctness, all on the Design Under Test itself without any communication with a host machine. This enables high utilization of the fast platform. In addition light-weight generation gives variability without sacrificing performance. The fact that an exerciser is bare-metal means that it does not rely on an operating system, and therefore it's easier to debug a failure. The collective experience of various chip vendor companies (IBM and others) show that bare-metal exercisers are very efficient in finding unique and severe bugs that no other tool or environment were able to find.

In this two hours tutorial we'll cover the motivation for post-silicon and emulation-based validation. We'll go over existing practices and discuss their advantages as well as limitations. We'll then turn to describing bare-metal exercisers in details. We'll explain the concept, the requirements from an exerciser, and analyze the software architecture of a state-of-the-art exerciser. We'll also cover a recommended methodology for using exercisers and share experience of several companies that use exercisers. We'll conclude with presenting open problems and research directions in this domain.

This tutorial is intended for practitioners from the industry who would like to learn more about post-silicon in general and about the role of exercisers in particular as well as for academics who are looking to extend their knowledge about post-silicon validation and the research challenges it presents.


Tutorial-4: Monday, January 16, 9:30 - 11:30@Room 105, 15:15 - 17:15@Room 103

Quick Start Guide of Digital PLL for Digital Designers

Organizer:
Kenichi Okada (Tokyo Institute of Technology)
Speakers:
Kenichi Okada (Tokyo Institute of Technology)
Salvatore Levantino (Politecnico di Milano)

Tutorial Outline:

This tutorial will introduce the fundamentals of digital phased-locked loops (PLL's) for clock generation in Systems on Chip (SoC's). Nowadays, tens of PLLs are integrated into a single large SoC. So, jitter, area/power utilization, and design effort of clock generators are key parameters. Though PLLs have been typically designed by analog designers, fully- or mostly-synthesized digital PLLs in scaled CMOS processes are taking over traditional designs. They consist of only logic gates taken from the digital standard-cell library, so that their netlist and layout can be automatically generated by commercial digital EDA tools. After reviewing the basic architectures of digital PLLs, the tutorial will discuss the design of the main building blocks such as digitally-controlled variable delay line, digitally-controlled oscillator, divider chain, frequency locked loop, and digital loop filter. The second part of the tutorial will deal with more advanced architectures of synthesizable clock generators, such as bang-bang digital PLLs, injection locking PLLs, and multiplying delay-locked loops (MDLLs). Finally, we will move to fractional-N clock generation reviewing the most advanced techniques such as those based on digital/time converters (DTCs). By the end of this tutorial, the attendee will be able to start the design of a synthesizable digital PLL.


Tutorial-5: Monday, January 16, 12:45 - 14:45, 15:15 - 17:15@Room 104

The Emergence of Hardware Oriented Security and Trust

Organizer:
Chip-Hong Chang (Nanyang Technological Univ.)
Speakers:
Chip-Hong Chang (Nanyang Technological Univ.)
Yier Jin (Univ. of Central Florida)

Tutorial Outline:

Hardware has long been touted as dependable and trustable entity than the software running on it. The illusion that attackers cannot easily access the isolated integrated circuit (IC) supply chain has once and again been invalidated by remotely activated hardware Trojan and untraceable break-ins of networking systems running on fake and subverted chips reported by businesses and military strategists, and confirmed by forensic security experts analysing recent incidents. The situation was aggravated by the geographical dispersion of chip design activities and the heavy reliance on third-party hardware intellectual properties (IPs). Counterfeit chips (such as unauthorized copies, remarked/recycled dice, overproduced and subverted chips or cloned designs) pose a major threat to all stakeholders in the IC supply chain, from designers, manufacturers, system integrators to end users, in view of the severe consequence of potentially degraded quality, reliability and performance that they caused to the electronic equipment and critical infrastructure. Unfortunately, tools that can analyse the circuit netlist for malicious logic detection and full functionality recovery are lacking to prevent such design backdoors, counterfeit and malicious chips from infiltrated into the integrated circuit (IC) design and fabrication flow. This tutorial addresses and reviews recent development in preventive countermeasures, post-manufacturing diagnosis techniques and emerging security-enhanced primitives to avert these hardware security threats. This tutorial will also cover the emerging topics where hardware platforms are playing an active role in system protection and intrusion detection. It aims to create an awareness of the ultimate challenges and solutions in addressing hardware security issues in the new age of Internet of Things (IoT), where the intense interactions between devices and devices, and devices and humans have introduced new vulnerabilities of embedded devices and integrated electronic systems.


Tutorial-6: Monday, January 16, 12:45 - 14:45, 15:15 - 17:15@Room 105

Cross-Layer Reliability Aware Design, Optimization and Dynamic Management

Organizer:
Sheldon Tan (UC Riverside)
Speakers:
Sheldon Tan (UC Riverside)
Mehdi Tahoori (Karlsruhe Inst. Tech.)
Hai-Bao Chen (Shanghai Jiao Tong Univ.)

Tutorial Outline:

Reliability has become a significant challenge for design of current nanometer integrated circuits (ICs). It was expected that the future chips will show signs of reliability-induced age much faster than the previous generations. It is predicted that the mean time between failure (MTBF) for the future exascale computing will be around 30 minutes using today's devices and computing platforms. Furthermore, long-term reliability degradation caused by aging effects are becoming a limiting constraint in 3D ICs and emerging FinFET devices due to increased failure rates. Semiconductor industry faces new challenges to maintain the reliability in the reality of ever- continued increase in the die size and number of transistors accompanying by the performance driven down-scaling in transistor size. This has led the International Technology Roadmap For Semiconductor (ITRS) to predict the onset of significant reliability problems in the future, and at a pace that has not been seen in the past. Existing technologies for ensuring reliability will not be able to satisfy the competing requirements for future ICs as they typically operate in one layer under worst-case assumptions about other layers in the design stacks. This potentially leads to inefficiencies that will make these techniques impractical in future fabrication processes.

The first part of the tutorial will describe novel approach and techniques for recently proposed physics-based electromigration (EM) modeling and assessment methods. We will present a newly proposed a physics-based model for both void nucleation phase and void growth phase and show how this new model can be applied for full-chip power grid EM analysis to account the essential redundancy of on-chip power grid networks. We then present a novel method to calculate the hydrostatic stress evolution inside a multi-branch interconnect tree that allows to avoid over optimistic prediction of the time to failure (TTF) made with the Blech-Black analysis of individual branches of interconnect tree. We further show how to extend the new physics- based model to consider the time-varying current and temperature stressing conditions, which are common working chip conditions. We finally show a recently proposed novel voltage-based EM immortality check algorithm for general interconnect trees and a new EM-signoff flow, which consists of fast EM immortality check and detailed numerical FFT analysis and its potential engagement with EM-aware physical design.

The second part of this tutorial targets device aging and in particular bias temperature instability (BTI) and hot carrier injection (HCI) which affect transistor performance over time. These aging mechanisms degrade the threshold voltage of transistors and lead to timing failures in logic paths and reduced signal to noise margin (SNM) in memories. We will cover various static (design time) as well as dynamic (based on runtime monitoring) techniques to analyze, monitor and mitigate aging effects in both logic and memory components. The effective solutions are cross- layer, meaning that they cover a wide range of abstraction levels and design stacks to be able to tackle this problem with minimum costs. The design time solutions prolog the lifetime of the circuit by minimizing the amount of aging stress on timing critical components of the design and maximizing the duration of aging relaxation for better aging recovery. Moreover, we show how aging-awareness can be integrated during logic synthesis. The runtime solutions monitor the amount of critical stress on the circuit and try to use proactive fine-grain aging mitigation to balance performance, power and aging during system operation.

In the third part of this tutorial, we will focus on the dynamic reliability management (DRM) technique based on the newly proposed electromigration models. First we will present a resource based EM model, which model the reliability or FFT of a wire due to EM effects as resources at the system level. Then we present a novel task migration method to explicitly balance consumption of EM resources for all the cores. The new method aims at the equal chance of failure of these cores, which will maximize the lifetime of the whole multi/many core system. Second, we present new DRM techniques for emerging many-core dark silicon processors. We employ the dynamic voltage and frequency scaling (DVFS) and dark silicon core on/off status as the controling knots. We show how the energy or lifetime of cores can be optimized subject to performance, temperature and lifetime constraints. On top of this, we show if soft errors and hard reliability like EM are considered at the same time, reliability constrained energy or performance optimization will become more challenging due to conflicting impacts of powers on those reliability effects. Finally, we show how the relativity modeling and management can be done at the datacenter level. We will present a new combined datacenter power and reliability compact model using a learning based approach in which a feed-forward neural network (FNN) is trained to predict energy and long term reliability for each processor under datacenter scheduling and workloads.
Last Updated on: 10 19, 2016