(Back to Session Schedule)

The 16th Asia and South Pacific Design Automation Conference

Session 1D  University LSI Design Contest
Time: 10:20 - 12:20 Wednesday, January 26, 2011
Location: Room 416+417
Organizers: Masanori Hariyama (Tohoku University, Japan), Hiroshi Kawaguchi (Kobe University, Japan)

1D-1 (Time: 10:20 - 10:24)
TitleA H.264/MPEG-2 Dual Mode Video Decoder Chip Supporting Temporal/Spatial Scalable Video
Author*Cheng-An Chien, Yao-Chang Yang, Hsiu-Cheng Chang, Jia-Wei Chen, Cheng-Yen Chang, Jiun-In Guo, Jinn-Shyan Wang (National Chung Cheng University, Taiwan), Ching-Hwa Cheng (Feng Chia University, Taiwan)
Pagepp. 73 - 74
KeywordH.264, MPEG2, SVC
AbstractThis paper proposes the first dual mode video decoder with 4-level temporal/spatial scalability and 32/64-bit adjustable memory bus width. A design automation environment for simulation and verification is established to automatically verify the correctness and completeness of the proposed design. Using a 0.13 um CMOS technology, it comprises 439Kgates/10.9KB SRAM and consumes 2~328mW in decoding CIF~HD1080 videos at 3.75~30fps when operating at 1~150MHz, respectively.
Slides

1D-2 (Time: 10:24 - 10:28)
TitleA Gate-level Pipelined 2.97GHz Self Synchronous FPGA in 65nm CMOS
Author*Benjamin Devlin, Makoto Ikeda, Kunihiro Asada (University of Tokyo, Japan)
Pagepp. 75 - 76
Keywordself-synchronous, fpga, high-throughput, reliable, pipeline
AbstractWe have designed and measured the performance against power supply bounce and aging of a Self Synchronous FPGA (SSFPGA) in 65nm CMOS which achieves 2.97GHz throughput at 1.2V. The proposed SSFPGA employs a 38x38 array of 4-input,3-stage Self Synchronous Configurable Logic Blocks (SSCLB), with the introduction of a new dual tree-divider 4 input LUT to achieve a 4.5x throughput improvement over our previous model. Energy was measured at 3.23 pJ/block/cycle using a custom built board. We measured the SSFPGA for aging with accelerated degradation and results show the SSFPGA has 8% longer time margin before chip malfunctions compared to a Synchronous FPGA.
Slides

1D-3 (Time: 10:28 - 10:32)
TitleA 4.32 mm2 170mW LDPC Decoder in 0.13µm CMOS for WiMax/Wi-Fi Applications
Author*Dan Bao, Chuan Wu, Yan Ying, Yun Chen, Xiao Yang Zeng (Fudan University, China)
Pagepp. 77 - 78
KeywordLDPC Decoder
AbstractAn energy-efficient programmable LDPC decoder is proposed for WiMax and Wi-Fi applications. The proposed decoder is designed with scalable processing units, flexible message passing network and medium-grain partitioned memories to harvest programmability, area reduction, and energy efficiency. The decoder can be programmed by host processor with several special-purpose micro-instructions. Thus, various operation modes can be reconfigured. Fabricated in SMIC 0.13µm 1P8M CMOS process, the chip occupies 4.32 mm2 with core area 2.97 mm2, and consumes 170mW with a throughput of 302Mb/s when operating at 145MHz and 1.2V.

1D-4 (Time: 10:32 - 10:36)
TitleAll-Digital PMOS and NMOS Process Variability Monitor Utilizing Buffer Ring with Pulse Counter
Author*Jaehyun Jeong, Tetsuya Iizuka, Toru Nakura, Makoto Ikeda, Kunihiro Asada (University of Tokyo, Japan)
Pagepp. 79 - 80
Keywordprocess variability, process monitor, buffer ring, all digital
AbstractThis paper presents an all-digital PMOS and NMOS process variability monitor which utilizes a simple buffer ring with a pulse counter. The proposed circuit monitors the process variability according to a count number of a single pulse which propagates on the buffer ring and a fixed logic level after the pulse vanishes. The proposed circuit has been fabricated in 65nm CMOS process and the measurement results demonstrate that we can monitor the PMOS and NMOS variabilities independently using the proposed monitoring circuit.

1D-5 (Time: 10:36 - 10:40)
TitleJitter Amplifier for Oscillator-Based True Random Number Generator
Author*Takehiko Amaki, Masanori Hashimoto, Takao Onoye (Osaka University, Japan)
Pagepp. 81 - 82
Keywordtrue random number generator, jitter
AbstractThis paper presents a jitter amplifier for oscillatorbased TRNG (true random number generator). The proposed jitter amplifier fabricated in a 65nm CMOS process occupying the area of 3,300 um2 archives 8.4x gain at 25 degrees Celsius and significantly improves the entropy enough to pass randomness test.
Slides

1D-6 (Time: 10:40 - 10:44)
TitleA 65nm Flip-Flop Array to Measure Soft Error Resiliency against High-Energy Neutron and Alpha Particles
Author*Jun Furuta (Kyoto University, Japan), Chikara Hamanaka, Kazutoshi Kobayashi (Kyoto Institute of Technology, Japan), Hidetoshi Onodera (Kyoto University, Japan)
Pagepp. 83 - 84
KeywordSoft Error
AbstractWe fabricated a 65nm LSI including flip-flop array to measure soft error resiliency against high-energy neutron and alpha particles. It consists of two FF arrays as follows. One is an array composed of redundant FFs to confirm radiation hardness of the proposed and conventional redundant FFs. The other is an array composed of ordinal D-FFs to measure SEU (Single Event Upset) and MCU(Multiple Cell Upset) by the distance from tap cells.
Slides

1D-7 (Time: 10:44 - 10:48)
TitleDual-Phase Pipeline Circuit Design Automation with a Built-in Performance Adjusting Mechanism
AuthorYu-Tzu Tsai, Cheng-Chih Tsai (Dept. of Electronic Engineering Feng Chia University, Taiwan), *Cheng-An Chien (Dept. of CSIE, National Chung Cheng University, Taiwan), Ching-Hwa Cheng (Dept. of Electronic Engineering Feng Chia University, Taiwan), Jiun-In Guo (Dept. of CSIE, National Chung Cheng University, Taiwan)
Pagepp. 85 - 86
Keywordpipeline, domino circuit
AbstractThe high speed dual phase operation domino circuit, which includes high-performance and reliable characteristics, is proposed and the circuit design technique with practical implementation is presented. The cell-based automatic synthesis flow supports the quick design of high performance chips. The test chip of a dual-phase 64 bit high-speed multiplier with a built-in performance adjustment mechanism is successfully validated using TSMC 0.18 technology. The test chip shows 2.7x performance improvement compares to conventional static CMOS logic design.

1D-8 (Time: 10:48 - 10:52)
TitleGeyser-2: The Second Prototype CPU with Fine-grained Run-time Power Gating
Author*Lei Zhao, Daisuke Ikebuchi, Yoshiki Saito, Masahiro Kamata, Naomi Seki, Yu Kojima, Hideharu Amano (Keio University, Japan), Satoshi Koyama, Tatsunori Hashida, Yusuke Umahashi, Daiki Masuda, Kimiyoshi Usami (Shibaura Institute of Technology, Japan), Kazuki Kimura, Mitaro Namiki (Tokyo University of Agriculture and Technology, Japan), Seidai Takeda, Hiroshi Nakamura (University of Tokyo, Japan), Masaaki Kondo (The University of Electro-Communications, Japan)
Pagepp. 87 - 88
KeywordPower Gating, MIPS CPU, Low Power Design
AbstractGeyser-2 is the second prototype MIPS CPU which provides a fine-grained run-time power gating controlled by instructions. It works at 210MHz clock and reduces 60% of leakage power in the normal temperature.
Slides

1D-9 (Time: 10:52 - 10:56)
TitleAn Implementation of an Asychronous FPGA Based on LEDR/Four-Phase-Dual-Rail Hybrid Architecture
Author*Yoshiya Komatsu, Shota Ishihara, Masanori Hariyama, Michitaka Kameyama (Tohoku University, Japan)
Pagepp. 89 - 90
KeywordFPGA, Asynchronous architecture, LEDR, 4-phase dual-rail
AbstractThis paper presents an asynchronous FPGA that combines four-phase dual-rail encoding and LEDR (Level-Encoded Dual-Rail) encoding. Four-phase dual-rail encoding is used for small area and low power of function units, while LEDR encoding for high throughput and low power of data transfer. The proposed FPGA is fabricated in the e-Shuttle 65nm CMOS process and operates at 870 MHz. Compared to the synchronous FPGA, the power consumption is reduced by 38% for the workload of 15%.
Slides

1D-10 (Time: 10:56 - 11:00)
TitleDesign and Chip Implementation of a Heterogeneous Multi-core DSP
Author*Shuming Chen, Xiaowen Chen, Yi Xu, Jianghua Wan, Jianzhuang Lu, Xiangyuan Liu, Shenggang Chen (National University of Defense Technology, China)
Pagepp. 91 - 92
Keywordmulti-core processor, Digital Signal Processor, heterogeneous
AbstractThis paper presents a novel heterogeneous multi-core Digital Signal Processor, named YHFT-QDSP, hosting one RISC CPU core and four VLIW DSP cores. The CPU core is responsible for task scheduling and management, while the DSP cores take charge of speeding up data processing. The YHFT-QDSP provides three kinds of interconnection communication. One is for inner-chip communication between the CPU core and the four DSP cores, the other two for both inner-chip and inter-chip communication amongst DSP cores. The YHFT-QDSP is implemented under SMIC 130nm LVT CMOS technology and can run 350MHz@1.2V with 114.49 mm2 die area.
Slides

1D-11 (Time: 11:00 - 11:04)
TitleA Low-Power Management Technique for High-Performance Domino Circuits
AuthorYu-Tzu Tsai, Cheng-Chih Tsai (Dept. of Electronic Engineering Feng Chia University, Taiwan), *Cheng-An Chien (Dept. of CSIE, National Chung Cheng University, Taiwan), Ching-Hwa Cheng (Dept. of Electronic Engineering Feng Chia University, Taiwan), Jiun-In Guo (Dept. of CSIE, National Chung Cheng University, Taiwan)
Pagepp. 93 - 94
Keywordpower management, domino circuit
AbstractExploiting a charge sharing method enables a performance power management design for domino circuits. The domino circuits have both high performance and low power consumption. A test chip has been successfully validated using TSMC 0.13um CMOS technology. Reductions in dynamic power consumption of 68% and static power consumption of 15% are achieved.

1D-12 (Time: 11:04 - 11:08)
TitleDesign and Evaluation of Variable Stages Pipeline Processor Chip
Author*Tomoyuki Nakabayashi, Takahiro Sasaki, Kazuhiko Ohno, Toshio Kondo (Mie University, Japan)
Pagepp. 95 - 96
KeywordVLSI, Low energy processor, Variable stages pipeline, Glitch
AbstractIn order to reduce the energy consumption in high performance computing, variable stages pipeline processor (VSP) is proposed, which improves execution time by dynamically unifying the pipeline stages. The VSP adopts a special pipeline register called an LDS-cell that unifies the pipeline stages and prevents glitch propagation. We fabricate the VSP chip on a Rohm 0.18um CMOS process and evaluate the energy consumption. The result indicates the VSP can achieve 13% less energy consumption than the conventional approach.
Slides

1D-13 (Time: 11:08 - 11:12)
TitleTurboVG: A HW/SW Co-Designed Multi-Core OpenVG Accelerator for Vector Graphics Applications with Embedded Power Profiler
Author*Shuo-Hung Chen, Hsiao-Mei Lin, Ching-Chou Hsieh, Chih-Tsun Huang, Jing-Jia Liou, Yeh-Ching Chung (National Tsing Hua University, Taiwan)
Pagepp. 97 - 98
KeywordHW/SW Co-Design, Embedded System, Vector Graphics, OpenVG
AbstractTurboVG is a hardware accelerator for the OpenVG 1.1 library that operates sixteen times faster than an optimized software implementation. This improved efficiency stems from a well-designed hardware-software interaction capable of handling massive data transfers across hierarchical layers without performance loss. By combining multiple TurboVG cores, the library can support screen resolutions of up to Full-HD 1080p.
Slides

1D-14 (Time: 11:12 - 11:16)
TitleDesign and Implementation of a High Performance Closed-Loop MIMO Communications with Ultra Low Complexity Handset
Author*Yu-Han Yuan, Wei-Ming Chen, Hsi-Pin Ma (National Tsing Hua University, Taiwan)
Pagepp. 99 - 100
KeywordMIMO, GMD, THP
AbstractA MIMO transceiver in which transmitter antenna selection is applied to geometric mean decomposition (GMD) which is combined with Tomlinson-Harashima Precoder (THP) in TDD system is implemented. We take the decoder quantization into consideration, and it can be simple for the handset. The proposed work can save more than 60% computational complexity at the handset compared with the GMD scheme. From the simulation results, the proposed transceiver can achieve about 7 dB SNR improvement over the open-loop VBLAST counterparts even about 2dB SNR better than ML at BER=10-2 under i.i.d. channel. Finally, the proposed HW/SW co-verification strategy provided an efficient way to do the verification.
Slides

1D-15 (Time: 11:16 - 11:20)
TitleA 58-63.6GHz Quadrature PLL Frequency Synthesizer Using Dual-Injection Technique
Author*Ahmed Musa, Rui Murakami, Takahiro Sato, Win Chiavipas, Kenichi Okada, Akira Matsuzawa (Tokyo Institute of Technology, Japan)
Pagepp. 101 - 102
Keyword60GHz, ILO, Injection locking
AbstractThis paper proposes a 60GHz quadrature PLL frequency synthesizer that has a tuning range capable of covering the whole band specified by the IEEE802.15.3c with exceptional phase noise. The synthesizer is constructed using a 20GHz PLL that is coupled with a frequency tripler to generate the 60GHz signal. Both the 20GHz PLL and the ILO were fabricated using a 65nm CMOS process and measurement results show a phase noise of -96dBc/Hz at 60GHz while consuming 77.5mW from a 1.2V supply.
Slides

1D-16 (Time: 11:20 - 11:24)
TitleAn Ultra-low-voltage LC-VCO with a Frequency Extension Circuit for Future 0.5-V Clock Generation
Author*Wei Deng, Kenichi Okada, Akira Matsuzawa (Tokyo Institute of Technology, Japan)
Pagepp. 103 - 104
Keywordclock generator, 0.5-V, LC-VCO
AbstractThis paper proposes a 0.5-V LC-VCO with a frequency extension circuit to replace ring oscillators for ultra-low-voltage sub-1ps-jitter clock generation. Significant performances, in terms of 0.6-ps jitter, 50MHz-to-6.4GHz frequency tuning range with 2 bands and sub-1mW PDC, indicates the successful replacement of ring VCO for the future 0.5-V LSIs and power aware LSIs.
Slides

1D-17 (Time: 11:24 - 11:28)
TitleA 32Gbps Low Propagation Delay 4x4 Switch IC for Feedback-Based System in 0.13µm CMOS Technology
AuthorYu-Hao Hsu, Yang-Syu Lin, Ching-Te Chiu, Jen-Ming Wu, Shuo-Hung Hsu, Fan-Ta Chen, Min-Sheng Kao, *Wei-Chih Lai, YarSun Hsu (National Tsing Hua University, Taiwan)
Pagepp. 105 - 106
Keywordlow propagation delay, load-balanced switch
AbstractAbstract - In this paper, a low propagation delay, low power, and area-efficient 4x4 load-balanced switch circuit for feedback-based system is presented. In this periodic and deterministic switch, only two DFFs are used to implement a pattern generator which is a O(N3) hardware complexity in traditional matching algorithm based NxN switch. For packet reordering, a feedback path is established in series of symmetric patterns. As comparing with commercial switch systems, we implement a 4x4 switch IC directly in high speed domain without the use of SERDES interfaces to achieve low propagation delay and high scalability. In CML output buffer, PMOS active load and active back-end termination are introduced. A stacked current source and symmetric topology in CML-DFF are adopted. From our results, this work efficiently deducted 28ns propagation delay, 80% area and 80% power introduced by the SERDES interface. The throughput rate is up to 32Gbps (8Gbps/Ch).
Slides

1D-18 (Time: 11:28 - 11:32)
TitleA Fully Integrated Shock Wave Transmitter with an On-Chip Dipole Antenna for Pulse Beam-Formability in 0.18-μm CMOS
Author*Nguyen Ngoc Mai Khanh (The University of Tokyo, Japan), Masahiro Sasaki, Kunihiro Asada (VLSI Design and Education Center (VDEC), the University of Tokyo, Japan)
Pagepp. 107 - 108
Keywordshock wave, CMOS, on-chip antenna, beam-forming, transmitter
AbstractThis paper presents a fully integrated 9-11-GHz shock wave transmitter with an on-chip antenna and a digitally programmable delay circuit (DPDC) for pulse beam-formability in short-range microwave active imaging applications. The resitorless shock wave generator (SWG) produces a 0.4-V peak-to-peak (p-p) shock wave output in HSPICE simulation. The DPDC is designed to adjust delays of shock-wave outputs for the beam-forming purpose. SWG's output is sent to an integrated meandering dipole antenna through an on-chip transformer. The measured return loss, S11, of a stand-alone integrated meandering dipole is from -26 dB to -10 dB with frequency range of 7.5-12 GHz. A 1.1-mV(p-p) shock wave output is received by a 20-dB standard gain horn antenna located at a 38-mm distance from the chip. Frequency response and delay resolution of the measured shock wave output are 9-11-GHz and 3-ps, respectively. These characteristics are suitable for fully integrated pulse beam-forming array antenna system.
Slides

1D-19 (Time: 11:32 - 11:36)
TitleAn On-Chip Characterizing System for Within-Die Delay Variation Measurement of Individual Standard Cells in 65-nm CMOS
Author*Xin Zhang, Koichi Ishida, Makoto Takamiya, Takayasu Sakurai (University of Tokyo, Japan)
Pagepp. 109 - 110
Keywordwithin-die delay variation, design for manufacturing, on-chip oscilloscope
AbstractNew characterizing system for within-die delay variations of individual standard cells is presented. The proposed characterizing system is able to measure rising and falling delay variations separately by directly measuring the input and output waveforms of individual gate using an on-chip sampling oscilloscope in 65nm CMOS process. 7 types of standard cells are measured with 60 DUT’s for each type. Thanks to the proposed system, a relationship between the rising and falling delay variations and the active area of the standard cells is experimentally shown for the first time.
Slides

1D-21 (Time: 11:36 - 11:40)
TitleRobust and Efficient Baseband Receiver Design for MB-OFDM UWB System
AuthorWen Fan, *Chiu-Sing Choy (The Chinese University of Hong Kong, Hong Kong)
Pagepp. 111 - 112
KeywordMB-OFDM UWB, baseband, receiver
AbstractRobust, efficient and low complexity design methodologies for high speed multi-band orthogonal frequency division multiplexing ultra-wideband (MB-OFDM UWB) is presented. The proposed design is implemented in 0.13μm CMOS technology with the core area of 2.66mm×0.94mm. Operating at 132MHz clock frequency, the estimated power consumption is 170mW.

1D-22 (Time: 11:40 - 11:44)
TitleA 95-nA, 523ppm/°C, 0.6-µW CMOS Current Reference Circuit with Subthreshold MOS Resistor Ladder
Author*Yuji Osaki, Tetsuya Hirose, Nobutaka Kuroki, Masahiro Numa (Kobe University, Japan)
Pagepp. 113 - 114
Keywordcurrent reference, low power, temperature stability
AbstractA low-power current reference circuit was developed in a 0.35-um standard CMOS process. The proposed circuit utilizes an offset-voltage generation subcircuit consisting of subthreshold MOS resistor ladder and generates temperature compensated reference current. Experimental results demonstrated that the proposed circuit generates a 95-nA reference current, and that the total power dissipation is 586 nW. The temperature coefficient of the reference current can be kept small within 523ppm/°C in a temperature range from -20 to 100°C.
Slides

1D-23 (Time: 11:44 - 11:48)
TitleA 80-400 MHz 74 dB-DR Gm-C Low-Pass Filter With a Unique Auto-Tuning System
Author*Ting Gao, Wei Li, Ning Li, Junyan Ren (Fudan University, China)
Pagepp. 115 - 116
KeywordGm-C, filter, auto tuning
AbstractAn 80-400 MHz 5TH order Chebyshev Gm-C low-pass filter with a unique auto tuning system is presented. The filter was fabricated with TSMC 0.13-ìm CMOS process. Experimental results show that the cut-off frequency of the filter can be tuned between 80-400MHz, with an average tuning error of 3.6%. The filter also realizes gain of 0-30 dB, IIP3 of 16.5 dBm, NF of 14-18 dB, and DR of 74 dB. Power dissipation is only 9 mW with 1.2 V supply voltage.
Slides

1D-24 (Time: 11:48 - 11:52)
TitleAn Adaptively Biased Low-Dropout Regulator with Transient Enhancement
Author*Chenchang Zhan, Wing-Hung Ki (Hong Kong University of Science and Technology, Hong Kong)
Pagepp. 117 - 118
Keywordlow-dropout regulator, adaptive biasing, output-capacitor-free, transient enhancement
AbstractAn output-capacitor-free adaptively biased low-dropout regulator with transient enhancement (ABTE LDR) is proposed. Techniques of Q-reduction compensation, adaptive biasing, and transient enhancement achieve low-voltage high-precision regulation with low quiescent current consumption while significantly improving the line and load transient responses and power supply rejections. The features of the ABTE LDR are experimentally verified by a 0.35-um CMOS prototype.
Slides

1D-25 (Time: 11:52 - 11:56)
TitleA Low-Power Triple-Mode Sigma-Delta DAC for Reconfigurable (WCDMA/TD-SCDMA/GSM) Transmitters
Author*Dong Qiu, Ting Yi, Zhiliang Hong (Fudan University, China)
Pagepp. 119 - 120
KeywordDigital-to-Analog Converter, Reconfigurable, Low-power
AbstractThis paper presents a sigma-delta DAC with channel filtering for multi-standard wireless transmitters. It can be digitally programmed to satisfy specifications of WCDMA, TD-SCDMA and GSM standards. The measured SFDR are 62.8/60.1/75.5 dB for WCDMA/ TD-SCDMA/ GSM mode, respectively. The sigma-delta DAC manufactured in SMIC 0.13-ìm CMOS process occupies a 0.72 mm2 die area, while drawing 5.52/4.82/3.04 mW in WCDMA/TD-SCDMA/GSM mode from a single 1.2-V supply voltage.

1D-26 (Time: 11:56 - 12:00)
TitleA Simple Non-coherent Solution to the UWB-IR Communication
Author*Mohiuddin Hafiz, Nobuo Sasaki, Kentaro Kimoto, Takamaro Kikkawa (Hiroshima University, Japan)
Pagepp. 121 - 122
Keywordnon-coherent, BPSK, CMOS, Transceiver
AbstractA simple non-coherent solution to UWB-IR communication has been presented here. An all digital differential transmitter, developed in a 65 nm CMOS technology and a simple receiver, developed in a 180 nm CMOS technology, for detecting the received differential signal are demonstrated in the work. Though the transmitter and the receiver have been developed in two different technologies, the main objective of this paper is to show the effectiveness of such a non-coherent solution for BPSK modulated UWB-IR communication.
Slides