ASP-DAC 2009 Technical Program

The 14th Asia and South Pacific Design Automation Conference

Session 2D Special Session: EDA Acceleration Using New Architectures
Time: 13:30 - 15:35 Tuesday, January 20, 2009
Location: Room 416+417
Organizer: Damir A. Jamsek (IBM Corp., United States)

2D-1 (Time: 13:35 - 14:15)

Title	(Invited Paper) Aspects of GPU for General Purpose High Performance Computing
Author	*Reiji Suda (The University of Tokyo/JST CREST, Japan), Takayuki Aoki (Tokyo Institute of Technology/JST CREST, Japan), Shoichi Hirasawa (University of Electro-Communications/JST CREST, Japan), Akira Nukada (Tokyo Institute of Technology/JST CREST, Japan), Hiroki Honda (University of Electro-Communications/JST CREST, Japan), Satoshi Matsuoka (Tokyo Institute of Technology/JST CREST/NII, Japan)
Page	pp. 216 - 223
Keyword	GPU computing, performance evaluation, scheduling algorithm, task parallel paradigm
Abstract	We discuss hardware and software aspects of GPGPU, specifically focusing on NVIDIA cards and CUDA, from the viewpoints of parallel computing. The major weak points of GPU against newest supercomputers are identified to be and summarized as only four points: large SIMD vector length, small memory, absence of fast L2 cache, and high register spill penalty. As software concerns, we derive optimal scheduling algorithm for latency hiding of host-device data transfer, and discuss SPMD parallelism on GPUs.

2D-2 (Time: 14:15 - 14:55)

Title	(Invited Paper) Designing and Optimizing Compute Kernels on Nvidia GPUs
Author	*Damir A. Jamsek (IBM Research Division, United States)
Page	pp. 224 - 229
Keyword	GPU, NVIDIA
Abstract	The availability of high performance compute capability in NVIDIA GPUs has expanded their use in CAD environments. We will describe the basic compute models including host/device programming models, device multi-thread programming models, as well optimization and performance tuning techniques

2D-3 (Time: 14:55 - 15:35)

Title	(Invited Paper) Parallelizing Fundamental Algorithms such as Sorting on Multi-core Processors for EDA Acceleration
Author	*Masato Edahiro (System IP Core Research Laboratories, NEC Corporation/Department of Computer Science, University of Tokyo, Japan)
Page	pp. 230 - 233
Keyword	multi-core, many-core, parallel algorithm, sorting
Abstract	Fundamental algorithms should be parallelized to accelerate EDA software on multi-core architecture. In this paper, we introduce scalable algorithms that have scalability on multi-cores. As an example, a sorting algorithm, called Map Sort, is presented. This algorithm uses a map from subsets of input data to intervals on data range. Experimental results show that, in comparison with quick sort on a single CPU, processing time of Map Sort is comparable on a CPU and three times faster on four CPUs.
Slides