ASP-DAC 2012 Technical Program

The 17th Asia and South Pacific Design Automation Conference

Session 5C Parallelizing System-Level Simulation
Time: 16:10 - 17:25 Wednesday, February 1, 2012
Location: Room 202
Chairs: Chia-Lin Yang (National Taiwan University, Taiwan), Derek Chiou (University of Texas at Austin, U.S.A.)

5C-1 (Time: 16:10 - 16:35)

Title	Relaxed Synchronization Technique for Speeding-up the Parallel Simulation of Multiprocessor Systems
Author	Dukyoung Yun (Seoul National University, Republic of Korea), Sungchan Kim (Chonbuk National University, Republic of Korea), *Soonhoi Ha (Seoul National University, Republic of Korea)
Page	pp. 449 - 454
Keyword	multiprocessor, parallel simulation, time synchronization, simulation cache, relaxed memory model
Abstract	For design verification of an MPSoC, a virtual prototyping system has been widely used as a cheap and fast method without a hardware prototype. It usually consists of component simulators working together in a single simulation host. As the number of component simulators increases, the simulation performance degrades significantly due to occurrence of frequent inter-simulator communication. In this paper, to boost up the simulation speed further, we propose a novel technique, called relaxed synchronization, which uses a simulation cache at each component simulator for simulation purpose. Like an architectural cache that reduces the main memory access frequency, a simulation cache reduces the count of synchronous communication effectively between the corresponding component simulator and the simulation backplane. When a read or write request to a shared memory is made, a cache line, not a single element, is transferred to utilize the space and temporal locality for simulation. The proposed technique is based on an assumption that the application program uses a relaxed memory model. Through experiments with real-life applications, it is proved that the proposed approach improves the simulation performance by up to 330 %.

5C-2 (Time: 16:35 - 17:00)

Title	Parallel Simulation of Mixed-abstraction SystemC Models on GPUs and Multicore CPUs
Author	*Rohit Sinha, Aayush Prakash, Hiren D. Patel (University of Waterloo, Canada)
Page	pp. 455 - 460
Keyword	GPU, SystemC, Parallel Simulation
Abstract	This work presents a methodology that parallelizes the simulation of mixed-abstraction level SystemC models across multicore CPUs, and graphics processing units (GPUs) for improved simulation performance. Given a SystemC model, we partition it into processes suitable for GPU execution and CPU execution. We convert the processes identified for GPU execution into GPU kernels with additional SystemC wrapper processes that invoke these kernels. The wrappers enable seamless communication of events in all directions between the GPUs and CPUs. We alter the OSCI SystemC simulation kernel to allow parallel execution of processes. Hence, we co-simulate in parallel, the SystemC processes on multiple CPUs, and the GPU kernels on the GPUs; exploit both the CPUs, and GPUs for faster simulation. We experiment with synthetic benchmarks and a set-top box case study.

5C-3 (Time: 17:00 - 17:25)

Title	An Optimizing Compiler for Out-of-Order Parallel ESL Simulation Exploiting Instance Isolation
Author	*Weiwei Chen, Rainer Doemer (Center for Embedded Computer Systems, University of California, Irvine, U.S.A.)
Page	pp. 461 - 466
Keyword	Parallel Discrete Event Simulation, system-level description languages, Optimizing compiler
Abstract	Electronic system-level (ESL) design relies on fast discrete event (DE) simulation for the validation of design models written in system-level description languages (SLDLs). An advanced technique to speedup ESL validation is out-of- order parallel DE simulation which allows multiple threads to run early and in parallel on multi-core hosts. To avoid data hazards and ensure timing accuracy, this technique requires the compiler to statically analyze the design model for potential data access conflicts. In this paper, we propose a compiler optimization that improves the data conflict analysis by exploiting instance isolation. The reduction in the number of conflicts increases the available parallelism and results in significantly reduced simulation time. Our experimental results show up to 90% gain in simulation speed for less than 6% increase in compilation time.