(Back to Session Schedule)

The 17th Asia and South Pacific Design Automation Conference

Session 5C  Parallelizing System-Level Simulation
Time: 16:10 - 17:25 Wednesday, February 1, 2012
Location: Room 202
Chairs: Chia-Lin Yang (National Taiwan University, Taiwan), Derek Chiou (University of Texas at Austin, U.S.A.)

5C-1 (Time: 16:10 - 16:35)
TitleRelaxed Synchronization Technique for Speeding-up the Parallel Simulation of Multiprocessor Systems
AuthorDukyoung Yun (Seoul National University, Republic of Korea), Sungchan Kim (Chonbuk National University, Republic of Korea), *Soonhoi Ha (Seoul National University, Republic of Korea)
Pagepp. 449 - 454
Keywordmultiprocessor, parallel simulation, time synchronization, simulation cache, relaxed memory model
AbstractFor design verification of an MPSoC, a virtual prototyping system has been widely used as a cheap and fast method without a hardware prototype. It usually consists of component simulators working together in a single simulation host. As the number of component simulators increases, the simulation performance degrades significantly due to occurrence of frequent inter-simulator communication. In this paper, to boost up the simulation speed further, we propose a novel technique, called relaxed synchronization, which uses a simulation cache at each component simulator for simulation purpose. Like an architectural cache that reduces the main memory access frequency, a simulation cache reduces the count of synchronous communication effectively between the corresponding component simulator and the simulation backplane. When a read or write request to a shared memory is made, a cache line, not a single element, is transferred to utilize the space and temporal locality for simulation. The proposed technique is based on an assumption that the application program uses a relaxed memory model. Through experiments with real-life applications, it is proved that the proposed approach improves the simulation performance by up to 330 %.

5C-2 (Time: 16:35 - 17:00)
TitleParallel Simulation of Mixed-abstraction SystemC Models on GPUs and Multicore CPUs
Author*Rohit Sinha, Aayush Prakash, Hiren D. Patel (University of Waterloo, Canada)
Pagepp. 455 - 460
KeywordGPU, SystemC, Parallel Simulation
AbstractThis work presents a methodology that parallelizes the simulation of mixed-abstraction level SystemC models across multicore CPUs, and graphics processing units (GPUs) for improved simulation performance. Given a SystemC model, we partition it into processes suitable for GPU execution and CPU execution. We convert the processes identified for GPU execution into GPU kernels with additional SystemC wrapper processes that invoke these kernels. The wrappers enable seamless communication of events in all directions between the GPUs and CPUs. We alter the OSCI SystemC simulation kernel to allow parallel execution of processes. Hence, we co-simulate in parallel, the SystemC processes on multiple CPUs, and the GPU kernels on the GPUs; exploit both the CPUs, and GPUs for faster simulation. We experiment with synthetic benchmarks and a set-top box case study.

5C-3 (Time: 17:00 - 17:25)
TitleAn Optimizing Compiler for Out-of-Order Parallel ESL Simulation Exploiting Instance Isolation
Author*Weiwei Chen, Rainer Doemer (Center for Embedded Computer Systems, University of California, Irvine, U.S.A.)
Pagepp. 461 - 466
KeywordParallel Discrete Event Simulation, system-level description languages, Optimizing compiler
AbstractElectronic system-level (ESL) design relies on fast discrete event (DE) simulation for the validation of design models written in system-level description languages (SLDLs). An advanced technique to speedup ESL validation is out-of- order parallel DE simulation which allows multiple threads to run early and in parallel on multi-core hosts. To avoid data hazards and ensure timing accuracy, this technique requires the compiler to statically analyze the design model for potential data access conflicts. In this paper, we propose a compiler optimization that improves the data conflict analysis by exploiting instance isolation. The reduction in the number of conflicts increases the available parallelism and results in significantly reduced simulation time. Our experimental results show up to 90% gain in simulation speed for less than 6% increase in compilation time.