ASP-DAC 2015 Technical Program

The 20th Asia and South Pacific Design Automation Conference

Session 8A Exploring Better Architecture of Your Systems
Time: 13:50 - 15:30 Thursday, January 22, 2015
Location: Room 102
Chairs: Rainer Doemer (University of California, Irvine, U.S.A.), Hoeseok Yang (Ajou University, Republic of Korea)

8A-1 (Time: 13:50 - 14:15)

Title	An Accurate ACOSSO Metamodeling Technique for Processor Architecture Design Space Exploration
Author	*Hongwei Wang (Beijing Key Laboratory of Mobile Computing and Pervasive Device/Institute of Computing Technology, Chinese Academy of Sciences/University of Chinese Academy of Sciences, China), Ziyuan Zhu, Jinglin Shi, Yongtao Su (Beijing Key Laboratory of Mobile Computing and Pervasive Device/Institute of Computing Technology, Chinese Academy of Sciences, China)
Page	pp. 689 - 694
Keyword	design space exploration, metamodeling, ACOSSO
Abstract	Processor architects usually design uniprocessor or chip multiprocessor (CMP) by using a platform-based approach. One of the major challenges in this approach is to explore the exponential-size design space composed of many tunable and interacting architectural parameters. An exhaustive search of the design space is prohibitive because of the expensive run-time of simulations. So an efficient design space exploration (DSE) strategy that can fast find the multi-objective architectural configurations (points in design space) in terms of system metrics like performance and energy is needed. In this paper, we propose an accurate and efficient adaptive component selection and smoothing operator (ACOSSO) metamodel assisted NSGA-II (MA-NSGA-II) multi-objective optimization (MOO) technique for processor DSE. We show the effectiveness of our methodology by comparing with linear regression (LR), restrict cubic splines (RCS), natural cubic splines (NCS) and artificial neural network (ANN) metamodeling techniques for processor design metrics prediction and architecture optimization. The experimental results show that, the proposed methodology achieves higher prediction accuracy and better architecture optimization results.

8A-2 (Time: 14:15 - 14:40)

Title	Speeding Up Single Pass Simulation of PLRUt Caches
Author	*Josef Schneider, Jorgen Peddersen, Sri Parameswaran (The University of New South Wales, Australia)
Page	pp. 695 - 700
Keyword	Cache Simulation, PLRUt
Abstract	CPU caches have become an essential component in many computer systems as they can significantly increase system performance by alleviating the effects of memory latency. For many designers part of the system design flow is the selection of an appropriately configured cache, a task which can be performed using cache simulators. Exploring the entire design space through precise cache simulation is a lengthy process, and while certain cache replacement policies have been optimised for fast simulation execution (such as LRU and FIFO), no effective optimisations have been proposed for an extremely effective replacement policy: Pseudo Least Recently Used tree-based, also known as PLRUt. In this paper we are the first to present a number of characteristics of the PLRUt replacement policy that lend themselves to the design of an optimised hash table-based cache simulator. We demonstrate that our optimised simulator is up to 1.93x faster than an un-optimised implementation.
Slides

8A-3 (Time: 14:40 - 15:05)

Title	ADAPT: An ADAptive Manycore Methodology for Software Pipelined ApplicaTions
Author	*Xi Zhang, Haris Javaid (University of New South Wales, Australia), Muhammad Shafique (Karlsruhe Institute of Technology, Germany), Jude Angelo Ambrose (University of New South Wales, Australia), Jörg Henkel (Karlsruhe Institute of Technology, Germany), Sri Parameswaran (University of New South Wales, Australia)
Page	pp. 701 - 706
Keyword	Manycore System, MPSoC, Streaming Applications, Software pipeline, Run-time adaptation
Abstract	Future multiprocessor architectures are expected to have hundreds of processors on a chip. To amortize the cost of such systems, they would be expected to be used in a variety of situations, and for a number of applications. In this paper, we examine how software pipelines, which are useful for streaming/multimedia applications, can be efficiently implemented in such multiprocessor systems. The goal is to balance the stages of the pipeline in the presence of workload variations. This paper shows a method to detect bottleneck stages and adds processors to those bottleneck stages at run-time. Further, if there are no free processors, then a shuffling of processors is performed. Our methodology (which was simulated on a commercial simulating system) adapts in less than two thousand cycles, and for a variety of benchmarks achieves up to 2.1× the throughput when compared to the state-of-the-art technology (modified and implemented in the same platform for purposes of comparison).
Slides

8A-4 (Time: 15:05 - 15:30)

Title	A Trace-Driven Approach for Fast and Accurate Simulation of Manycore Architectures
Author	*Anastasiia Butko, Rafael Garibotti, Luciano Ost, Vianney Lapotre, Abdoulaye Gamatie, Gilles Sassatelli (LIRMM/CNRS/University of Montpellier II, France), Chris Adeniyi-Jones (ARM, Ltd., U.K.)
Page	pp. 707 - 712
Keyword	manycore architecture, modeling, trace-driven simulation, gem5 simulator, multi-threading
Abstract	The evolution of manycore sytems, forecasted to feature hundreds of cores by the end of the decade calls for effi- cient solutions for design space exploration and debugging. Among the relevant existing solutions the well-known gem5 simulator provides a rich architecture description frame- work. However, these features come at the price of prohibitive simulation time that limits the scope of possible explorations to configurations made of tens of cores. To address this limitation, this paper proposes a novel trace-driven simulation approach for efficient exploration of manycore architectures.
Slides