(Back to Session Schedule)

The 18th Asia and South Pacific Design Automation Conference

Session 7A  Special Session: Many-Core Architecture and Software Technology
Time: 10:20 - 12:20 Friday, January 25, 2013
Chairs: Masato Edahiro (Nagoya University, Japan), Hiroyuki Tomiyama (Ritsumeikan University, Japan)

7A-1 (Time: 10:20 - 10:40)
Title(Invited Paper) SMYLE Project: Toward High-Performance, Low-Power Computing on Manycore-Processor SoCs
Author*Koji Inoue (Kyushu University, Japan)
Pagepp. 558 - 560
Keywordmanycore, SoC, low power, high performance, processor
AbstractThis paper introduces a manycore research project called SMYLE (Scalable ManYcore for Low Energy computing). The aims of this project are: 1) proposing a manycore SoC architecture and developing a suitable programming and execution environment, 2) designing a domain specific manycore system for emerging video mining applications, and 3) releasing developed software tools and FPGA emulation environments to accelerate manycore research and development in the community. The project started in December 2010 with full support from the New Energy and Industrial Technology Development Organization (NEDO).

7A-2 (Time: 10:40 - 11:05)
Title(Invited Paper) SMYLEref: A Reference Architecture for Manycore-Processor SoCs
Author*Masaaki Kondo, Son Truong Nguyen (The University of Electro-Communications, Japan), Tomoya Hirao, Takeshi Soga, Hiroshi Sasaki, Koji Inoue (Kyushu University, Japan)
Pagepp. 561 - 564
KeywordManycore Processor, Prototyping, FPGA
AbstractNowadays, the trend of developing micro-processor with tens of cores brings a promising prospect for embedded systems. Realizing a high performance and low power many-core processor is becoming a primary technical challenge. We are currently developing a many-core processor architecture for embedded systems as a part of a NEDO's project. This paper introduces the many-core architecture called SMYLEref along whit the concept of Virtual Accelerator on Many-core, in which many cores on a chip are utilized as a hardware platform for realizing multiple virtual accelerators. We are developing its prototype system with off-the-shelf FPGA evaluation boards. In this paper, we introduce the architecture of SMYLEref and the detail of the prototype system. In addition, several initial experiments with the prototype system are also presented.
Slides

7A-3 (Time: 11:05 - 11:30)
Title(Invited Paper) SMYLE OpenCL: A Programming Framework for Embedded Many-core SoCs
Author*Hiroyuki Tomiyama, Takuji Hieda, Naoki Nishiyama, Noriko Etani, Ittetsu Taniguchi (Ritsumeikan University, Japan)
Pagepp. 565 - 567
Keywordmanycore SoCs, OpenCL, embedded systems
AbstractEmbedded SoC architecture has shifted from single-core to multi/many-core paradigm because of better power/performance efficiency. In order to exploit the potential power/performance efficiency of the many-core architecture, a parallel computing framework is necessary. OpenCL is one of the most popular parallel computing frameworks in the field of general-purpose computing on GPUs and multicore servers. However, the existing OpenCL implementations are not suitable to embedded real-time systems because of the large runtime overhead. In this paper, we describe a lightweight OpenCL framework for embedded multi/many-core SoCs. Our OpenCL framework minimizes the runtime overhead by statically creating threads and mapping them onto cores. Preliminary experiments on an FPGA prototype board with a five-core architecture shows a significant reduction in runtime overhead compared with an existing OpenCL framework.

7A-4 (Time: 11:30 - 11:55)
Title(Invited Paper) Support Tools for Porting Legacy Applications to Multicore
AuthorYuri Ardila, *Natsuki Kawai, Takashi Nakamura, Yosuke Tamura (Fixstars Corporation, Japan)
Pagepp. 568 - 573
Keywordauto-parallelizer, performance estimation, benchmark, parallel computing
AbstractAbstract| This paper presents PEMAP, an automated performance estimation tool to project performance of hand-parallelized programs from sequential programs and BEMAP, a benchmark suite to measure an auto-parallelizer or even a machine's performance. BEMAP is an open-source project, and the documentations on code explanations and experimental results are also provided. Our experiments on PEMAP shows we can estimate performance of hand-parallelized programs in an error of 0.44% of sequential program's performance on average, while using BEMAP shows that the ability of an auto-parallelizer can be measured by comparing the compiled code to the hand-tuned parallelized OpenCL code, and therefore assisting the development of the auto-parallelizer tool.
Slides

7A-5 (Time: 11:55 - 12:20)
Title(Invited Paper) Manycore Processor for Video Mining Applications
Author*Yukoh Matsumoto, Hiroyuki Uchida, Michiya Hagimoto, Yasumori Hibi, Sunao Torii, Masamichi Izumida (TOPS Systems Corporation, Japan)
Pagepp. 574 - 575
AbstractThrough Architecture-Algorithm co-design for Video Mining Applications we designed a scalable Manycore processor consists of clustered heterogeneous cores with stream processing capabilities, and zero-overhead inter-process communication through FIFO with a hardware-software mechanism. For achieving high-performance and low-power consumption, especially so as to reduce memory access required for Video Mining Applications, each application is partitioned to exploit both task and data parallelism, and programmed as a distributed stream processing with relatively large local register-file based on Kahn Process Network model.
Slides