Title | An Efficient STT-RAM-Based Register File in GPU Architectures |
Author | Xiaoxiao Liu, Mengjie Mao, Xiuyuan Bi, Hai Li, *Yiran Chen (University of Pittsburgh, U.S.A.) |
Page | pp. 490 - 495 |
Keyword | STT-RAM, MLC, Register file, GPU |
Abstract | Modern GPGPUs employ a large register file (RF) to efficiently process heavily parallel threads in single instruction multiple thread (SIMT) fashion. The up-scaling of RF capacity, however, is greatly constrained by large cell area and high leakage power consumption of SRAM implementation. In this work, we propose a novel GPU RF design based on the emerging multi-level cell (MLC) spin-transfer torque RAM (STT-RAM) technology. Compared to SRAM, MLC STT-RAM (or MLC-STT) has much smaller cell area and almost zero standby power due to its nonvolatility. Moreover, by leveraging the asymmetric performance
of the soft and the hard bits of a MLC-STT cell, we propose a remapping strategy to perform a flexible tradeoff between the access time and the capacity of the RF based on run-time access patterns. A novel rescheduling scheme is also developed to minimize the waiting time of the issued warps to access register banks. Experimental results over ISPASS2009 and CUDA benchmarks show that on average, our proposed MLC-STT RF can achieve 3.28% performance improvement, 9.48% energy reduction, and 38.9% energy efficiency enhancement compared to conventional SRAM-based design. |
Slides |
Title | A Bit-Write Reduction Method based on Error-Correcting Codes for Non-Volatile Memories |
Author | *Masashi Tawada, Shinji Kimura, Masao Yanagisawa, Nozomu Togawa (Waseda University, Japan) |
Page | pp. 496 - 501 |
Keyword | Non-volatile memory, Bit-write reduction, Error-correcting codes, Encode/decode, Write-Reduction Code |
Abstract | Non-volatile memory has many advantages over SRAM. However, one of its largest problems is that it consumes a large amount of energy
in writing. In this paper, we propose a bit-write reduction method based on error correcting codes for
non-volatile memories. When a data is written into a
memory cell, we do not write it directly but encode it
into a codeword. We focus on error-correcting codes
and generate new codes called write-reduction codes. In
our write-reduction codes, each data corresponds to an
information vector in an error-correcting code and an information vector corresponds not to a single codeword
but a set of write-reduction codewords. Given a writing data and current memory bits, we can deterministically select a particular write-reduction codeword
corresponding to a data to be written, where the maximum number of
ipped bits are theoretically minimized. Then the number of writing bits into memory
cells will also be minimized. We perform several experimental evaluations and demonstrate up to 72%
energy reduction. |
Title | Minimizing MLC PCM Write Energy for Free through Profiling-Based State Remapping |
Author | *Mengying Zhao (City University of Hong Kong, Hong Kong), Yuan Xue, Chengmo Yang (University of Delaware, U.S.A.), Chun Jason Xue (City University of Hong Kong, Hong Kong) |
Page | pp. 502 - 507 |
Keyword | PCM, energy, state remapping |
Abstract | Phase change memory is becoming one of the most
promising candidates to replace DRAM as main memory in deep
sub-micron regime. Multi-level cell (MLC) PCM outperforms
single level cell (SLC) PCM in terms of storage capacity but
requires an iterative programming-and-verifying scheme to program
cells to different resistance levels. The energy consumed
in programming different MLC states varies significantly, thus
motivating a state remapping technique to minimize the overall
write energy. In this paper, we first compare dynamic and static
state remapping strategies in terms of their efficacy in reducing
energy, and then propose an effective and low-cost static state
remapping algorithm. The experimental studies show 10.6%
average (up to 16.9%) reduction in MLC PCM write energy,
achieved within negligible hardware and performance overhead.
Compared with the most related work, the proposed scheme saves
more write energy on average, with near-zero performance, area
and energy overhead. |
Slides |
Title | Improving Performance and Lifetime of DRAM-PCM Hybrid Main Memory through a Proactive Page Allocation Strategy |
Author | Hoda Aghaei Khouzani, *Chengmo Yang (University of Delaware, U.S.A.), Jingtong Hu (Oklahoma State University, U.S.A.) |
Page | pp. 508 - 513 |
Keyword | Phase Change Memory, Hybrid Main Memory, Memory Managment, Page Allocation |
Abstract | This paper aims to reduce both DRAM misses and PCM writes in a DRAM-PCM hybrid memory architecture. We propose a proactive page allocation approach, exploiting the flexibility of mapping virtual pages to physical pages. By taking into consideration both the segment information and the number of conflict misses in DRAM, the proposed algorithm distributes heavily written pages across different DRAM sets. Trace-driven experiments show that the proposed technique is able to improve performance and lifetime of DRAM-PCM hybrid memory simultaneously. |