论文标题
基于FPGA的Hyrbid内存仿真系统
FPGA-based Hyrbid Memory Emulation System
论文作者
论文摘要
已经提出了由新兴的非易失性存储器(NVM)和DRAM组成的混合记忆系统,以解决不断增长的应用程序内存需求。新兴的NVM技术,例如相变位置(PCM),Memristor和3D XPoint,具有较高的容量密度,最小的静态功耗和每GB的成本较低。但是,NVM具有更长的访问延迟和有限的写入耐力,而不是DRAM。两个内存类的不同特征指向包含多个主内存类的混合内存系统的设计。 在新体系结构的迭代和增量开发中,模拟完成的及时性对于项目进展至关重要。因此,需要一种高效的仿真方法来评估不同混合记忆系统设计的性能。混合内存系统的设计探索很具有挑战性,因为它需要模仿完整的系统堆栈,包括操作系统,内存控制器和互连。此外,用于内存性能测试的基准应用程序通常具有更大的工作组,因此花费更长的模拟热身时间。 在本文中,我们提出了一个基于FPGA的混合记忆系统仿真平台。我们针对移动计算系统,该系统对能源消耗很敏感,并且很可能会以其功率效率采用NVM。在这里,由于我们平台的重点是混合内存系统的设计,因此我们利用板载硬式IP ARM处理器来提高模拟性能,同时提高结果的准确性。因此,用户可以使用FPGA逻辑元素来实施其数据放置/迁移策略,并快速有效地评估新设计。结果表明,与软件对应物GEM5相比,我们的仿真平台在模拟时间中提供了9280倍的加速。
Hybrid memory systems, comprised of emerging non-volatile memory (NVM) and DRAM, have been proposed to address the growing memory demand of applications. Emerging NVM technologies, such as phase-change memories (PCM), memristor, and 3D XPoint, have higher capacity density, minimal static power consumption and lower cost per GB. However, NVM has longer access latency and limited write endurance as opposed to DRAM. The different characteristics of two memory classes point towards the design of hybrid memory systems containing multiple classes of main memory. In the iterative and incremental development of new architectures, the timeliness of simulation completion is critical to project progression. Hence, a highly efficient simulation method is needed to evaluate the performance of different hybrid memory system designs. Design exploration for hybrid memory systems is challenging, because it requires emulation of the full system stack, including the OS, memory controller, and interconnect. Moreover, benchmark applications for memory performance test typically have much larger working sets, thus taking even longer simulation warm-up period. In this paper, we propose a FPGA-based hybrid memory system emulation platform. We target at the mobile computing system, which is sensitive to energy consumption and is likely to adopt NVM for its power efficiency. Here, because the focus of our platform is on the design of the hybrid memory system, we leverage the on-board hard IP ARM processors to both improve simulation performance while improving accuracy of the results. Thus, users can implement their data placement/migration policies with the FPGA logic elements and evaluate new designs quickly and effectively. Results show that our emulation platform provides a speedup of 9280x in simulation time compared to the software counterpart Gem5.