论文标题
Aritpim:高通量内存算术
AritPIM: High-Throughput In-Memory Arithmetic
论文作者
论文摘要
内存中的数字处理(PIM)架构正在迅速出现,以通过将逻辑整合到内存元素中来克服内存壁瓶颈。这些体系结构以平行位逻辑操作的形式在内存本身内提供了巨大的计算能力。我们开发了针对PIM的新型算法技术,结合了计算机算术的新观点,将这种位并行性扩展到四个基本算术操作(加法,减法,乘法,乘法和除法),用于定义点和浮点数,以及使用位 - 系列和位平行的方法。我们提出了一套最先进的算法算法,该算法在大多数情况下证明了数字PIM文献中的第一种算法 - 包括以前认为数字PIM不可能的案例,例如添加浮点。通过一项关于熟悉PIM的案例研究,我们将所提出的算法与NVIDIA RTX 3070 GPU进行了比较,并显示出显着的吞吐量和能量改进。
Digital processing-in-memory (PIM) architectures are rapidly emerging to overcome the memory-wall bottleneck by integrating logic within memory elements. Such architectures provide vast computational power within the memory itself in the form of parallel bitwise logic operations. We develop novel algorithmic techniques for PIM that, combined with new perspectives on computer arithmetic, extend this bitwise parallelism to the four fundamental arithmetic operations (addition, subtraction, multiplication, and division), for both fixed-point and floating-point numbers, and using both bit-serial and bit-parallel approaches. We propose a state-of-the-art suite of arithmetic algorithms, demonstrating the first algorithm in the literature of digital PIM for a majority of cases - including cases previously considered impossible for digital PIM, such as floating-point addition. Through a case study on memristive PIM, we compare the proposed algorithms to an NVIDIA RTX 3070 GPU and demonstrate significant throughput and energy improvements.