使用低功率移动GPU设备的蒙特卡洛中子运输

论文标题

使用低功率移动GPU设备的蒙特卡洛中子运输

Monte Carlo neutron transport using low power mobile GPU devices

论文作者

Liu, Changyuan

论文摘要

GPU用于蒙特卡洛颗粒传输的使用缺乏公平的比较。这项工作在相同的低功率移动设备的同一制造过程中，在同一软件包中对CPU和GPU进行了模拟。使用新鲜燃料的简单Pincell基准问题的实验可在CPU和GPU之间取得一致的结果。同时，它发现Apple M1 GPU的功能是M1 CPU的两倍，而在功率消耗方面有5倍的优势。针对GPU优化的粒子排序算法提高了计算效率28 \％，同时显着降低了GPU功耗。排序算法的这种优势预计耗尽的燃料问题比新鲜燃料问题更大。证明了专为连续变化的材料而设计的内核重建多普勒扩展算法可在GPU上有效地实现使用参考代码的一致的多普勒系数，并且该算法可以有效地实现。与具有双精度浮点数的参考代码相比，具有单个精度浮点数的测试代码可能会低估K效能值的值约500 PCM，但是燃料的多普勒系数也得到了很好的再现。结论可能会加强这样的论点，即高性能计算机采用GPU以减少总功耗很有帮助。

The using of GPU for Monte Carlo particle transport is lacking of fair comparisons. This work performs simulations on both CPU and GPU in the same package under the same manufacturing process of low power mobile devices. The experiment with simple pincell benchmark problems with fresh fuel gives consistent results between CPU and GPU. In the meanwhile, it finds that the Apple M1 GPU is as twice capable as M1 CPU, while entitled with a 5 times advantage in power consumption. The particle sorting algorithm optimized for GPU improves computing efficiency by 28\%, while prominently reducing GPU power consumption. Such advantage of sorting algorithm is expected to be greater for depleted fuel problems than fresh fuel problem. The kernel reconstruction Doppler broadening algorithm designed for continuously varying materials is demonstrated to produce consistent Doppler coefficients with the reference code and the algorithm can be efficiently implemented on GPU. Compared with the reference code with double precision floating point numbers, the testing codes with single precision floating point numbers could underestimate the K-effective values by about 500 pcm, and the Doppler coefficients of the fuel are well reproduced though. The conclusion may strengthen the argument that it is helpful for high performance computer to adopt GPU in order to reduce gross power consumption.

下载PDF全文

下载文献需遵守相关版权规定

论文标题