论文标题

哈希算法,优化的映射和玻色子的多配置方法的大规模并行化

Hashing algorithms, optimized mappings and massive parallelization of multiconfigurational methods for bosons

论文作者

Andriati, Alex, Gammal, Arnaldo

论文摘要

开发了FOCK状态索引的数值例程,并在跨度多构型空间中处理创建和歼灭操作员。从单个粒子状态的截断基础上拟合粒子的组合问题(定义了跨度的多构型空间),基于度量标准提供了哈希函数,以对所有可能的配置进行分类,这是指在Fock状态定义中所需的一组职业数字。尽管哈希函数明确地将配置与许多粒子状态扩展的系数索引相关联,但创建和an灭操作员的平均值可能是一个高度苛刻的计算,尤其是当它们嵌入时间依赖时间的问题中。因此,对创建作用和歼灭操作员作用后的配置之间的转换进行了改进,并突出了优势和更多的内存消耗。我们还从带有CUDA的图形处理器单元中利用了大量的并行处理器,以改善与跨度的多配置空间上多体汉密尔顿矩阵一起起作用的常规,该空间量化了问题的可扩展性。此处显示的改进似乎很有希望,尤其是对于涉及大量粒子的计算,在这种情况下,优化的CUDA代码的性能比单个核心处理器快的速度快五十倍。这些代码始终在Lieb-Liniger气体中应用,评估基态并与分析溶液进行比较。

Numerical routines for Fock states indexing and to handle creation and annihilation operators in the spanned multiconfigurational space are developed. From the combinatorial problem of fitting particles in a truncated basis of individual particle states, which defines the spanned multiconfigurational space, a hashing function is provided based on a metric to sort all possible configurations, which refers to sets of occupation numbers required in the definition of Fock states. Despite the hashing function unambiguously relates the configuration to the coefficient index of the many-particle state expansion in the Fock basis, averages of creation and annihilation operators can be a highly demanding computation, especially when they are embedded in a time-dependent problem. Therefore, improvements in the conversion between configurations after the action of creation and annihilation operators are thoroughly inspected, highlighting the advantages and additional memory consumption. We also exploit massive parallel processors from graphics processor units with CUDA to improve a routine to act with the many-body Hamiltonian matrix on the spanned multiconfigurational space, which demonstrated quantitatively the scalability of the problem. The improvements shown here seem promising especially for calculations involving a large number of particles, in which case, the optimized CUDA code provided a drastic performance gain of roughly fifty times faster than a single core processor. The codes were consistently tested with an application to the Lieb-Liniger gas, evaluating the ground state and comparing with the analytical solution.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源