论文标题

使用高阶异步方案对湍流的直接数值模拟:准确性和性能

Direct Numerical Simulations of turbulent flows using high-order Asynchrony-Tolerant schemes: accuracy and performance

论文作者

Kumari, Komal, Donzis, Diego A.

论文摘要

直接数值模拟(DNS)是理解湍流基本物理学的必不可少的工具。由于雷诺数数字($r_λ$)的计算成本急剧上升,即使在适度的$r_λ$中,也只能在大量平行的超级计算机上实现良好的DNS。但是,在极端的规模上,涉及当前方法的处理元素(PE)之间的通信和同步非常昂贵,并有望成为可伸缩性的主要瓶颈。为了克服这一挑战,我们使用所谓的异步性耐受剂(AT)方案开发了算法,这些方案在数学层面上放松通信和同步约束,以执行衰减和磁盘强迫可压缩湍流的DN。异步是使用两种方法引入的,一种方法避免了同步,另一种方法避免了通信。这些会导致在PE边界处的周期性和随机延迟。我们表明,异步算法都可以准确地解决湍流的大规模和小规模运动,包括瞬时和间歇场。我们还表明,与标准同步模拟相比,在异步模拟中,通信时间相对较小,尤其是在大处理器计数下的比例。结果,我们观察到了两种异步算法的提高的并行可伸缩性高达$ 262144 $。

Direct numerical simulations (DNS) are an indispensable tool for understanding the fundamental physics of turbulent flows. Because of their steep increase in computational cost with Reynolds number ($R_λ$), well-resolved DNS are realizable only on massively parallel supercomputers, even at moderate $R_λ$. However, at extreme scales, the communications and synchronizations between processing elements (PEs) involved in current approaches become exceedingly expensive and are expected to be a major bottleneck to scalability. In order to overcome this challenge, we developed algorithms using the so-called Asynchrony-Tolerant (AT) schemes that relax communication and synchronization constraints at a mathematical level, to perform DNS of decaying and solenoidally forced compressible turbulence. Asynchrony is introduced using two approaches, one that avoids synchronizations and the other that avoids communications. These result in periodic and random delays, respectively, at PE boundaries. We show that both asynchronous algorithms accurately resolve the large-scale and small-scale motions of turbulence, including instantaneous and intermittent fields. We also show that in asynchronous simulations the communication time is a relatively smaller fraction of the total computation time, especially at large processor count, compared to standard synchronous simulations. As a consequence, we observe improved parallel scalability up to $262144$ processors for both asynchronous algorithms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源