连接科学仪器和HPC：模式，技术，经验

论文标题

连接科学仪器和HPC：模式，技术，经验

Linking Scientific Instruments and HPC: Patterns, Technologies, Experiences

论文作者

Vescovi, Rafael, Chard, Ryan, Saint, Nickolaus, Blaiszik, Ben, Pruyne, Jim, Bicer, Tekin, Lavens, Alex, Liu, Zhengchun, Papka, Michael E., Narayanan, Suresh, Schwarz, Nicholas, Chard, Kyle, Foster, Ian

论文摘要

现代实验设施的强大探测器通常以多个GB/s收集数据。需要在线分析方法来仅收集此类大规模数据流的有趣子集，例如通过明确丢弃某些数据元素或将仪器引导到实验空间的相关领域。这种在线分析需要用于配置和运行高性能分布式计算管道的方法 - 我们称之为流量 - 链接仪器，HPC（例如，用于分析，模拟，AI模型培训），边缘计算（用于分析），数据存储，数据存储，元数据目录和高速网络。在本文中，我们回顾了与此类流有关的共同模式，并描述了实例化这些模式的方法。我们还将这些方法应用于从五种不同科学仪器的数据处理中的应用中展示了经验，每种仪器都将HPC资源与数据倒置，机器学习模型培训或其他目的相关。我们还讨论了这些新方法对科学设施的运营商和用户的含义。

Powerful detectors at modern experimental facilities routinely collect data at multiple GB/s. Online analysis methods are needed to enable the collection of only interesting subsets of such massive data streams, such as by explicitly discarding some data elements or by directing instruments to relevant areas of experimental space. Such online analyses require methods for configuring and running high-performance distributed computing pipelines--what we call flows--linking instruments, HPC (e.g., for analysis, simulation, AI model training), edge computing (for analysis), data stores, metadata catalogs, and high-speed networks. In this article, we review common patterns associated with such flows and describe methods for instantiating those patterns. We also present experiences with the application of these methods to the processing of data from five different scientific instruments, each of which engages HPC resources for data inversion, machine learning model training, or other purposes. We also discuss implications of these new methods for operators and users of scientific facilities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题