论文标题
Pilarnet:高能量物理学中的粒子成像液氩探测器的公共数据集
PILArNet: Public Dataset for Particle Imaging Liquid Argon Detectors in High Energy Physics
论文作者
论文摘要
机器学习解决方案的快速发展通常与测试公共数据集的生产相吻合。这样的数据集减少了解决问题的最大进入障碍 - 采购数据 - 同时还提供了比较不同解决方案的基准。此外,大型数据集已被用于训练高性能的功能查找器,然后将其用于最初定义的新方法中的新方法。为了鼓励使用液体氩时间投影室收集的数据的快速发展,高能物理实验中使用的一类粒子探测器,我们生产了PIRALNET,第一个2D和3D开放数据集用于几个关键分析任务。本文介绍的初始数据集包含300,000个样品,并以三种不同的体积尺寸记录。该数据集有效地存储在稀疏的2D和3D矩阵格式中,其中包含有关模拟颗粒的辅助信息,并可用于公共研究使用。在本文中,我们描述了数据集,任务以及用于采购样品的方法。
Rapid advancement of machine learning solutions has often coincided with the production of a test public data set. Such datasets reduce the largest barrier to entry for tackling a problem -- procuring data -- while also providing a benchmark to compare different solutions. Furthermore, large datasets have been used to train high-performing feature finders which are then used in new approaches to problems beyond that initially defined. In order to encourage the rapid development in the analysis of data collected using liquid argon time projection chambers, a class of particle detectors used in high energy physics experiments, we have produced the PILArNet, first 2D and 3D open dataset to be used for a couple of key analysis tasks. The initial dataset presented in this paper contains 300,000 samples simulated and recorded in three different volume sizes. The dataset is stored efficiently in sparse 2D and 3D matrix format with auxiliary information about simulated particles in the volume, and is made available for public research use. In this paper we describe the dataset, tasks, and the method used to procure the sample.