论文标题
部分可观测时空混沌系统的无模型预测
Particle Transformer for Jet Tagging
论文作者
论文摘要
喷气标记是粒子物理学中的一项关键但具有挑战性的分类任务。尽管深度学习已经改变了喷气标记并显着提高了性能,但缺乏大规模的公共数据集阻碍了进一步的增强。在这项工作中,我们提出了JetClass,这是一种用于喷气标记的新综合数据集。 JETCLASS数据集由100 M喷气机组成,大约比现有公共数据集大约两个数量级。总共模拟了10种类型的喷气机,包括到目前为止未探索用于标记的几种类型。基于大型数据集,我们提出了一种用于喷射标记的新的基于变压器的体系结构,称为“粒子变压器”(部分)。通过将成对的粒子相互作用纳入注意机制,部分可以达到比普通变压器更高的标记性能,并超过了先前的最先前的颗粒颗粒。一旦进行了微调,预先训练的零件模型也大大提高了两个广泛采用的喷气标记基准的性能。数据集,代码和模型可在https://github.com/jet-universe/particle_transformer上公开获取。
Jet tagging is a critical yet challenging classification task in particle physics. While deep learning has transformed jet tagging and significantly improved performance, the lack of a large-scale public dataset impedes further enhancement. In this work, we present JetClass, a new comprehensive dataset for jet tagging. The JetClass dataset consists of 100 M jets, about two orders of magnitude larger than existing public datasets. A total of 10 types of jets are simulated, including several types unexplored for tagging so far. Based on the large dataset, we propose a new Transformer-based architecture for jet tagging, called Particle Transformer (ParT). By incorporating pairwise particle interactions in the attention mechanism, ParT achieves higher tagging performance than a plain Transformer and surpasses the previous state-of-the-art, ParticleNet, by a large margin. The pre-trained ParT models, once fine-tuned, also substantially enhance the performance on two widely adopted jet tagging benchmarks. The dataset, code and models are publicly available at https://github.com/jet-universe/particle_transformer.