使用索非亚，mtobjects和监督的深度学习，对HI排放线方块中的源发现技术的比较研究

论文标题

使用索非亚，mtobjects和监督的深度学习，对HI排放线方块中的源发现技术的比较研究

A comparative study of source-finding techniques in HI emission line cubes using SoFiA, MTObjects, and supervised deep learning

论文作者

Barkai, J. A., Verheijen, M. A. W., Martínez, E. T., Wilkinson, M. H. F.

论文摘要

原子中性氢（HI）的21 cm光谱线发射是在射电天文学中观察到的主要波长之一。但是，信号本质上是微弱的，星系的HI含量取决于宇宙环境，需要大量的调查量和调查深度来调查HI宇宙。随着来自这些调查的数据的数量随着技术的改进而继续增加，需要自动技术来识别和表征HI来源的自动技术，同时考虑完整性和纯度之间的权衡。这项研究旨在找到最佳的管道，以查找和掩盖最佳的掩模质量和最少的3D中性氢立方体中的伪像。探索了各种现有方法，以创建一条管道，以最佳识别和掩盖3D中性氢21 cm光谱线数据群中的来源。测试了两种传统的来源调查方法，Sofia和Mtobjects，以及一种新的监督深度学习方法，其中使用了3D卷积神经网络架构，称为V-NET。通过添加经典的机器学习分类器作为后处理步骤，可以进一步改善这三种源找到方法，以删除误报检测。从Westerbork合成射电望远镜的HI数据群上测试了管道，并具有其他插入的模拟星系。索非亚与随机的森林分类器相结合，提供了最佳的结果，V-Net随机森林组合紧密。我们怀疑这是由于培训集中的模拟资源比实际来源多的事实。因此，使用更标记的数据可以提高V-NET网络的质量的空间，从而有可能胜过Sofia。

The 21 cm spectral line emission of atomic neutral hydrogen (HI) is one of the primary wavelengths observed in radio astronomy. However, the signal is intrinsically faint and the HI content of galaxies depends on the cosmic environment, requiring large survey volumes and survey depth to investigate the HI Universe. As the amount of data coming from these surveys continues to increase with technological improvements, so does the need for automatic techniques for identifying and characterising HI sources while considering the tradeoff between completeness and purity. This study aimed to find the optimal pipeline for finding and masking the most sources with the best mask quality and the fewest artefacts in 3D neutral hydrogen cubes. Various existing methods were explored in an attempt to create a pipeline to optimally identify and mask the sources in 3D neutral hydrogen 21 cm spectral line data cubes. Two traditional source-finding methods were tested, SoFiA and MTObjects, as well as a new supervised deep learning approach, in which a 3D convolutional neural network architecture, known as V-Net was used. These three source-finding methods were further improved by adding a classical machine learning classifier as a post-processing step to remove false positive detections. The pipelines were tested on HI data cubes from the Westerbork Synthesis Radio Telescope with additional inserted mock galaxies. SoFiA combined with a random forest classifier provided the best results, with the V-Net-random forest combination a close second. We suspect this is due to the fact that there are many more mock sources in the training set than real sources. There is, therefore, room to improve the quality of the V-Net network with better-labelled data such that it can potentially outperform SoFiA.

下载PDF全文

下载文献需遵守相关版权规定

论文标题