论文标题
Galaxy-Halo连接中的散布:机器学习分析
The scatter in the galaxy-halo connection: a machine learning analysis
论文作者
论文摘要
我们应用机器学习,这是一种强大的方法,用于在宇宙流体动力学模拟的Galaxy-Halo连接中揭示高维数据中的复杂相关性。在没有完美信息的情况下,星系和光环变量之间的映射是随机的,但是常规的机器学习模型是确定性的,因此无法捕获其内在的散布。为了克服这一限制,我们设计了具有高斯损失函数的神经网络的集合,该整体可以预测概率分布,从而使我们能够在Galaxy-Halo连接中建模统计不确定性以及其最佳拟合趋势。我们从Horizon-agn和Illainistng100-1仿真中提取许多星系和光晕变量,并量化了一个子集知识在多大程度上可以预测另一个子集。这使我们能够识别Galaxy-Halo连接的关键特征,并在各种预测中研究其散布的起源。我们发现,尽管超出质量的光环特性占光光到恒星质量关系中散射的50%,但通过添加进一步的光晕特性,无法显着改善恒星半质量半径或总气体质量的预测。我们还使用这些结果来研究两个模拟中星系尺寸的半分析模型,发现将星系大小与光环大小或自旋的假设未成功。
We apply machine learning, a powerful method for uncovering complex correlations in high-dimensional data, to the galaxy-halo connection of cosmological hydrodynamical simulations. The mapping between galaxy and halo variables is stochastic in the absence of perfect information, but conventional machine learning models are deterministic and hence cannot capture its intrinsic scatter. To overcome this limitation, we design an ensemble of neural networks with a Gaussian loss function that predict probability distributions, allowing us to model statistical uncertainties in the galaxy-halo connection as well as its best-fit trends. We extract a number of galaxy and halo variables from the Horizon-AGN and IllustrisTNG100-1 simulations and quantify the extent to which knowledge of some subset of one enables prediction of the other. This allows us to identify the key features of the galaxy-halo connection and investigate the origin of its scatter in various projections. We find that while halo properties beyond mass account for up to 50 per cent of the scatter in the halo-to-stellar mass relation, the prediction of stellar half-mass radius or total gas mass is not substantially improved by adding further halo properties. We also use these results to investigate semi-analytic models for galaxy size in the two simulations, finding that assumptions relating galaxy size to halo size or spin are not successful.