论文标题

内部空虚:通过几何数据分析找到差距,山谷和空白

The emptiness inside: Finding gaps, valleys, and lacunae with geometric data analysis

论文作者

Contardo, Gabriella, Hogg, David W., Hunt, Jason A. S., Peek, Joshua E. G., Chen, Yen-Chi

论文摘要

数据中缺口的发现在天体物理学中很重要。例如,在动力学系统中的共振或某些半径的系外行星中存在运动差距。数据集中的差距是一种异常,但是在不寻常的意义上是:与其周围环境相比,它不是一个远离其他数据点的单个异常数据点,而是远离其他数据点的单个离群数据点,而是空间的一个区域或一组点。差距既有趣又难以找到和表征,尤其是当它们具有非平凡的形状时。我们在本文中介绍了一个统计量,该统计量可用于估计数据空间中一个点的(局部)“粘度”。它使用密度估计值的梯度和黑森(因此需要两次不同的密度估计器)。该统计量可以在空间中的任何点(几乎)计算,并且不依赖优化。它允许以一般有效的方式突出显示任何维度和形状的低强度区域。我们说明了我们在银河系磁盘平面上附近恒星速度分布的方法,该磁盘平面表现出可能源自不同过程的差距。识别和表征这些差距可以帮助确定其起源。我们在附录实现注释和其他注意事项中提供了使用密度的Hessian的关键点和属性,以查找数据不足。

Discoveries of gaps in data have been important in astrophysics. For example, there are kinematic gaps opened by resonances in dynamical systems, or exoplanets of a certain radius that are empirically rare. A gap in a data set is a kind of anomaly, but in an unusual sense: Instead of being a single outlier data point, situated far from other data points, it is a region of the space, or a set of points, that is anomalous compared to its surroundings. Gaps are both interesting and hard to find and characterize, especially when they have non-trivial shapes. We present in this paper a statistic that can be used to estimate the (local) "gappiness" of a point in the data space. It uses the gradient and Hessian of the density estimate (and thus requires a twice-differentiable density estimator). This statistic can be computed at (almost) any point in the space and does not rely on optimization; it allows to highlight under-dense regions of any dimensionality and shape in a general and efficient way. We illustrate our method on the velocity distribution of nearby stars in the Milky Way disk plane, which exhibits gaps that could originate from different processes. Identifying and characterizing those gaps could help determine their origins. We provide in an Appendix implementation notes and additional considerations for finding under-densities in data, using critical points and the properties of the Hessian of the density.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源