论文标题

使用无监督的学习技术对光学星系形态的分类

The Classification of Optical Galaxy Morphology Using Unsupervised Learning Techniques

论文作者

Fielding, Ezra, Nyirenda, Clement N., Vaccari, Mattia

论文摘要

近年来,大规模数据密集的天文调查导致生产的图像比科学家可以手动分类更多。即使是尝试众包这项工作的尝试,很快就会被现代调查产生的大量数据所超越。这引起了人们对星系形态分类的基于人类方法的生存能力。尽管有监督的学习方法需要具有现有标签的数据集,但无监督的学习技术却不是。因此,本文实施了无监督的学习技术,以对银河动物园贴花数据集进行分类。培训并实施了卷积自动编码器提取器。然后通过K-均值,模糊C均值和聚集聚类聚类所得的特征。将这些集群与Galaxy Zoo贴花项目提供的真正志愿者分类进行了比较。通常,最好的结果是通过聚集聚类方法产生的。但是,考虑到聚类时间的增加,与K-均值聚类相比,性能的提高并不显着。在进行了适当的聚类算法优化之后,这种方法可能被证明可用于分类更好的性能问题,并可以作为新方法的基础,从而从无监督的技术中产生更多“类人类”的星系形态分类。

In recent years, large scale data intensive astronomical surveys have resulted in more detailed images being produced than scientists can manually classify. Even attempts to crowd-source this work will soon be outpaced by the large amount of data generated by modern surveys. This has brought into question the viability of human-based methods for classifying galaxy morphology. While supervised learning methods require datasets with existing labels, unsupervised learning techniques do not. Therefore, this paper implements unsupervised learning techniques to classify the Galaxy Zoo DECaLS dataset. A convolutional autoencoder feature extractor was trained and implemented. The resulting features were then clustered via k-means, fuzzy c-means and agglomerative clustering. These clusters were compared against the true volunteer classifications provided by the Galaxy Zoo DECaLS project. The best results, in general, were produced by the agglomerate clustering method. However, the increase in performance compared to k-means clustering was not significant considering the increase in clustering time. After undergoing the appropriate clustering algorithm optimizations, this approach could prove useful for classifying the better performing questions and could serve as the basis for a novel approach to generating more "human-like" galaxy morphology classifications from unsupervised techniques.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源