论文标题
使用深度学习和数据综合对高维光谱数据的强大分类
Robust Classification of High-Dimensional Spectroscopy Data Using Deep Learning and Data Synthesis
论文作者
论文摘要
本文提出了一种对高维光谱数据进行分类的新方法,并证明它表现优于其他当前的最新方法。我们考虑的具体任务是根据其拉曼光谱确定样品是否含有氯化溶剂。我们还研究了对训练集(负异常值)中未表示的离群样本分类的鲁棒性。提出并证明了局部连接神经网络(NN)在光谱数据的二元分类中的新颖应用,并证明是为了提高传统流行算法的准确性。此外,我们介绍了通过使用合成训练光谱进一步提高局部连接NN算法的准确性的能力,并研究了基于自动编码器的单级分类器和离群值检测器的使用。最后,将两步分类过程作为二进制和一级分类范例的替代方案提出。该过程结合了局部连接的NN分类器,合成训练数据的使用以及基于自动编码器的离群值检测器产生的模型,该模型既可以产生高分类的精度,又可以强大地与负异常值的存在。
This paper presents a new approach to classification of high dimensional spectroscopy data and demonstrates that it outperforms other current state-of-the art approaches. The specific task we consider is identifying whether samples contain chlorinated solvents or not, based on their Raman spectra. We also examine robustness to classification of outlier samples that are not represented in the training set (negative outliers). A novel application of a locally-connected neural network (NN) for the binary classification of spectroscopy data is proposed and demonstrated to yield improved accuracy over traditionally popular algorithms. Additionally, we present the ability to further increase the accuracy of the locally-connected NN algorithm through the use of synthetic training spectra and we investigate the use of autoencoder based one-class classifiers and outlier detectors. Finally, a two-step classification process is presented as an alternative to the binary and one-class classification paradigms. This process combines the locally-connected NN classifier, the use of synthetic training data, and an autoencoder based outlier detector to produce a model which is shown to both produce high classification accuracy, and be robust to the presence of negative outliers.