论文标题
一种用于基因组分类的神经洛哈斯学习体系结构
A Neurochaos Learning Architecture for Genome Classification
论文作者
论文摘要
有经验证据表明,在生物神经网络中单个神经元水平上存在非线性和混乱。混乱的神经元的特性激发了我们将它们雇用到人工学习系统中。在这里,我们提出了一个神经chaos学习(NL)结构,其中用于从数据中提取特征的神经元是一维混乱图。 Chaosfex+SVM是该NL体系结构的实例,被认为是混乱和古典机器学习算法的混合组合。我们正式证明,具有有限数量的1D混沌神经元的单层NL满足通用近似定理,其具有确切的值的值,以近似具有有限支持的离散实际有价值函数所需的混沌神经元数。这是由于混乱的拓扑传递性能以及所选1D混沌图的无限数量致密轨道的存在而成为可能的。 NL中的混沌神经元在存在输入刺激(数据)的情况下被激活,并输出混乱的发射轨迹。从NL单个神经元的这种混乱射击轨迹中,我们提取构成Chaosfex特征的发射时间,发射速率,能量和熵。然后将这些Chaosfex功能馈送到具有线性内核进行分类的支持向量机。在低训练样本和高训练样本方案中,证明了NL(Chaosfex+SVM)执行的混沌功能工程的有效性。具体而言,我们考虑了其他冠状病毒(SARS-COV-1,MERS-COV等)的SARS-COV-2基因组序列分类的问题。每班只有一个训练样本进行1000个随机训练试验,我们报告了从SARS-COV-1基因组序列对SARS-COV-2分类的平均宏F1得分> 0.99。还展示了Chaosfex特征对加性噪声的鲁棒性。
There has been empirical evidence of presence of non-linearity and chaos at the level of single neurons in biological neural networks. The properties of chaotic neurons inspires us to employ them in artificial learning systems. Here, we propose a Neurochaos Learning (NL) architecture, where the neurons used to extract features from data are 1D chaotic maps. ChaosFEX+SVM, an instance of this NL architecture, is proposed as a hybrid combination of chaos and classical machine learning algorithm. We formally prove that a single layer of NL with a finite number of 1D chaotic neurons satisfies the Universal Approximation Theorem with an exact value for the number of chaotic neurons needed to approximate a discrete real valued function with finite support. This is made possible due to the topological transitivity property of chaos and the existence of uncountably infinite number of dense orbits for the chosen 1D chaotic map. The chaotic neurons in NL get activated under the presence of an input stimulus (data) and output a chaotic firing trajectory. From such chaotic firing trajectories of individual neurons of NL, we extract Firing Time, Firing Rate, Energy and Entropy that constitute ChaosFEX features. These ChaosFEX features are then fed to a Support Vector Machine with linear kernel for classification. The effectiveness of chaotic feature engineering performed by NL (ChaosFEX+SVM) is demonstrated for synthetic and real world datasets in the low and high training sample regimes. Specifically, we consider the problem of classification of genome sequences of SARS-CoV-2 from other coronaviruses (SARS-CoV-1, MERS-CoV and others). With just one training sample per class for 1000 random trials of training, we report an average macro F1-score > 0.99 for the classification of SARS-CoV-2 from SARS-CoV-1 genome sequences. Robustness of ChaosFEX features to additive noise is also demonstrated.