论文标题
不断发展的角色级卷积神经网络,用于文本分类
Evolving Character-level Convolutional Neural Networks for Text Classification
论文作者
论文摘要
角色级卷积神经网络(CHAR-CNN)不需要对其分类语言的语义或句法结构的了解。该属性简化了其实施,但降低了其分类准确性。提高CHAR-CNN体系结构的深度不会导致突破性的准确性提高。研究尚未确定哪些CHAR-CNN架构对于文本分类任务是最佳的。手动设计和培训CHAN-CNNS是一个需要专家领域知识的迭代且耗时的过程。进化深度学习(EDL)技术,包括基于替代的版本,在自动搜索CNN体系结构以进行图像分析任务方面取得了成功。研究人员尚未应用EDL技术来搜索CHAR-CNNS的架构空间进行文本分类任务。本文展示了使用基于基因编程的新型EDL算法(一种间接编码和替代模型)自动搜索性能的CHAR-CNN体系结构的第一份工作。该算法对八个文本分类数据集进行了评估,并针对五个手动设计的CNN体系结构和一个长期的短期内存(LSTM)体系结构进行了基准测试。实验结果表明,该算法可以在分类精度方面发展出优于LSTM的体系结构,而在分类准确性和参数计数方面,手动设计的CNN体系结构中的五个。
Character-level convolutional neural networks (char-CNN) require no knowledge of the semantic or syntactic structure of the language they classify. This property simplifies its implementation but reduces its classification accuracy. Increasing the depth of char-CNN architectures does not result in breakthrough accuracy improvements. Research has not established which char-CNN architectures are optimal for text classification tasks. Manually designing and training char-CNNs is an iterative and time-consuming process that requires expert domain knowledge. Evolutionary deep learning (EDL) techniques, including surrogate-based versions, have demonstrated success in automatically searching for performant CNN architectures for image analysis tasks. Researchers have not applied EDL techniques to search the architecture space of char-CNNs for text classification tasks. This article demonstrates the first work in evolving char-CNN architectures using a novel EDL algorithm based on genetic programming, an indirect encoding and surrogate models, to search for performant char-CNN architectures automatically. The algorithm is evaluated on eight text classification datasets and benchmarked against five manually designed CNN architecture and one long short-term memory (LSTM) architecture. Experiment results indicate that the algorithm can evolve architectures that outperform the LSTM in terms of classification accuracy and five of the manually designed CNN architectures in terms of classification accuracy and parameter count.