通过半监督学习的卷积对抗自动编码器来检测车内入侵

论文标题

通过半监督学习的卷积对抗自动编码器来检测车内入侵

Detecting In-vehicle Intrusion via Semi-supervised Learning-based Convolutional Adversarial Autoencoders

论文作者

Hoang, Thien-Nu, Kim, Daehee

论文摘要

随着自动驾驶汽车技术的发展，由于其简单性和效率，控制器区域网络（CAN）总线已成为车载通信系统的事实上的标准。但是，没有任何加密和身份验证机制，使用CAN协议的车载网络易受广泛的攻击。许多研究主要基于机器学习，已经提出了在CAN总线系统中安装入侵检测系统（IDS）以进行异常检测。尽管机器学习方法对ID具有许多优势，但以前的模型通常需要大量标记的数据，这会导致时间和人工成本。为了解决这个问题，我们在本文中提出了一种新型的半监督基于学习的卷积自动编码器模型。提出的模型结合了两个流行的深度学习模型：自动编码器和生成对抗网络。首先，对模型进行了未标记的数据训练，以了解正常和攻击模式的歧管。然后，仅在监督培训中使用了少数标记的样本。提出的模型可以检测到各种消息注入攻击，例如DOS，模糊和欺骗以及未知的攻击。实验结果表明，与其他监督方法相比，提出的模型的最高F1得分为0.99，较低的错误率为0.1 \％。此外，我们表明该模型可以通过根据可训练的参数和推理时间的数量来分析模型复杂性来满足实时需求。与最先进的模型相比，这项研究成功地将模型参数的数量减少了五倍，推理时间降低了八次。

With the development of autonomous vehicle technology, the controller area network (CAN) bus has become the de facto standard for an in-vehicle communication system because of its simplicity and efficiency. However, without any encryption and authentication mechanisms, the in-vehicle network using the CAN protocol is susceptible to a wide range of attacks. Many studies, which are mostly based on machine learning, have proposed installing an intrusion detection system (IDS) for anomaly detection in the CAN bus system. Although machine learning methods have many advantages for IDS, previous models usually require a large amount of labeled data, which results in high time and labor costs. To handle this problem, we propose a novel semi-supervised learning-based convolutional adversarial autoencoder model in this paper. The proposed model combines two popular deep learning models: autoencoder and generative adversarial networks. First, the model is trained with unlabeled data to learn the manifolds of normal and attack patterns. Then, only a small number of labeled samples are used in supervised training. The proposed model can detect various kinds of message injection attacks, such as DoS, fuzzy, and spoofing, as well as unknown attacks. The experimental results show that the proposed model achieves the highest F1 score of 0.99 and a low error rate of 0.1\% with limited labeled data compared to other supervised methods. In addition, we show that the model can meet the real-time requirement by analyzing the model complexity in terms of the number of trainable parameters and inference time. This study successfully reduced the number of model parameters by five times and the inference time by eight times, compared to a state-of-the-art model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题