论文标题
基于机器学习的网络入侵检测系统的交叉评估
The Cross-evaluation of Machine Learning-based Network Intrusion Detection Systems
论文作者
论文摘要
通过有监督的机器学习(ML)增强网络入侵检测系统(NID)是艰难的。必须对ML-NID进行训练和评估,需要清楚标记良性和恶意样本的数据。这样的标签需要昂贵的专家知识,从而导致缺乏实际部署,以及始终依赖相同过时的数据的论文。由于一些努力揭示了其标记的数据集,情况最近有所改善。但是,大多数过去的作品都将这些数据集用作“另一个”测试床,从而忽略了该可用性提供的附加潜力。 相比之下,我们使用这种现有标记的数据来促进交叉评估ML-NID。这种方法仅受到有限的关注,由于其复杂性,需要专门的治疗方法。因此,我们提出了第一个交叉评估模型。我们的模型突出了可以通过交叉评估来评估的更广泛的现实用例,从而发现了最先进的ML-NID的未知品质。例如,它们的检测表面可以扩展 - 没有额外的标签成本。但是,进行这种交叉评估是具有挑战性的。因此,我们提出了基于网络流的可靠交叉评估的第一个框架Xenids。通过在六个著名的数据集上使用Xenids,我们证明了ML-NID的跨评估的隐藏潜力,但也证明了风险。
Enhancing Network Intrusion Detection Systems (NIDS) with supervised Machine Learning (ML) is tough. ML-NIDS must be trained and evaluated, operations requiring data where benign and malicious samples are clearly labelled. Such labels demand costly expert knowledge, resulting in a lack of real deployments, as well as on papers always relying on the same outdated data. The situation improved recently, as some efforts disclosed their labelled datasets. However, most past works used such datasets just as a 'yet another' testbed, overlooking the added potential provided by such availability. In contrast, we promote using such existing labelled data to cross-evaluate ML-NIDS. Such approach received only limited attention and, due to its complexity, requires a dedicated treatment. We hence propose the first cross-evaluation model. Our model highlights the broader range of realistic use-cases that can be assessed via cross-evaluations, allowing the discovery of still unknown qualities of state-of-the-art ML-NIDS. For instance, their detection surface can be extended--at no additional labelling cost. However, conducting such cross-evaluations is challenging. Hence, we propose the first framework, XeNIDS, for reliable cross-evaluations based on Network Flows. By using XeNIDS on six well-known datasets, we demonstrate the concealed potential, but also the risks, of cross-evaluations of ML-NIDS.