论文标题

数据分析启用的入侵检测:TON_IOT LINUX数据集的评估

Data Analytics-enabled Intrusion Detection: Evaluations of ToN_IoT Linux Datasets

论文作者

Moustafa, Nour, Ahmed, Mohiuddin, Ahmed, Sherif

论文摘要

随着人工智能(AI)启用安全应用程序的广泛扩展,需要收集异质和可扩展的数据源来有效评估安全应用程序的性能。本文介绍了新数据集的描述,该数据集名为TON IOT数据集,其中包括从物联网服务(IoT)服务的遥测数据集,Windows和Linux的操作系统数据集以及网络流量的数据集中收集的分布式数据源。该论文旨在描述用于从硬盘,内存和过程审核轨迹收集Linux数据集的新测试平台。该体系结构是在三个分布的边缘,雾和云的层设计的。边缘层包括物联网和网络系统,雾层包括虚拟机和网关,云层包括与其他两个层连接的数据分析和可视化工具。使用软件定义网络(SDN)和网络函数虚拟化(NFV)对这些层进行编程控制,并使用VMware NSX和VCLOUD NFV平台进行控制。 Linux TON IOT数据集将用于培训和验证各种新的联合和分发AI支持的安全解决方案,例如入侵检测,威胁智能,隐私保护和数字取证。采用各种数据分析和机器学习方法来确定数据集在检查功能工程,合法和安全事件的统计数据以及安全事件的可靠性方面。可以从[1]公开访问数据集。

With the widespread of Artificial Intelligence (AI)- enabled security applications, there is a need for collecting heterogeneous and scalable data sources for effectively evaluating the performances of security applications. This paper presents the description of new datasets, named ToN IoT datasets that include distributed data sources collected from Telemetry datasets of Internet of Things (IoT) services, Operating systems datasets of Windows and Linux, and datasets of Network traffic. The paper aims to describe the new testbed architecture used to collect Linux datasets from audit traces of hard disk, memory and process. The architecture was designed in three distributed layers of edge, fog, and cloud. The edge layer comprises IoT and network systems, the fog layer includes virtual machines and gateways, and the cloud layer includes data analytics and visualization tools connected with the other two layers. The layers were programmatically controlled using Software-Defined Network (SDN) and Network-Function Virtualization (NFV) using the VMware NSX and vCloud NFV platform. The Linux ToN IoT datasets would be used to train and validate various new federated and distributed AI-enabled security solutions such as intrusion detection, threat intelligence, privacy preservation and digital forensics. Various Data analytical and machine learning methods are employed to determine the fidelity of the datasets in terms of examining feature engineering, statistics of legitimate and security events, and reliability of security events. The datasets can be publicly accessed from [1].

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源