通过深度学习的恶意网络流量检测：信息理论观点

论文标题

通过深度学习的恶意网络流量检测：信息理论观点

Malicious Network Traffic Detection via Deep Learning: An Information Theoretic View

论文作者

Galinkin, Erick

论文摘要

深度学习从学术界和行业吸引的关注逐年不断增长，据说我们正处于人工智能研究的新黄金时代。但是，神经网络仍然经常被视为一个“黑匣子”，在那里学习，但无法以人类解动的方式理解。由于这些机器学习系统越来越多地在安全环境中采用，因此探索这些解释很重要。我们考虑一个用于解决此问题的Android恶意软件流量数据集。然后，使用信息平面，我们探讨同构如何影响数据的学会表示以及该数据上参数捕获的共同信息的不变性。我们以准确性作为学习表示形式相似性的第二个量度，从经验上验证这些结果。我们的结果表明，尽管在所有参数的歧管上定义的学习表示形式和特定坐标系的细节略有不同，但功能近似值是相同的。此外，我们的结果表明，由于在同构中相互信息仍然不变，因此只有更改数据集熵的特征工程方法将改变神经网络的结果。这意味着，对于某些数据集和任务，神经网络需要有意义的，人为驱动的功能工程或体系结构的变化，以为神经网络提供足够的信息，以生成足够的统计量。应用我们的结果可以用于指导机器学习工程师的分析方法，并建议可以利用卷积定理的神经网络作为标准卷积神经网络同样准确，并且可以在计算上更有效。

The attention that deep learning has garnered from the academic community and industry continues to grow year over year, and it has been said that we are in a new golden age of artificial intelligence research. However, neural networks are still often seen as a "black box" where learning occurs but cannot be understood in a human-interpretable way. Since these machine learning systems are increasingly being adopted in security contexts, it is important to explore these interpretations. We consider an Android malware traffic dataset for approaching this problem. Then, using the information plane, we explore how homeomorphism affects learned representation of the data and the invariance of the mutual information captured by the parameters on that data. We empirically validate these results, using accuracy as a second measure of similarity of learned representations. Our results suggest that although the details of learned representations and the specific coordinate system defined over the manifold of all parameters differ slightly, the functional approximations are the same. Furthermore, our results show that since mutual information remains invariant under homeomorphism, only feature engineering methods that alter the entropy of the dataset will change the outcome of the neural network. This means that for some datasets and tasks, neural networks require meaningful, human-driven feature engineering or changes in architecture to provide enough information for the neural network to generate a sufficient statistic. Applying our results can serve to guide analysis methods for machine learning engineers and suggests that neural networks that can exploit the convolution theorem are equally accurate as standard convolutional neural networks, and can be more computationally efficient.

下载PDF全文

下载文献需遵守相关版权规定

论文标题