机器学习方法和Windows恶意软件分类的挑战的调查

论文标题

机器学习方法和Windows恶意软件分类的挑战的调查

A Survey of Machine Learning Methods and Challenges for Windows Malware Classification

论文作者

Raff, Edward, Nicholas, Charles

论文摘要

恶意软件分类是一个困难的问题，几十年来已经应用了机器学习方法。然而，进步通常很慢，部分原因是通过开发机器学习系统的所有阶段发生的任务遇到了许多独特的困难：数据收集，标签，功能创建和选择，模型选择和评估。在本调查中，我们将回顾许多与恶意软件分类有关的当前方法和挑战，包括数据收集，功能提取，模型构建以及评估。我们的讨论将包括有关该域中基于机器学习的解决方案必须考虑的约束的想法，尚未解决机器学习也可以提供解决方案的问题。这项调查旨在对网络安全从业人员有用，他们希望更多地了解如何将机器学习应用于恶意软件问题，并为数据科学家提供必要的背景，以了解这个独特的复杂空间中的挑战。

Malware classification is a difficult problem, to which machine learning methods have been applied for decades. Yet progress has often been slow, in part due to a number of unique difficulties with the task that occur through all stages of the developing a machine learning system: data collection, labeling, feature creation and selection, model selection, and evaluation. In this survey we will review a number of the current methods and challenges related to malware classification, including data collection, feature extraction, and model construction, and evaluation. Our discussion will include thoughts on the constraints that must be considered for machine learning based solutions in this domain, and yet to be tackled problems for which machine learning could also provide a solution. This survey aims to be useful both to cybersecurity practitioners who wish to learn more about how machine learning can be applied to the malware problem, and to give data scientists the necessary background into the challenges in this uniquely complicated space.

下载PDF全文

下载文献需遵守相关版权规定

论文标题