DISTAD：基于执行跟踪分布的软件异常检测

论文标题

DISTAD：基于执行跟踪分布的软件异常检测

DistAD: Software Anomaly Detection Based on Execution Trace Distribution

论文作者

Kong, Shiyi, Ai, Jun, Lu, Minyan, Wang, Shuguang, Wong, W. Eric

论文摘要

现代软件系统已经变得越来越复杂，这使得它们难以测试和验证。在运行时检测复杂系统中的软件部分异常可以帮助处理意外的软件行为，避免灾难性软件故障并改善软件运行时的可用性。这些检测技术旨在在最终导致不可避免的故障之前识别故障（异常）的表现，从而支持以下运行时耐受耐受性技术。在这项工作中，我们提出了一种名为Distad的新型异常检测方法，该方法基于软件运行时动态执行跟踪的分布。与使用关键性能指标的其他现有作品不同，执行跟踪是在运行时通过侵入性仪器收集的。根据采样机制控制仪器，以避免过多的开销。双向长期记忆（BI-LSTM），复发性神经网络（RNN）的结构用于实现异常检测。整个框架是在单级神经网络（OCNN）学习模式下构建的，该模式可以帮助消除缺乏足够标记的样本和数据不平衡问题的限制。一系列受控的实验是在名为Cassandra的广泛使用的数据库系统上进行的，以证明该方法的有效性和可行性。还评估了侵入性探测带来的开销。结果表明，与未受监督的执行相比，Distad可以达到超过70％的精度和90％的召回（在正常状态），开销不超过2倍。

Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) before they ultimately lead to unavoidable failures, thus, supporting the following runtime fault-tolerant techniques. In this work, we propose a novel anomaly detection method named DistAD, which is based on the distribution of software runtime dynamic execution traces. Unlike other existing works using key performance indicators, the execution trace is collected during runtime via intrusive instrumentation. Instrumentation are controlled following a sampling mechanism to avoid excessive overheads. Bi-directional Long Short-Term Memory (Bi-LSTM), an architecture of Recurrent Neural Network (RNN) is used to achieve the anomaly detection. The whole framework is constructed under a One-Class Neural Network (OCNN) learning mode which can help eliminate the limits of lacking for enough labeled samples and the data imbalance issues. A series of controlled experiments are conducted on a widely used database system named Cassandra to prove the validity and feasibility of the proposed method. Overheads brought about by the intrusive probing are also evaluated. The results show that DistAD can achieve more than 70% accuracy and 90% recall (in normal states) with no more than 2 times overheads compared with unmonitored executions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题