在同行评审文档中对技术债务的自动检测和分析

论文标题

在同行评审文档中对技术债务的自动检测和分析

Automatic Detection and Analysis of Technical Debts in Peer-Review Documentation of R Packages

论文作者

Khan, Junaed Younus, Uddin, Gias

论文摘要

技术债务（TD）是对代码相关问题的隐喻，这是由于优先置于完美代码上的快速交付而导致的。鉴于TD的减少可以对软件工程生命周期（SDLC）产生长期积极影响，因此在文献中对TD进行了广泛的研究。但是，尽管它的流行和使用，但现有的研究很少有人关注R编程语言的技术债务。 Codabux等人的最新研究。 [21]发现R软件包可以具有10种不同的TD类型来分析同行评审文档。但是，这些发现是基于对R包审核评论的一小部分样本的手动分析。在本文中，我们开发了一套机器学习（ML）分类器，以自动检测10个TD。最佳性能分类器基于深ML模型BERT，该模型的F1分数为0.71-0.91。然后，我们将经过训练的BERT模型应用于来自Ropensci和Bioconductor的两个平台的所有可用同行评审发行评论（13.5k评论评论来自总共1297 R套件）。我们对两个R平台中10 TD的患病率和演变进行了实证研究。我们发现文档债务是所有类型的TD中最普遍的债务，并且也正在迅速扩展。我们还发现，与域特异性平台（即生物导体）相比，通用平台的R包（即Ropensci）更容易容易出现TD。我们的经验研究结果可以指导R包文档中未来的改进机会。我们的ML模型可用于自动监视R软件包文档中TDS的流行率和演变。

Technical debt (TD) is a metaphor for code-related problems that arise as a result of prioritizing speedy delivery over perfect code. Given that the reduction of TDs can have long-term positive impact in the software engineering life-cycle (SDLC), TDs are studied extensively in the literature. However, very few of the existing research focused on the technical debts of R programming language despite its popularity and usage. Recent research by Codabux et al. [21] finds that R packages can have 10 diverse TD types analyzing peer-review documentation. However, the findings are based on the manual analysis of a small sample of R package review comments. In this paper, we develop a suite of Machine Learning (ML) classifiers to detect the 10 TDs automatically. The best performing classifier is based on the deep ML model BERT, which achieves F1-scores of 0.71 - 0.91. We then apply the trained BERT models on all available peer-review issue comments from two platforms, rOpenSci and BioConductor (13.5K review comments coming from a total of 1297 R packages). We conduct an empirical study on the prevalence and evolution of 10 TDs in the two R platforms. We discovered documentation debt is the most prevalent among all types of TD, and it is also expanding rapidly. We also find that R packages of generic platform (i.e. rOpenSci) are more prone to TD compared to domain-specific platform (i.e. BioConductor). Our empirical study findings can guide future improvements opportunities in R package documentation. Our ML models can be used to automatically monitor the prevalence and evolution of TDs in R package documentation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题