论文标题
软件工程定性研究的可靠性通过编码器协议:使用Krippendorff的$α$&atlas.ti的指南
Reliability in Software Engineering Qualitative Research through Inter-Coder Agreement: A guide using Krippendorff's $α$ & Atlas.ti
论文作者
论文摘要
近年来,使用定性数据分析(例如案例研究,访谈调查和扎根理论研究)的经验软件工程研究正在增加。但是,这项研究的大部分并未深入研究发现的可靠性和有效性,特别是在编码的可靠性上,尽管存在多种被称为Coder Inter-Coder Inter-Coder协议(ICA)来分析团队编码中的共识的统计技术。本文旨在建立一个新颖的理论框架,以实现进行这种有效性分析的方法学方法。该框架基于一组系数,用于衡量不同编码者在判断共同问题时达到的一致性程度。我们分析了不同的可靠性系数,并提供了计算的详细示例,并特别注意Krippendorff的$α$系数。我们系统地回顾了文献中报道的Krippendorff $α$的几种变体,并提供了一种新颖的常见数学框架,其中所有这些框架都通过通用$α$系数统一。最后,本文在关于DevOps文化的大案例研究中提供了该理论框架使用该理论框架的详细指南。我们解释了如何使用广泛使用的软件工具来计算和解释$α$系数,以进行定性分析,例如Atlas.ti。我们希望这项工作将有助于经验研究人员,特别是在软件工程中,有助于提高他们的研究质量和可信赖性。
In recent years, the research on empirical software engineering that uses qualitative data analysis (e.g., cases studies, interview surveys, and grounded theory studies) is increasing. However, most of this research does not deep into the reliability and validity of findings, specifically in the reliability of coding in which these methodologies rely on, despite there exist a variety of statistical techniques known as Inter-Coder Agreement (ICA) for analyzing consensus in team coding. This paper aims to establish a novel theoretical framework that enables a methodological approach for conducting this validity analysis. This framework is based on a set of coefficients for measuring the degree of agreement that different coders achieve when judging a common matter. We analyze different reliability coefficients and provide detailed examples of calculation, with special attention to Krippendorff's $α$ coefficients. We systematically review several variants of Krippendorff's $α$ reported in the literature and provide a novel common mathematical framework in which all of them are unified through a universal $α$ coefficient. Finally, this paper provides a detailed guide of the use of this theoretical framework in a large case study on DevOps culture. We explain how $α$ coefficients are computed and interpreted using a widely used software tool for qualitative analysis like Atlas.ti. We expect that this work will help empirical researchers, particularly in software engineering, to improve the quality and trustworthiness of their studies.