高维探索性项目因素分析的深度学习算法

论文标题

高维探索性项目因素分析的深度学习算法

A Deep Learning Algorithm for High-Dimensional Exploratory Item Factor Analysis

论文作者

Urban, Christopher J., Bauer, Daniel J.

论文摘要

边际最大似然（MML）估计是由于MML估计器的一致性，正态性和效率，由于样本量倾向于无限，因此在心理计量学中拟合项目响应理论模型的首选方法。然而，最新的MML估计程序，例如大都会杂货店Robbins-Monro（MH-RM）算法以及近似MML估计程序，例如变异推理（VI）是计算时间耗时的时间何时何时何时何时样本大小和潜在因素的数量很大。在这项工作中，我们研究了一种基于学习的深度VI算法，用于用于探索性项目因素分析（IFA），即使在具有许多潜在因素的大数据集中，该算法在计算上也很快。所提出的方法采用了一个深层的人工神经网络模型，称为“重要性加权自动编码器（IWAE）进行探索性IFA”。 IWAE使用重要性抽样技术近似于MML估计器，其中增加了在拟合期间绘制的重要性加权（IW）样本的数量可改善近似值，通常以降低计算效率的成本。我们提供了一个真实的数据应用程序，该应用程序恢复了在随机开始的过程中与心理理论保持一致的结果。通过仿真研究，我们表明，随着样本量或IW样品数量增加（尽管因子相关性和截距估计值估算出一些偏见），并且在更少的时间内获得了与MH-RM相似的结果。我们的仿真还表明，所提出的方法的性能与约束最大似然估计相似，并且潜在的速度要快，这是一种快速程序，当样本量和同时倾向于无穷大的项目数量时，这是一致的。

Marginal maximum likelihood (MML) estimation is the preferred approach to fitting item response theory models in psychometrics due to the MML estimator's consistency, normality, and efficiency as the sample size tends to infinity. However, state-of-the-art MML estimation procedures such as the Metropolis-Hastings Robbins-Monro (MH-RM) algorithm as well as approximate MML estimation procedures such as variational inference (VI) are computationally time-consuming when the sample size and the number of latent factors are very large. In this work, we investigate a deep learning-based VI algorithm for exploratory item factor analysis (IFA) that is computationally fast even in large data sets with many latent factors. The proposed approach applies a deep artificial neural network model called an importance-weighted autoencoder (IWAE) for exploratory IFA. The IWAE approximates the MML estimator using an importance sampling technique wherein increasing the number of importance-weighted (IW) samples drawn during fitting improves the approximation, typically at the cost of decreased computational efficiency. We provide a real data application that recovers results aligning with psychological theory across random starts. Via simulation studies, we show that the IWAE yields more accurate estimates as either the sample size or the number of IW samples increases (although factor correlation and intercepts estimates exhibit some bias) and obtains similar results to MH-RM in less time. Our simulations also suggest that the proposed approach performs similarly to and is potentially faster than constrained joint maximum likelihood estimation, a fast procedure that is consistent when the sample size and the number of items simultaneously tend to infinity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题