论文标题
自适应测试实践中的单调性
Monotonicity in practice of adaptive testing
论文作者
论文摘要
在我们以前的工作中,我们已经展示了贝叶斯网络如何用于自适应测试学生技能。后来,我们利用了单调性限制的优势,以便学习模型更好地拟合数据。本文在这两个阶段之间提供了一种协同作用,因为它评估了用于计算机化自适应测试的贝叶斯网络模型,并使用最近提出的单调性梯度算法学习。将此学习方法与另一种单调方法(等值元回归EM算法)进行了比较。在捷克国家数学考试的大量数据集上,对方法的质量进行了经验评估。除了自适应测试方法的优势外,我们还观察到单调方法的有利行为,尤其是对于小型学习数据集大小。这项工作的另一个新颖性是使用得分分布的可靠性间隔,该间隔用于预测学生的最终成绩和成绩。在实验中,我们清楚地表明,我们可以在保持其可靠性的同时缩短测试。我们还表明,单调性通过有限的培训数据集提高了预测质量。通过梯度方法学到的单调模型比无限模型的问题预测质量较低,但是在该应用程序的主要目标中,这是学生得分预测的主要目标。重要的观察是,仅对模型可能性或预测准确性的优化不一定会导致描述学生最佳的模型。
In our previous work we have shown how Bayesian networks can be used for adaptive testing of student skills. Later, we have taken the advantage of monotonicity restrictions in order to learn models fitting data better. This article provides a synergy between these two phases as it evaluates Bayesian network models used for computerized adaptive testing and learned with a recently proposed monotonicity gradient algorithm. This learning method is compared with another monotone method, the isotonic regression EM algorithm. The quality of methods is empirically evaluated on a large data set of the Czech National Mathematics Exam. Besides advantages of adaptive testing approach we observed also advantageous behavior of monotonic methods, especially for small learning data set sizes. Another novelty of this work is the use of the reliability interval of the score distribution, which is used to predict student's final score and grade. In the experiments we have clearly shown we can shorten the test while keeping its reliability. We have also shown that the monotonicity increases the prediction quality with limited training data sets. The monotone model learned by the gradient method has a lower question prediction quality than unrestricted models but it is better in the main target of this application, which is the student score prediction. It is an important observation that a mere optimization of the model likelihood or the prediction accuracy do not necessarily lead to a model that describes best the student.