论文标题
测量高等教育中学生出勤数据的可信度用于数据挖掘
Measuring the Credibility of Student Attendance Data in Higher Education for Data Mining
论文作者
论文摘要
教育数据挖掘(EDM)是一项发展的学科,涉及扩展经典数据挖掘(DM)方法并开发新方法以发现源自教育系统的数据。一直以经典的方式处理高等教育的学生参加,教育工作者依靠计算出勤或缺席的发生,以建立对学生的知识以及基于这一数字的模块。这种方法既不可信,也不一定提供了学生表现的真正迹象。这项研究试图以确保实现准确和可信结果的方式来提取提取的知识。从教育系统中收集的学生出勤数据首先被清理以消除任何随机性和噪音,然后研究了各种属性,以突出影响学生真正出勤率的最重要的属性。下一步是根据上一步中选择的属性来得出一个衡量学生出勤信誉(SAC)的方程式。然后评估新开发的度量的可靠性,以检查其一致性。最后,使用J48 DM分类技术来根据其SAC值的强度对模块进行分类。这项研究的结果是有希望的,使用新得出的公式实现的可信度值给出了学生出勤的准确,可信和真实的指标,以及基于学生参加这些模块的可信度的准确分类模块的准确分类。
Educational Data Mining (EDM) is a developing discipline, concerned with expanding the classical Data Mining (DM) methods and developing new methods for discovering the data that originate from educational systems. Student attendance in higher education has always been dealt with in a classical way, educators rely on counting the occurrence of attendance or absence building their knowledge about students as well as modules based on this count. This method is neither credible nor does it necessarily provide a real indication of a student performance. This study tries to formulate the extracted knowledge in a way that guarantees achieving accurate and credible results. Student attendance data, gathered from the educational system, were first cleaned in order to remove any randomness and noise, then various attributes were studied so as to highlight the most significant ones that affect the real attendance of students. The next step was to derive an equation that measures the Student Attendance Credibility (SAC) considering the attributes chosen in the previous step. The reliability of the newly developed measure was then evaluated in order to examine its consistency. Finally, the J48 DM classification technique was utilized in order to classify modules based on the strength of their SAC values. Results of this study were promising, and credibility values achieved using the newly derived formula gave accurate, credible, and real indicators of student attendance, as well as accurate classification of modules based on the credibility of student attendance on those modules.