论文标题

本科论文中关键字提取的无监督学习算法

Unsupervised Learning Algorithms for Keyword Extraction in an Undergraduate Thesis

论文作者

Torres-Cruz, Fred, Flores, Edelfre, Arcaya, William E., Chagua, Irenio L., Ingaluque, Marga I.

论文摘要

近年来,许多学术机构中管理的数据量增加,特别是在本科生所做的所有研究工作中,他们只是将经验技术用于关键字选择,而忘记了现有的技术方法来帮助他们的学生在此过程中为学生提供帮助。信息和沟通技术,例如用于综合研究和学术工作的平台(PILAR),记录有关研究项目的信息,例如其各种方式,例如标题,摘要和关键词,在管理中具有相关性和重要性。我们通过这些研究项目的这些研究记录证明了算法,这些研究项目已在这项研究中进行了分析,并对九个(09)无监督的机器学习算法中的每种模型进行了预测,这些模型已针对数据集中的每个7430记录实施。为此数据集提取关键字的最有效方法是TF-IDF方法,在该模型处理的每个论文文件中,在平均提取时间中获得了72%的精度和[0.4786,SD 0.0501]。

The amount of data managed in many academic institutions has increased in recent years, particularly in all the research work done by undergraduate students, who simply use empirical techniques for keyword selection, forgetting existing technical methods to assist their students in this process. Information and communication technologies, such as the platform for integrated research and academic work with responsibility (PILAR), which records information about research projects, such as titles, summaries, and keywords in their various modalities, have gained relevance and importance in the management of these. We proved algorithms with these records of research projects that have been analysed in this study, and predictions were made for each of the nine (09) models of unsupervised machine learning algorithms that were implemented for each of the 7430 records from the dataset. The most efficient way of extracting keywords for this dataset was the TF-IDF method, obtaining 72% accuracy and [0.4786, SD 0.0501] in average extraction time for each thesis file processed by this model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源