论文标题

数据集与现实:从信息需求的角度了解模型性能

Dataset vs Reality: Understanding Model Performance from the Perspective of Information Need

论文作者

Yu, Mengying, Sun, Aixin

论文摘要

深度学习技术为我们带来了许多在几个基准上超过人类的模型。一个有趣的问题是:这些模型是否可以很好地解决与基准数据集具有相似设置(例如相同输入/输出)的现实世界问题?我们认为,训练了模型,以回答创建培训数据集的相同信息需求。尽管某些数据集可能共享高结构相似性,例如,用于答案的问题 - 答案对(QA)任务和图像字幕(IC)任务(IC)任务(IC)任务(QA),但它们可能代表旨在满足不同信息需求的不同研究任务。为了支持我们的论点,我们将QA任务和IC任务用作两个案例研究,并比较其广泛使用的基准数据集。从信息检索的上下文中信息需求的角度来看,我们显示了数据集创建过程中的差异以及数据集之间的形态句法属性的差异。这些数据集的差异可以归因于特定研究任务的不同信息需求。我们鼓励所有研究人员在利用数据集训练模型之前,考虑需要该信息的信息。同样,在创建数据集的同时,研究人员还可以将信息需求视角纳入确定数据集准确反映其打算处理的研究任务的程度的一个因素。

Deep learning technologies have brought us many models that outperform human beings on a few benchmarks. An interesting question is: can these models well solve real-world problems with similar settings (e.g., identical input/output) to the benchmark datasets? We argue that a model is trained to answer the same information need for which the training dataset is created. Although some datasets may share high structural similarities, e.g., question-answer pairs for the question answering (QA) task and image-caption pairs for the image captioning (IC) task, they may represent different research tasks aiming for answering different information needs. To support our argument, we use the QA task and IC task as two case studies and compare their widely used benchmark datasets. From the perspective of information need in the context of information retrieval, we show the differences in the dataset creation processes, and the differences in morphosyntactic properties between datasets. The differences in these datasets can be attributed to the different information needs of the specific research tasks. We encourage all researchers to consider the information need the perspective of a research task before utilizing a dataset to train a model. Likewise, while creating a dataset, researchers may also incorporate the information need perspective as a factor to determine the degree to which the dataset accurately reflects the research task they intend to tackle.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源