调查科学文献中折磨短语的检测

论文标题

调查科学文献中折磨短语的检测

Investigating the detection of Tortured Phrases in Scientific Literature

论文作者

Lay, Puthineath, Lentschat, Martin, Labbé, Cyril

论文摘要

借助在线工具，不道德的作者今天可以生成伪科学文章，并尝试发布它。这些工具中的一些通过替换或释义现有文本以生成新内容来起作用，但是它们倾向于产生荒谬的表达方式。最近的一项研究介绍了“折磨短语”的概念，这是一种意外的奇怪短语，而不是固定的表达式。例如。伪造意识而不是人工智能。本研究旨在研究尚未列出的折磨的短语，可以自动检测到。我们进行了几项实验，包括非神经二进制分类，神经二元分类和余弦酶标记的相似性比较，从而得出明显的结果。

With the help of online tools, unscrupulous authors can today generate a pseudo-scientific article and attempt to publish it. Some of these tools work by replacing or paraphrasing existing texts to produce new content, but they have a tendency to generate nonsensical expressions. A recent study introduced the concept of 'tortured phrase', an unexpected odd phrase that appears instead of the fixed expression. E.g. counterfeit consciousness instead of artificial intelligence. The present study aims at investigating how tortured phrases, that are not yet listed, can be detected automatically. We conducted several experiments, including non-neural binary classification, neural binary classification and cosine similarity comparison of the phrase tokens, yielding noticeable results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题