论文标题

我仍然有时间:扩展海德尔时间的德语文字

I still have Time(s): Extending HeidelTime for German Texts

论文作者

Lücking, Andy, Stoeckel, Manuel, Abrami, Giuseppe, Mehler, Alexander

论文摘要

Heideltime是检测文本中时间表达式的最广泛和成功的工具之一。由于Heideltime的模式匹配系统基于正则表达式,因此可以以方便的方式扩展。我们为Heideltime的德国资源提供了这样的扩展:Heideltime-Ext。该扩展是通过观察现实世界中文本和各种时间库中的虚假负面因素引起的。覆盖范围的增益为2.7%或8.5%,具体取决于承认的潜在过度概括程度。我们描述了Heideltime-Ext的发展,其对来自各种流派的文本样本的评估,并共享一些语言观察。 Heideltime Ext可以从https://github.com/texttechnologylab/heideltime获得。

HeidelTime is one of the most widespread and successful tools for detecting temporal expressions in texts. Since HeidelTime's pattern matching system is based on regular expression, it can be extended in a convenient way. We present such an extension for the German resources of HeidelTime: HeidelTime-EXT . The extension has been brought about by means of observing false negatives within real world texts and various time banks. The gain in coverage is 2.7% or 8.5%, depending on the admitted degree of potential overgeneralization. We describe the development of HeidelTime-EXT, its evaluation on text samples from various genres, and share some linguistic observations. HeidelTime ext can be obtained from https://github.com/texttechnologylab/heideltime.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源