论文标题

查询形式:简单的零击形式实体查询框架

QueryForm: A Simple Zero-shot Form Entity Query Framework

论文作者

Wang, Zifeng, Zhang, Zizhao, Devlin, Jacob, Lee, Chen-Yu, Su, Guolong, Zhang, Hao, Dy, Jennifer, Perot, Vincent, Pfister, Tomas

论文摘要

用于文档理解的零拍传输学习是一种至关重要但不足的方案,可帮助降低注释文档实体所涉及的高成本。我们提出了一个新颖的基于查询的框架QueryForm,该框架以零拍的方式从类似形式的文档中提取实体值。 QueryForm包含一种双重提示机制,该机制将文档架构和特定实体类型既包含在查询中,该机制用于提示变压器模型执行单个实体提取任务。此外,我们建议利用来自具有弱HTML注释的形式的网页生成的大规模查询对对,以预先培训查询。通过将预训练和微调统一到基于查询的框架中,QueryForm使模型能够从包含各种实体和布局的结构化文档中学习,从而更好地概括了目标文档类型,而无需针对目标特定的培训数据。 QueryForm设置了XFUND(+4.6%〜10.1%)和付款(+3.2%〜9.5%)零摄像基准的新最先进的F1得分,零摄影基准,具有较小的型号尺寸,没有其他图像输入。

Zero-shot transfer learning for document understanding is a crucial yet under-investigated scenario to help reduce the high cost involved in annotating document entities. We present a novel query-based framework, QueryForm, that extracts entity values from form-like documents in a zero-shot fashion. QueryForm contains a dual prompting mechanism that composes both the document schema and a specific entity type into a query, which is used to prompt a Transformer model to perform a single entity extraction task. Furthermore, we propose to leverage large-scale query-entity pairs generated from form-like webpages with weak HTML annotations to pre-train QueryForm. By unifying pre-training and fine-tuning into the same query-based framework, QueryForm enables models to learn from structured documents containing various entities and layouts, leading to better generalization to target document types without the need for target-specific training data. QueryForm sets new state-of-the-art average F1 score on both the XFUND (+4.6%~10.1%) and the Payment (+3.2%~9.5%) zero-shot benchmark, with a smaller model size and no additional image input.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源