查询形式：简单的零击形式实体查询框架

论文标题

查询形式：简单的零击形式实体查询框架

QueryForm: A Simple Zero-shot Form Entity Query Framework

论文作者

Wang, Zifeng, Zhang, Zizhao, Devlin, Jacob, Lee, Chen-Yu, Su, Guolong, Zhang, Hao, Dy, Jennifer, Perot, Vincent, Pfister, Tomas

论文摘要

用于文档理解的零拍传输学习是一种至关重要但不足的方案，可帮助降低注释文档实体所涉及的高成本。我们提出了一个新颖的基于查询的框架QueryForm，该框架以零拍的方式从类似形式的文档中提取实体值。 QueryForm包含一种双重提示机制，该机制将文档架构和特定实体类型既包含在查询中，该机制用于提示变压器模型执行单个实体提取任务。此外，我们建议利用来自具有弱HTML注释的形式的网页生成的大规模查询对对，以预先培训查询。通过将预训练和微调统一到基于查询的框架中，QueryForm使模型能够从包含各种实体和布局的结构化文档中学习，从而更好地概括了目标文档类型，而无需针对目标特定的培训数据。 QueryForm设置了XFUND（+4.6％〜10.1％）和付款（+3.2％〜9.5％）零摄像基准的新最先进的F1得分，零摄影基准，具有较小的型号尺寸，没有其他图像输入。

Zero-shot transfer learning for document understanding is a crucial yet under-investigated scenario to help reduce the high cost involved in annotating document entities. We present a novel query-based framework, QueryForm, that extracts entity values from form-like documents in a zero-shot fashion. QueryForm contains a dual prompting mechanism that composes both the document schema and a specific entity type into a query, which is used to prompt a Transformer model to perform a single entity extraction task. Furthermore, we propose to leverage large-scale query-entity pairs generated from form-like webpages with weak HTML annotations to pre-train QueryForm. By unifying pre-training and fine-tuning into the same query-based framework, QueryForm enables models to learn from structured documents containing various entities and layouts, leading to better generalization to target document types without the need for target-specific training data. QueryForm sets new state-of-the-art average F1 score on both the XFUND (+4.6%~10.1%) and the Payment (+3.2%~9.5%) zero-shot benchmark, with a smaller model size and no additional image input.

下载PDF全文

下载文献需遵守相关版权规定

论文标题