论文标题
XFBOOST:通过可控解码器改善文本生成
XFBoost: Improving Text Generation with Controllable Decoders
论文作者
论文摘要
基于变压器的自然语言模型中的多模式条件已在产品描述生成任务中表现出最先进的表现。最近的方法在一个或多个图像和其他文本元数据上以语言模型为条件,以实现近乎人类的性能来描述电子商务商店的产品。但是,相对于给定产品的输入,产生的描述可能表现出不准确的程度甚至矛盾的主张。在本文中,我们提出了一个可控的语言生成框架,称为reture-fineTune-boost(XFBoost),该框架解决了不准确的低质量推断问题。通过在生成过程的解码阶段使用视觉语义属性作为约束,并使用策略梯度技术对语言模型进行填充,发现XFBOOST框架可产生更大的描述性文本,具有较高的图像相关性,超过了基础线,并降低了实际不准确描述的频率。我们进一步证明了XFBOOST在在线学习中的应用,其中人类的批评家通过主动反馈改善了语言模型。
Multimodal conditionality in transformer-based natural language models has demonstrated state-of-the-art performance in the task of product description generation. Recent approaches condition a language model on one or more images and other textual metadata to achieve near-human performance for describing products from e-commerce stores. However, generated descriptions may exhibit degrees of inaccuracy or even contradictory claims relative to the inputs of a given product. In this paper, we propose a controllable language generation framework called Extract-Finetune-Boost (XFBoost), which addresses the problem of inaccurate low-quality inference. By using visual semantic attributes as constraints at the decoding stage of the generation process and finetuning the language model with policy gradient techniques, the XFBoost framework is found to produce significantly more descriptive text with higher image relevancy, outperforming baselines and lowering the frequency of factually inaccurate descriptions. We further demonstrate the application of XFBoost to online learning wherein human-in-the-loop critics improve language models with active feedback.