论文标题

科尔伯特的白盒分析

A White Box Analysis of ColBERT

论文作者

Formal, Thibault, Piwowarski, Benjamin, Clinchant, Stéphane

论文摘要

如今,基于变压器的模型是临时信息检索的最新模型,但其行为远非被理解。最近的工作声称,伯特不满足经典的IR公理。但是,我们建议通过分析术语重要性和精确/软匹配模式来剖定科尔伯特的匹配过程。即使未正式验证传统的公理,我们的分析也表明Colbert:(i)能够捕获一个术语重要性的概念; (ii)依赖于重要术语的确切匹配。

Transformer-based models are nowadays state-of-the-art in ad-hoc Information Retrieval, but their behavior is far from being understood. Recent work has claimed that BERT does not satisfy the classical IR axioms. However, we propose to dissect the matching process of ColBERT, through the analysis of term importance and exact/soft matching patterns. Even if the traditional axioms are not formally verified, our analysis reveals that ColBERT: (i) is able to capture a notion of term importance; (ii) relies on exact matches for important terms.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源