通过隐式证据整合预测临床试验结果

论文标题

通过隐式证据整合预测临床试验结果

Predicting Clinical Trial Results by Implicit Evidence Integration

论文作者

Jin, Qiao, Tan, Chuanqi, Chen, Mosha, Liu, Xiaozhong, Huang, Songfang

论文摘要

临床试验为实践循证医学提供了必不可少的指导，尽管通常伴随着无法忍受的成本和风险。为了优化临床试验的设计，我们引入了一项新的临床试验结果预测（CTRP）任务。在CTRP框架中，一个模型采用了一个具有背景的PICO形式的临床试验建议，并预测了结果，即干预组与比较组在研究人群中的测量结果方面的比较。尽管对于手动收集而言，结构化的临床证据非常昂贵，但我们从医学文献中利用了大规模的非结构化句子，这些句子隐含地包含PICOS和结果作为证据。具体而言，我们预先训练了一个模型，以预测这种隐式证据的分离结果，并在下游数据集上使用有限的数据微调模型。基准证据集成数据集上的实验表明，所提出的模型的表现要优于大基线，例如，在宏F1中，相对增长率为10.7％。此外，在另一个由与Covid-19相关的临床试验组成的数据集上还验证了性能的改进。

Clinical trials provide essential guidance for practicing Evidence-Based Medicine, though often accompanying with unendurable costs and risks. To optimize the design of clinical trials, we introduce a novel Clinical Trial Result Prediction (CTRP) task. In the CTRP framework, a model takes a PICO-formatted clinical trial proposal with its background as input and predicts the result, i.e. how the Intervention group compares with the Comparison group in terms of the measured Outcome in the studied Population. While structured clinical evidence is prohibitively expensive for manual collection, we exploit large-scale unstructured sentences from medical literature that implicitly contain PICOs and results as evidence. Specifically, we pre-train a model to predict the disentangled results from such implicit evidence and fine-tune the model with limited data on the downstream datasets. Experiments on the benchmark Evidence Integration dataset show that the proposed model outperforms the baselines by large margins, e.g., with a 10.7% relative gain over BioBERT in macro-F1. Moreover, the performance improvement is also validated on another dataset composed of clinical trials related to COVID-19.

下载PDF全文

下载文献需遵守相关版权规定

论文标题