论文标题

分析合并的神经成像数据集时,允许处理多个滋扰变量

Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets

论文作者

Lokhande, Vishnu Suresh, Chakraborty, Rudrasis, Ravi, Sathya N., Singh, Vikas

论文摘要

在评估关联时(例如,风险因素和疾病结果之间),汇集多个机构的多个神经影像学数据集通常可以改善统计能力,否则可能太弱而无法检测到。如果只有{\ em单}的可变性来源(例如,不同的扫描仪),在许多情况下,域的适应和匹配表示形式的分布可能就足够了。但是,在存在{\ em多一个}的nuisance变量的情况下,同时影响测量结果,汇总数据集提出了独特的挑战,例如,数据的变化可能来自获取方法以及参与者的人口统计学(性别,年龄)。不变的表示学习本身不适合完全建模数据生成过程。在本文中,我们展示了如何在结构化空间上实例化的模棱两可表示学习(用于研究神经网络中的对称性)以及在因果推理上简单地使用经典结果提供有效的实用解决方案。特别是,我们演示了我们的模型如何在某些假设下处理多个滋扰变量,并可以在场景中对汇集的科学数据集进行分析,否则这些数据集将需要消除大部分样本。

Pooling multiple neuroimaging datasets across institutions often enables improvements in statistical power when evaluating associations (e.g., between risk factors and disease outcomes) that may otherwise be too weak to detect. When there is only a {\em single} source of variability (e.g., different scanners), domain adaptation and matching the distributions of representations may suffice in many scenarios. But in the presence of {\em more than one} nuisance variable which concurrently influence the measurements, pooling datasets poses unique challenges, e.g., variations in the data can come from both the acquisition method as well as the demographics of participants (gender, age). Invariant representation learning, by itself, is ill-suited to fully model the data generation process. In this paper, we show how bringing recent results on equivariant representation learning (for studying symmetries in neural networks) instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution. In particular, we demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源