论文标题
在控制错误发现率时搜索特定于亚组的关联
Searching for subgroup-specific associations while controlling the false discovery rate
论文作者
论文摘要
本文介绍了一种创新的方法,用于在高维数据中进行有条件的独立性测试,从而促进了人口不同亚组内的自动发现,同时控制了错误的发现率。这是通过扩展Model-X仿制滤波器以提供更有用的推论来实现的。我们的增强推论可以帮助解释样本异质性并发现互动,从而更好地利用现代机器学习模型提供的功能。具体而言,我们的方法能够利用任何模型来识别与有趣的人群亚组有关的数据驱动的假设。然后,它严格检验这些假设而不会屈服于选择偏差。重要的是,我们的方法是有效的,不需要样品分裂。我们使用来自具有多个治疗变量的随机实验得出的数据,通过模拟和数值实验来证明方法的有效性。
This paper introduces an innovative method for conducting conditional independence testing in high-dimensional data, facilitating the automated discovery of significant associations within distinct subgroups of a population, all while controlling the false discovery rate. This is achieved by expanding upon the model-X knockoff filter to provide more informative inferences. Our enhanced inferences can help explain sample heterogeneity and uncover interactions, making better use of the capabilities offered by modern machine learning models. Specifically, our method is able to leverage any model for the identification of data-driven hypotheses pertaining to interesting population subgroups. Then, it rigorously test these hypotheses without succumbing to selection bias. Importantly, our approach is efficient and does not require sample splitting. We demonstrate the effectiveness of our method through simulations and numerical experiments, using data derived from a randomized experiment featuring multiple treatment variables.