论文标题

概率模型,结合了辅助协变量以控制FDR

Probabilistic Model Incorporating Auxiliary Covariates to Control FDR

论文作者

Qiu, Lin, Murrugarra-Llerena, Nils, Silva, Vítor, Lin, Lin, Chinchilli, Vernon M.

论文摘要

在利用多个假设检验的侧面信息的同时,控制虚假发现率(FDR)是现代数据科学的新兴研究主题。现有方法依赖于测试级别的协变量,同时忽略了有关测试级别协变量的指标。对于复杂的大规模问题,这种策略可能不是最佳的,在复杂的大规模问题中,在测试级别的协变量和辅助指标或协变量之间通常存在间接关系。我们将辅助协变量纳入了测试级别的协变量之间,以控制FDR的深黑框框架(称为Neurt-FDR),该框架增强了统计能力并控制FDR进行多种假设测试的FDR。我们的方法将测试级协变量作为神经网络参数,并通过回归框架调整辅助协变量,从而可以灵活地处理高维特征以及有效的端到端优化。我们表明,与竞争基线相比,Neurt-FDR在三个真实数据集中发现了更多的发现。

Controlling False Discovery Rate (FDR) while leveraging the side information of multiple hypothesis testing is an emerging research topic in modern data science. Existing methods rely on the test-level covariates while ignoring metrics about test-level covariates. This strategy may not be optimal for complex large-scale problems, where indirect relations often exist among test-level covariates and auxiliary metrics or covariates. We incorporate auxiliary covariates among test-level covariates in a deep Black-Box framework controlling FDR (named as NeurT-FDR) which boosts statistical power and controls FDR for multiple-hypothesis testing. Our method parametrizes the test-level covariates as a neural network and adjusts the auxiliary covariates through a regression framework, which enables flexible handling of high-dimensional features as well as efficient end-to-end optimization. We show that NeurT-FDR makes substantially more discoveries in three real datasets compared to competitive baselines.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源