通过半参数累积概率模型来解决检测极限

论文标题

通过半参数累积概率模型来解决检测极限

Addressing Detection Limits with Semiparametric Cumulative Probability Models

论文作者

Tian, Yuqi, Li, Chun, Tu, Shengxin, James, Nathan T., Harrell, Frank E., Shepherd, Bryan E.

论文摘要

检测限（DLS）在研究中无法在一定范围之外测量变量，在研究中很常见。在响应变量中处理DLS的大多数方法都隐含地对DLS外部数据的分布进行参数假设。我们提出了一种基于广泛使用的序数回归模型（累积概率模型（CPM））来处理DLS的新方法。 CPM是一种半参数线性变换模型。 CPM是基于等级的，可以处理连续和离散结果变量的混合分布。这些功能是使用DLS分析数据的关键，因为尽管DLS内部的观察值通常是连续的，但外部DLS经过审查，通常将其放入离散类别中。使用单个较低的DL，CPM将低于DL的值分配为最低等级。当有多个DLS时，可以修改CPM可能性以适当分布概率质量。我们证明了CPM与模拟和两个HIV数据示例的使用。第一个示例模拟了生物标志物，其中15％的观测值低于DL。第二种使用多核心数据来对病毒负荷进行建模，其中约55％的观测值在DLS之外，随着时间的流逝和随着时间的流逝而变化。

Detection limits (DLs), where a variable is unable to be measured outside of a certain range, are common in research. Most approaches to handle DLs in the response variable implicitly make parametric assumptions on the distribution of data outside DLs. We propose a new approach to deal with DLs based on a widely used ordinal regression model, the cumulative probability model (CPM). The CPM is a type of semiparametric linear transformation model. CPMs are rank-based and can handle mixed distributions of continuous and discrete outcome variables. These features are key for analyzing data with DLs because while observations inside DLs are typically continuous, those outside DLs are censored and generally put into discrete categories. With a single lower DL, the CPM assigns values below the DL as having the lowest rank. When there are multiple DLs, the CPM likelihood can be modified to appropriately distribute probability mass. We demonstrate the use of CPMs with simulations and two HIV data examples. The first example models a biomarker in which 15% of observations are below a DL. The second uses multi-cohort data to model viral load, where approximately 55% of observations are outside DLs which vary across sites and over time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题