TY - JOUR
T1 - Variable selection for random effects two-part models
AU - Han, Dongxiao
AU - Liu, Lei
AU - Su, Xiaogang
AU - Johnson, Bankole
AU - Sun, Liuquan
N1 - Publisher Copyright:
© The Author(s) 2018.
PY - 2019/9/1
Y1 - 2019/9/1
N2 - Random effects two-part models have been applied to longitudinal studies for zero-inflated (or semi-continuous) data, characterized by a large portion of zero values and continuous non-zero (positive) values. Examples include monthly medical costs, daily alcohol drinks, relative abundance of microbiome, etc. With the advance of information technology for data collection and storage, the number of variables available to researchers can be rather large in such studies. To avoid curse of dimensionality and facilitate decision making, it is critically important to select covariates that are truly related to the outcome. However, owing to its intricate nature, there is not yet a satisfactory variable selection method available for such sophisticated models. In this paper, we seek a feasible way of conducting variable selection for random effects two-part models on the basis of the recently proposed “minimum information criterion” (MIC) method. We demonstrate that the MIC formulation leads to a reasonable formulation of sparse estimation, which can be conveniently solved with SAS Proc NLMIXED. The performance of our approach is evaluated through simulation, and an application to a longitudinal alcohol dependence study is provided.
AB - Random effects two-part models have been applied to longitudinal studies for zero-inflated (or semi-continuous) data, characterized by a large portion of zero values and continuous non-zero (positive) values. Examples include monthly medical costs, daily alcohol drinks, relative abundance of microbiome, etc. With the advance of information technology for data collection and storage, the number of variables available to researchers can be rather large in such studies. To avoid curse of dimensionality and facilitate decision making, it is critically important to select covariates that are truly related to the outcome. However, owing to its intricate nature, there is not yet a satisfactory variable selection method available for such sophisticated models. In this paper, we seek a feasible way of conducting variable selection for random effects two-part models on the basis of the recently proposed “minimum information criterion” (MIC) method. We demonstrate that the MIC formulation leads to a reasonable formulation of sparse estimation, which can be conveniently solved with SAS Proc NLMIXED. The performance of our approach is evaluated through simulation, and an application to a longitudinal alcohol dependence study is provided.
KW - High dimensional
KW - mixed effects
KW - pharmacogenetics
KW - precision medicine
KW - tuning parameter
KW - variable selection
UR - http://www.scopus.com/inward/record.url?scp=85049877908&partnerID=8YFLogxK
U2 - 10.1177/0962280218784712
DO - 10.1177/0962280218784712
M3 - Article
C2 - 30001684
AN - SCOPUS:85049877908
SN - 0962-2802
VL - 28
SP - 2697
EP - 2709
JO - Statistical Methods in Medical Research
JF - Statistical Methods in Medical Research
IS - 9
ER -