TY - JOUR
T1 - Scalable Gaussian-process regression and variable selection using Vecchia approximations
AU - Cao, Jian
AU - Guinness, Joseph
AU - Genton, Marc G.
AU - Katzfuss, Matthias
N1 - Publisher Copyright:
©2022 Jian Cao, Joseph Guinness, Marc Genton, Matthias Katzfuss.
PY - 2022/9/1
Y1 - 2022/9/1
N2 - Gaussian process (GP) regression is a flexible, nonparametric approach to regression that naturally quantifies uncertainty. In many applications, the number of responses and covariates are both large, and a goal is to select covariates that are related to the response. For this setting, we propose a novel, scalable algorithm, coined VGPR, which optimizes a penalized GP log-likelihood based on the Vecchia GP approximation, an ordered conditional approximation from spatial statistics that implies a sparse Cholesky factor of the precision matrix. We traverse the regularization path from strong to weak penalization, sequentially adding candidate covariates based on the gradient of the log-likelihood and deselecting irrelevant covariates via a new quadratic constrained coordinate descent algorithm. We propose Vecchia-based mini-batch subsampling, which provides unbiased gradient estimators. The resulting procedure is scalable to millions of responses and thousands of covariates. Theoretical analysis and numerical studies demonstrate the improved scalability and accuracy relative to existing methods.
AB - Gaussian process (GP) regression is a flexible, nonparametric approach to regression that naturally quantifies uncertainty. In many applications, the number of responses and covariates are both large, and a goal is to select covariates that are related to the response. For this setting, we propose a novel, scalable algorithm, coined VGPR, which optimizes a penalized GP log-likelihood based on the Vecchia GP approximation, an ordered conditional approximation from spatial statistics that implies a sparse Cholesky factor of the precision matrix. We traverse the regularization path from strong to weak penalization, sequentially adding candidate covariates based on the gradient of the log-likelihood and deselecting irrelevant covariates via a new quadratic constrained coordinate descent algorithm. We propose Vecchia-based mini-batch subsampling, which provides unbiased gradient estimators. The resulting procedure is scalable to millions of responses and thousands of covariates. Theoretical analysis and numerical studies demonstrate the improved scalability and accuracy relative to existing methods.
KW - adaptive bridge penalty
KW - gradient-based variable selection
KW - mini-batch subsampling
KW - ordered conditional approximation
KW - penalized Gaussian regression
UR - https://www.scopus.com/pages/publications/85148065344
M3 - Article
AN - SCOPUS:85148065344
SN - 1532-4435
VL - 23
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
M1 - 348
ER -