Abstract
Dense kernel matrices resulting from pairwise evaluations of a kernel function arise naturally in machine learning and statistics. Previous work in constructing sparse approximate inverse Cholesky factors of such matrices by minimizing Kullback-Leibler divergence recovers the Vecchia approximation for Gaussian processes. These methods rely only on the geometry of the evaluation points to construct the sparsity pattern. In this work, we instead construct the sparsity pattern by leveraging a greedy selection algorithm that maximizes mutual information with target points, conditional on all points previously selected. For selecting k points out of N, the naive time complexity is \scrO(Nk4), but by maintaining a partial Cholesky factor we reduce this to \scrO(Nk2). Furthermore, for multiple (m) targets we achieve a time complexity of \scrO(Nk2+Nm2+m3), which is maintained in the setting of aggregated Cholesky factorization where a selected point need not condition every target. We apply the selection algorithm to image classification and recovery of sparse Cholesky factors. By minimizing Kullback-Leibler divergence, we apply the algorithm to Cholesky factorization, Gaussian process regression, and preconditioning the conjugate gradient method, improving over k-nearest neighbors selection.
| Original language | English |
|---|---|
| Pages (from-to) | 1649-1679 |
| Number of pages | 31 |
| Journal | SIAM-ASA Journal on Uncertainty Quantification |
| Volume | 13 |
| Issue number | 3 |
| DOIs | |
| State | Published - 2025 |
Keywords
- Cholesky factorization
- Gaussian process
- Vecchia approximation
- factorized sparse approximate inverse
- optimal experimental design
- orthogonal matching pursuit
Fingerprint
Dive into the research topics of 'Sparse Inverse Cholesky Factorization of Dense Kernel Matrices by Greedy Conditional Selection'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver