TY - JOUR
T1 - Adaptive Testing for High-Dimensional Data
AU - Zhang, Yangfan
AU - Wang, Runmin
AU - Shao, Xiaofeng
N1 - Publisher Copyright:
© 2025 American Statistical Association.
PY - 2025
Y1 - 2025
N2 - In this article, we propose a class of (Formula presented.) -norm based U-statistics for a family of global testing problems related to high-dimensional data. This includes testing of mean vector and its spatial sign, simultaneous testing of linear model coefficients, and testing of component-wise independence for high-dimensional observations, among others. Under the null hypothesis, we derive asymptotic normality and independence between (Formula presented.) -norm based U-statistics for several qs under mild moment and cumulant conditions. A simple combination of two studentized (Formula presented.) -based test statistics via their p-values is proposed and is shown to attain great power against alternatives of different sparsity. Our work is a substantial extension of He et al., which is mostly focused on mean and covariance testing, and we manage to provide a general treatment of asymptotic independence of (Formula presented.) -norm based U-statistics for a wide class of kernels. To alleviate the computation burden, we introduce a variant of the proposed U-statistics by using the monotone indices in the summation, resulting in a U-statistic with asymmetric kernel. A dynamic programming method is introduced to reduce the computational cost from (Formula presented.), which is required for the calculation of the full U-statistic, to (Formula presented.) where r is the order of the kernel. Numerical results further corroborate the advantage of the proposed adaptive test as compared to some existing competitors. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
AB - In this article, we propose a class of (Formula presented.) -norm based U-statistics for a family of global testing problems related to high-dimensional data. This includes testing of mean vector and its spatial sign, simultaneous testing of linear model coefficients, and testing of component-wise independence for high-dimensional observations, among others. Under the null hypothesis, we derive asymptotic normality and independence between (Formula presented.) -norm based U-statistics for several qs under mild moment and cumulant conditions. A simple combination of two studentized (Formula presented.) -based test statistics via their p-values is proposed and is shown to attain great power against alternatives of different sparsity. Our work is a substantial extension of He et al., which is mostly focused on mean and covariance testing, and we manage to provide a general treatment of asymptotic independence of (Formula presented.) -norm based U-statistics for a wide class of kernels. To alleviate the computation burden, we introduce a variant of the proposed U-statistics by using the monotone indices in the summation, resulting in a U-statistic with asymmetric kernel. A dynamic programming method is introduced to reduce the computational cost from (Formula presented.), which is required for the calculation of the full U-statistic, to (Formula presented.) where r is the order of the kernel. Numerical results further corroborate the advantage of the proposed adaptive test as compared to some existing competitors. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
KW - Independence testing
KW - Simultaneous testing
KW - Spatial sign
KW - U-statistics
UR - https://www.scopus.com/pages/publications/86000015724
U2 - 10.1080/01621459.2024.2439617
DO - 10.1080/01621459.2024.2439617
M3 - Article
AN - SCOPUS:86000015724
SN - 0162-1459
VL - 120
SP - 1893
EP - 1905
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 551
ER -