TY - JOUR
T1 - Robust two-sample test of high-dimensional mean vectors under dependence
AU - Wang, Wei
AU - Lin, Nan
AU - Tang, Xiang
N1 - Publisher Copyright:
© 2018 Elsevier Inc.
PY - 2019/1
Y1 - 2019/1
N2 - A basic problem in modern multivariate analysis is testing the equality of two mean vectors in settings where the dimension [Formula presented] increases with the sample size [Formula presented]. This paper proposes a robust two-sample test for high-dimensional data against sparse and strong alternatives, in which the mean vectors of the populations differ in only a few dimensions, but the magnitude of the differences is large. The test is based on trimmed means and robust precision matrix estimators. The asymptotic joint distribution of the trimmed means is established, and the proposed test statistic is shown to have a Gumbel distribution in the limit. Simulation studies suggest that the numerical performance of the proposed test is comparable to that of non-robust tests for uncontaminated data. For cell-wise contaminated data, it outperforms non-robust tests. An illustration involves biomarker identification in an Alzheimer's disease dataset.
AB - A basic problem in modern multivariate analysis is testing the equality of two mean vectors in settings where the dimension [Formula presented] increases with the sample size [Formula presented]. This paper proposes a robust two-sample test for high-dimensional data against sparse and strong alternatives, in which the mean vectors of the populations differ in only a few dimensions, but the magnitude of the differences is large. The test is based on trimmed means and robust precision matrix estimators. The asymptotic joint distribution of the trimmed means is established, and the proposed test statistic is shown to have a Gumbel distribution in the limit. Simulation studies suggest that the numerical performance of the proposed test is comparable to that of non-robust tests for uncontaminated data. For cell-wise contaminated data, it outperforms non-robust tests. An illustration involves biomarker identification in an Alzheimer's disease dataset.
KW - Cell-wise contamination
KW - Robust precision matrix estimation
KW - Sparse and strong alternatives
KW - Trimmed mean
KW - Two-sample mean test
UR - https://www.scopus.com/pages/publications/85055114635
U2 - 10.1016/j.jmva.2018.09.013
DO - 10.1016/j.jmva.2018.09.013
M3 - Article
AN - SCOPUS:85055114635
SN - 0047-259X
VL - 169
SP - 312
EP - 329
JO - Journal of Multivariate Analysis
JF - Journal of Multivariate Analysis
ER -