TY - JOUR
T1 - Identification of genetic association of multiple rare variants using collapsing methods
AU - Sun, Yan V.
AU - Sung, Yun Ju
AU - Tintle, Nathan
AU - Ziegler, Andreas
PY - 2011
Y1 - 2011
N2 - Next-generation sequencing technology allows investigation of both common and rare variants in humans. Exomes are sequenced on the population level or in families to further study the genetics of human diseases. Genetic Analysis Workshop 17 (GAW17) provided exomic data from the 1000 Genomes Project and simulated phenotypes. These data enabled evaluations of existing and newly developed statistical methods for rare variant sequence analysis for which standard statistical methods fail because of the rareness of the alleles. Various alternative approaches have been proposed that overcome the rareness problem by combining multiple rare variants within a gene. These approaches are termed collapsing methods, and our GAW17 group focused on studying the performance of existing and novel collapsing methods using rare variants. All tested methods performed similarly, as measured by type I error and power. Inflated type I error fractions were consistently observed and might be caused by gametic phase disequilibrium between causal and noncausal rare variants in this relatively small sample as well as by population stratification. Incorporating prior knowledge, such as appropriate covariates and information on functionality of SNPs, increased the power of detecting associated genes. Overall, collapsing rare variants can increase the power of identifying disease-associated genes. However, studying genetic associations of rare variants remains a challenging task that requires further development and improvement in data collection, management, analysis, and computation.
AB - Next-generation sequencing technology allows investigation of both common and rare variants in humans. Exomes are sequenced on the population level or in families to further study the genetics of human diseases. Genetic Analysis Workshop 17 (GAW17) provided exomic data from the 1000 Genomes Project and simulated phenotypes. These data enabled evaluations of existing and newly developed statistical methods for rare variant sequence analysis for which standard statistical methods fail because of the rareness of the alleles. Various alternative approaches have been proposed that overcome the rareness problem by combining multiple rare variants within a gene. These approaches are termed collapsing methods, and our GAW17 group focused on studying the performance of existing and novel collapsing methods using rare variants. All tested methods performed similarly, as measured by type I error and power. Inflated type I error fractions were consistently observed and might be caused by gametic phase disequilibrium between causal and noncausal rare variants in this relatively small sample as well as by population stratification. Incorporating prior knowledge, such as appropriate covariates and information on functionality of SNPs, increased the power of detecting associated genes. Overall, collapsing rare variants can increase the power of identifying disease-associated genes. However, studying genetic associations of rare variants remains a challenging task that requires further development and improvement in data collection, management, analysis, and computation.
KW - 1000 Genomes Project
KW - Association
KW - Collapsing methods
KW - Next-generation sequencing
UR - http://www.scopus.com/inward/record.url?scp=82455219107&partnerID=8YFLogxK
U2 - 10.1002/gepi.20658
DO - 10.1002/gepi.20658
M3 - Article
C2 - 22128049
AN - SCOPUS:82455219107
SN - 0741-0395
VL - 35
SP - S101-S106
JO - Genetic Epidemiology
JF - Genetic Epidemiology
IS - SUPPL. 1
ER -