Genotypic discrepancies arising from imputation

Anthony L. Hinrichs, Robert C. Culverhouse, Brian K. Suarez

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

The ideal genetic analysis of family data would include whole genome sequence on all family members. A strategy of combining sequence data from a subset of key individuals with inexpensive, genome-wide association study (GWAS) chip genotypes on all individuals to infer sequence level genotypes throughout the families has been suggested as a highly accurate alternative. This strategy was followed by the Genetic Analysis Workshop 18 data providers. We examined the quality of the imputation to identify potential consequences of this strategy by comparing discrepancies between GWAS genotype calls and imputed calls for the same variants. Overall, the inference and imputation process worked very well. However, we find that discrepancies occurred at an increased rate when imputation was used to infer missing data in sequenced individuals. Although this may be an artifact of this particular instantiation of these analytic methods, there may be general genetic or algorithmic reasons to avoid trying to fill in missing sequence data. This is especially true given the risk of false positives and reduction in power for family-based transmission tests when founders are incorrectly imputed as heterozygotes. Finally, we note a higher rate of discrepancies when unsequenced individuals are inferred using sequenced individuals from other pedigrees drawn from the same admixed population.

Original languageEnglish
Article numberS17
JournalBMC Proceedings
Volume8
DOIs
StatePublished - Jun 17 2014

Fingerprint Dive into the research topics of 'Genotypic discrepancies arising from imputation'. Together they form a unique fingerprint.

Cite this