TY - JOUR
T1 - A deep population reference panel of tandem repeat variation
AU - Ziaei Jam, Helyaneh
AU - Li, Yang
AU - DeVito, Ross
AU - Mousavi, Nima
AU - Ma, Nichole
AU - Lujumba, Ibra
AU - Adam, Yagoub
AU - Maksimov, Mikhail
AU - Huang, Bonnie
AU - Dolzhenko, Egor
AU - Qiu, Yunjiang
AU - Kakembo, Fredrick Elishama
AU - Joseph, Habi
AU - Onyido, Blessing
AU - Adeyemi, Jumoke
AU - Bakhtiari, Mehrdad
AU - Park, Jonghun
AU - Javadzadeh, Sara
AU - Jjingo, Daudi
AU - Adebiyi, Ezekiel
AU - Bafna, Vineet
AU - Gymrek, Melissa
N1 - Publisher Copyright:
© 2023, Springer Nature Limited.
PY - 2023/12
Y1 - 2023/12
N2 - Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
AB - Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
UR - http://www.scopus.com/inward/record.url?scp=85174696758&partnerID=8YFLogxK
U2 - 10.1038/s41467-023-42278-3
DO - 10.1038/s41467-023-42278-3
M3 - Article
C2 - 37872149
AN - SCOPUS:85174696758
SN - 2041-1723
VL - 14
JO - Nature communications
JF - Nature communications
IS - 1
M1 - 6711
ER -