TY - JOUR
T1 - Retrotransposition of gene transcripts leads to structural variation in mammalian genomes
AU - Ewing, Adam D.
AU - Ballinger, Tracy J.
AU - Earl, Dent
AU - Harris, Christopher C.
AU - Ding, Li
AU - Wilson, Richard K.
AU - Haussler, David
N1 - Funding Information:
We thank members of the genome reconstruction and cancer groups at the UCSC Center for Biomolecular Science and Engineering, members of the TCGA publication committee, and others for reviewing our work. We also thank TCGA genome sequencing centers for data generation and sequence alignment. We acknowledge funding from the Howard Hughes Medical Institute to DH and the following National Institutes of Health grants to DH: 5P41HG002371-12 (NHGRI) and 5U24CA143858-03 (NCI).
PY - 2013/3/13
Y1 - 2013/3/13
N2 - Background: Retroposed processed gene transcripts are an important source of material for new gene formation on evolutionary timescales. Most prior work on gene retrocopy discovery compared copies in reference genome assemblies to their source genes. Here, we explore gene retrocopy insertion polymorphisms (GRIPs) that are present in the germlines of individual humans, mice, and chimpanzees, and we identify novel gene retrocopy insertions in cancerous somatic tissues that are absent from patient-matched non-cancer genomes.Results: Through analysis of whole-genome sequence data, we found evidence for 48 GRIPs in the genomes of one or more humans sequenced as part of the 1,000 Genomes Project and The Cancer Genome Atlas, but which were not in the human reference assembly. Similarly, we found evidence for 755 GRIPs at distinct locations in one or more of 17 inbred mouse strains but which were not in the mouse reference assembly, and 19 GRIPs across a cohort of 10 chimpanzee genomes, which were not in the chimpanzee reference genome assembly. Many of these insertions are new members of existing gene families whose source genes are highly and widely expressed, and the majority have detectable hallmarks of processed gene retrocopy formation. We estimate the rate of novel gene retrocopy insertions in humans and chimps at roughly one new gene retrocopy insertion for every 6,000 individuals.Conclusions: We find that gene retrocopy polymorphisms are a widespread phenomenon, present a multi-species analysis of these events, and provide a method for their ascertainment.
AB - Background: Retroposed processed gene transcripts are an important source of material for new gene formation on evolutionary timescales. Most prior work on gene retrocopy discovery compared copies in reference genome assemblies to their source genes. Here, we explore gene retrocopy insertion polymorphisms (GRIPs) that are present in the germlines of individual humans, mice, and chimpanzees, and we identify novel gene retrocopy insertions in cancerous somatic tissues that are absent from patient-matched non-cancer genomes.Results: Through analysis of whole-genome sequence data, we found evidence for 48 GRIPs in the genomes of one or more humans sequenced as part of the 1,000 Genomes Project and The Cancer Genome Atlas, but which were not in the human reference assembly. Similarly, we found evidence for 755 GRIPs at distinct locations in one or more of 17 inbred mouse strains but which were not in the mouse reference assembly, and 19 GRIPs across a cohort of 10 chimpanzee genomes, which were not in the chimpanzee reference genome assembly. Many of these insertions are new members of existing gene families whose source genes are highly and widely expressed, and the majority have detectable hallmarks of processed gene retrocopy formation. We estimate the rate of novel gene retrocopy insertions in humans and chimps at roughly one new gene retrocopy insertion for every 6,000 individuals.Conclusions: We find that gene retrocopy polymorphisms are a widespread phenomenon, present a multi-species analysis of these events, and provide a method for their ascertainment.
UR - http://www.scopus.com/inward/record.url?scp=84879028815&partnerID=8YFLogxK
U2 - 10.1186/gb-2013-14-3-r22
DO - 10.1186/gb-2013-14-3-r22
M3 - Article
C2 - 23497673
AN - SCOPUS:84879028815
SN - 1474-7596
VL - 14
JO - Genome biology
JF - Genome biology
IS - 3
M1 - R22
ER -