TY - JOUR
T1 - A human genome structural variation sequencing resource reveals insights into mutational mechanisms
AU - Kidd, Jeffrey M.
AU - Graves, Tina
AU - Newman, Tera L.
AU - Fulton, Robert
AU - Hayden, Hillary S.
AU - Malig, Maika
AU - Kallicki, Joelle
AU - Kaul, Rajinder
AU - Wilson, Richard K.
AU - Eichler, Evan E.
N1 - Funding Information:
We thank D. Smith and the staff at Agencourt Biosciences for library production, E. Kirkness and staff of the J. Craig Venter Institute for end-sequence data from the JVCI library, and L. Chen for computational assistance in the mapping of end-sequence data. We thank S. Girirajan, J. Moran, and C. Payen for thoughtful discussion; T. Brown for manuscript preparation assistance; and members of the University of Washington and Washington University Genome Centers for assistance with data generation. J.M.K. is supported by a National Science Foundation Graduate Research Fellowship. This work was supported by the National Institutes of Health Grant HG004120 to E.E.E., who is an investigator of the Howard Hughes Medical Institute. E.E.E is on the scientific advisory board for Pacific Biosciences. T.L.N. is an employee and founder of iGenix Inc.
PY - 2010/11/24
Y1 - 2010/11/24
N2 - Understanding the prevailing mutational mechanisms responsible for human genome structural variation requires uniformity in the discovery of allelic variants and precision in terms of breakpoint delineation. We develop a resource based on capillary end sequencing of 13.8 million fosmid clones from 17 human genomes and characterize the complete sequence of 1054 large structural variants corresponding to 589 deletions, 384 insertions, and 81 inversions. We analyze the 2081 breakpoint junctions and infer potential mechanism of origin. Three mechanisms account for the bulk of germline structural variation: microhomology-mediated processes involving short (2-20 bp) stretches of sequence (28%), nonallelic homologous recombination (22%), and L1 retrotransposition (19%). The high quality and long-range continuity of the sequence reveals more complex mutational mechanisms, including repeat-mediated inversions and gene conversion, that are most often missed by other methods, such as comparative genomic hybridization, single nucleotide polymorphism microarrays, and next-generation sequencing.
AB - Understanding the prevailing mutational mechanisms responsible for human genome structural variation requires uniformity in the discovery of allelic variants and precision in terms of breakpoint delineation. We develop a resource based on capillary end sequencing of 13.8 million fosmid clones from 17 human genomes and characterize the complete sequence of 1054 large structural variants corresponding to 589 deletions, 384 insertions, and 81 inversions. We analyze the 2081 breakpoint junctions and infer potential mechanism of origin. Three mechanisms account for the bulk of germline structural variation: microhomology-mediated processes involving short (2-20 bp) stretches of sequence (28%), nonallelic homologous recombination (22%), and L1 retrotransposition (19%). The high quality and long-range continuity of the sequence reveals more complex mutational mechanisms, including repeat-mediated inversions and gene conversion, that are most often missed by other methods, such as comparative genomic hybridization, single nucleotide polymorphism microarrays, and next-generation sequencing.
UR - http://www.scopus.com/inward/record.url?scp=79251493015&partnerID=8YFLogxK
U2 - 10.1016/j.cell.2010.10.027
DO - 10.1016/j.cell.2010.10.027
M3 - Article
C2 - 21111241
AN - SCOPUS:79251493015
SN - 0092-8674
VL - 143
SP - 837
EP - 847
JO - Cell
JF - Cell
IS - 5
ER -