TY - JOUR
T1 - The structure, function and evolution of a complete human chromosome 8
AU - Logsdon, Glennis A.
AU - Vollger, Mitchell R.
AU - Hsieh, Ping Hsun
AU - Mao, Yafei
AU - Liskovykh, Mikhail A.
AU - Koren, Sergey
AU - Nurk, Sergey
AU - Mercuri, Ludovica
AU - Dishuck, Philip C.
AU - Rhie, Arang
AU - de Lima, Leonardo G.
AU - Dvorkina, Tatiana
AU - Porubsky, David
AU - Harvey, William T.
AU - Mikheenko, Alla
AU - Bzikadze, Andrey V.
AU - Kremitzki, Milinn
AU - Graves-Lindsay, Tina A.
AU - Jain, Chirag
AU - Hoekzema, Kendra
AU - Murali, Shwetha C.
AU - Munson, Katherine M.
AU - Baker, Carl
AU - Sorensen, Melanie
AU - Lewis, Alexandra M.
AU - Surti, Urvashi
AU - Gerton, Jennifer L.
AU - Larionov, Vladimir
AU - Ventura, Mario
AU - Miga, Karen H.
AU - Phillippy, Adam M.
AU - Eichler, Evan E.
N1 - Funding Information:
Acknowledgements We thank S. Goodwin for sequence data generation; M. Jain and D. Miller for re-base-calling sequence data; R. Tindell, H. Visse, A. Tornabene, and G. Ellis for technical assistance; Z. Zhao for computational assistance; F. F. Dastvan for instrumentation; D. Gordon for accessioning BACs; G. Bouffard for accessioning ONT FAST5 data; J. G. Underwood for discussions; and T. Brown for assistance in editing this manuscript. We acknowledge experimental support from the W. M. Keck Microscopy Center (UW) and the computational resources of the NIH HPC Biowulf cluster (https://hpc.nih.gov). This research was supported, in part, by funding from the National Institutes of Health (NIH), HG002385 and HG010169 (E.E.E.); National Institute of General Medical Sciences (NIGMS), F32 GM134558 (G.A.L.); Intramural Research Program of the National Human Genome Research Institute at NIH (S.K., A.M.P., A.R.); National Library of Medicine Big Data Training Grant for Genomics and Neuroscience 5T32LM012419-04 (M.R.V.); NIH/NHGRI Pathway to Independence Award K99 HG011041 (P.H.); NIH/NHGRI R21 1R21HG010548-01 and NIH/NHGRI U01 1U01HG010971 (K.H.M.); and the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research, USA (V.L.). E.E.E. is an investigator of the Howard Hughes Medical Institute.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/5/6
Y1 - 2021/5/6
N2 - The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.
AB - The complete assembly of each human chromosome is essential for understanding human biology and evolution1,2. Here we use complementary long-read sequencing technologies to complete the linear assembly of human chromosome 8. Our assembly resolves the sequence of five previously long-standing gaps, including a 2.08-Mb centromeric α-satellite array, a 644-kb copy number polymorphism in the β-defensin gene cluster that is important for disease risk, and an 863-kb variable number tandem repeat at chromosome 8q21.2 that can function as a neocentromere. We show that the centromeric α-satellite array is generally methylated except for a 73-kb hypomethylated region of diverse higher-order α-satellites enriched with CENP-A nucleosomes, consistent with the location of the kinetochore. In addition, we confirm the overall organization and methylation pattern of the centromere in a diploid human genome. Using a dual long-read sequencing approach, we complete high-quality draft assemblies of the orthologous centromere from chromosome 8 in chimpanzee, orangutan and macaque to reconstruct its evolutionary history. Comparative and phylogenetic analyses show that the higher-order α-satellite structure evolved in the great ape ancestor with a layered symmetry, in which more ancient higher-order repeats locate peripherally to monomeric α-satellites. We estimate that the mutation rate of centromeric satellite DNA is accelerated by more than 2.2-fold compared to the unique portions of the genome, and this acceleration extends into the flanking sequence.
UR - http://www.scopus.com/inward/record.url?scp=85103900292&partnerID=8YFLogxK
U2 - 10.1038/s41586-021-03420-7
DO - 10.1038/s41586-021-03420-7
M3 - Article
C2 - 33828295
AN - SCOPUS:85103900292
VL - 593
SP - 101
EP - 107
JO - Nature
JF - Nature
SN - 0028-0836
IS - 7857
ER -