TY - JOUR
T1 - Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome
AU - Human Pangenome Reference Consortium
AU - Lee, Ho Joon
AU - Greer, Stephanie U.
AU - Pavlichin, Dmitri S.
AU - Zhou, Bo
AU - Urban, Alexander E.
AU - Weissman, Tsachy
AU - Liao, Wen Wei
AU - Asri, Mobin
AU - Ebler, Jana
AU - Doerr, Daniel
AU - Haukness, Marina
AU - Hickey, Glenn
AU - Lu, Shuangjia
AU - Lucas, Julian K.
AU - Monlong, Jean
AU - Abel, Haley J.
AU - Buonaiuto, Silvia
AU - Chang, Xian H.
AU - Cheng, Haoyu
AU - Chu, Justin
AU - Colonna, Vincenza
AU - Eizenga, Jordan M.
AU - Feng, Xiaowen
AU - Fischer, Christian
AU - Fulton, Robert S.
AU - Garg, Shilpa
AU - Groza, Cristian
AU - Guarracino, Andrea
AU - Harvey, William T.
AU - Heumos, Simon
AU - Howe, Kerstin
AU - Jain, Miten
AU - Lu, Tsung Yu
AU - Markello, Charles
AU - Martin, Fergal J.
AU - Mitchell, Matthew W.
AU - Munson, Katherine M.
AU - Mwaniki, Moses Njagi
AU - Novak, Adam M.
AU - Olsen, Hugh E.
AU - Pesout, Trevor
AU - Porubsky, David
AU - Prins, Pjotr
AU - Sibbesen, Jonas A.
AU - Tomlinson, Chad
AU - Villani, Flavia
AU - Vollger, Mitchell R.
AU - Antonacci-Fulton, Lucinda L.
AU - Baid, Gunjan
AU - Baker, Carl A.
AU - Belyaeva, Anastasiya
AU - Billis, Konstantinos
AU - Carroll, Andrew
AU - Chang, Pi Chuan
AU - Cody, Sarah
AU - Cook, Daniel E.
AU - Cornejo, Omar E.
AU - Diekhans, Mark
AU - Ebert, Peter
AU - Fairley, Susan
AU - Fedrigo, Olivier
AU - Felsenfeld, Adam L.
AU - Formenti, Giulio
AU - Frankish, Adam
AU - Gao, Yan
AU - Giron, Carlos Garcia
AU - Green, Richard E.
AU - Haggerty, Leanne
AU - Hoekzema, Kendra
AU - Hourlier, Thibaut
AU - Ji, Hanlee P.
AU - Kolesnikov, Alexey
AU - Korbel, Jan O.
AU - Kordosky, Jennifer
AU - Lewis, Alexandra P.
AU - Magalhães, Hugo
AU - Marco-Sola, Santiago
AU - Marijon, Pierre
AU - McDaniel, Jennifer
AU - Mountcastle, Jacquelyn
AU - Nattestad, Maria
AU - Olson, Nathan D.
AU - Puiu, Daniela
AU - Regier, Allison A.
AU - Rhie, Arang
AU - Sacco, Samuel
AU - Sanders, Ashley D.
AU - Schneider, Valerie A.
AU - Schultz, Baergen I.
AU - Shafin, Kishwar
AU - Sirén, Jouni
AU - Smith, Michael W.
AU - Sofia, Heidi J.
AU - Abou Tayoun, Ahmad N.
AU - Thibaud-Nissen, Françoise
AU - Tricomi, Francesca Floriana
AU - Wagner, Justin
AU - Wood, Jonathan M.D.
AU - Zimin, Aleksey V.
AU - Popejoy, Alice B.
AU - Bourque, Guillaume
AU - Chaisson, Mark J.P.
AU - Flicek, Paul
AU - Phillippy, Adam M.
AU - Zook, Justin M.
AU - Eichler, Evan E.
AU - Haussler, David
AU - Jarvis, Erich D.
AU - Miga, Karen H.
AU - Wang, Ting
AU - Garrison, Erik
AU - Marschall, Tobias
AU - Hall, Ira
AU - Li, Heng
AU - Paten, Benedict
N1 - Publisher Copyright:
© 2023 The Authors
PY - 2023/8/28
Y1 - 2023/8/28
N2 - The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as “pan-conserved segment tags” (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.
AB - The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as “pan-conserved segment tags” (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.
KW - CP: Genetics
KW - k-mer
KW - pan-conserved segment
KW - pangenome
KW - reference genome
KW - structural polymorphism
KW - structural variations
UR - http://www.scopus.com/inward/record.url?scp=85169761251&partnerID=8YFLogxK
U2 - 10.1016/j.crmeth.2023.100543
DO - 10.1016/j.crmeth.2023.100543
M3 - Article
C2 - 37671027
AN - SCOPUS:85169761251
SN - 2667-2375
VL - 3
JO - Cell Reports Methods
JF - Cell Reports Methods
IS - 8
M1 - 100543
ER -