TY - JOUR
T1 - Long-read genome sequencing for the molecular diagnosis of neurodevelopmental disorders
AU - Hiatt, Susan M.
AU - Lawlor, James M.J.
AU - Handley, Lori H.
AU - Ramaker, Ryne C.
AU - Rogers, Brianne B.
AU - Partridge, E. Christopher
AU - Boston, Lori Beth
AU - Williams, Melissa
AU - Plott, Christopher B.
AU - Jenkins, Jerry
AU - Gray, David E.
AU - Holt, James M.
AU - Bowling, Kevin M.
AU - Bebin, E. Martina
AU - Grimwood, Jane
AU - Schmutz, Jeremy
AU - Cooper, Gregory M.
N1 - Funding Information:
This work was supported by a grant from the National Human Genome Research Institute ( UM1HG007301 ). Some reagents were provided by PacBio as part of an early-access testing program. We thank our colleagues at HudsonAlpha who provided advice and general support, including Amy Nesmith Cox, Greg Barsh, Kelly East, Whitley Kelley, David Bick, and Elaine Lyon, in addition to the HudsonAlpha Genomic Services Laboratory and Clinical Services Laboratory. We also thank the clinical team at North Alabama Children’s Specialists. Finally, we are grateful to the families who participated in this study.
Publisher Copyright:
© 2021 The Author(s)
PY - 2021/4/8
Y1 - 2021/4/8
N2 - Exome and genome sequencing have proven to be effective tools for the diagnosis of neurodevelopmental disorders (NDDs), but large fractions of NDDs cannot be attributed to currently detectable genetic variation. This is likely, at least in part, a result of the fact that many genetic variants are difficult or impossible to detect through typical short-read sequencing approaches. Here, we describe a genomic analysis using Pacific Biosciences circular consensus sequencing (CCS) reads, which are both long (>10 kb) and accurate (>99% bp accuracy). We used CCS on six proband-parent trios with NDDs that were unexplained despite extensive testing, including genome sequencing with short reads. We identified variants and created de novo assemblies in each trio, with global metrics indicating these datasets are more accurate and comprehensive than those provided by short-read data. In one proband, we identified a likely pathogenic (LP), de novo L1-mediated insertion in CDKL5 that results in duplication of exon 3, leading to a frameshift. In a second proband, we identified multiple large de novo structural variants, including insertion-translocations affecting DGKB and MLLT3, which we show disrupt MLLT3 transcript levels. We consider this extensive structural variation likely pathogenic. The breadth and quality of variant detection, coupled to finding variants of clinical and research interest in two of six probands with unexplained NDDs, support the hypothesis that long-read genome sequencing can substantially improve rare disease genetic discovery rates.
AB - Exome and genome sequencing have proven to be effective tools for the diagnosis of neurodevelopmental disorders (NDDs), but large fractions of NDDs cannot be attributed to currently detectable genetic variation. This is likely, at least in part, a result of the fact that many genetic variants are difficult or impossible to detect through typical short-read sequencing approaches. Here, we describe a genomic analysis using Pacific Biosciences circular consensus sequencing (CCS) reads, which are both long (>10 kb) and accurate (>99% bp accuracy). We used CCS on six proband-parent trios with NDDs that were unexplained despite extensive testing, including genome sequencing with short reads. We identified variants and created de novo assemblies in each trio, with global metrics indicating these datasets are more accurate and comprehensive than those provided by short-read data. In one proband, we identified a likely pathogenic (LP), de novo L1-mediated insertion in CDKL5 that results in duplication of exon 3, leading to a frameshift. In a second proband, we identified multiple large de novo structural variants, including insertion-translocations affecting DGKB and MLLT3, which we show disrupt MLLT3 transcript levels. We consider this extensive structural variation likely pathogenic. The breadth and quality of variant detection, coupled to finding variants of clinical and research interest in two of six probands with unexplained NDDs, support the hypothesis that long-read genome sequencing can substantially improve rare disease genetic discovery rates.
KW - clinical sequencing
KW - long read sequencing
KW - mobile element insertion
KW - neurodevelopmental disorder
KW - structural variation
UR - http://www.scopus.com/inward/record.url?scp=85101974887&partnerID=8YFLogxK
U2 - 10.1016/j.xhgg.2021.100023
DO - 10.1016/j.xhgg.2021.100023
M3 - Article
C2 - 33937879
AN - SCOPUS:85101974887
SN - 2666-2477
VL - 2
JO - Human Genetics and Genomics Advances
JF - Human Genetics and Genomics Advances
IS - 2
M1 - 100023
ER -