TY - JOUR
T1 - Discovery of clinically relevant fusions in pediatric cancer
AU - LaHaye, Stephanie
AU - Fitch, James R.
AU - Voytovich, Kyle J.
AU - Herman, Adam C.
AU - Kelly, Benjamin J.
AU - Lammi, Grant E.
AU - Arbesfeld, Jeremy A.
AU - Wijeratne, Saranga
AU - Franklin, Samuel J.
AU - Schieffer, Kathleen M.
AU - Bir, Natalie
AU - McGrath, Sean D.
AU - Miller, Anthony R.
AU - Wetzel, Amy
AU - Miller, Katherine E.
AU - Bedrosian, Tracy A.
AU - Leraas, Kristen
AU - Varga, Elizabeth A.
AU - Lee, Kristy
AU - Gupta, Ajay
AU - Setty, Bhuvana
AU - Boué, Daniel R.
AU - Leonard, Jeffrey R.
AU - Finlay, Jonathan L.
AU - Abdelbaki, Mohamed S.
AU - Osorio, Diana S.
AU - Koo, Selene C.
AU - Koboldt, Daniel C.
AU - Wagner, Alex H.
AU - Eisfeld, Ann Kathrin
AU - Mrózek, Krzysztof
AU - Magrini, Vincent
AU - Cottrell, Catherine E.
AU - Mardis, Elaine R.
AU - Wilson, Richard K.
AU - White, Peter
N1 - Funding Information:
We thank the patients and their families for participating in our translational research protocol.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/12
Y1 - 2021/12
N2 - Background: Pediatric cancers typically have a distinct genomic landscape when compared to adult cancers and frequently carry somatic gene fusion events that alter gene expression and drive tumorigenesis. Sensitive and specific detection of gene fusions through the analysis of next-generation-based RNA sequencing (RNA-Seq) data is computationally challenging and may be confounded by low tumor cellularity or underlying genomic complexity. Furthermore, numerous computational tools are available to identify fusions from supporting RNA-Seq reads, yet each algorithm demonstrates unique variability in sensitivity and precision, and no clearly superior approach currently exists. To overcome these challenges, we have developed an ensemble fusion calling approach to increase the accuracy of identifying fusions. Results: Our Ensemble Fusion (EnFusion) approach utilizes seven fusion calling algorithms: Arriba, CICERO, FusionMap, FusionCatcher, JAFFA, MapSplice, and STAR-Fusion, which are packaged as a fully automated pipeline using Docker and Amazon Web Services (AWS) serverless technology. This method uses paired end RNA-Seq sequence reads as input, and the output from each algorithm is examined to identify fusions detected by a consensus of at least three algorithms. These consensus fusion results are filtered by comparison to an internal database to remove likely artifactual fusions occurring at high frequencies in our internal cohort, while a “known fusion list” prevents failure to report known pathogenic events. We have employed the EnFusion pipeline on RNA-Seq data from 229 patients with pediatric cancer or blood disorders studied under an IRB-approved protocol. The samples consist of 138 central nervous system tumors, 73 solid tumors, and 18 hematologic malignancies or disorders. The combination of an ensemble fusion-calling pipeline and a knowledge-based filtering strategy identified 67 clinically relevant fusions among our cohort (diagnostic yield of 29.3%), including RBPMS-MET, BCAN-NTRK1, and TRIM22-BRAF fusions. Following clinical confirmation and reporting in the patient’s medical record, both known and novel fusions provided medically meaningful information. Conclusions: The EnFusion pipeline offers a streamlined approach to discover fusions in cancer, at higher levels of sensitivity and accuracy than single algorithm methods. Furthermore, this method accurately identifies driver fusions in pediatric cancer, providing clinical impact by contributing evidence to diagnosis and, when appropriate, indicating targeted therapies.
AB - Background: Pediatric cancers typically have a distinct genomic landscape when compared to adult cancers and frequently carry somatic gene fusion events that alter gene expression and drive tumorigenesis. Sensitive and specific detection of gene fusions through the analysis of next-generation-based RNA sequencing (RNA-Seq) data is computationally challenging and may be confounded by low tumor cellularity or underlying genomic complexity. Furthermore, numerous computational tools are available to identify fusions from supporting RNA-Seq reads, yet each algorithm demonstrates unique variability in sensitivity and precision, and no clearly superior approach currently exists. To overcome these challenges, we have developed an ensemble fusion calling approach to increase the accuracy of identifying fusions. Results: Our Ensemble Fusion (EnFusion) approach utilizes seven fusion calling algorithms: Arriba, CICERO, FusionMap, FusionCatcher, JAFFA, MapSplice, and STAR-Fusion, which are packaged as a fully automated pipeline using Docker and Amazon Web Services (AWS) serverless technology. This method uses paired end RNA-Seq sequence reads as input, and the output from each algorithm is examined to identify fusions detected by a consensus of at least three algorithms. These consensus fusion results are filtered by comparison to an internal database to remove likely artifactual fusions occurring at high frequencies in our internal cohort, while a “known fusion list” prevents failure to report known pathogenic events. We have employed the EnFusion pipeline on RNA-Seq data from 229 patients with pediatric cancer or blood disorders studied under an IRB-approved protocol. The samples consist of 138 central nervous system tumors, 73 solid tumors, and 18 hematologic malignancies or disorders. The combination of an ensemble fusion-calling pipeline and a knowledge-based filtering strategy identified 67 clinically relevant fusions among our cohort (diagnostic yield of 29.3%), including RBPMS-MET, BCAN-NTRK1, and TRIM22-BRAF fusions. Following clinical confirmation and reporting in the patient’s medical record, both known and novel fusions provided medically meaningful information. Conclusions: The EnFusion pipeline offers a streamlined approach to discover fusions in cancer, at higher levels of sensitivity and accuracy than single algorithm methods. Furthermore, this method accurately identifies driver fusions in pediatric cancer, providing clinical impact by contributing evidence to diagnosis and, when appropriate, indicating targeted therapies.
KW - Cancer
KW - Gene fusions
KW - Genomics
KW - Pediatric neoplasms
KW - RNA-Seq
KW - Transcriptomics
UR - http://www.scopus.com/inward/record.url?scp=85120748279&partnerID=8YFLogxK
U2 - 10.1186/s12864-021-08094-z
DO - 10.1186/s12864-021-08094-z
M3 - Article
C2 - 34863095
AN - SCOPUS:85120748279
SN - 1471-2164
VL - 22
JO - BMC genomics
JF - BMC genomics
IS - 1
M1 - 872
ER -