TY - JOUR
T1 - Integrated analysis of genomic and transcriptomic data for the discovery of splice-associated variants in cancer
AU - Cotto, Kelsy C.
AU - Feng, Yang Yang
AU - Ramu, Avinash
AU - Richters, Megan
AU - Freshour, Sharon L.
AU - Skidmore, Zachary L.
AU - Xia, Huiming
AU - McMichael, Joshua F.
AU - Kunisaki, Jason
AU - Campbell, Katie M.
AU - Chen, Timothy Hung Po
AU - Rozycki, Emily B.
AU - Adkins, Douglas
AU - Devarakonda, Siddhartha
AU - Sankararaman, Sumithra
AU - Lin, Yiing
AU - Chapman, William C.
AU - Maher, Christopher A.
AU - Arora, Vivek
AU - Dunn, Gavin P.
AU - Uppaluri, Ravindra
AU - Govindan, Ramaswamy
AU - Griffith, Obi L.
AU - Griffith, Malachi
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/12
Y1 - 2023/12
N2 - Somatic mutations within non-coding regions and even exons may have unidentified regulatory consequences that are often overlooked in analysis workflows. Here we present RegTools (www.regtools.org), a computationally efficient, free, and open-source software package designed to integrate somatic variants from genomic data with splice junctions from bulk or single cell transcriptomic data to identify variants that may cause aberrant splicing. We apply RegTools to over 9000 tumor samples with both tumor DNA and RNA sequence data. RegTools discovers 235,778 events where a splice-associated variant significantly increases the splicing of a particular junction, across 158,200 unique variants and 131,212 unique junctions. To characterize these somatic variants and their associated splice isoforms, we annotate them with the Variant Effect Predictor, SpliceAI, and Genotype-Tissue Expression junction counts and compare our results to other tools that integrate genomic and transcriptomic data. While many events are corroborated by the aforementioned tools, the flexibility of RegTools also allows us to identify splice-associated variants in known cancer drivers, such as TP53, CDKN2A, and B2M, and other genes.
AB - Somatic mutations within non-coding regions and even exons may have unidentified regulatory consequences that are often overlooked in analysis workflows. Here we present RegTools (www.regtools.org), a computationally efficient, free, and open-source software package designed to integrate somatic variants from genomic data with splice junctions from bulk or single cell transcriptomic data to identify variants that may cause aberrant splicing. We apply RegTools to over 9000 tumor samples with both tumor DNA and RNA sequence data. RegTools discovers 235,778 events where a splice-associated variant significantly increases the splicing of a particular junction, across 158,200 unique variants and 131,212 unique junctions. To characterize these somatic variants and their associated splice isoforms, we annotate them with the Variant Effect Predictor, SpliceAI, and Genotype-Tissue Expression junction counts and compare our results to other tools that integrate genomic and transcriptomic data. While many events are corroborated by the aforementioned tools, the flexibility of RegTools also allows us to identify splice-associated variants in known cancer drivers, such as TP53, CDKN2A, and B2M, and other genes.
UR - http://www.scopus.com/inward/record.url?scp=85150858571&partnerID=8YFLogxK
U2 - 10.1038/s41467-023-37266-6
DO - 10.1038/s41467-023-37266-6
M3 - Article
C2 - 36949070
AN - SCOPUS:85150858571
SN - 2041-1723
VL - 14
JO - Nature communications
JF - Nature communications
IS - 1
M1 - 1589
ER -