TY - JOUR
T1 - Discovery of driver non-coding splice-site-creating mutations in cancer
AU - Cao, Song
AU - Zhou, Daniel Cui
AU - Oh, Clara
AU - Jayasinghe, Reyka G.
AU - Zhao, Yanyan
AU - Yoon, Christopher J.
AU - Wyczalkowski, Matthew A.
AU - Bailey, Matthew H.
AU - Tsou, Terrence
AU - Gao, Qingsong
AU - Malone, Andrew
AU - Reynolds, Sheila
AU - Shmulevich, Ilya
AU - Wendl, Michael C.
AU - Chen, Feng
AU - Ding, Li
N1 - Funding Information:
This work was supported by the National Cancer Institute grants R01CA178383, R01CA180006 and U24CA211006 to L.D. F.C. is supported by National Institute of Diabetes and Digestive and Kidney Diseases grant R01DK087960. Additional support came from the National Institute of General Medical Sciences Cell and Molecular Biology training grant GM 007067 (R.G.J.). The Cancer Genome Atlas (cancergenome.nih.gov) and The International Cancer Genome Consortium (ICGC) were the source of primary data. We acknowledge support of computational resources from McDonnell Genome Institute, the Oncology Division of the Washington University School of Medicine, and the Institute for Systems Biology-Cancer Genomics Cloud (ISB-CGC), a pilot project of the National Cancer Institute (under contract number HHSN261201400007C).
Publisher Copyright:
© 2020, The Author(s).
PY - 2020/12/1
Y1 - 2020/12/1
N2 - Non-coding mutations can create splice sites, however the true extent of how such somatic non-coding mutations affect RNA splicing are largely unexplored. Here we use the MiSplice pipeline to analyze 783 cancer cases with WGS data and 9494 cases with WES data, discovering 562 non-coding mutations that lead to splicing alterations. Notably, most of these mutations create new exons. Introns associated with new exon creation are significantly larger than the genome-wide average intron size. We find that some mutation-induced splicing alterations are located in genes important in tumorigenesis (ATRX, BCOR, CDKN2B, MAP3K1, MAP3K4, MDM2, SMAD4, STK11, TP53 etc.), often leading to truncated proteins and affecting gene expression. The pattern emerging from these exon-creating mutations suggests that splice sites created by non-coding mutations interact with pre-existing potential splice sites that originally lacked a suitable splicing pair to induce new exon formation. Our study suggests the importance of investigating biological and clinical consequences of noncoding splice-inducing mutations that were previously neglected by conventional annotation pipelines. MiSplice will be useful for automatically annotating the splicing impact of coding and non-coding mutations in future large-scale analyses.
AB - Non-coding mutations can create splice sites, however the true extent of how such somatic non-coding mutations affect RNA splicing are largely unexplored. Here we use the MiSplice pipeline to analyze 783 cancer cases with WGS data and 9494 cases with WES data, discovering 562 non-coding mutations that lead to splicing alterations. Notably, most of these mutations create new exons. Introns associated with new exon creation are significantly larger than the genome-wide average intron size. We find that some mutation-induced splicing alterations are located in genes important in tumorigenesis (ATRX, BCOR, CDKN2B, MAP3K1, MAP3K4, MDM2, SMAD4, STK11, TP53 etc.), often leading to truncated proteins and affecting gene expression. The pattern emerging from these exon-creating mutations suggests that splice sites created by non-coding mutations interact with pre-existing potential splice sites that originally lacked a suitable splicing pair to induce new exon formation. Our study suggests the importance of investigating biological and clinical consequences of noncoding splice-inducing mutations that were previously neglected by conventional annotation pipelines. MiSplice will be useful for automatically annotating the splicing impact of coding and non-coding mutations in future large-scale analyses.
UR - http://www.scopus.com/inward/record.url?scp=85094978271&partnerID=8YFLogxK
U2 - 10.1038/s41467-020-19307-6
DO - 10.1038/s41467-020-19307-6
M3 - Article
C2 - 33149122
AN - SCOPUS:85094978271
SN - 2041-1723
VL - 11
JO - Nature Communications
JF - Nature Communications
IS - 1
M1 - 5573
ER -