TY - JOUR
T1 - Unsupervised detection of cancer driver mutations with parsimony-guided learning
AU - Kumar, Runjun D.
AU - Swamidass, S. Joshua
AU - Bose, Ron
N1 - Publisher Copyright:
© 2016 Nature America, Inc. All rights reserved.
PY - 2016/10/1
Y1 - 2016/10/1
N2 - Methods are needed to reliably prioritize biologically active driver mutations over inactive passengers in high-throughput sequencing cancer data sets. We present ParsSNP, an unsupervised functional impact predictor that is guided by parsimony. ParsSNP uses an expectation-maximization framework to find mutations that explain tumor incidence broadly, without using predefined training labels that can introduce biases. We compare ParsSNP to five existing tools (CanDrA, CHASM, FATHMM Cancer, TransFIC, and Condel) across five distinct benchmarks. ParsSNP outperformed the existing tools in 24 of 25 comparisons. To investigate the real-world benefit of these improvements, we applied ParsSNP to an independent data set of 30 patients with diffuse-type gastric cancer. ParsSNP identified many known and likely driver mutations that other methods did not detect, including truncation mutations in known tumor suppressors and the recurrent driver substitution RHOA p.Tyr42Cys. In conclusion, ParsSNP uses an innovative, parsimony-based approach to prioritize cancer driver mutations and provides dramatic improvements over existing methods.
AB - Methods are needed to reliably prioritize biologically active driver mutations over inactive passengers in high-throughput sequencing cancer data sets. We present ParsSNP, an unsupervised functional impact predictor that is guided by parsimony. ParsSNP uses an expectation-maximization framework to find mutations that explain tumor incidence broadly, without using predefined training labels that can introduce biases. We compare ParsSNP to five existing tools (CanDrA, CHASM, FATHMM Cancer, TransFIC, and Condel) across five distinct benchmarks. ParsSNP outperformed the existing tools in 24 of 25 comparisons. To investigate the real-world benefit of these improvements, we applied ParsSNP to an independent data set of 30 patients with diffuse-type gastric cancer. ParsSNP identified many known and likely driver mutations that other methods did not detect, including truncation mutations in known tumor suppressors and the recurrent driver substitution RHOA p.Tyr42Cys. In conclusion, ParsSNP uses an innovative, parsimony-based approach to prioritize cancer driver mutations and provides dramatic improvements over existing methods.
UR - http://www.scopus.com/inward/record.url?scp=84987668900&partnerID=8YFLogxK
U2 - 10.1038/ng.3658
DO - 10.1038/ng.3658
M3 - Article
C2 - 27618449
AN - SCOPUS:84987668900
SN - 1061-4036
VL - 48
SP - 1288
EP - 1295
JO - Nature Genetics
JF - Nature Genetics
IS - 10
ER -