Methods are needed to reliably prioritize biologically active driver mutations over inactive passengers in high-throughput sequencing cancer data sets. We present ParsSNP, an unsupervised functional impact predictor that is guided by parsimony. ParsSNP uses an expectation-maximization framework to find mutations that explain tumor incidence broadly, without using predefined training labels that can introduce biases. We compare ParsSNP to five existing tools (CanDrA, CHASM, FATHMM Cancer, TransFIC, and Condel) across five distinct benchmarks. ParsSNP outperformed the existing tools in 24 of 25 comparisons. To investigate the real-world benefit of these improvements, we applied ParsSNP to an independent data set of 30 patients with diffuse-type gastric cancer. ParsSNP identified many known and likely driver mutations that other methods did not detect, including truncation mutations in known tumor suppressors and the recurrent driver substitution RHOA p.Tyr42Cys. In conclusion, ParsSNP uses an innovative, parsimony-based approach to prioritize cancer driver mutations and provides dramatic improvements over existing methods.

Original languageEnglish
Pages (from-to)1288-1295
Number of pages8
JournalNature Genetics
Issue number10
StatePublished - Oct 1 2016


Dive into the research topics of 'Unsupervised detection of cancer driver mutations with parsimony-guided learning'. Together they form a unique fingerprint.

Cite this