TY - JOUR
T1 - Systematic discovery of complex insertions and deletions in human cancers
AU - Ye, Kai
AU - Wang, Jiayin
AU - Jayasinghe, Reyka
AU - Lameijer, Eric Wubbo
AU - McMichael, Joshua F.
AU - Ning, Jie
AU - McLellan, Michael D.
AU - Xie, Mingchao
AU - Cao, Song
AU - Yellapantula, Venkata
AU - Huang, Kuan Lin
AU - Scott, Adam
AU - Foltz, Steven
AU - Niu, Beifang
AU - Johnson, Kimberly J.
AU - Moed, Matthijs
AU - Slagboom, P. Eline
AU - Chen, Feng
AU - Wendl, Michael C.
AU - Ding, Li
N1 - Publisher Copyright:
© 2016 Nature America, Inc.
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Complex insertions and deletions (indels) are formed by simultaneously deleting and inserting DNA fragments of different sizes at a common genomic location. Here we present a systematic analysis of somatic complex indels in the coding sequences of samples from over 8,000 cancer cases using Pindel-C. We discovered 285 complex indels in cancer-associated genes (such as PIK3R1, TP53, ARID1A, GATA3 and KMT2D) in approximately 3.5% of cases analyzed; nearly all instances of complex indels were overlooked (81.1%) or misannotated (17.6%) in previous reports of 2,199 samples. In-frame complex indels are enriched in PIK3R1 and EGFR, whereas frameshifts are prevalent in VHL, GATA3, TP53, ARID1A, PTEN and ATRX. Furthermore, complex indels display strong tissue specificity (such as VHL in kidney cancer samples and GATA3 in breast cancer samples). Finally, structural analyses support findings of previously missed, but potentially druggable, mutations in the EGFR, MET and KIT oncogenes. This study indicates the critical importance of improving complex indel discovery and interpretation in medical research.
AB - Complex insertions and deletions (indels) are formed by simultaneously deleting and inserting DNA fragments of different sizes at a common genomic location. Here we present a systematic analysis of somatic complex indels in the coding sequences of samples from over 8,000 cancer cases using Pindel-C. We discovered 285 complex indels in cancer-associated genes (such as PIK3R1, TP53, ARID1A, GATA3 and KMT2D) in approximately 3.5% of cases analyzed; nearly all instances of complex indels were overlooked (81.1%) or misannotated (17.6%) in previous reports of 2,199 samples. In-frame complex indels are enriched in PIK3R1 and EGFR, whereas frameshifts are prevalent in VHL, GATA3, TP53, ARID1A, PTEN and ATRX. Furthermore, complex indels display strong tissue specificity (such as VHL in kidney cancer samples and GATA3 in breast cancer samples). Finally, structural analyses support findings of previously missed, but potentially druggable, mutations in the EGFR, MET and KIT oncogenes. This study indicates the critical importance of improving complex indel discovery and interpretation in medical research.
UR - http://www.scopus.com/inward/record.url?scp=84954401021&partnerID=8YFLogxK
U2 - 10.1038/nm.4002
DO - 10.1038/nm.4002
M3 - Article
C2 - 26657142
AN - SCOPUS:84954401021
SN - 1078-8956
VL - 22
SP - 97
EP - 104
JO - Nature medicine
JF - Nature medicine
IS - 1
ER -