TY - JOUR
T1 - Combinatorial analysis for sequence and spatial motif discovery in short sequence fragments
AU - Jackups, Ronald
AU - Liang, Jie
N1 - Funding Information:
The authors thank Dr. Xiang Li, Hammad Naveed, and Sarah Cheng for helpful discussions. They also thank the anonymous referees whose suggestions helped to expand their models and improve the clarity of their presentation. This work was supported by grants from the US National Science Foundation (CAREER DBI0646035 and DMS0800257) and the National Institute of Health (GM079804-01 and GM081682-01).
PY - 2010
Y1 - 2010
N2 - Motifs are overrepresented sequence or spatial patterns appearing in proteins. They often play important roles in maintaining protein stability and in facilitating protein function. When motifs are located in short sequence fragments, as in transmembrane domains that are only 6-20 residues in length, and when there is only very limited data, it is difficult to identify motifs. In this study, we introduce combinatorial models based on permutation for assessing statistically significant sequence and spatial patterns in short sequences. We show that our method can uncover previously unknown sequence and spatial motifs in \beta-barrel membrane proteins and that our method outperforms existing methods in detecting statistically significant motifs in this data set. Last, we discuss implications of motif analysis for problems involving short sequences in other families of proteins.
AB - Motifs are overrepresented sequence or spatial patterns appearing in proteins. They often play important roles in maintaining protein stability and in facilitating protein function. When motifs are located in short sequence fragments, as in transmembrane domains that are only 6-20 residues in length, and when there is only very limited data, it is difficult to identify motifs. In this study, we introduce combinatorial models based on permutation for assessing statistically significant sequence and spatial patterns in short sequences. We show that our method can uncover previously unknown sequence and spatial motifs in \beta-barrel membrane proteins and that our method outperforms existing methods in detecting statistically significant motifs in this data set. Last, we discuss implications of motif analysis for problems involving short sequences in other families of proteins.
KW - Combinatorial algorithms
KW - Motifs
KW - Permutations and combinations
KW - combinatorial models
KW - sequence analysis.
KW - short sequence
UR - http://www.scopus.com/inward/record.url?scp=77955466614&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2008.101
DO - 10.1109/TCBB.2008.101
M3 - Article
C2 - 20671322
AN - SCOPUS:77955466614
VL - 7
SP - 524
EP - 536
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
SN - 1545-5963
IS - 3
M1 - 4633345
ER -