190 Scopus citations

Abstract

Pevzner and Sze [23] considered a precise version of the motif discovery problem and simultaneously issued an algorithmic challenge: find a motif M of length 15, where each planted instance differs from M in 4 positions. Whereas previous algorithms all failed to solve this (15,4)-motif problem, Pevzner and Sze introduced algorithms that succeeded. However, their algorithms failed to solve the considerably more difficult (14,4)-, (16,5)-, and (18,6)-motif problems. We introduce a novel motif discovery algorithm based on the use of random projections of the input's substrings. Experiments on simulated data demonstrate that this algorithm performs better than existing algorithms and, in particular, typically solves the difficult (14,4)-, (16,5)-, and (18,6)-motif problems quite efficiently. A probabilistic estimate shows that the small values of d for which the algorithm fails to recover the planted (l, d)-motif are in all likelihood inherently impossible to solve. We also present experimental results on re alistic biological data by identifying ribosome binding sites in prokaryotes as well as a number of known transcriptional regulatory motifs in eukaryotes.

Original languageEnglish
Pages69-76
Number of pages8
DOIs
StatePublished - 2001
Event5th Annual Internatinal Conference on Computational Biology - Montreal, Que., Canada
Duration: May 22 2001May 26 2001

Conference

Conference5th Annual Internatinal Conference on Computational Biology
Country/TerritoryCanada
CityMontreal, Que.
Period05/22/0105/26/01

Fingerprint

Dive into the research topics of 'Finding motifs using random projections'. Together they form a unique fingerprint.

Cite this