TY - JOUR
T1 - Finding the most significant common sequence and structure motifs in a set of RNA sequences
AU - Gorodkin, J.
AU - Heyer, L. J.
AU - Stormo, G. D.
N1 - Funding Information:
Thanks to B.Javornik and NexStar for providing the data electronically. We thank S.Eddy for advice on the use of COVE and an anonymous reviewer for many helpful comments. This work was sponsored in part by NIH grant HG00249 to GDS. JG was supported by the Danish National Research Foundation.
PY - 1997/9/15
Y1 - 1997/9/15
N2 - We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints. In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections. The first part utilizes a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences, but maintains tractability by constructing multi-sequence alignments from pair-wise comparisons. The algorithm finds the multiple alignments using a greedy approach and has similarities to both CLUSTAL and CONSENSUS, but the core algorithm assures that the pair-wise alignments are optimized for both sequence and structure conservation. The choice of scoring system and the method of progressively constructing the final solution are important considerations that are discussed. Example solutions, and comparisons with other approaches, are provided. The solutions include finding consensus structures identical to published ones.
AB - We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints. In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections. The first part utilizes a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences, but maintains tractability by constructing multi-sequence alignments from pair-wise comparisons. The algorithm finds the multiple alignments using a greedy approach and has similarities to both CLUSTAL and CONSENSUS, but the core algorithm assures that the pair-wise alignments are optimized for both sequence and structure conservation. The choice of scoring system and the method of progressively constructing the final solution are important considerations that are discussed. Example solutions, and comparisons with other approaches, are provided. The solutions include finding consensus structures identical to published ones.
UR - http://www.scopus.com/inward/record.url?scp=0030812919&partnerID=8YFLogxK
U2 - 10.1093/nar/25.18.3724
DO - 10.1093/nar/25.18.3724
M3 - Article
C2 - 9278497
AN - SCOPUS:0030812919
SN - 0305-1048
VL - 25
SP - 3724
EP - 3732
JO - Nucleic acids research
JF - Nucleic acids research
IS - 18
ER -