TY - JOUR
T1 - Sequence landscapes
AU - Clift, B.
AU - Haussler, D.
AU - Mcconnell, R.
AU - Schneider, T. D.
AU - Stormo, G. D.
N1 - Funding Information:
B. CDft, D. Haussler and R. McConnell were supported by NSF grant IST-8317918. T. Schneider and G. Stormo were supported by NIH grant GM28755. The Pyramid 90x computer was purchased for the Department of MCD Biology by NIH grant RR01538. We would like to thank A. Ehrenfeucht, A. Blumer and J. Blumer for their help and criticism during the gradual evolution of this software over the past year and a half.
PY - 1986/1/10
Y1 - 1986/1/10
N2 - We describe a method tor representing the structure of repeating sequences in nucleic-acids, proteins and other texts. A portion of the sequence is presented at the bottom of a CRT screen. Above the sequence is its landscape, which looks like a mountain range. Each mountain corresponds to a subsequence of the sequence. At the peak of every mountain is written the number of times that the subsequence appears. A data structure called a DAWG, which can be built in time proportional to the length of the sequence, is used to construct the landscape. For the 40 thousand bases of bacterlophage T7, the DAWQ can be built in 30 seconds. The time to display any portion of the landscape is less than a second. Using sequence landscapes, one can quickly locate significant repeats.
AB - We describe a method tor representing the structure of repeating sequences in nucleic-acids, proteins and other texts. A portion of the sequence is presented at the bottom of a CRT screen. Above the sequence is its landscape, which looks like a mountain range. Each mountain corresponds to a subsequence of the sequence. At the peak of every mountain is written the number of times that the subsequence appears. A data structure called a DAWG, which can be built in time proportional to the length of the sequence, is used to construct the landscape. For the 40 thousand bases of bacterlophage T7, the DAWQ can be built in 30 seconds. The time to display any portion of the landscape is less than a second. Using sequence landscapes, one can quickly locate significant repeats.
UR - http://www.scopus.com/inward/record.url?scp=0023045776&partnerID=8YFLogxK
U2 - 10.1093/nar/14.1.141
DO - 10.1093/nar/14.1.141
M3 - Article
C2 - 3753762
AN - SCOPUS:0023045776
SN - 0305-1048
VL - 14
SP - 141
EP - 158
JO - Nucleic acids research
JF - Nucleic acids research
IS - 1
ER -