TY - JOUR
T1 - Consensus patterns in DNA
AU - Stormo, Gary D.
N1 - Funding Information:
I thank Calvin Harley for sending me the E. coli promoter sequences on a floppy disk and George Hartzell for helping with the figures and Gerald Hertz for critical comments on the manuscript. Supported by National Institutes of Health Grant GM28755.
PY - 1990/1/1
Y1 - 1990/1/1
N2 - Matrices can provide realistic representations of protein/DNA specificity. In many cases simple mononucleotide-based matrices are adequate representations, but more complex matrices may be needed for other cases. Unlike simple consensus sequences, matrices allow for different penalties to be assessed for different changes to a binding site, a property that is essential for accurate description of a binding site pattern. When only a collection of binding site sequences is known, the best representation for the pattern is an information content formulation, based on both thermodynamic and statistical considerations. Quantitative data on relative binding affinities may be used to determine matrices that provide a best fit to the data. Matrix representations also provide an efficient method of aligning multiple sequences to identify binding site patterns that they have in common.
AB - Matrices can provide realistic representations of protein/DNA specificity. In many cases simple mononucleotide-based matrices are adequate representations, but more complex matrices may be needed for other cases. Unlike simple consensus sequences, matrices allow for different penalties to be assessed for different changes to a binding site, a property that is essential for accurate description of a binding site pattern. When only a collection of binding site sequences is known, the best representation for the pattern is an information content formulation, based on both thermodynamic and statistical considerations. Quantitative data on relative binding affinities may be used to determine matrices that provide a best fit to the data. Matrix representations also provide an efficient method of aligning multiple sequences to identify binding site patterns that they have in common.
UR - http://www.scopus.com/inward/record.url?scp=0025342573&partnerID=8YFLogxK
U2 - 10.1016/0076-6879(90)83015-2
DO - 10.1016/0076-6879(90)83015-2
M3 - Article
C2 - 2179676
AN - SCOPUS:0025342573
SN - 0076-6879
VL - 183
SP - 211
EP - 221
JO - Methods in enzymology
JF - Methods in enzymology
IS - C
ER -