Identification of consensus patterns in unaligned DNA sequences known to be functionally related

Gerald Z. Hertz, George W. Hartzell, Gary D. Stormo

Research output: Contribution to journalArticlepeer-review

280 Scopus citations

Abstract

We have developed a method for identifying consensus patterns in a set of unaligned DNA sequences known to bind a common protein or to have some other common biochemical function. The method is based on a tnatrix representation of binding site patterns. Each row of the matrix represents one of the four possible bases, each column represents one of the positions of the binding site and each element is determined by the frequency the indicated base occurs at the indicated position. The goal of the method is to find the most significant matrix-i.e. the one with the lowest probability of occurring by chance-out of all the matrices that can be formed from the set of related sequences. The reliability of the method improves with the number of sequences, while the time required increases only linearly with the number of sequences. To test this method, we analysed 11 DNA sequences containing promoters regulated by the Escherichia coli LexA protein. The matrices we' found were consistent with the known consensus sequence, and could distinguish the generally accepted LexA binding sites from other DNA sequences.

Original languageEnglish
Pages (from-to)81-92
Number of pages12
JournalBioinformatics
Volume6
Issue number2
DOIs
StatePublished - Apr 1 1990
Externally publishedYes

Fingerprint Dive into the research topics of 'Identification of consensus patterns in unaligned DNA sequences known to be functionally related'. Together they form a unique fingerprint.

Cite this