TY - JOUR
T1 - Identifying target sites for cooperatively binding factors
AU - GuhaThakurta, D.
AU - Stormo, G. D.
N1 - Funding Information:
Compaq Computers and Ray Hookway are gratefully acknowledged for making available CPU cycles for our computational purposes. Chris Workman is thanked for his useful input during development of the method. We thank Chip Lawrence for providing the 1.01.009 version of Gibbs Motif Sampler and instructions on its use. We thank Xiaole Liu and Jun Liu for providing the BioProspector program. One unknown reviewer is thanked for bringing to our attention several relevant publications. The work was supported by grant (HG00249) from National Institutes of Health to GDS.
PY - 2001
Y1 - 2001
N2 - Motivation: Transcriptional activation in eukaryotic organisms normally requires combinatorial interactions of multiple transcription factors. Though several methods exist for identification of individual protein binding site patterns in DNA sequences, there are few methods for discovery of binding site patterns for cooperatively acting factors. Here we present an algorithm, Co-Bind (for COperative BINDing), for discovering DNA target sites for cooperatively acting transcription factors. The method utilizes a Gibbs sampling strategy to model the cooperativity between two transcription factors and defines position weight matrices for the binding sites. Sequences from both the training set and the entire genome are taken into account, in order to discriminate against commonly occurring patterns in the genome, and produce patterns which are significant only in the training set. Results: We have tested Co-Bind on semi-synthetic and real data sets to show it can efficiently identify DNA target site patterns for cooperatively binding transcription factors. In cases where binding site patterns are weak and cannot be identified by other available methods, Co-Bind, by virtue of modeling the cooperativity between factors, can identify those sites efficiently. Though developed to model protein-DNA interactions, the scope of Co-Bind may be extended to combinatorial, sequence specific, interactions in other macromolecules.
AB - Motivation: Transcriptional activation in eukaryotic organisms normally requires combinatorial interactions of multiple transcription factors. Though several methods exist for identification of individual protein binding site patterns in DNA sequences, there are few methods for discovery of binding site patterns for cooperatively acting factors. Here we present an algorithm, Co-Bind (for COperative BINDing), for discovering DNA target sites for cooperatively acting transcription factors. The method utilizes a Gibbs sampling strategy to model the cooperativity between two transcription factors and defines position weight matrices for the binding sites. Sequences from both the training set and the entire genome are taken into account, in order to discriminate against commonly occurring patterns in the genome, and produce patterns which are significant only in the training set. Results: We have tested Co-Bind on semi-synthetic and real data sets to show it can efficiently identify DNA target site patterns for cooperatively binding transcription factors. In cases where binding site patterns are weak and cannot be identified by other available methods, Co-Bind, by virtue of modeling the cooperativity between factors, can identify those sites efficiently. Though developed to model protein-DNA interactions, the scope of Co-Bind may be extended to combinatorial, sequence specific, interactions in other macromolecules.
UR - http://www.scopus.com/inward/record.url?scp=0034894539&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/17.7.608
DO - 10.1093/bioinformatics/17.7.608
M3 - Article
C2 - 11448879
AN - SCOPUS:0034894539
SN - 1367-4803
VL - 17
SP - 608
EP - 621
JO - Bioinformatics
JF - Bioinformatics
IS - 7
ER -