Abstract
Transcription factors (TFs) are major modulators of transcription and subsequent cellular processes. The binding of TFs to specific regulatory elements is governed by their specificity. Considering the gap between known TFs sequence and specificity, specificity prediction frameworks are highly desired. Key inputs to such frameworks are protein residues that modulate the specificity of TF under consideration. Simple measures like mutual information (MI) to delineate specificity influencing residues (SIRs) from alignment fail due to structural constraints imposed by the three-dimensional structure of protein. Structural restraints on the evolution of the amino-acid sequence lead to identification of false SIRs. In this manuscript we extended three methods (direct information, PSICOVand adjusted mutual information) that have been used to disentangle spurious indirect protein residue-residue contacts from direct contacts, to identify SIRs from joint alignments of amino-acids and specificity. We predicted SIRs for homeodomain (HD), helix-loop-helix, LacI and GntR families of TFs using these methods and compared to MI. Using various measures, we show that the performance of these three methods is comparable but better than MI. Implication of these methods in specificity prediction framework is discussed. The methods are implemented as an R package and available along with the alignments at http://stormo.wustl.edu/SpecPred. [Figure not available: see fulltext.].
Original language | English |
---|---|
Pages (from-to) | 115-123 |
Number of pages | 9 |
Journal | Quantitative Biology |
Volume | 3 |
Issue number | 3 |
DOIs | |
State | Published - Sep 1 2015 |
Keywords
- co-evolution
- direct information
- feature selection
- motifs
- protein-DNA interactions
- residue co-variance
- specificity determinants