Abstract

This work describes ANN-Spec, a machine learning algorithm and its application to discovering un-gapped patterns in DNA sequence. The approach makes use of an Artificial Neural Network and a Gibbs sampling method to define the Specificity of a DNA-binding protein. ANN-Spec searches for the parameters of a simple network (or weight matrix) that will maximize the specificity for binding sequences of a positive set compared to a background sequence set. Binding sites in the positive data set are found with the resulting weight matrix and these sites are then used to define a local multiple sequence alignment. Training complexity is O(lN) where l is the width of the pattern and N is the size of the positive training data. A quantitative comparison of ANN-Spec and a few related programs is presented. The comparison shows that ANN-Spec finds patterns of higher specificity when training with a background data set. The program and documentation are available from the authors for UNIX systems.

Original languageEnglish
Pages (from-to)467-478
Number of pages12
JournalPacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
StatePublished - 2000

Fingerprint

Dive into the research topics of 'ANN-Spec: a method for discovering transcription factor binding sites with improved specificity.'. Together they form a unique fingerprint.

Cite this