TY - JOUR
T1 - Analysis of differentially-regulated genes within a regulatory network by GPS genome navigation
AU - Zwir, Igor
AU - Huang, Henry
AU - Groisman, Eduardo A.
N1 - Funding Information:
The authors thank G. Stormo, E. Ruspini and C. Gu for methodological suggestions and H. Salgado for the Salmonella operon information. This work was supported, in part, by the grant AI49561 from the National Institutes of Health to E.A.G., who is an Investigator of the Howard Hughes Medical Institute. I.Z. is also a member of the Computer Science Department at the University of Granada, Spain, and supported, in part, by the Spanish Ministry of Science and Technology under project TIC 2003-00 and BIO2004-0270E. Funding to pay the Open Access publication charges for this article was provided by the Howard Hughes Medical Institute.
PY - 2005/11/15
Y1 - 2005/11/15
N2 - Motivation: A critical challenge of the post-genomic era is to understand how genes are differentially regulated even when they belong to a given network. Because the fundamental mechanism controlling gene expression operates at the level of transcription initiation, computational techniques have been developed that identify cis regulatory features and map such features into expression patterns to classify genes into distinct networks. However, these methods are not focused on distinguishing between differentially regulated genes within a given network. Here we describe an unsupervised machine learning method, termed GPS for gene promoter scan, that discriminates among co-regulated promoters by simultaneously considering both cis-acting regulatory features and gene expression. GPS is particularly useful for knowledge discovery in environments with reduced datasets and high levels of uncertainty. Results: Application of this method to the enteric bacteria Escherichia coli and Salmonella enterica uncovered novel members, as well as regulatory interactions in the regulon controlled by the PhoP protein that were not discovered using previous approaches. The predictions made by GPS were experimentally validated to establish that the PhoP protein uses multiple mechanisms to control gene transcription, and is a central element in a highly connected network.
AB - Motivation: A critical challenge of the post-genomic era is to understand how genes are differentially regulated even when they belong to a given network. Because the fundamental mechanism controlling gene expression operates at the level of transcription initiation, computational techniques have been developed that identify cis regulatory features and map such features into expression patterns to classify genes into distinct networks. However, these methods are not focused on distinguishing between differentially regulated genes within a given network. Here we describe an unsupervised machine learning method, termed GPS for gene promoter scan, that discriminates among co-regulated promoters by simultaneously considering both cis-acting regulatory features and gene expression. GPS is particularly useful for knowledge discovery in environments with reduced datasets and high levels of uncertainty. Results: Application of this method to the enteric bacteria Escherichia coli and Salmonella enterica uncovered novel members, as well as regulatory interactions in the regulon controlled by the PhoP protein that were not discovered using previous approaches. The predictions made by GPS were experimentally validated to establish that the PhoP protein uses multiple mechanisms to control gene transcription, and is a central element in a highly connected network.
UR - http://www.scopus.com/inward/record.url?scp=27944461068&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/bti672
DO - 10.1093/bioinformatics/bti672
M3 - Article
C2 - 16159917
AN - SCOPUS:27944461068
SN - 1367-4803
VL - 21
SP - 4073
EP - 4083
JO - Bioinformatics
JF - Bioinformatics
IS - 22
ER -