TY - JOUR
T1 - NetProphet 2.0
T2 - Mapping transcription factor networks by exploiting scalable data resources
AU - Kang, Yiming
AU - Liow, Hien Haw
AU - Maier, Ezekiel J.
AU - Brent, Michael R.
N1 - Publisher Copyright:
© The Author 2017. Published by Oxford University Press.
PY - 2018/1/15
Y1 - 2018/1/15
N2 - Motivation Cells process information, in part, through transcription factor (TF) networks, which control the rates at which individual genes produce their products. A TF network map is a graph that indicates which TFs bind and directly regulate each gene. Previous work has described network mapping algorithms that rely exclusively on gene expression data and â integrative' algorithms that exploit a wide range of data sources including chromatin immunoprecipitation sequencing (ChIP-seq) of many TFs, genome-wide chromatin marks, and binding specificities for many TFs determined in vitro. However, such resources are available only for a few major model systems and cannot be easily replicated for new organisms or cell types. Results We present NetProphet 2.0, a â data light' algorithm for TF network mapping, and show that it is more accurate at identifying direct targets of TFs than other, similarly data light algorithms. In particular, it improves on the accuracy of NetProphet 1.0, which used only gene expression data, by exploiting three principles. First, combining multiple approaches to network mapping from expression data can improve accuracy relative to the constituent approaches. Second, TFs with similar DNA binding domains bind similar sets of target genes. Third, even a noisy, preliminary network map can be used to infer DNA binding specificities from promoter sequences and these inferred specificities can be used to further improve the accuracy of the network map. Availability and implementation Source code and comprehensive documentation are freely available at https://github.com/yiming-kang/NetProphet-2.0. Contact [email protected] Supplementary informationSupplementary dataare available at Bioinformatics online.
AB - Motivation Cells process information, in part, through transcription factor (TF) networks, which control the rates at which individual genes produce their products. A TF network map is a graph that indicates which TFs bind and directly regulate each gene. Previous work has described network mapping algorithms that rely exclusively on gene expression data and â integrative' algorithms that exploit a wide range of data sources including chromatin immunoprecipitation sequencing (ChIP-seq) of many TFs, genome-wide chromatin marks, and binding specificities for many TFs determined in vitro. However, such resources are available only for a few major model systems and cannot be easily replicated for new organisms or cell types. Results We present NetProphet 2.0, a â data light' algorithm for TF network mapping, and show that it is more accurate at identifying direct targets of TFs than other, similarly data light algorithms. In particular, it improves on the accuracy of NetProphet 1.0, which used only gene expression data, by exploiting three principles. First, combining multiple approaches to network mapping from expression data can improve accuracy relative to the constituent approaches. Second, TFs with similar DNA binding domains bind similar sets of target genes. Third, even a noisy, preliminary network map can be used to infer DNA binding specificities from promoter sequences and these inferred specificities can be used to further improve the accuracy of the network map. Availability and implementation Source code and comprehensive documentation are freely available at https://github.com/yiming-kang/NetProphet-2.0. Contact [email protected] Supplementary informationSupplementary dataare available at Bioinformatics online.
UR - http://www.scopus.com/inward/record.url?scp=85040627148&partnerID=8YFLogxK
U2 - 10.1093/bioinformatics/btx563
DO - 10.1093/bioinformatics/btx563
M3 - Article
C2 - 28968736
AN - SCOPUS:85040627148
SN - 1367-4803
VL - 34
SP - 249
EP - 257
JO - Bioinformatics
JF - Bioinformatics
IS - 2
ER -