TY - JOUR
T1 - Dual threshold optimization and network inference reveal convergent evidence from TF binding locations and TF perturbation responses
AU - Kang, Yiming
AU - Patel, Nikhil R.
AU - Shively, Christian
AU - Recio, Pamela Samantha
AU - Chen, Xuhua
AU - Wranik, Bernd J.
AU - Kim, Griffin
AU - Scott McIsaac, R.
AU - Mitra, Robi
AU - Brent, Michael R.
N1 - Publisher Copyright:
© 2020 Kang et al.; Published by Cold Spring Harbor Laboratory Press
PY - 2020
Y1 - 2020
N2 - A high-confidence map of the direct, functional targets of each transcription factor (TF) requires convergent evidence from independent sources. Two significant sources of evidence are TF binding locations and the transcriptional responses to direct TF perturbations. Systematic data sets of both types exist for yeast and human, but they rarely converge on a common set of direct, functional targets for a TF. Even the few genes that are both bound and responsive may not be direct functional targets. Our analysis shows that when there are many nonfunctional binding sites and many indirect targets, nonfunctional sites are expected to occur in the cis-regulatory DNA of indirect targets by chance. To address this problem, we introduce dual threshold optimization (DTO), a new method for setting significance thresholds on binding and perturbation-response data, and show that it improves convergence. It also enables comparison of binding data to perturbation-response data that have been processed by network inference algorithms, which further improves convergence. The combination of dual threshold optimization and network inference greatly expands the high-confidence TF network map in both yeast and human. Next, we analyze a comprehensive new data set measuring the transcriptional response shortly after inducing overexpression of a yeast TF. We also present a new yeast binding location data set obtained by transposon calling cards and compare it to recent ChIP-exo data. These new data sets improve convergence and expand the high-confidence network synergistically.
AB - A high-confidence map of the direct, functional targets of each transcription factor (TF) requires convergent evidence from independent sources. Two significant sources of evidence are TF binding locations and the transcriptional responses to direct TF perturbations. Systematic data sets of both types exist for yeast and human, but they rarely converge on a common set of direct, functional targets for a TF. Even the few genes that are both bound and responsive may not be direct functional targets. Our analysis shows that when there are many nonfunctional binding sites and many indirect targets, nonfunctional sites are expected to occur in the cis-regulatory DNA of indirect targets by chance. To address this problem, we introduce dual threshold optimization (DTO), a new method for setting significance thresholds on binding and perturbation-response data, and show that it improves convergence. It also enables comparison of binding data to perturbation-response data that have been processed by network inference algorithms, which further improves convergence. The combination of dual threshold optimization and network inference greatly expands the high-confidence TF network map in both yeast and human. Next, we analyze a comprehensive new data set measuring the transcriptional response shortly after inducing overexpression of a yeast TF. We also present a new yeast binding location data set obtained by transposon calling cards and compare it to recent ChIP-exo data. These new data sets improve convergence and expand the high-confidence network synergistically.
UR - http://www.scopus.com/inward/record.url?scp=85082536886&partnerID=8YFLogxK
U2 - 10.1101/gr.259655.119
DO - 10.1101/gr.259655.119
M3 - Article
C2 - 32060051
AN - SCOPUS:85082536886
SN - 1088-9051
VL - 30
SP - 459
EP - 471
JO - Genome research
JF - Genome research
IS - 3
ER -