TY - JOUR
T1 - Inferring Disease-Associated microRNAs in Heterogeneous Networks with Node Attributes
AU - Xuan, Ping
AU - Shen, Tonghui
AU - Wang, Xiao
AU - Zhang, Tiangang
AU - Zhang, Weixiong
N1 - Funding Information:
The work was supported by the Natural Science Foundation of China (61702296, 61302139), the United States National Institutes of Health (R01GM100364), the Natural Science Foundation of Heilongjiang Province (F2015013, F2018028), the Young Innovative Talent Research Foundation of Harbin Science and Technology Bureau (2015RAQXJ004, 2016RQQXJ135), the Distinguished Youth Foundation of Heilongjiang University (JCL201405), the Fundamental Research Foundation of Universities in Heilongjiang Province for Technology Innovation, and the Fundamental Research Foundation of Universities in Heilongjiang Province for Youth Innovation Team.
Publisher Copyright:
© 2004-2012 IEEE.
PY - 2020/5/1
Y1 - 2020/5/1
N2 - Identification of disease-associated microRNAs (disease miRNAs) is an essential step towards discovering causal miRNAs and understanding disease pathogenesis. Two sources of information can be exploited for predicting disease miRNAs: one includes the connections between miRNAs, between diseases, and between miRNAs and diseases, and the other has the attributes of miRNA nodes. The former contains information of miRNA similarities, disease similarities, and miRNA-disease associations. The latter includes the information of the families and clusters that miRNAs belong to. Similar diseases are usually associated with miRNAs that have similar functions and common attributes. However, most of the existing methods for disease miRNA prediction focus only on the connections of miRNAs and diseases. It remains challenging to adequately integrate the connections and miRNA node attributes to identify more reliable candidate disease miRNAs. We propose a non-negative matrix factorization based method, FamCluRank, for predicting disease miRNAs in heterogeneous networks with node attributes. One of the novelties of FamCluRank is to fully utilize these two oversighted characteristics of miRNAs and focuses particularly on a deep integration of miRNA families and cluster attributes. In particular, the integration was achieved by three different means. We first constructed a miRNA-disease heterogeneous network with node attributes where the miRNA nodes have their family and cluster attributes. Second, miRNAs sharing more common families and clusters are more likely to be associated with the diseases that are also related to these families and clusters. On the basis of the biological premise, we constructed a novel prediction model of FamCluRank to deeply integrate the family and cluster attributes of miRNAs. Third, two similar diseases tend to be associated with more common miRNA families and clusters, and vice versa. Hence, FamCluRank's prediction model is constructed by concerning not only the possible associations between miRNAs and diseases but also the possible disease-family and disease-cluster associations. Comparison with the state-of-the-art methods showed FamCluRank's superior performance not only on the well-characterized diseases but also on the new ones. Case studies on colorectal neoplasms, pancreatic neoplasms, lung neoplasms, and 32 new diseases demonstrated its ability for discovering potential disease miRNAs. FamCluRank is a potent prioritization tool for screening the reliable candidates for subsequent studies concerning their involvement in the pathogenesis of diseases. The web service of FamCluRank, the candidate disease miRNAs for 329 diseases, and the dataset used to develop FamCluRank are available at http://www.famclurank.top.
AB - Identification of disease-associated microRNAs (disease miRNAs) is an essential step towards discovering causal miRNAs and understanding disease pathogenesis. Two sources of information can be exploited for predicting disease miRNAs: one includes the connections between miRNAs, between diseases, and between miRNAs and diseases, and the other has the attributes of miRNA nodes. The former contains information of miRNA similarities, disease similarities, and miRNA-disease associations. The latter includes the information of the families and clusters that miRNAs belong to. Similar diseases are usually associated with miRNAs that have similar functions and common attributes. However, most of the existing methods for disease miRNA prediction focus only on the connections of miRNAs and diseases. It remains challenging to adequately integrate the connections and miRNA node attributes to identify more reliable candidate disease miRNAs. We propose a non-negative matrix factorization based method, FamCluRank, for predicting disease miRNAs in heterogeneous networks with node attributes. One of the novelties of FamCluRank is to fully utilize these two oversighted characteristics of miRNAs and focuses particularly on a deep integration of miRNA families and cluster attributes. In particular, the integration was achieved by three different means. We first constructed a miRNA-disease heterogeneous network with node attributes where the miRNA nodes have their family and cluster attributes. Second, miRNAs sharing more common families and clusters are more likely to be associated with the diseases that are also related to these families and clusters. On the basis of the biological premise, we constructed a novel prediction model of FamCluRank to deeply integrate the family and cluster attributes of miRNAs. Third, two similar diseases tend to be associated with more common miRNA families and clusters, and vice versa. Hence, FamCluRank's prediction model is constructed by concerning not only the possible associations between miRNAs and diseases but also the possible disease-family and disease-cluster associations. Comparison with the state-of-the-art methods showed FamCluRank's superior performance not only on the well-characterized diseases but also on the new ones. Case studies on colorectal neoplasms, pancreatic neoplasms, lung neoplasms, and 32 new diseases demonstrated its ability for discovering potential disease miRNAs. FamCluRank is a potent prioritization tool for screening the reliable candidates for subsequent studies concerning their involvement in the pathogenesis of diseases. The web service of FamCluRank, the candidate disease miRNAs for 329 diseases, and the dataset used to develop FamCluRank are available at http://www.famclurank.top.
KW - Disease-associated miRNA prediction
KW - heterogeneous networks
KW - node attributes
KW - non-negative matrix factorization
UR - http://www.scopus.com/inward/record.url?scp=85054405708&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2018.2872574
DO - 10.1109/TCBB.2018.2872574
M3 - Article
AN - SCOPUS:85054405708
SN - 1545-5963
VL - 17
SP - 1019
EP - 1031
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 3
M1 - 8476231
ER -