TY - JOUR
T1 - Transactional database transformation and its application in prioritizing human disease genes
AU - Xiang, Yang
AU - Payne, Philip R.O.
AU - Huang, Kun
N1 - Funding Information:
The authors would like to thank the anonymous reviewers for their helpful comments and suggestions. This work was supported by the US National Science Foundation (NSF) under Grant #1019343 to the Computing Research Association for the CIFellows Project, and by the National Cancer Institute under Grant NCI R01CA141090.
PY - 2012
Y1 - 2012
N2 - Binary (0,1) matrices, commonly known as transactional databases, can represent many application data, including gene-phenotype data where "1 represents a confirmed gene-phenotype relation and "0 represents an unknown relation. It is natural to ask what information is hidden behind these "0s and "1s. Unfortunately, recent matrix completion methods, though very effective in many cases, are less likely to infer something interesting from these (0,1)-matrices. To answer this challenge, we propose Ind Evi, a very succinct and effective algorithm to perform independent-evidence-based transactional database transformation. Each entry of a (0,1)-matrix is evaluated by "independent evidence (maximal supporting patterns) extracted from the whole matrix for this entry. The value of an entry, regardless of its value as 0 or 1, has completely no effect for its independent evidence. The experiment on a gene-phenotype database shows that our method is highly promising in ranking candidate genes and predicting unknown disease genes.
AB - Binary (0,1) matrices, commonly known as transactional databases, can represent many application data, including gene-phenotype data where "1 represents a confirmed gene-phenotype relation and "0 represents an unknown relation. It is natural to ask what information is hidden behind these "0s and "1s. Unfortunately, recent matrix completion methods, though very effective in many cases, are less likely to infer something interesting from these (0,1)-matrices. To answer this challenge, we propose Ind Evi, a very succinct and effective algorithm to perform independent-evidence-based transactional database transformation. Each entry of a (0,1)-matrix is evaluated by "independent evidence (maximal supporting patterns) extracted from the whole matrix for this entry. The value of an entry, regardless of its value as 0 or 1, has completely no effect for its independent evidence. The experiment on a gene-phenotype database shows that our method is highly promising in ranking candidate genes and predicting unknown disease genes.
KW - Transactional database
KW - binary matrix
KW - disease gene
KW - frequent item set mining
KW - matrix completion
KW - maximal biclique
KW - phenotype
KW - prioritization
UR - http://www.scopus.com/inward/record.url?scp=81455139750&partnerID=8YFLogxK
U2 - 10.1109/TCBB.2011.58
DO - 10.1109/TCBB.2011.58
M3 - Article
C2 - 21422495
AN - SCOPUS:81455139750
SN - 1545-5963
VL - 9
SP - 294
EP - 304
JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics
JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics
IS - 1
M1 - 5740841
ER -