TY - JOUR
T1 - The complete parsimony haplotype inference problem and algorithms based on integer programming, branch-and-bound and Boolean satisfiability
AU - Jäger, Gerold
AU - Climer, Sharlee
AU - Zhang, Weixiong
N1 - Funding Information:
The research was supported in part by an Olin Fellowship to SC, an Alzheimer's Association grant, two NSF grants ( IIS-0535257 , DBI-0743797 ) to WZ and also by NIH grant R01GM100364 to WZ and SC.
Publisher Copyright:
© 2016 Elsevier B.V.
PY - 2016/3/1
Y1 - 2016/3/1
N2 - Haplotype inference by pure parsimony (Hipp) is a well-known paradigm for haplotype inference. In order to assess the biological significance of this paradigm, we generalize the problem of Hipp to the problem of finding all optimal solutions, which we call Chipp. We study intrinsic haplotype features, such as backbone haplotypes and fat genotypes as well as equal columns and decomposability. We explicitly exploit these features in three computational approaches that are based on integer linear programming, depth-first branch-and-bound, and Boolean satisfiability. Further we introduce two hybrid algorithms that draw upon the diverse strengths of the approaches. Our experimental analysis shows that our optimized algorithms are significantly superior to the baseline algorithms, often with orders of magnitude faster running time. Finally, our experiments provide some useful insights into the intrinsic features of this important problem.
AB - Haplotype inference by pure parsimony (Hipp) is a well-known paradigm for haplotype inference. In order to assess the biological significance of this paradigm, we generalize the problem of Hipp to the problem of finding all optimal solutions, which we call Chipp. We study intrinsic haplotype features, such as backbone haplotypes and fat genotypes as well as equal columns and decomposability. We explicitly exploit these features in three computational approaches that are based on integer linear programming, depth-first branch-and-bound, and Boolean satisfiability. Further we introduce two hybrid algorithms that draw upon the diverse strengths of the approaches. Our experimental analysis shows that our optimized algorithms are significantly superior to the baseline algorithms, often with orders of magnitude faster running time. Finally, our experiments provide some useful insights into the intrinsic features of this important problem.
KW - Boolean satisfiability
KW - Branch-and-bound
KW - Haplotype inference
KW - Integer programming
KW - Pure parsimony
UR - http://www.scopus.com/inward/record.url?scp=84975678746&partnerID=8YFLogxK
U2 - 10.1016/j.jda.2016.06.001
DO - 10.1016/j.jda.2016.06.001
M3 - Article
AN - SCOPUS:84975678746
SN - 1570-8667
VL - 37
SP - 68
EP - 83
JO - Journal of Discrete Algorithms
JF - Journal of Discrete Algorithms
ER -