TY - JOUR
T1 - Enspara
T2 - Modeling molecular ensembles with scalable data structures and parallel computing
AU - Porter, J. R.
AU - Zimmerman, M. I.
AU - Bowman, G. R.
N1 - Funding Information:
We are grateful to the Folding@home users for computing resources. This work was funded by National Institutes of Health Grant Nos. R01GM12400701 and T32GM02700, as well as by the National Science Foundation CAREER Award No. MCB-1552471. G.R.B. holds a Career Award at the Scientific Interface from the Burroughs Wellcome Fund and a Packard Fellowship for Science and Engineering from The David and Lucile Packard Foundation. M.I.Z. holds a Monsanto Graduate Fellowship and a Center for Biological Systems Engineering Fellowship.
Funding Information:
This work was funded by National Institutes of Health Grant Nos. R01GM12400701 and T32GM02700, as well as by the National Science Foundation CAREER Award No. MCB-1552471. G.R.B. holds a Career Award at the Scientific Interface from the Burroughs Wellcome Fund and a Packard Fellowship for Science and Engineering from The David and Lucile Packard Foundation. M.I.Z. holds a Monsanto Graduate Fellowship and a Center for Biological Systems Engineering Fellowship.
Publisher Copyright:
© 2019 Author(s).
PY - 2019/1/28
Y1 - 2019/1/28
N2 - Markov state models (MSMs) are quantitative models of protein dynamics that are useful for uncovering the structural fluctuations that proteins undergo, as well as the mechanisms of these conformational changes. Given the enormity of conformational space, there has been ongoing interest in identifying a small number of states that capture the essential features of a protein. Generally, this is achieved by making assumptions about the properties of relevant features - for example, that the most important features are those that change slowly. An alternative strategy is to keep as many degrees of freedom as possible and subsequently learn from the model which of the features are most important. In these larger models, however, traditional approaches quickly become computationally intractable. In this paper, we present enspara, a library for working with MSMs that provides several novel algorithms and specialized data structures that dramatically improve the scalability of traditional MSM methods. This includes ragged arrays for minimizing memory requirements, message passing interface-parallelized implementations of compute-intensive operations, and a flexible framework for model construction and analysis.
AB - Markov state models (MSMs) are quantitative models of protein dynamics that are useful for uncovering the structural fluctuations that proteins undergo, as well as the mechanisms of these conformational changes. Given the enormity of conformational space, there has been ongoing interest in identifying a small number of states that capture the essential features of a protein. Generally, this is achieved by making assumptions about the properties of relevant features - for example, that the most important features are those that change slowly. An alternative strategy is to keep as many degrees of freedom as possible and subsequently learn from the model which of the features are most important. In these larger models, however, traditional approaches quickly become computationally intractable. In this paper, we present enspara, a library for working with MSMs that provides several novel algorithms and specialized data structures that dramatically improve the scalability of traditional MSM methods. This includes ragged arrays for minimizing memory requirements, message passing interface-parallelized implementations of compute-intensive operations, and a flexible framework for model construction and analysis.
UR - http://www.scopus.com/inward/record.url?scp=85060813657&partnerID=8YFLogxK
U2 - 10.1063/1.5063794
DO - 10.1063/1.5063794
M3 - Article
C2 - 30709308
AN - SCOPUS:85060813657
SN - 0021-9606
VL - 150
JO - Journal of Chemical Physics
JF - Journal of Chemical Physics
IS - 4
M1 - 044108
ER -