CAMELOT: A machine learning approach for Coarse-grained simulations of aggregation of block-copolymeric protein sequences

Kiersten M. Ruff, Tyler S. Harmon, Rohit V. Pappu

Research output: Contribution to journalArticlepeer-review

28 Scopus citations


We report the development and deployment of a coarse-graining method that is well suited for computer simulations of aggregation and phase separation of protein sequences with block-copolymeric architectures. Our algorithm, named CAMELOT for Coarse-grained simulations Aided by MachinE Learning Optimization and Training, leverages information from converged all atom simulations that is used to determine a suitable resolution and parameterize the coarse-grained model. To parameterize a system-specific coarse-grained model, we use a combination of Boltzmann inversion, non-linear regression, and a Gaussian process Bayesian optimization approach. The accuracy of the coarsegrained model is demonstrated through direct comparisons to results from all atom simulations. We demonstrate the utility of our coarse-graining approach using the block-copolymeric sequence from the exon 1 encoded sequence of the huntingtin protein. This sequence comprises of 17 residues from the N-terminal end of huntingtin (N17) followed by a polyglutamine (polyQ) tract. Simulations based on the CAMELOT approach are used to show that the adsorption and unfolding of the wild type N17 and its sequence variants on the surface of polyQ tracts engender a patchy colloid like architecture that promotes the formation of linear aggregates. These results provide a plausible explanation for experimental observations, which show that N17 accelerates the formation of linear aggregates in block-copolymeric N17-polyQ sequences. The CAMELOT approach is versatile and is generalizable for simulating the aggregation and phase behavior of a range of block-copolymeric protein sequences.

Original languageEnglish
Article number243123
JournalJournal of Chemical Physics
Issue number24
StatePublished - Dec 28 2015


Dive into the research topics of 'CAMELOT: A machine learning approach for Coarse-grained simulations of aggregation of block-copolymeric protein sequences'. Together they form a unique fingerprint.

Cite this