Limits...
Automatic Selection of Order Parameters in the Analysis of Large Scale Molecular Dynamics Simulations.

Sultan MM, Kiss G, Shukla D, Pande VS - J Chem Theory Comput (2014)

Bottom Line: We address this challenge by introducing a method called clustering based feature selection (CB-FS) that employs a posterior analysis approach.It combines supervised machine learning (SML) and feature selection with Markov state models to automatically identify the relevant degrees of freedom that separate conformational states.We highlight the utility of the method in the evaluation of large-scale simulations and show that it can be used for the rapid and automated identification of relevant order parameters involved in the functional transitions of two exemplary cell-signaling proteins central to human disease states.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry, Stanford University , 318 Campus Drive, Stanford, California 94305, United States.

ABSTRACT

Given the large number of crystal structures and NMR ensembles that have been solved to date, classical molecular dynamics (MD) simulations have become powerful tools in the atomistic study of the kinetics and thermodynamics of biomolecular systems on ever increasing time scales. By virtue of the high-dimensional conformational state space that is explored, the interpretation of large-scale simulations faces difficulties not unlike those in the big data community. We address this challenge by introducing a method called clustering based feature selection (CB-FS) that employs a posterior analysis approach. It combines supervised machine learning (SML) and feature selection with Markov state models to automatically identify the relevant degrees of freedom that separate conformational states. We highlight the utility of the method in the evaluation of large-scale simulations and show that it can be used for the rapid and automated identification of relevant order parameters involved in the functional transitions of two exemplary cell-signaling proteins central to human disease states.

No MeSH data available.


Related in: MedlinePlus

A)Gini importance of different dihedrals showing importance ofthe K295, E310, H384, and the A-loop (404–424). B) The c-Srcactivation pathways projected onto a two-dimensional reaction coordinateshowing the sequence of events that needs to happen for the systemto activate. Note how the A-loop (red) needs to first unfold, followedby the rotation of the C-Helix (orange). C) On an atomistic scale,the A-loop (red) unfolds followed by the twisting motion of the E310to interact with the K295. D) The salt bridge switching mechanisminvolved in this system. For more details see the works of Shuklaet al.12 and Meng et al.34,35
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4263461&req=5

fig4: A)Gini importance of different dihedrals showing importance ofthe K295, E310, H384, and the A-loop (404–424). B) The c-Srcactivation pathways projected onto a two-dimensional reaction coordinateshowing the sequence of events that needs to happen for the systemto activate. Note how the A-loop (red) needs to first unfold, followedby the rotation of the C-Helix (orange). C) On an atomistic scale,the A-loop (red) unfolds followed by the twisting motion of the E310to interact with the K295. D) The salt bridge switching mechanisminvolved in this system. For more details see the works of Shuklaet al.12 and Meng et al.34,35

Mentions: The key results and their biological interpretation are summarizedin Figure 4. The Src kinase activation is asequential two step process in which the activation loop (A-loop,shown in red in Figure 4c) unfolds before theC-helix (shown in orange in Figure 4c) whichthen swings inward toward the core of the protein to form a criticalGlu-Lys ion pair. Projecting the 20,000 frames onto these two degreesof freedom shows this two step process in Figure 4b. CB-FS selected the key dihedrals that are involved in thissequential activation mechanism (Figure 4a).For example, residues 410–420 are part of the activation loop(A-loop, shown in red in Figure 4c) that needsto unfold for the system to activate (Figure 4b). The CB-FS method also highlighted the importance of H384 thatforms a part of the regulatory spine critical for catalysis and E310which switches from interacting with R409 in the inactive state toK295 in the active state (Figure 4b-d).12,35


Automatic Selection of Order Parameters in the Analysis of Large Scale Molecular Dynamics Simulations.

Sultan MM, Kiss G, Shukla D, Pande VS - J Chem Theory Comput (2014)

A)Gini importance of different dihedrals showing importance ofthe K295, E310, H384, and the A-loop (404–424). B) The c-Srcactivation pathways projected onto a two-dimensional reaction coordinateshowing the sequence of events that needs to happen for the systemto activate. Note how the A-loop (red) needs to first unfold, followedby the rotation of the C-Helix (orange). C) On an atomistic scale,the A-loop (red) unfolds followed by the twisting motion of the E310to interact with the K295. D) The salt bridge switching mechanisminvolved in this system. For more details see the works of Shuklaet al.12 and Meng et al.34,35
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4263461&req=5

fig4: A)Gini importance of different dihedrals showing importance ofthe K295, E310, H384, and the A-loop (404–424). B) The c-Srcactivation pathways projected onto a two-dimensional reaction coordinateshowing the sequence of events that needs to happen for the systemto activate. Note how the A-loop (red) needs to first unfold, followedby the rotation of the C-Helix (orange). C) On an atomistic scale,the A-loop (red) unfolds followed by the twisting motion of the E310to interact with the K295. D) The salt bridge switching mechanisminvolved in this system. For more details see the works of Shuklaet al.12 and Meng et al.34,35
Mentions: The key results and their biological interpretation are summarizedin Figure 4. The Src kinase activation is asequential two step process in which the activation loop (A-loop,shown in red in Figure 4c) unfolds before theC-helix (shown in orange in Figure 4c) whichthen swings inward toward the core of the protein to form a criticalGlu-Lys ion pair. Projecting the 20,000 frames onto these two degreesof freedom shows this two step process in Figure 4b. CB-FS selected the key dihedrals that are involved in thissequential activation mechanism (Figure 4a).For example, residues 410–420 are part of the activation loop(A-loop, shown in red in Figure 4c) that needsto unfold for the system to activate (Figure 4b). The CB-FS method also highlighted the importance of H384 thatforms a part of the regulatory spine critical for catalysis and E310which switches from interacting with R409 in the inactive state toK295 in the active state (Figure 4b-d).12,35

Bottom Line: We address this challenge by introducing a method called clustering based feature selection (CB-FS) that employs a posterior analysis approach.It combines supervised machine learning (SML) and feature selection with Markov state models to automatically identify the relevant degrees of freedom that separate conformational states.We highlight the utility of the method in the evaluation of large-scale simulations and show that it can be used for the rapid and automated identification of relevant order parameters involved in the functional transitions of two exemplary cell-signaling proteins central to human disease states.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry, Stanford University , 318 Campus Drive, Stanford, California 94305, United States.

ABSTRACT

Given the large number of crystal structures and NMR ensembles that have been solved to date, classical molecular dynamics (MD) simulations have become powerful tools in the atomistic study of the kinetics and thermodynamics of biomolecular systems on ever increasing time scales. By virtue of the high-dimensional conformational state space that is explored, the interpretation of large-scale simulations faces difficulties not unlike those in the big data community. We address this challenge by introducing a method called clustering based feature selection (CB-FS) that employs a posterior analysis approach. It combines supervised machine learning (SML) and feature selection with Markov state models to automatically identify the relevant degrees of freedom that separate conformational states. We highlight the utility of the method in the evaluation of large-scale simulations and show that it can be used for the rapid and automated identification of relevant order parameters involved in the functional transitions of two exemplary cell-signaling proteins central to human disease states.

No MeSH data available.


Related in: MedlinePlus