Limits...
Integrating protein structural dynamics and evolutionary analysis with Bio3D.

Skjærven L, Yao XQ, Scarabelli G, Grant BJ - BMC Bioinformatics (2014)

Bottom Line: These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis.New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included.We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedicine, University of Bergen, Bergen, Norway. lars.skjarven@biomed.uib.no.

ABSTRACT

Background: Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution.

Results: Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

Conclusions: The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .

Show MeSH

Related in: MedlinePlus

Results of ensemble PCA and NMA onE. coliDHFR. (A) Available PDB structures projected onto their first two principal components accounting for a total of 59% of the total variance. (B) Comparison of mode fluctuations calculated for open (black) and closed (red) conformations. The figure is generated by automated functions for plotting and the identification of areas of significant differences in residue fluctuations between groups of conformers (light blue boxes). The locations of major secondary structure elements are shown in the plot margins with β strands in gray and α helices in black. (C) Conformational ensemble obtained from interpolating along the first five modes of all collected E. coli structures. Domain analysis on the generated ensemble reveals the identification of two dynamic sub-domains colored red and blue, respectively. See Additional file 2 for full details and corresponding code for this analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4279791&req=5

Fig2: Results of ensemble PCA and NMA onE. coliDHFR. (A) Available PDB structures projected onto their first two principal components accounting for a total of 59% of the total variance. (B) Comparison of mode fluctuations calculated for open (black) and closed (red) conformations. The figure is generated by automated functions for plotting and the identification of areas of significant differences in residue fluctuations between groups of conformers (light blue boxes). The locations of major secondary structure elements are shown in the plot margins with β strands in gray and α helices in black. (C) Conformational ensemble obtained from interpolating along the first five modes of all collected E. coli structures. Domain analysis on the generated ensemble reveals the identification of two dynamic sub-domains colored red and blue, respectively. See Additional file 2 for full details and corresponding code for this analysis.

Mentions: Following the workflow described in Figure 1 (see the Package overview and architecture section), we collected all 90 E. coli. DHFR structures from the PDB, performed a PCA to investigate the major conformational variation, and calculated the normal modes of each structure to probe for potential differences in structural flexibility. The PCA reveals that the ensemble can be divided into three major groups along their first two principal components (which collectively account for 59% of the total coordinate mean square displacements, Figure 2A). These conformers display either a closed, occluded, or an open conformation of two active site loops (termed the Met20 loop: residues 9-24, and the F-G loop: residues 116-132). NMA reveals that structures obtaining an open conformation show enhanced flexibility for the Met20 loop as compared to both the closed and occluded conformations (Figure 2B). Conversely, the F-G loop shows lower fluctuation values for the open conformation as compared to the occluded state (Additional File 2). These differences in mode fluctuations highlight the importance of considering multiple conformers in NMA, which is greatly facilitated by the Bio3D package. Additional, domain analysis with the function geostas() reveals the presence of two dynamic sub-domains corresponding to the adenosine-binding sub-domain and the loop sub-domain, respectively (Figure 2C). These domains are divided by a hinge region corresponding to residues Thr35 and Gln108, in agreement with previous studies [41]. This example demonstrates how integrating PCA, NMA and dynamic domain analysis on E. coli. DHFR structures can provide mechanistic insight into protein dynamics of functional relevance.Figure 2


Integrating protein structural dynamics and evolutionary analysis with Bio3D.

Skjærven L, Yao XQ, Scarabelli G, Grant BJ - BMC Bioinformatics (2014)

Results of ensemble PCA and NMA onE. coliDHFR. (A) Available PDB structures projected onto their first two principal components accounting for a total of 59% of the total variance. (B) Comparison of mode fluctuations calculated for open (black) and closed (red) conformations. The figure is generated by automated functions for plotting and the identification of areas of significant differences in residue fluctuations between groups of conformers (light blue boxes). The locations of major secondary structure elements are shown in the plot margins with β strands in gray and α helices in black. (C) Conformational ensemble obtained from interpolating along the first five modes of all collected E. coli structures. Domain analysis on the generated ensemble reveals the identification of two dynamic sub-domains colored red and blue, respectively. See Additional file 2 for full details and corresponding code for this analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4279791&req=5

Fig2: Results of ensemble PCA and NMA onE. coliDHFR. (A) Available PDB structures projected onto their first two principal components accounting for a total of 59% of the total variance. (B) Comparison of mode fluctuations calculated for open (black) and closed (red) conformations. The figure is generated by automated functions for plotting and the identification of areas of significant differences in residue fluctuations between groups of conformers (light blue boxes). The locations of major secondary structure elements are shown in the plot margins with β strands in gray and α helices in black. (C) Conformational ensemble obtained from interpolating along the first five modes of all collected E. coli structures. Domain analysis on the generated ensemble reveals the identification of two dynamic sub-domains colored red and blue, respectively. See Additional file 2 for full details and corresponding code for this analysis.
Mentions: Following the workflow described in Figure 1 (see the Package overview and architecture section), we collected all 90 E. coli. DHFR structures from the PDB, performed a PCA to investigate the major conformational variation, and calculated the normal modes of each structure to probe for potential differences in structural flexibility. The PCA reveals that the ensemble can be divided into three major groups along their first two principal components (which collectively account for 59% of the total coordinate mean square displacements, Figure 2A). These conformers display either a closed, occluded, or an open conformation of two active site loops (termed the Met20 loop: residues 9-24, and the F-G loop: residues 116-132). NMA reveals that structures obtaining an open conformation show enhanced flexibility for the Met20 loop as compared to both the closed and occluded conformations (Figure 2B). Conversely, the F-G loop shows lower fluctuation values for the open conformation as compared to the occluded state (Additional File 2). These differences in mode fluctuations highlight the importance of considering multiple conformers in NMA, which is greatly facilitated by the Bio3D package. Additional, domain analysis with the function geostas() reveals the presence of two dynamic sub-domains corresponding to the adenosine-binding sub-domain and the loop sub-domain, respectively (Figure 2C). These domains are divided by a hinge region corresponding to residues Thr35 and Gln108, in agreement with previous studies [41]. This example demonstrates how integrating PCA, NMA and dynamic domain analysis on E. coli. DHFR structures can provide mechanistic insight into protein dynamics of functional relevance.Figure 2

Bottom Line: These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis.New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included.We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedicine, University of Bergen, Bergen, Norway. lars.skjarven@biomed.uib.no.

ABSTRACT

Background: Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution.

Results: Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

Conclusions: The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .

Show MeSH
Related in: MedlinePlus