Limits...
Integrating protein structural dynamics and evolutionary analysis with Bio3D.

Skjærven L, Yao XQ, Scarabelli G, Grant BJ - BMC Bioinformatics (2014)

Bottom Line: These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis.New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included.We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedicine, University of Bergen, Bergen, Norway. lars.skjarven@biomed.uib.no.

ABSTRACT

Background: Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution.

Results: Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

Conclusions: The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .

Show MeSH

Related in: MedlinePlus

Cross-species normal modes analysis of DHFR. (A) Sequence conservation of the collected DHFR species. (B) Aligned fluctuation profiles for selected species of DHFR. Shaded blue regions depict areas discussed in the text showing different fluctuation patterns between specific species. The region shaded in light red depict the Met20 loop in E. coli DHFR and the corresponding loop in the remaining species. The location of major secondary structure elements in E. coli DHFR are also shown in the plot margins with β strands in gray and α helices in black. (C) A visual comparison of mode fluctuations between DHFR from E. coli and H. sapiens. Fluctuation magnitude is represented by thin to thick tube colored blue (low fluctuations), white (moderate fluctuations) to red (large fluctuations). See Additional file 3 for full details and corresponding code for this analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4279791&req=5

Fig3: Cross-species normal modes analysis of DHFR. (A) Sequence conservation of the collected DHFR species. (B) Aligned fluctuation profiles for selected species of DHFR. Shaded blue regions depict areas discussed in the text showing different fluctuation patterns between specific species. The region shaded in light red depict the Met20 loop in E. coli DHFR and the corresponding loop in the remaining species. The location of major secondary structure elements in E. coli DHFR are also shown in the plot margins with β strands in gray and α helices in black. (C) A visual comparison of mode fluctuations between DHFR from E. coli and H. sapiens. Fluctuation magnitude is represented by thin to thick tube colored blue (low fluctuations), white (moderate fluctuations) to red (large fluctuations). See Additional file 3 for full details and corresponding code for this analysis.

Mentions: To detect more distantly related DHFR homologues we built a hidden Markov model (HMM) from the PFAM multiple sequence alignment using the Bio3D interface to PFAM and HMMER (see the Package overview and architecture section). The resulting HMM was used in a new search of the PDB that identified a total of 33 species from bacteria, archaea, and eukaryotes, showing a pairwise sequence identity down to 21%. NMA was carried out on 197 of these structures. The resulting fluctuation profiles are plotted for each species along with the sequence conservation in Figure 3A-B. The plot reveals an overall similar trend of residue fluctuations between the species despite their low sequence identity. While the functionally important Met20 loop display a conserved flexibility trend for most of the species, the E. coli structures have enhanced fluctuations in this region (region I, Figure 3). This has previously been attributed to distinct functional mechanism for ligand flux: while E. coli DHFR relies on loop flexibility for the opening of the active site, H. sapiens DHFR accomplishes this by subtle subdomain rotational hinge motions [41]. Other important differences include enhanced loop fluctuations in H. sapiens DHFR, which are not evident in the bacterial species (residues 43-50 and 126-131 for human DHFR; Figure 3). These fluctuations have been suggested to be important for facilitating the hinge motions in H. sapiens DHFR [41]. Interestingly, the flexibility pattern of the human DHFR 43-50 loop is shared with two fungal variants: C. albicans and C. glabrata (region II, Figure 3). A similar trend is apparent for residues 62-64 in human DHFR. This flexible loop is also shared with the bacterial M. tubercolosi species (region III), but is missing in the four other bacterial species. Finally, the two fungal species display an additional and flexible surface loop (residues 139-150 in C. albicans DHFR; region IV), while C. glabrata contains residues 164-178 specific for this species (region V). This example demonstrates how Bio3D version 2.0 can facilitate the investigation of common and divergent protein structural dynamics in large protein superfamilies.Figure 3


Integrating protein structural dynamics and evolutionary analysis with Bio3D.

Skjærven L, Yao XQ, Scarabelli G, Grant BJ - BMC Bioinformatics (2014)

Cross-species normal modes analysis of DHFR. (A) Sequence conservation of the collected DHFR species. (B) Aligned fluctuation profiles for selected species of DHFR. Shaded blue regions depict areas discussed in the text showing different fluctuation patterns between specific species. The region shaded in light red depict the Met20 loop in E. coli DHFR and the corresponding loop in the remaining species. The location of major secondary structure elements in E. coli DHFR are also shown in the plot margins with β strands in gray and α helices in black. (C) A visual comparison of mode fluctuations between DHFR from E. coli and H. sapiens. Fluctuation magnitude is represented by thin to thick tube colored blue (low fluctuations), white (moderate fluctuations) to red (large fluctuations). See Additional file 3 for full details and corresponding code for this analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4279791&req=5

Fig3: Cross-species normal modes analysis of DHFR. (A) Sequence conservation of the collected DHFR species. (B) Aligned fluctuation profiles for selected species of DHFR. Shaded blue regions depict areas discussed in the text showing different fluctuation patterns between specific species. The region shaded in light red depict the Met20 loop in E. coli DHFR and the corresponding loop in the remaining species. The location of major secondary structure elements in E. coli DHFR are also shown in the plot margins with β strands in gray and α helices in black. (C) A visual comparison of mode fluctuations between DHFR from E. coli and H. sapiens. Fluctuation magnitude is represented by thin to thick tube colored blue (low fluctuations), white (moderate fluctuations) to red (large fluctuations). See Additional file 3 for full details and corresponding code for this analysis.
Mentions: To detect more distantly related DHFR homologues we built a hidden Markov model (HMM) from the PFAM multiple sequence alignment using the Bio3D interface to PFAM and HMMER (see the Package overview and architecture section). The resulting HMM was used in a new search of the PDB that identified a total of 33 species from bacteria, archaea, and eukaryotes, showing a pairwise sequence identity down to 21%. NMA was carried out on 197 of these structures. The resulting fluctuation profiles are plotted for each species along with the sequence conservation in Figure 3A-B. The plot reveals an overall similar trend of residue fluctuations between the species despite their low sequence identity. While the functionally important Met20 loop display a conserved flexibility trend for most of the species, the E. coli structures have enhanced fluctuations in this region (region I, Figure 3). This has previously been attributed to distinct functional mechanism for ligand flux: while E. coli DHFR relies on loop flexibility for the opening of the active site, H. sapiens DHFR accomplishes this by subtle subdomain rotational hinge motions [41]. Other important differences include enhanced loop fluctuations in H. sapiens DHFR, which are not evident in the bacterial species (residues 43-50 and 126-131 for human DHFR; Figure 3). These fluctuations have been suggested to be important for facilitating the hinge motions in H. sapiens DHFR [41]. Interestingly, the flexibility pattern of the human DHFR 43-50 loop is shared with two fungal variants: C. albicans and C. glabrata (region II, Figure 3). A similar trend is apparent for residues 62-64 in human DHFR. This flexible loop is also shared with the bacterial M. tubercolosi species (region III), but is missing in the four other bacterial species. Finally, the two fungal species display an additional and flexible surface loop (residues 139-150 in C. albicans DHFR; region IV), while C. glabrata contains residues 164-178 specific for this species (region V). This example demonstrates how Bio3D version 2.0 can facilitate the investigation of common and divergent protein structural dynamics in large protein superfamilies.Figure 3

Bottom Line: These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis.New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included.We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedicine, University of Bergen, Bergen, Norway. lars.skjarven@biomed.uib.no.

ABSTRACT

Background: Popular bioinformatics approaches for studying protein functional dynamics include comparisons of crystallographic structures, molecular dynamics simulations and normal mode analysis. However, determining how observed displacements and predicted motions from these traditionally separate analyses relate to each other, as well as to the evolution of sequence, structure and function within large protein families, remains a considerable challenge. This is in part due to the general lack of tools that integrate information of molecular structure, dynamics and evolution.

Results: Here, we describe the integration of new methodologies for evolutionary sequence, structure and simulation analysis into the Bio3D package. This major update includes unique high-throughput normal mode analysis for examining and contrasting the dynamics of related proteins with non-identical sequences and structures, as well as new methods for quantifying dynamical couplings and their residue-wise dissection from correlation network analysis. These new methodologies are integrated with major biomolecular databases as well as established methods for evolutionary sequence and comparative structural analysis. New functionality for directly comparing results derived from normal modes, molecular dynamics and principal component analysis of heterogeneous experimental structure distributions is also included. We demonstrate these integrated capabilities with example applications to dihydrofolate reductase and heterotrimeric G-protein families along with a discussion of the mechanistic insight provided in each case.

Conclusions: The integration of structural dynamics and evolutionary analysis in Bio3D enables researchers to go beyond a prediction of single protein dynamics to investigate dynamical features across large protein families. The Bio3D package is distributed with full source code and extensive documentation as a platform independent R package under a GPL2 license from http://thegrantlab.org/bio3d/ .

Show MeSH
Related in: MedlinePlus