Limits...
Reconstruction of protein backbones from the BriX collection of canonical protein fragments.

Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, Serrano L, Rousseau F, Schymkowitz J - PLoS Comput. Biol. (2008)

Bottom Line: As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures.Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations.When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.

View Article: PubMed Central - PubMed

Affiliation: SWITCH Laboratory, Vrije Universiteit Brussels, Brussels, Belgium.

ABSTRACT
As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.

Show MeSH
Reconstruction of human protein backbones using BriX classes.(A, B) Local fit approximation for the reconstruction of the set of human protein structures: some examples. The backbones (in red) of α G25K GTP-binding protein (A) and β human C-reactive protein (B) fully covered with BriX classes (green). The covering algorithm selected 35 and 40 redundancy filtered fragment classes to describe the respective structures. (C, D) Global fit approximation for the reconstruction of the set of human protein structures: some examples. A backbone trace of α G25K GTP-binding protein (C) and β human C-reactive protein (D). The target proteins are shown in red and the approximations are shown in green. The overall RMSD is 0.4542 Angstrom and 0.5614 Angstrom, respectively.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2367438&req=5

pcbi-1000083-g005: Reconstruction of human protein backbones using BriX classes.(A, B) Local fit approximation for the reconstruction of the set of human protein structures: some examples. The backbones (in red) of α G25K GTP-binding protein (A) and β human C-reactive protein (B) fully covered with BriX classes (green). The covering algorithm selected 35 and 40 redundancy filtered fragment classes to describe the respective structures. (C, D) Global fit approximation for the reconstruction of the set of human protein structures: some examples. A backbone trace of α G25K GTP-binding protein (C) and β human C-reactive protein (D). The target proteins are shown in red and the approximations are shown in green. The overall RMSD is 0.4542 Angstrom and 0.5614 Angstrom, respectively.

Mentions: In order to assess the accuracy of a fragment library to describe known protein structures two different measures have been proposed in previous works [6],[34]: the local fit and global fit approximation. The first measure determines how well each fragment of a protein structure can be locally approximated by the best corresponding fragment class. Note that in this test it is not required to assemble the fragments to obtain a unique backbone trace. In addition, we calculated the total percentage of the structure that could be covered by the fragment classes (see Materials and Methods section). For reasons of generality, the validation test considered a representative set of human proteins, extracted from the PDB database (see Materials and Methods section). This relatively small set contained 935 structures, equivalently balanced over the existing folds (as is illustrated in Table 1). In order to fully consider the secondary structure differences, separate tests were carried out for α (A) proteins, β (B) proteins, α and β (A/B and A+B) proteins, according to the SCOP classification. With an average RMSD of 0.16 Angstrom for the local fit approximation BriX improves the previously obtained 0.23 Angstrom RMSD by Camproux et al using 27 structural classes [34]. Kolodny and Levitt achieved an average RMSD of 0.26 and 0.39 Angstrom for respectively 4 and 14 classes considering fragments of four-residue length [16], The 16-states alphabet describing fragments of 5 residues of De Brevern et al [43] approximated the local structure with an accuracy of 0.51 Angstrom. Furthermore, Table 1 shows BriX achieves a coverage of 99 to 100%. Figure 5 illustrates an all α (5A) and an all β (5B) class protein, originating from the human proteins validation set, entirely covered by BriX classes. Remarkable is that even all β proteins and irregular structures such as loops appeared to have full coverage of BriX classes. This implies that in spite of their hypervariable character, loops are made up of regular building blocks.


Reconstruction of protein backbones from the BriX collection of canonical protein fragments.

Baeten L, Reumers J, Tur V, Stricher F, Lenaerts T, Serrano L, Rousseau F, Schymkowitz J - PLoS Comput. Biol. (2008)

Reconstruction of human protein backbones using BriX classes.(A, B) Local fit approximation for the reconstruction of the set of human protein structures: some examples. The backbones (in red) of α G25K GTP-binding protein (A) and β human C-reactive protein (B) fully covered with BriX classes (green). The covering algorithm selected 35 and 40 redundancy filtered fragment classes to describe the respective structures. (C, D) Global fit approximation for the reconstruction of the set of human protein structures: some examples. A backbone trace of α G25K GTP-binding protein (C) and β human C-reactive protein (D). The target proteins are shown in red and the approximations are shown in green. The overall RMSD is 0.4542 Angstrom and 0.5614 Angstrom, respectively.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2367438&req=5

pcbi-1000083-g005: Reconstruction of human protein backbones using BriX classes.(A, B) Local fit approximation for the reconstruction of the set of human protein structures: some examples. The backbones (in red) of α G25K GTP-binding protein (A) and β human C-reactive protein (B) fully covered with BriX classes (green). The covering algorithm selected 35 and 40 redundancy filtered fragment classes to describe the respective structures. (C, D) Global fit approximation for the reconstruction of the set of human protein structures: some examples. A backbone trace of α G25K GTP-binding protein (C) and β human C-reactive protein (D). The target proteins are shown in red and the approximations are shown in green. The overall RMSD is 0.4542 Angstrom and 0.5614 Angstrom, respectively.
Mentions: In order to assess the accuracy of a fragment library to describe known protein structures two different measures have been proposed in previous works [6],[34]: the local fit and global fit approximation. The first measure determines how well each fragment of a protein structure can be locally approximated by the best corresponding fragment class. Note that in this test it is not required to assemble the fragments to obtain a unique backbone trace. In addition, we calculated the total percentage of the structure that could be covered by the fragment classes (see Materials and Methods section). For reasons of generality, the validation test considered a representative set of human proteins, extracted from the PDB database (see Materials and Methods section). This relatively small set contained 935 structures, equivalently balanced over the existing folds (as is illustrated in Table 1). In order to fully consider the secondary structure differences, separate tests were carried out for α (A) proteins, β (B) proteins, α and β (A/B and A+B) proteins, according to the SCOP classification. With an average RMSD of 0.16 Angstrom for the local fit approximation BriX improves the previously obtained 0.23 Angstrom RMSD by Camproux et al using 27 structural classes [34]. Kolodny and Levitt achieved an average RMSD of 0.26 and 0.39 Angstrom for respectively 4 and 14 classes considering fragments of four-residue length [16], The 16-states alphabet describing fragments of 5 residues of De Brevern et al [43] approximated the local structure with an accuracy of 0.51 Angstrom. Furthermore, Table 1 shows BriX achieves a coverage of 99 to 100%. Figure 5 illustrates an all α (5A) and an all β (5B) class protein, originating from the human proteins validation set, entirely covered by BriX classes. Remarkable is that even all β proteins and irregular structures such as loops appeared to have full coverage of BriX classes. This implies that in spite of their hypervariable character, loops are made up of regular building blocks.

Bottom Line: As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures.Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations.When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.

View Article: PubMed Central - PubMed

Affiliation: SWITCH Laboratory, Vrije Universiteit Brussels, Brussels, Belgium.

ABSTRACT
As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.

Show MeSH