Limits...
Combining physicochemical and evolutionary information for protein contact prediction.

Schneider M, Brock O - PLoS ONE (2014)

Bottom Line: The resulting contact predictions are highly accurate.As a result of combining two sources of information--evolutionary and physicochemical--we maintain prediction accuracy even when only few sequence homologs are present.We show that the predicted contacts help to improve ab initio structure prediction.

View Article: PubMed Central - PubMed

Affiliation: Robotics and Biology Laboratory, Department of Electrical Engineering and Computer Science, Technische Universit├Ąt Berlin, Berlin, Germany.

ABSTRACT
We introduce a novel contact prediction method that achieves high prediction accuracy by combining evolutionary and physicochemical information about native contacts. We obtain evolutionary information from multiple-sequence alignments and physicochemical information from predicted ab initio protein structures. These structures represent low-energy states in an energy landscape and thus capture the physicochemical information encoded in the energy function. Such low-energy structures are likely to contain native contacts, even if their overall fold is not native. To differentiate native from non-native contacts in those structures, we develop a graph-based representation of the structural context of contacts. We then use this representation to train an support vector machine classifier to identify most likely native contacts in otherwise non-native structures. The resulting contact predictions are highly accurate. As a result of combining two sources of information--evolutionary and physicochemical--we maintain prediction accuracy even when only few sequence homologs are present. We show that the predicted contacts help to improve ab initio structure prediction. A web service is available at http://compbio.robotics.tu-berlin.de/epc-map/.

Show MeSH
Tertiary structure prediction improvement of the dissimilatory sulfite reductase D (PDB1ucrA), of the E. coli SSB-DNA polymerase III (PDB3sxuB) and of the GIT1 paxillin-binding domain (PDB2jx0A).Contact maps show false positive predictions in the upper triangle (red), true positive predictions in the lower triangle (blue) and native contacts in grey. For the shown predictions, native structures are shown in grey and predicted structures are colored from N-terminus (blue) to C-terminus (red). The predictions correspond to the lowest-energy structure generated without use of contacts (middle column) and with EPC-map predicted contacts (right column).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4206277&req=5

pone-0108438-g009: Tertiary structure prediction improvement of the dissimilatory sulfite reductase D (PDB1ucrA), of the E. coli SSB-DNA polymerase III (PDB3sxuB) and of the GIT1 paxillin-binding domain (PDB2jx0A).Contact maps show false positive predictions in the upper triangle (red), true positive predictions in the lower triangle (blue) and native contacts in grey. For the shown predictions, native structures are shown in grey and predicted structures are colored from N-terminus (blue) to C-terminus (red). The predictions correspond to the lowest-energy structure generated without use of contacts (middle column) and with EPC-map predicted contacts (right column).

Mentions: Figure 9 shows three example proteins for which the combination of EPC-map and Rosetta yielded significant improvements in prediction accuracy. For these examples, only few homologous sequences are available (less then ). For each protein, we show the contact map obtained by EPC-map with true and false positives, the prediction of Rosetta without the inclusion of contact information, and the prediction based on contact information.


Combining physicochemical and evolutionary information for protein contact prediction.

Schneider M, Brock O - PLoS ONE (2014)

Tertiary structure prediction improvement of the dissimilatory sulfite reductase D (PDB1ucrA), of the E. coli SSB-DNA polymerase III (PDB3sxuB) and of the GIT1 paxillin-binding domain (PDB2jx0A).Contact maps show false positive predictions in the upper triangle (red), true positive predictions in the lower triangle (blue) and native contacts in grey. For the shown predictions, native structures are shown in grey and predicted structures are colored from N-terminus (blue) to C-terminus (red). The predictions correspond to the lowest-energy structure generated without use of contacts (middle column) and with EPC-map predicted contacts (right column).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4206277&req=5

pone-0108438-g009: Tertiary structure prediction improvement of the dissimilatory sulfite reductase D (PDB1ucrA), of the E. coli SSB-DNA polymerase III (PDB3sxuB) and of the GIT1 paxillin-binding domain (PDB2jx0A).Contact maps show false positive predictions in the upper triangle (red), true positive predictions in the lower triangle (blue) and native contacts in grey. For the shown predictions, native structures are shown in grey and predicted structures are colored from N-terminus (blue) to C-terminus (red). The predictions correspond to the lowest-energy structure generated without use of contacts (middle column) and with EPC-map predicted contacts (right column).
Mentions: Figure 9 shows three example proteins for which the combination of EPC-map and Rosetta yielded significant improvements in prediction accuracy. For these examples, only few homologous sequences are available (less then ). For each protein, we show the contact map obtained by EPC-map with true and false positives, the prediction of Rosetta without the inclusion of contact information, and the prediction based on contact information.

Bottom Line: The resulting contact predictions are highly accurate.As a result of combining two sources of information--evolutionary and physicochemical--we maintain prediction accuracy even when only few sequence homologs are present.We show that the predicted contacts help to improve ab initio structure prediction.

View Article: PubMed Central - PubMed

Affiliation: Robotics and Biology Laboratory, Department of Electrical Engineering and Computer Science, Technische Universit├Ąt Berlin, Berlin, Germany.

ABSTRACT
We introduce a novel contact prediction method that achieves high prediction accuracy by combining evolutionary and physicochemical information about native contacts. We obtain evolutionary information from multiple-sequence alignments and physicochemical information from predicted ab initio protein structures. These structures represent low-energy states in an energy landscape and thus capture the physicochemical information encoded in the energy function. Such low-energy structures are likely to contain native contacts, even if their overall fold is not native. To differentiate native from non-native contacts in those structures, we develop a graph-based representation of the structural context of contacts. We then use this representation to train an support vector machine classifier to identify most likely native contacts in otherwise non-native structures. The resulting contact predictions are highly accurate. As a result of combining two sources of information--evolutionary and physicochemical--we maintain prediction accuracy even when only few sequence homologs are present. We show that the predicted contacts help to improve ab initio structure prediction. A web service is available at http://compbio.robotics.tu-berlin.de/epc-map/.

Show MeSH