Limits...
Characteristics of protein residue-residue contacts and their application in contact prediction.

Wozniak PP, Kotulska M - J Mol Model (2014)

Bottom Line: Contact characteristics of specific topologies were compared to the characteristics of their protein classes showing protein groups with a distinguished contact characteristic.We showed that our results could be used to improve the performance of recent top contact predictor - direct coupling analysis.Our work provides values of contact site propensities that can be involved in bioinformatic databases.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeże Wyspiańskiego 27, 50-370, Wroclaw, Poland, pawel.p.wozniak@pwr.edu.pl.

ABSTRACT
Contact sites between amino acids characterize important structural features of a protein. We investigated characteristics of contact sites in a representative set of proteins and their relations between protein class or topology. For this purpose, we used a non-redundant set of 5872 protein domains, identically categorized by CATH and SCOP databases. The proteins represented alpha, beta, and alpha+beta classes. Contact maps of protein structures were obtained for a selected set of physical distances in the main backbone and separations in protein sequences. For each set a dependency between contact degree and distance parameters was quantified. We indicated residues forming contact sites most frequently and unique amino acid pairs which created contact sites most often within each structural class. Contact characteristics of specific topologies were compared to the characteristics of their protein classes showing protein groups with a distinguished contact characteristic. We showed that our results could be used to improve the performance of recent top contact predictor - direct coupling analysis. Our work provides values of contact site propensities that can be involved in bioinformatic databases.

Show MeSH

Related in: MedlinePlus

Mean TP rate for different number of top-ranked contacts in proteins from dataset used to calculate the fp value in our study. Results for original DCA algorithm (black circle) and with application of fp (gray square) are presented for: a domains from class alpha, b domains from class beta, c domains from class alpha+beta, d all domains
© Copyright Policy - OpenAccess
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4221654&req=5

Fig11: Mean TP rate for different number of top-ranked contacts in proteins from dataset used to calculate the fp value in our study. Results for original DCA algorithm (black circle) and with application of fp (gray square) are presented for: a domains from class alpha, b domains from class beta, c domains from class alpha+beta, d all domains

Mentions: We compared the results from Fig. 10 obtained for Morcos et al. [3] dataset with the results gained for the dataset used to calculate the fp value in our study. These are presented in Fig. 11. In this case, the improvement for the top 50 contacts is negligible and the contact sites prediction accuracy stays at the similar level after the application of the fp value. Even previously observed decrease of TP rate for more than 100 top-ranked pairs is much lower. However, still the best results were obtained for the domains from alpha class. Results presented in Fig. 11 suggest that our algorithm performs better for the more specific dataset. Data used by Morcos et al. [3] came from mainly bacterial domain families with large non-redundant multiple sequence alignments. Domains examined in our study do not belong to any specific protein family but can be clearly assigned to one structural group. This shows that presented algorithm is dataset source-dependant.Fig. 11


Characteristics of protein residue-residue contacts and their application in contact prediction.

Wozniak PP, Kotulska M - J Mol Model (2014)

Mean TP rate for different number of top-ranked contacts in proteins from dataset used to calculate the fp value in our study. Results for original DCA algorithm (black circle) and with application of fp (gray square) are presented for: a domains from class alpha, b domains from class beta, c domains from class alpha+beta, d all domains
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4221654&req=5

Fig11: Mean TP rate for different number of top-ranked contacts in proteins from dataset used to calculate the fp value in our study. Results for original DCA algorithm (black circle) and with application of fp (gray square) are presented for: a domains from class alpha, b domains from class beta, c domains from class alpha+beta, d all domains
Mentions: We compared the results from Fig. 10 obtained for Morcos et al. [3] dataset with the results gained for the dataset used to calculate the fp value in our study. These are presented in Fig. 11. In this case, the improvement for the top 50 contacts is negligible and the contact sites prediction accuracy stays at the similar level after the application of the fp value. Even previously observed decrease of TP rate for more than 100 top-ranked pairs is much lower. However, still the best results were obtained for the domains from alpha class. Results presented in Fig. 11 suggest that our algorithm performs better for the more specific dataset. Data used by Morcos et al. [3] came from mainly bacterial domain families with large non-redundant multiple sequence alignments. Domains examined in our study do not belong to any specific protein family but can be clearly assigned to one structural group. This shows that presented algorithm is dataset source-dependant.Fig. 11

Bottom Line: Contact characteristics of specific topologies were compared to the characteristics of their protein classes showing protein groups with a distinguished contact characteristic.We showed that our results could be used to improve the performance of recent top contact predictor - direct coupling analysis.Our work provides values of contact site propensities that can be involved in bioinformatic databases.

View Article: PubMed Central - PubMed

Affiliation: Institute of Biomedical Engineering and Instrumentation, Wroclaw University of Technology, Wybrzeże Wyspiańskiego 27, 50-370, Wroclaw, Poland, pawel.p.wozniak@pwr.edu.pl.

ABSTRACT
Contact sites between amino acids characterize important structural features of a protein. We investigated characteristics of contact sites in a representative set of proteins and their relations between protein class or topology. For this purpose, we used a non-redundant set of 5872 protein domains, identically categorized by CATH and SCOP databases. The proteins represented alpha, beta, and alpha+beta classes. Contact maps of protein structures were obtained for a selected set of physical distances in the main backbone and separations in protein sequences. For each set a dependency between contact degree and distance parameters was quantified. We indicated residues forming contact sites most frequently and unique amino acid pairs which created contact sites most often within each structural class. Contact characteristics of specific topologies were compared to the characteristics of their protein classes showing protein groups with a distinguished contact characteristic. We showed that our results could be used to improve the performance of recent top contact predictor - direct coupling analysis. Our work provides values of contact site propensities that can be involved in bioinformatic databases.

Show MeSH
Related in: MedlinePlus