Limits...
Deriving high-resolution protein backbone structure propensities from all crystal data using the information maximization device.

Solis AD - PLoS ONE (2014)

Bottom Line: This work demonstrates the effectiveness of the IMD and the superiority of the resulting PDFs by extensive fold recognition experiments and rigorous comparisons with previously published triplet PDFs.Because it automatically optimizes PDFs, IMD results in improved performance of knowledge-based potentials, which rely on such PDFs.Furthermore, it provides an easy computational recipe for empirically deriving other kinds of sequence-dependent structural PDFs with greater detail and precision.

View Article: PubMed Central - PubMed

Affiliation: Biological Sciences Department, New York City College of Technology, The City University of New York, Brooklyn, New York, United States of America.

ABSTRACT
The most informative probability distribution functions (PDFs) describing the Ramachandran phi-psi dihedral angle pair, a fundamental descriptor of backbone conformation of protein molecules, are derived from high-resolution X-ray crystal structures using an information-theoretic approach. The Information Maximization Device (IMD) is established, based on fundamental information-theoretic concepts, and then applied specifically to derive highly resolved phi-psi maps for all 20 single amino acid and all 8000 triplet sequences at an optimal resolution determined by the volume of current data. The paper shows that utilizing the latent information contained in all viable high-resolution crystal structures found in the Protein Data Bank (PDB), totaling more than 77,000 chains, permits the derivation of a large number of optimized sequence-dependent PDFs. This work demonstrates the effectiveness of the IMD and the superiority of the resulting PDFs by extensive fold recognition experiments and rigorous comparisons with previously published triplet PDFs. Because it automatically optimizes PDFs, IMD results in improved performance of knowledge-based potentials, which rely on such PDFs. Furthermore, it provides an easy computational recipe for empirically deriving other kinds of sequence-dependent structural PDFs with greater detail and precision. The high-resolution phi-psi maps derived in this work are available for download.

Show MeSH
Comparison of BLCLUST and BETAN performance in folding recognition with local decoy threading.Threading trials were done to 5000 short segments randomly selected from the BLC-NEW data set, using two triplet sequence KBPs derived from BLCLUST and BETAN PDFs. Threading results are expressed in percentile rank r, while the native scores In(c/s) were computed by Eq.(9). Each point in the plot represents one of 5000 short segments, whose coordinates are the difference in r of the native conformation in threading (x-axis) and the difference in In(c/s) as given by the two KBPs. A positive ΔIn(c/s) means that BLCLUST assigns a higher mutual information score than BETAN. A positive Δr means that native conformations are assigned lower (better) ranks by using BLCLUST KBPs than by BETAN KBPs. The strong correlation between the assignment of higher scores and the ability to detect native conformations is evidence of the superiority of BLCLUST PDFs over BETAN PDFs (See Table 3 for the details of the results of this threading test.).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4045576&req=5

pone-0094334-g012: Comparison of BLCLUST and BETAN performance in folding recognition with local decoy threading.Threading trials were done to 5000 short segments randomly selected from the BLC-NEW data set, using two triplet sequence KBPs derived from BLCLUST and BETAN PDFs. Threading results are expressed in percentile rank r, while the native scores In(c/s) were computed by Eq.(9). Each point in the plot represents one of 5000 short segments, whose coordinates are the difference in r of the native conformation in threading (x-axis) and the difference in In(c/s) as given by the two KBPs. A positive ΔIn(c/s) means that BLCLUST assigns a higher mutual information score than BETAN. A positive Δr means that native conformations are assigned lower (better) ranks by using BLCLUST KBPs than by BETAN KBPs. The strong correlation between the assignment of higher scores and the ability to detect native conformations is evidence of the superiority of BLCLUST PDFs over BETAN PDFs (See Table 3 for the details of the results of this threading test.).

Mentions: The results for each 10-mer threading, summarized in Table 3 and plotted in Figure 12, confirms once again the solid correlation between an increase in mutual information and an improvement in performance as exemplified by a decrease in native score rank. For 70.6% of 10-mers, BLCLUST assigned a higher mutual information value than BETAN, resulting in a marked decrease in r for about 76.4% of the 10-mers. Aggregately, the mean mutual information increase is 0.11 nats while the mean decrease in native score percentile rank <r> is 3.23%. BLCLUST PDFs are significantly superior in recognizing native folds than BETAN PDFs.


Deriving high-resolution protein backbone structure propensities from all crystal data using the information maximization device.

Solis AD - PLoS ONE (2014)

Comparison of BLCLUST and BETAN performance in folding recognition with local decoy threading.Threading trials were done to 5000 short segments randomly selected from the BLC-NEW data set, using two triplet sequence KBPs derived from BLCLUST and BETAN PDFs. Threading results are expressed in percentile rank r, while the native scores In(c/s) were computed by Eq.(9). Each point in the plot represents one of 5000 short segments, whose coordinates are the difference in r of the native conformation in threading (x-axis) and the difference in In(c/s) as given by the two KBPs. A positive ΔIn(c/s) means that BLCLUST assigns a higher mutual information score than BETAN. A positive Δr means that native conformations are assigned lower (better) ranks by using BLCLUST KBPs than by BETAN KBPs. The strong correlation between the assignment of higher scores and the ability to detect native conformations is evidence of the superiority of BLCLUST PDFs over BETAN PDFs (See Table 3 for the details of the results of this threading test.).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4045576&req=5

pone-0094334-g012: Comparison of BLCLUST and BETAN performance in folding recognition with local decoy threading.Threading trials were done to 5000 short segments randomly selected from the BLC-NEW data set, using two triplet sequence KBPs derived from BLCLUST and BETAN PDFs. Threading results are expressed in percentile rank r, while the native scores In(c/s) were computed by Eq.(9). Each point in the plot represents one of 5000 short segments, whose coordinates are the difference in r of the native conformation in threading (x-axis) and the difference in In(c/s) as given by the two KBPs. A positive ΔIn(c/s) means that BLCLUST assigns a higher mutual information score than BETAN. A positive Δr means that native conformations are assigned lower (better) ranks by using BLCLUST KBPs than by BETAN KBPs. The strong correlation between the assignment of higher scores and the ability to detect native conformations is evidence of the superiority of BLCLUST PDFs over BETAN PDFs (See Table 3 for the details of the results of this threading test.).
Mentions: The results for each 10-mer threading, summarized in Table 3 and plotted in Figure 12, confirms once again the solid correlation between an increase in mutual information and an improvement in performance as exemplified by a decrease in native score rank. For 70.6% of 10-mers, BLCLUST assigned a higher mutual information value than BETAN, resulting in a marked decrease in r for about 76.4% of the 10-mers. Aggregately, the mean mutual information increase is 0.11 nats while the mean decrease in native score percentile rank <r> is 3.23%. BLCLUST PDFs are significantly superior in recognizing native folds than BETAN PDFs.

Bottom Line: This work demonstrates the effectiveness of the IMD and the superiority of the resulting PDFs by extensive fold recognition experiments and rigorous comparisons with previously published triplet PDFs.Because it automatically optimizes PDFs, IMD results in improved performance of knowledge-based potentials, which rely on such PDFs.Furthermore, it provides an easy computational recipe for empirically deriving other kinds of sequence-dependent structural PDFs with greater detail and precision.

View Article: PubMed Central - PubMed

Affiliation: Biological Sciences Department, New York City College of Technology, The City University of New York, Brooklyn, New York, United States of America.

ABSTRACT
The most informative probability distribution functions (PDFs) describing the Ramachandran phi-psi dihedral angle pair, a fundamental descriptor of backbone conformation of protein molecules, are derived from high-resolution X-ray crystal structures using an information-theoretic approach. The Information Maximization Device (IMD) is established, based on fundamental information-theoretic concepts, and then applied specifically to derive highly resolved phi-psi maps for all 20 single amino acid and all 8000 triplet sequences at an optimal resolution determined by the volume of current data. The paper shows that utilizing the latent information contained in all viable high-resolution crystal structures found in the Protein Data Bank (PDB), totaling more than 77,000 chains, permits the derivation of a large number of optimized sequence-dependent PDFs. This work demonstrates the effectiveness of the IMD and the superiority of the resulting PDFs by extensive fold recognition experiments and rigorous comparisons with previously published triplet PDFs. Because it automatically optimizes PDFs, IMD results in improved performance of knowledge-based potentials, which rely on such PDFs. Furthermore, it provides an easy computational recipe for empirically deriving other kinds of sequence-dependent structural PDFs with greater detail and precision. The high-resolution phi-psi maps derived in this work are available for download.

Show MeSH