Limits...
Similarity search for local protein structures at atomic resolution by exploiting a database management system

View Article: PubMed Central - PubMed

ABSTRACT

A method to search for local structural similarities in proteins at atomic resolution is presented. It is demonstrated that a huge amount of structural data can be handled within a reasonable CPU time by using a conventional relational database management system with appropriate indexing of geometric data. This method, which we call geometric indexing, can enumerate ligand binding sites that are structurally similar to sub-structures of a query protein among more than 160,000 possible candidates within a few hours of CPU time on an ordinary desktop computer. After detecting a set of high scoring ligand binding sites by the geometric indexing search, structural alignments at atomic resolution are constructed by iteratively applying the Hungarian algorithm, and the statistical significance of the final score is estimated from an empirical model based on a gamma distribution. Applications of this method to several protein structures clearly shows that significant similarities can be detected between local structures of non-homologous as well as homologous proteins.

No MeSH data available.


Scatter plot of the IR scores and coordinate RMS deviations resulted from a search with the PDB entry 101m. The regions enclosed by the circles marked with M and G contain mostly myoglobins and other globins, respectively.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC5036654&req=5

f5-3_75: Scatter plot of the IR scores and coordinate RMS deviations resulted from a search with the PDB entry 101m. The regions enclosed by the circles marked with M and G contain mostly myoglobins and other globins, respectively.

Mentions: In general, good alignments should have high IR scores and low coordinate root mean square (cRMS) deviations. This trend is clearly observed in Figure 5. That is, good alignments should reside in the right bottom corner of the scatter plot of Figure 5. In this scatter plot, we can recognize two high-scoring clusters around IR score of 60–70 and 25–35, which correspond to closely related myoglobins and other globins, respectively. In the region of low IR scores, there are may templates with low cRMS values. A low IR score implies a small number of aligned atoms, hence the low cRMS values.


Similarity search for local protein structures at atomic resolution by exploiting a database management system
Scatter plot of the IR scores and coordinate RMS deviations resulted from a search with the PDB entry 101m. The regions enclosed by the circles marked with M and G contain mostly myoglobins and other globins, respectively.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC5036654&req=5

f5-3_75: Scatter plot of the IR scores and coordinate RMS deviations resulted from a search with the PDB entry 101m. The regions enclosed by the circles marked with M and G contain mostly myoglobins and other globins, respectively.
Mentions: In general, good alignments should have high IR scores and low coordinate root mean square (cRMS) deviations. This trend is clearly observed in Figure 5. That is, good alignments should reside in the right bottom corner of the scatter plot of Figure 5. In this scatter plot, we can recognize two high-scoring clusters around IR score of 60–70 and 25–35, which correspond to closely related myoglobins and other globins, respectively. In the region of low IR scores, there are may templates with low cRMS values. A low IR score implies a small number of aligned atoms, hence the low cRMS values.

View Article: PubMed Central - PubMed

ABSTRACT

A method to search for local structural similarities in proteins at atomic resolution is presented. It is demonstrated that a huge amount of structural data can be handled within a reasonable CPU time by using a conventional relational database management system with appropriate indexing of geometric data. This method, which we call geometric indexing, can enumerate ligand binding sites that are structurally similar to sub-structures of a query protein among more than 160,000 possible candidates within a few hours of CPU time on an ordinary desktop computer. After detecting a set of high scoring ligand binding sites by the geometric indexing search, structural alignments at atomic resolution are constructed by iteratively applying the Hungarian algorithm, and the statistical significance of the final score is estimated from an empirical model based on a gamma distribution. Applications of this method to several protein structures clearly shows that significant similarities can be detected between local structures of non-homologous as well as homologous proteins.

No MeSH data available.