Limits...
Similarity search for local protein structures at atomic resolution by exploiting a database management system

View Article: PubMed Central - PubMed

ABSTRACT

A method to search for local structural similarities in proteins at atomic resolution is presented. It is demonstrated that a huge amount of structural data can be handled within a reasonable CPU time by using a conventional relational database management system with appropriate indexing of geometric data. This method, which we call geometric indexing, can enumerate ligand binding sites that are structurally similar to sub-structures of a query protein among more than 160,000 possible candidates within a few hours of CPU time on an ordinary desktop computer. After detecting a set of high scoring ligand binding sites by the geometric indexing search, structural alignments at atomic resolution are constructed by iteratively applying the Hungarian algorithm, and the statistical significance of the final score is estimated from an empirical model based on a gamma distribution. Applications of this method to several protein structures clearly shows that significant similarities can be detected between local structures of non-homologous as well as homologous proteins.

No MeSH data available.


Optimal superpositions of the query 1svn on templates. The wire-frame model in the CPK color scheme is the query protein 1svn. The template atoms are colored in green. Aligned atoms are in ball-and-stick model. The ligand of the template is the ball-and-stick model in magenta. A: Peptide-binding site of subtilisin DY (PDB ID: 1bh621). B: Peptide-binding site of γ-chymotrypsin (PDB ID: 7gch24); the labeled Ser, His, Asp are the aligned catalytic triad. The figures were created by using the PDBjViewer25.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC5036654&req=5

f6-3_75: Optimal superpositions of the query 1svn on templates. The wire-frame model in the CPK color scheme is the query protein 1svn. The template atoms are colored in green. Aligned atoms are in ball-and-stick model. The ligand of the template is the ball-and-stick model in magenta. A: Peptide-binding site of subtilisin DY (PDB ID: 1bh621). B: Peptide-binding site of γ-chymotrypsin (PDB ID: 7gch24); the labeled Ser, His, Asp are the aligned catalytic triad. The figures were created by using the PDBjViewer25.

Mentions: We next examine the result of a search with subtilisin savinase from Bacillus lentus (PDB ID: 1svn20) as a query. The top hit was the peptide binding site of subtilisin DY (PDB ID: 1bh621) with an IR score of 59.8 and P-value of 1.0×10−14 (Fig. 6A). Subsequent hits were subtilisins and related proteases. After these subtilisin-related templates (removing physically implausible templates), we found a Mn2+ binding site of Dicer from Giardia intestinalis (PDB ID: 2ffl22; P=1.5×10−5) and Mg2+ binding site of 30S ribosomal subunit S20 from Thermus thermophilus (PDB ID: 1i9423; P=1.8×10−5). But these ion binding sites reside within common loop structures, and hence they are likely to be biochemically/biologically insignificant. At the 255th rank, we found the active site of bovine γ-chymotrypsin (PDB ID: 7gch24) with an IR score of 20.9 (P-value 2.0×10−5). This protein has a different fold than subtilisins but shares the common catalytic triad consisting of three residues Ser, His, and Asp. The obtained atomic alignment indeed contains these catalytic residues. Namely, Asp32, His64, and Ser221 of subtilisin Savinase are aligned with Asp102, His57, and Ser195 of γ-trypsin (Fig. 6B).


Similarity search for local protein structures at atomic resolution by exploiting a database management system
Optimal superpositions of the query 1svn on templates. The wire-frame model in the CPK color scheme is the query protein 1svn. The template atoms are colored in green. Aligned atoms are in ball-and-stick model. The ligand of the template is the ball-and-stick model in magenta. A: Peptide-binding site of subtilisin DY (PDB ID: 1bh621). B: Peptide-binding site of γ-chymotrypsin (PDB ID: 7gch24); the labeled Ser, His, Asp are the aligned catalytic triad. The figures were created by using the PDBjViewer25.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC5036654&req=5

f6-3_75: Optimal superpositions of the query 1svn on templates. The wire-frame model in the CPK color scheme is the query protein 1svn. The template atoms are colored in green. Aligned atoms are in ball-and-stick model. The ligand of the template is the ball-and-stick model in magenta. A: Peptide-binding site of subtilisin DY (PDB ID: 1bh621). B: Peptide-binding site of γ-chymotrypsin (PDB ID: 7gch24); the labeled Ser, His, Asp are the aligned catalytic triad. The figures were created by using the PDBjViewer25.
Mentions: We next examine the result of a search with subtilisin savinase from Bacillus lentus (PDB ID: 1svn20) as a query. The top hit was the peptide binding site of subtilisin DY (PDB ID: 1bh621) with an IR score of 59.8 and P-value of 1.0×10−14 (Fig. 6A). Subsequent hits were subtilisins and related proteases. After these subtilisin-related templates (removing physically implausible templates), we found a Mn2+ binding site of Dicer from Giardia intestinalis (PDB ID: 2ffl22; P=1.5×10−5) and Mg2+ binding site of 30S ribosomal subunit S20 from Thermus thermophilus (PDB ID: 1i9423; P=1.8×10−5). But these ion binding sites reside within common loop structures, and hence they are likely to be biochemically/biologically insignificant. At the 255th rank, we found the active site of bovine γ-chymotrypsin (PDB ID: 7gch24) with an IR score of 20.9 (P-value 2.0×10−5). This protein has a different fold than subtilisins but shares the common catalytic triad consisting of three residues Ser, His, and Asp. The obtained atomic alignment indeed contains these catalytic residues. Namely, Asp32, His64, and Ser221 of subtilisin Savinase are aligned with Asp102, His57, and Ser195 of γ-trypsin (Fig. 6B).

View Article: PubMed Central - PubMed

ABSTRACT

A method to search for local structural similarities in proteins at atomic resolution is presented. It is demonstrated that a huge amount of structural data can be handled within a reasonable CPU time by using a conventional relational database management system with appropriate indexing of geometric data. This method, which we call geometric indexing, can enumerate ligand binding sites that are structurally similar to sub-structures of a query protein among more than 160,000 possible candidates within a few hours of CPU time on an ordinary desktop computer. After detecting a set of high scoring ligand binding sites by the geometric indexing search, structural alignments at atomic resolution are constructed by iteratively applying the Hungarian algorithm, and the statistical significance of the final score is estimated from an empirical model based on a gamma distribution. Applications of this method to several protein structures clearly shows that significant similarities can be detected between local structures of non-homologous as well as homologous proteins.

No MeSH data available.