Limits...
Similarity search for local protein structures at atomic resolution by exploiting a database management system

View Article: PubMed Central - PubMed

ABSTRACT

A method to search for local structural similarities in proteins at atomic resolution is presented. It is demonstrated that a huge amount of structural data can be handled within a reasonable CPU time by using a conventional relational database management system with appropriate indexing of geometric data. This method, which we call geometric indexing, can enumerate ligand binding sites that are structurally similar to sub-structures of a query protein among more than 160,000 possible candidates within a few hours of CPU time on an ordinary desktop computer. After detecting a set of high scoring ligand binding sites by the geometric indexing search, structural alignments at atomic resolution are constructed by iteratively applying the Hungarian algorithm, and the statistical significance of the final score is estimated from an empirical model based on a gamma distribution. Applications of this method to several protein structures clearly shows that significant similarities can be detected between local structures of non-homologous as well as homologous proteins.

No MeSH data available.


Related in: MedlinePlus

Optimal superpositions of the NAD-binding sites of the query alcohol dehydrogenase (PDB ID: 1het)31 on templates. A: The template is the NAD-binding site of urocanase protein (PDB ID: 1x87; Tereshko et al., unpublished) from Bacillus stearothermophilus. B: The template is the FAD-binding site of p-hydroxybenzoate hydroxylase (PDB ID: 1iuv32) from Pseudomonas aeruginosa. The color scheme is the same as Fig. 6. The ligand of 1het is also shown in the stick model with the CPK colors.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC5036654&req=5

f8-3_75: Optimal superpositions of the NAD-binding sites of the query alcohol dehydrogenase (PDB ID: 1het)31 on templates. A: The template is the NAD-binding site of urocanase protein (PDB ID: 1x87; Tereshko et al., unpublished) from Bacillus stearothermophilus. B: The template is the FAD-binding site of p-hydroxybenzoate hydroxylase (PDB ID: 1iuv32) from Pseudomonas aeruginosa. The color scheme is the same as Fig. 6. The ligand of 1het is also shown in the stick model with the CPK colors.

Mentions: The fourth example is the alcohol dehydrogenase (ADH; PDB ID: 1het31) from Equus caballus (horse). The first 107 top hits are the nicotinamide-adenine-dinucleotide (NAD)-binding sites of ADHs from various species, which are followed by various kinds of other dehydrogenases such as formaldehyde dehydrogenase, sorbitol dehydrogenase, glucose dehydrogenase, and so on. We looked for structural similarities with proteins other than dehydrogenases, and have found a few such examples. One example is the NAD-binding site of the urocanase protein (PDB ID: 1x87; Tereshko et al., unpublished) with an IR score of 24.0 (P=2.7×10−6). According to the SCOP database, this protein belongs to the urocanase fold which is clearly different from the NAD(P)-binding Rossmann-fold domain of the ADH. The alignment (Fig. 8A) consists of 76 atom pairs yielding cRMS of 1.0 Å. Another example is the flavin-adenine dinucleotide (FAD)-binding site of p-hydroxybenzoate hydroxylase (PHBH; PDB ID: 1iuv32) which exhibited a significant IR score of 20.2 (P=2.3×10−5; Fig. 8B). PHBH belongs to the FAD/NAD(P)-binding domain fold which is different from the NAD(P)-binding Rossmann fold of ADH.


Similarity search for local protein structures at atomic resolution by exploiting a database management system
Optimal superpositions of the NAD-binding sites of the query alcohol dehydrogenase (PDB ID: 1het)31 on templates. A: The template is the NAD-binding site of urocanase protein (PDB ID: 1x87; Tereshko et al., unpublished) from Bacillus stearothermophilus. B: The template is the FAD-binding site of p-hydroxybenzoate hydroxylase (PDB ID: 1iuv32) from Pseudomonas aeruginosa. The color scheme is the same as Fig. 6. The ligand of 1het is also shown in the stick model with the CPK colors.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC5036654&req=5

f8-3_75: Optimal superpositions of the NAD-binding sites of the query alcohol dehydrogenase (PDB ID: 1het)31 on templates. A: The template is the NAD-binding site of urocanase protein (PDB ID: 1x87; Tereshko et al., unpublished) from Bacillus stearothermophilus. B: The template is the FAD-binding site of p-hydroxybenzoate hydroxylase (PDB ID: 1iuv32) from Pseudomonas aeruginosa. The color scheme is the same as Fig. 6. The ligand of 1het is also shown in the stick model with the CPK colors.
Mentions: The fourth example is the alcohol dehydrogenase (ADH; PDB ID: 1het31) from Equus caballus (horse). The first 107 top hits are the nicotinamide-adenine-dinucleotide (NAD)-binding sites of ADHs from various species, which are followed by various kinds of other dehydrogenases such as formaldehyde dehydrogenase, sorbitol dehydrogenase, glucose dehydrogenase, and so on. We looked for structural similarities with proteins other than dehydrogenases, and have found a few such examples. One example is the NAD-binding site of the urocanase protein (PDB ID: 1x87; Tereshko et al., unpublished) with an IR score of 24.0 (P=2.7×10−6). According to the SCOP database, this protein belongs to the urocanase fold which is clearly different from the NAD(P)-binding Rossmann-fold domain of the ADH. The alignment (Fig. 8A) consists of 76 atom pairs yielding cRMS of 1.0 Å. Another example is the flavin-adenine dinucleotide (FAD)-binding site of p-hydroxybenzoate hydroxylase (PHBH; PDB ID: 1iuv32) which exhibited a significant IR score of 20.2 (P=2.3×10−5; Fig. 8B). PHBH belongs to the FAD/NAD(P)-binding domain fold which is different from the NAD(P)-binding Rossmann fold of ADH.

View Article: PubMed Central - PubMed

ABSTRACT

A method to search for local structural similarities in proteins at atomic resolution is presented. It is demonstrated that a huge amount of structural data can be handled within a reasonable CPU time by using a conventional relational database management system with appropriate indexing of geometric data. This method, which we call geometric indexing, can enumerate ligand binding sites that are structurally similar to sub-structures of a query protein among more than 160,000 possible candidates within a few hours of CPU time on an ordinary desktop computer. After detecting a set of high scoring ligand binding sites by the geometric indexing search, structural alignments at atomic resolution are constructed by iteratively applying the Hungarian algorithm, and the statistical significance of the final score is estimated from an empirical model based on a gamma distribution. Applications of this method to several protein structures clearly shows that significant similarities can be detected between local structures of non-homologous as well as homologous proteins.

No MeSH data available.


Related in: MedlinePlus