Limits...
TIM-Finder: a new method for identifying TIM-barrel proteins.

Si JN, Yan RX, Wang C, Zhang Z, Su XD - BMC Struct. Biol. (2009)

Bottom Line: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles.With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder.TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification.

View Article: PubMed Central - HTML - PubMed

Affiliation: State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China. sijingna@gmail.com

ABSTRACT

Background: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles. To accelerate the exploration of the sequence-structure protein landscape in the TIM-barrel fold, a computational tool that allows sensitive detection of TIM-barrel proteins is required.

Results: To develop a new TIM-barrel protein identification method in this work, we consider three descriptors: a sequence-alignment-based descriptor using PSI-BLAST e-values and bit scores, a descriptor based on secondary structure element alignment (SSEA), and a descriptor based on the occurrence of PROSITE functional motifs. With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder. When tested on the whole proteome of Bacillus subtilis, TIM-Finder is able to detect 194 TIM-barrel proteins at a 99% confidence level, outperforming the PSI-BLAST search as well as one existing fold recognition method.

Conclusions: TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification. The TIM-Finder web server is freely accessible at http://202.112.170.199/TIM-Finder/.

Show MeSH
The CE structural alignment of two TIM-barrel proteins 1vpqA and 1i60A. Although the PSI-BLAST-based descriptor was not able to detect the remote homologous relationship between 1vpqA (green) and 1i60A (yellow), the SSEA-based descriptor can successfully recognize their structural similarity based on a SSEA score of 0.814.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2803183&req=5

Figure 3: The CE structural alignment of two TIM-barrel proteins 1vpqA and 1i60A. Although the PSI-BLAST-based descriptor was not able to detect the remote homologous relationship between 1vpqA (green) and 1i60A (yellow), the SSEA-based descriptor can successfully recognize their structural similarity based on a SSEA score of 0.814.

Mentions: Predicted secondary structure has long been proven to be helpful in protein fold classification and recognition [28], and the SSEA-based descriptor has been reported to be an effective way to consider the information of predicted secondary structure [14,29,30]. As shown in Figure 2, the SSEA-based descriptor performs the best, and it achieves an AUC value of 0.953. At a FPR less than 5%, the SSEA-based descriptor is able to successfully recognize 78.5% of the TIM-barrel proteins. As reported in our previous study [25], the PSI-BLAST-based descriptor is much better than SSEA at generic fold recognition. Interestingly, SSEA is more powerful than the PSI-BLAST-based descriptor in recognizing TIM-barrel proteins. Generally, the TIM-barrel fold has a well conserved 3D structure, which consists of eight β-strands and eight α-helices. From N-terminus to C-terminus, the secondary structure of a typical TIM-barrel fold is strictly arranged as β1-α1-β2-α2-β3-α3-β4-α4-β5-α5-β6-α6-β7-α7-β8-α8 (Figure 1A), which may explain why the SSEA descriptor is so powerful in recognizing TIM-barrel proteins. The performance of the SSEA-based descriptor is further demonstrated in two TIM-barrel proteins distant from one another in sequence space: 1vpqA (SCOP index: c.1.32.1) and 1i60A (SCOP index: c.1.15.4). Because the two proteins share a weak sequence similarity, the PSI-BLAST-based descriptor fails to recognize their remote homologous relationship. With a SSEA score of 0.814, however, the SSEA-based descriptor is able to catch these two proteins' structural similarity. The success of SSEA should be ascribed to the overall conservation of secondary structure topology between these two proteins, which can be observed from their structural alignment derived from the CE algorithm [31] (Figure 3).


TIM-Finder: a new method for identifying TIM-barrel proteins.

Si JN, Yan RX, Wang C, Zhang Z, Su XD - BMC Struct. Biol. (2009)

The CE structural alignment of two TIM-barrel proteins 1vpqA and 1i60A. Although the PSI-BLAST-based descriptor was not able to detect the remote homologous relationship between 1vpqA (green) and 1i60A (yellow), the SSEA-based descriptor can successfully recognize their structural similarity based on a SSEA score of 0.814.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2803183&req=5

Figure 3: The CE structural alignment of two TIM-barrel proteins 1vpqA and 1i60A. Although the PSI-BLAST-based descriptor was not able to detect the remote homologous relationship between 1vpqA (green) and 1i60A (yellow), the SSEA-based descriptor can successfully recognize their structural similarity based on a SSEA score of 0.814.
Mentions: Predicted secondary structure has long been proven to be helpful in protein fold classification and recognition [28], and the SSEA-based descriptor has been reported to be an effective way to consider the information of predicted secondary structure [14,29,30]. As shown in Figure 2, the SSEA-based descriptor performs the best, and it achieves an AUC value of 0.953. At a FPR less than 5%, the SSEA-based descriptor is able to successfully recognize 78.5% of the TIM-barrel proteins. As reported in our previous study [25], the PSI-BLAST-based descriptor is much better than SSEA at generic fold recognition. Interestingly, SSEA is more powerful than the PSI-BLAST-based descriptor in recognizing TIM-barrel proteins. Generally, the TIM-barrel fold has a well conserved 3D structure, which consists of eight β-strands and eight α-helices. From N-terminus to C-terminus, the secondary structure of a typical TIM-barrel fold is strictly arranged as β1-α1-β2-α2-β3-α3-β4-α4-β5-α5-β6-α6-β7-α7-β8-α8 (Figure 1A), which may explain why the SSEA descriptor is so powerful in recognizing TIM-barrel proteins. The performance of the SSEA-based descriptor is further demonstrated in two TIM-barrel proteins distant from one another in sequence space: 1vpqA (SCOP index: c.1.32.1) and 1i60A (SCOP index: c.1.15.4). Because the two proteins share a weak sequence similarity, the PSI-BLAST-based descriptor fails to recognize their remote homologous relationship. With a SSEA score of 0.814, however, the SSEA-based descriptor is able to catch these two proteins' structural similarity. The success of SSEA should be ascribed to the overall conservation of secondary structure topology between these two proteins, which can be observed from their structural alignment derived from the CE algorithm [31] (Figure 3).

Bottom Line: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles.With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder.TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification.

View Article: PubMed Central - HTML - PubMed

Affiliation: State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China. sijingna@gmail.com

ABSTRACT

Background: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles. To accelerate the exploration of the sequence-structure protein landscape in the TIM-barrel fold, a computational tool that allows sensitive detection of TIM-barrel proteins is required.

Results: To develop a new TIM-barrel protein identification method in this work, we consider three descriptors: a sequence-alignment-based descriptor using PSI-BLAST e-values and bit scores, a descriptor based on secondary structure element alignment (SSEA), and a descriptor based on the occurrence of PROSITE functional motifs. With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder. When tested on the whole proteome of Bacillus subtilis, TIM-Finder is able to detect 194 TIM-barrel proteins at a 99% confidence level, outperforming the PSI-BLAST search as well as one existing fold recognition method.

Conclusions: TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification. The TIM-Finder web server is freely accessible at http://202.112.170.199/TIM-Finder/.

Show MeSH