Limits...
TIM-Finder: a new method for identifying TIM-barrel proteins.

Si JN, Yan RX, Wang C, Zhang Z, Su XD - BMC Struct. Biol. (2009)

Bottom Line: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles.With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder.TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification.

View Article: PubMed Central - HTML - PubMed

Affiliation: State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China. sijingna@gmail.com

ABSTRACT

Background: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles. To accelerate the exploration of the sequence-structure protein landscape in the TIM-barrel fold, a computational tool that allows sensitive detection of TIM-barrel proteins is required.

Results: To develop a new TIM-barrel protein identification method in this work, we consider three descriptors: a sequence-alignment-based descriptor using PSI-BLAST e-values and bit scores, a descriptor based on secondary structure element alignment (SSEA), and a descriptor based on the occurrence of PROSITE functional motifs. With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder. When tested on the whole proteome of Bacillus subtilis, TIM-Finder is able to detect 194 TIM-barrel proteins at a 99% confidence level, outperforming the PSI-BLAST search as well as one existing fold recognition method.

Conclusions: TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification. The TIM-Finder web server is freely accessible at http://202.112.170.199/TIM-Finder/.

Show MeSH
The overall performance of TIM-Finder measured by ROC analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2803183&req=5

Figure 5: The overall performance of TIM-Finder measured by ROC analysis.

Mentions: Using SVM, the PSI-BLAST-, SSEA- and motif-based descriptors were combined into a prediction system called TIM-Finder. More details of the construction of TIM-Finder are available under Methods. The overall performance of TIM-Finder was further measured by the ROC curve (Figure 5). For the purpose of comparison, prediction based on the combination of PSI-BLAST- and SSEA-based descriptors was also carried out. Meanwhile, the result from the single PSI-BLAST-based descriptor is also shown in Figure 5 to provide a benchmark for TIM-Finder. As shown in Figure 5, TIM-Finder results in a high AUC value of 0.987. Since the performance at low false positive rates is more important for real-world applications, the sensitivity values of TIM-Finder at 1%, 5% and 10% FPRs are further listed in Table 1. With a 5% FPR rate control, TIM-Finder is able to correctly identify 92.0% of the TIM-barrel proteins, which is approximately 17 percentage points higher than the individual PSI-BLAST-based descriptor and about 12 percentage points higher than the combination of the PSI-BLAST- and SSEA-based descriptors (Table 1; Figure 5). Although the motif-based descriptor itself has an overall weak performance, it should be emphasized here that the motif-based descriptor does make an important contribution to the final performance of TIM-Finder (Figure 5), implying that it relies on quite different features from the PSI-BLAST- and SSEA-based descriptors. Generally, TIM-Finder has been benchmarked to have an excellent performance, implying it can be applied in practical use such as proteome-wide TIM-barrel protein detection.


TIM-Finder: a new method for identifying TIM-barrel proteins.

Si JN, Yan RX, Wang C, Zhang Z, Su XD - BMC Struct. Biol. (2009)

The overall performance of TIM-Finder measured by ROC analysis.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2803183&req=5

Figure 5: The overall performance of TIM-Finder measured by ROC analysis.
Mentions: Using SVM, the PSI-BLAST-, SSEA- and motif-based descriptors were combined into a prediction system called TIM-Finder. More details of the construction of TIM-Finder are available under Methods. The overall performance of TIM-Finder was further measured by the ROC curve (Figure 5). For the purpose of comparison, prediction based on the combination of PSI-BLAST- and SSEA-based descriptors was also carried out. Meanwhile, the result from the single PSI-BLAST-based descriptor is also shown in Figure 5 to provide a benchmark for TIM-Finder. As shown in Figure 5, TIM-Finder results in a high AUC value of 0.987. Since the performance at low false positive rates is more important for real-world applications, the sensitivity values of TIM-Finder at 1%, 5% and 10% FPRs are further listed in Table 1. With a 5% FPR rate control, TIM-Finder is able to correctly identify 92.0% of the TIM-barrel proteins, which is approximately 17 percentage points higher than the individual PSI-BLAST-based descriptor and about 12 percentage points higher than the combination of the PSI-BLAST- and SSEA-based descriptors (Table 1; Figure 5). Although the motif-based descriptor itself has an overall weak performance, it should be emphasized here that the motif-based descriptor does make an important contribution to the final performance of TIM-Finder (Figure 5), implying that it relies on quite different features from the PSI-BLAST- and SSEA-based descriptors. Generally, TIM-Finder has been benchmarked to have an excellent performance, implying it can be applied in practical use such as proteome-wide TIM-barrel protein detection.

Bottom Line: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles.With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder.TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification.

View Article: PubMed Central - HTML - PubMed

Affiliation: State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China. sijingna@gmail.com

ABSTRACT

Background: The triosephosphate isomerase (TIM)-barrel fold occurs frequently in the proteomes of different organisms, and the known TIM-barrel proteins have been found to play diverse functional roles. To accelerate the exploration of the sequence-structure protein landscape in the TIM-barrel fold, a computational tool that allows sensitive detection of TIM-barrel proteins is required.

Results: To develop a new TIM-barrel protein identification method in this work, we consider three descriptors: a sequence-alignment-based descriptor using PSI-BLAST e-values and bit scores, a descriptor based on secondary structure element alignment (SSEA), and a descriptor based on the occurrence of PROSITE functional motifs. With the assistance of Support Vector Machine (SVM), the three descriptors were combined to obtain a new method with improved performance, which we call TIM-Finder. When tested on the whole proteome of Bacillus subtilis, TIM-Finder is able to detect 194 TIM-barrel proteins at a 99% confidence level, outperforming the PSI-BLAST search as well as one existing fold recognition method.

Conclusions: TIM-Finder can serve as a competitive tool for proteome-wide TIM-barrel protein identification. The TIM-Finder web server is freely accessible at http://202.112.170.199/TIM-Finder/.

Show MeSH