Limits...
Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics.

Pfeifer N, Leinenbach A, Huber CG, Kohlbacher O - BMC Bioinformatics (2007)

Bottom Line: A major problem with existing methods lies within the significant number of false positive and false negative annotations.Identification can thus be improved by comparing measured retention times to predicted retention times.Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division for Simulation of Biological Systems, Center for Bioinformatics, Eberhard-Karls University, 72076 Tübingen, Germany. npfeifer@informatik.uni-tuebingen.de

ABSTRACT

Background: High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing methods lies within the significant number of false positive and false negative annotations. So far, standard algorithms for protein identification do not use the information gained from separation processes usually involved in peptide analysis, such as retention time information, which are readily available from chromatographic separation of the sample. Identification can thus be improved by comparing measured retention times to predicted retention times. Current prediction models are derived from a set of measured test analytes but they usually require large amounts of training data.

Results: We introduce a new kernel function which can be applied in combination with support vector machines to a wide range of computational proteomics problems. We show the performance of this new approach by applying it to the prediction of peptide adsorption/elution behavior in strong anion-exchange solid-phase extraction (SAX-SPE) and ion-pair reversed-phase high-performance liquid chromatography (IP-RP-HPLC). Furthermore, the predicted retention times are used to improve spectrum identifications by a p-value-based filtering approach. The approach was tested on a number of different datasets and shows excellent performance while requiring only very small training sets (about 40 peptides instead of thousands). Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly.

Conclusion: The proposed kernel function is well-suited for the prediction of chromatographic separation in computational proteomics and requires only a limited amount of training data. The performance of this new method is demonstrated by applying it to peptide retention time prediction in IP-RP-HPLC and prediction of peptide sample fractionation in SAX-SPE. Finally, we incorporate the predicted chromatographic behavior in a p-value based filter to improve peptide identifications based on liquid chromatography-tandem mass spectrometry.

Show MeSH

Related in: MedlinePlus

Performance comparison for peptide sample fractionation prediction. Comparison of classification success rates for different methods predicting peptide adsorption on the dataset of Oh et al. [11].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2254445&req=5

Figure 2: Performance comparison for peptide sample fractionation prediction. Comparison of classification success rates for different methods predicting peptide adsorption on the dataset of Oh et al. [11].

Mentions: A comparison of the SR for different methods can be found in Fig. 2. The first two bars represent the SR performance of the best SVMs using standard kernels of Table 1. The third bar demonstrates the performance of an SVM with the local-alignment kernel. The fourth bar shows the performance of the best predictor in Oh et al., which is 0.84. The last bar represents the SR of the POBK, which is introduced in this paper, for peptide sample fractionation and retention time prediction. The SR of this method is 0.87, which is significantly better than all other approaches.


Statistical learning of peptide retention behavior in chromatographic separations: a new kernel-based approach for computational proteomics.

Pfeifer N, Leinenbach A, Huber CG, Kohlbacher O - BMC Bioinformatics (2007)

Performance comparison for peptide sample fractionation prediction. Comparison of classification success rates for different methods predicting peptide adsorption on the dataset of Oh et al. [11].
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2254445&req=5

Figure 2: Performance comparison for peptide sample fractionation prediction. Comparison of classification success rates for different methods predicting peptide adsorption on the dataset of Oh et al. [11].
Mentions: A comparison of the SR for different methods can be found in Fig. 2. The first two bars represent the SR performance of the best SVMs using standard kernels of Table 1. The third bar demonstrates the performance of an SVM with the local-alignment kernel. The fourth bar shows the performance of the best predictor in Oh et al., which is 0.84. The last bar represents the SR of the POBK, which is introduced in this paper, for peptide sample fractionation and retention time prediction. The SR of this method is 0.87, which is significantly better than all other approaches.

Bottom Line: A major problem with existing methods lies within the significant number of false positive and false negative annotations.Identification can thus be improved by comparing measured retention times to predicted retention times.Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division for Simulation of Biological Systems, Center for Bioinformatics, Eberhard-Karls University, 72076 Tübingen, Germany. npfeifer@informatik.uni-tuebingen.de

ABSTRACT

Background: High-throughput peptide and protein identification technologies have benefited tremendously from strategies based on tandem mass spectrometry (MS/MS) in combination with database searching algorithms. A major problem with existing methods lies within the significant number of false positive and false negative annotations. So far, standard algorithms for protein identification do not use the information gained from separation processes usually involved in peptide analysis, such as retention time information, which are readily available from chromatographic separation of the sample. Identification can thus be improved by comparing measured retention times to predicted retention times. Current prediction models are derived from a set of measured test analytes but they usually require large amounts of training data.

Results: We introduce a new kernel function which can be applied in combination with support vector machines to a wide range of computational proteomics problems. We show the performance of this new approach by applying it to the prediction of peptide adsorption/elution behavior in strong anion-exchange solid-phase extraction (SAX-SPE) and ion-pair reversed-phase high-performance liquid chromatography (IP-RP-HPLC). Furthermore, the predicted retention times are used to improve spectrum identifications by a p-value-based filtering approach. The approach was tested on a number of different datasets and shows excellent performance while requiring only very small training sets (about 40 peptides instead of thousands). Using the retention time predictor in our retention time filter improves the fraction of correctly identified peptide mass spectra significantly.

Conclusion: The proposed kernel function is well-suited for the prediction of chromatographic separation in computational proteomics and requires only a limited amount of training data. The performance of this new method is demonstrated by applying it to peptide retention time prediction in IP-RP-HPLC and prediction of peptide sample fractionation in SAX-SPE. Finally, we incorporate the predicted chromatographic behavior in a p-value based filter to improve peptide identifications based on liquid chromatography-tandem mass spectrometry.

Show MeSH
Related in: MedlinePlus