Limits...
Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction.

Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M - BMC Bioinformatics (2007)

Bottom Line: The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures.In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72.On this dataset, the best of the other methods achieved a sensitivity of 0.64.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark. metteb@cbs.dtu.dk

ABSTRACT

Background: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods.

Results: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score.

Conclusion: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL. All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.

Show MeSH
ROC curves. The analysis has been performed on 41 A3 restricted epitope-protein pairs from the HIV dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194739&req=5

Figure 1: ROC curves. The analysis has been performed on 41 A3 restricted epitope-protein pairs from the HIV dataset.

Mentions: In the above section, we used the AUC value to compare NetCTL-1.2 to NetCTL-1.0 and NetMHC-3.0NO_HIV. This measure is, however, not appropriate for the EpiJen and WAPP methods. These methods do not produce a single, combined score for each peptide in the dataset. Instead, the proteasomal cleavage and TAP transport predictors act as filters that reduce the number of possible epitopes. In addition, the EpiJen server maximally outputs the 5% peptides, which have the highest predicted MHC class I affinity and at the same time pass the proteasomal cleavage and TAP transport filters. The problem is exemplified in the ROC (Receiver Operating Characteristic) curve shown in Figure 1. For NetCTL-1.2, MAPPP, and MHC-pathway, the combined score is used as the predicted value. For EpiJen and WAPP, we used the predicted MHC class I affinity as the predicted value. The ROC curves for the two last-mentioned methods come to an abrupt stop, since there are no predicted values for peptides that do not pass the proteasomal cleavage and TAP transport filters. The ROC curves also highlight the need for extracting sensitivity at comparable specificity levels and vice versa in order to achieve objective benchmark comparisons between different methods: Any of the methods can be assigned the highest sensitivity, if the specificity is not set at a comparable level.


Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction.

Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M - BMC Bioinformatics (2007)

ROC curves. The analysis has been performed on 41 A3 restricted epitope-protein pairs from the HIV dataset.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194739&req=5

Figure 1: ROC curves. The analysis has been performed on 41 A3 restricted epitope-protein pairs from the HIV dataset.
Mentions: In the above section, we used the AUC value to compare NetCTL-1.2 to NetCTL-1.0 and NetMHC-3.0NO_HIV. This measure is, however, not appropriate for the EpiJen and WAPP methods. These methods do not produce a single, combined score for each peptide in the dataset. Instead, the proteasomal cleavage and TAP transport predictors act as filters that reduce the number of possible epitopes. In addition, the EpiJen server maximally outputs the 5% peptides, which have the highest predicted MHC class I affinity and at the same time pass the proteasomal cleavage and TAP transport filters. The problem is exemplified in the ROC (Receiver Operating Characteristic) curve shown in Figure 1. For NetCTL-1.2, MAPPP, and MHC-pathway, the combined score is used as the predicted value. For EpiJen and WAPP, we used the predicted MHC class I affinity as the predicted value. The ROC curves for the two last-mentioned methods come to an abrupt stop, since there are no predicted values for peptides that do not pass the proteasomal cleavage and TAP transport filters. The ROC curves also highlight the need for extracting sensitivity at comparable specificity levels and vice versa in order to achieve objective benchmark comparisons between different methods: Any of the methods can be assigned the highest sensitivity, if the specificity is not set at a comparable level.

Bottom Line: The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures.In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72.On this dataset, the best of the other methods achieved a sensitivity of 0.64.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark. metteb@cbs.dtu.dk

ABSTRACT

Background: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods.

Results: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score.

Conclusion: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL. All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.

Show MeSH