Limits...
Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction.

Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M - BMC Bioinformatics (2007)

Bottom Line: Most importantly, they can minimize the experimental effort needed to identify epitopes.NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures.The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark. metteb@cbs.dtu.dk

ABSTRACT

Background: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods.

Results: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score.

Conclusion: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL. All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.

Show MeSH

Related in: MedlinePlus

Comparing specificities. The HIV dataset has been used for the analysis. In order to include epitopes restricted to as many supertypes as possible, NetCTL-1.2 is compared to each of the other methods separately. For each comparison, only predictions for supertypes that the test method covers are included. The average specificity is found at a predefined average sensitivity using either NetCTL-1.2 or one of the four test methods (EpiJen, MAPPP, MHC-pathway, WAPP). A: Average sensitivity = 0.3, B: Average sensitivity = 0.5, C: Average sensitivity = 0.8. Only NetCTL-1.2, MAPPP and MHC-pathway provide enough predicted scores to obtain a sensitivity of 0.8. The error bars are the standard error. ** The difference is significant at P < 0.01. * The difference is significant at P < 0.05.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2194739&req=5

Figure 3: Comparing specificities. The HIV dataset has been used for the analysis. In order to include epitopes restricted to as many supertypes as possible, NetCTL-1.2 is compared to each of the other methods separately. For each comparison, only predictions for supertypes that the test method covers are included. The average specificity is found at a predefined average sensitivity using either NetCTL-1.2 or one of the four test methods (EpiJen, MAPPP, MHC-pathway, WAPP). A: Average sensitivity = 0.3, B: Average sensitivity = 0.5, C: Average sensitivity = 0.8. Only NetCTL-1.2, MAPPP and MHC-pathway provide enough predicted scores to obtain a sensitivity of 0.8. The error bars are the standard error. ** The difference is significant at P < 0.01. * The difference is significant at P < 0.05.

Mentions: When using the default settings at the NetCTL-1.2, MAPPP, and WAPP servers, thresholds are defined that separate the predicted epitopes from the predicted non-epitopes. At the EpiJen server, one can choose between defining the top-scoring 5%, 4%, 3%, or 2% peptides as epitopes. MHC-pathway does as yet not offer any thresholds for separating predicted epitopes from non-epitopes. These differences pose a challenge when comparing the performance of the methods as regards to sensitivity and specificity, since it is a prerequisite for the calculation of these measures that the predicted epitopes can be separated from the non-epitopes. Furthermore, as mentioned earlier, it is generally problematic to distinguish which method has the highest predictive performance, if one method has the highest sensitivity, while the other method has the highest specificity. To overcome these problems, we chose to compare the specificity of the methods at a series of predefined sensitivity values. We chose three predefined sensitivities: 0.3, 0.5, and 0.8. For the HIV dataset, we again compared two methods at a time: NetCTL-1.2 and one of the four test methods, in order to include epitopes restricted to as many supertypes as possible. For the HIVEpiJen dataset, all methods can be compared simultaneously, since all methods can predict epitopes restricted to the A1, A2, and A3 supertypes. We first identified the prediction threshold values that result in the desired sensitivity when averaging over all epitope-protein pairs. We then used the same thresholds to find the average specificity. Figure 3 shows the results for the HIV dataset. It can be seen that NetCTL-1.2 has a significantly higher specificity than EpiJen, MAPPP, and WAPP at all sensitivities (P < 0.01, unpaired student's t-test). When comparing NetCTL-1.2 to MHC-pathway, it can be seen that at an average sensitivity of 0.3 and 0.5 NetCTL has a higher specificity than MHC-pathway although this difference is not statistically significant. At an average sensitivity of 0.8, NetCTL-1.2 has significantly higher specificity than MHC-pathway (P < 0.05, unpaired student's t-test).


Large-scale validation of methods for cytotoxic T-lymphocyte epitope prediction.

Larsen MV, Lundegaard C, Lamberth K, Buus S, Lund O, Nielsen M - BMC Bioinformatics (2007)

Comparing specificities. The HIV dataset has been used for the analysis. In order to include epitopes restricted to as many supertypes as possible, NetCTL-1.2 is compared to each of the other methods separately. For each comparison, only predictions for supertypes that the test method covers are included. The average specificity is found at a predefined average sensitivity using either NetCTL-1.2 or one of the four test methods (EpiJen, MAPPP, MHC-pathway, WAPP). A: Average sensitivity = 0.3, B: Average sensitivity = 0.5, C: Average sensitivity = 0.8. Only NetCTL-1.2, MAPPP and MHC-pathway provide enough predicted scores to obtain a sensitivity of 0.8. The error bars are the standard error. ** The difference is significant at P < 0.01. * The difference is significant at P < 0.05.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2194739&req=5

Figure 3: Comparing specificities. The HIV dataset has been used for the analysis. In order to include epitopes restricted to as many supertypes as possible, NetCTL-1.2 is compared to each of the other methods separately. For each comparison, only predictions for supertypes that the test method covers are included. The average specificity is found at a predefined average sensitivity using either NetCTL-1.2 or one of the four test methods (EpiJen, MAPPP, MHC-pathway, WAPP). A: Average sensitivity = 0.3, B: Average sensitivity = 0.5, C: Average sensitivity = 0.8. Only NetCTL-1.2, MAPPP and MHC-pathway provide enough predicted scores to obtain a sensitivity of 0.8. The error bars are the standard error. ** The difference is significant at P < 0.01. * The difference is significant at P < 0.05.
Mentions: When using the default settings at the NetCTL-1.2, MAPPP, and WAPP servers, thresholds are defined that separate the predicted epitopes from the predicted non-epitopes. At the EpiJen server, one can choose between defining the top-scoring 5%, 4%, 3%, or 2% peptides as epitopes. MHC-pathway does as yet not offer any thresholds for separating predicted epitopes from non-epitopes. These differences pose a challenge when comparing the performance of the methods as regards to sensitivity and specificity, since it is a prerequisite for the calculation of these measures that the predicted epitopes can be separated from the non-epitopes. Furthermore, as mentioned earlier, it is generally problematic to distinguish which method has the highest predictive performance, if one method has the highest sensitivity, while the other method has the highest specificity. To overcome these problems, we chose to compare the specificity of the methods at a series of predefined sensitivity values. We chose three predefined sensitivities: 0.3, 0.5, and 0.8. For the HIV dataset, we again compared two methods at a time: NetCTL-1.2 and one of the four test methods, in order to include epitopes restricted to as many supertypes as possible. For the HIVEpiJen dataset, all methods can be compared simultaneously, since all methods can predict epitopes restricted to the A1, A2, and A3 supertypes. We first identified the prediction threshold values that result in the desired sensitivity when averaging over all epitope-protein pairs. We then used the same thresholds to find the average specificity. Figure 3 shows the results for the HIV dataset. It can be seen that NetCTL-1.2 has a significantly higher specificity than EpiJen, MAPPP, and WAPP at all sensitivities (P < 0.01, unpaired student's t-test). When comparing NetCTL-1.2 to MHC-pathway, it can be seen that at an average sensitivity of 0.3 and 0.5 NetCTL has a higher specificity than MHC-pathway although this difference is not statistically significant. At an average sensitivity of 0.8, NetCTL-1.2 has significantly higher specificity than MHC-pathway (P < 0.05, unpaired student's t-test).

Bottom Line: Most importantly, they can minimize the experimental effort needed to identify epitopes.NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures.The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures.

View Article: PubMed Central - HTML - PubMed

Affiliation: Center for Biological Sequence Analysis, BioCentrum-DTU, Building 208, Technical University of Denmark, DK-2800 Lyngby, Denmark. metteb@cbs.dtu.dk

ABSTRACT

Background: Reliable predictions of Cytotoxic T lymphocyte (CTL) epitopes are essential for rational vaccine design. Most importantly, they can minimize the experimental effort needed to identify epitopes. NetCTL is a web-based tool designed for predicting human CTL epitopes in any given protein. It does so by integrating predictions of proteasomal cleavage, TAP transport efficiency, and MHC class I affinity. At least four other methods have been developed recently that likewise attempt to predict CTL epitopes: EpiJen, MAPPP, MHC-pathway, and WAPP. In order to compare the performance of prediction methods, objective benchmarks and standardized performance measures are needed. Here, we develop such large-scale benchmark and corresponding performance measures and report the performance of an updated version 1.2 of NetCTL in comparison with the four other methods.

Results: We define a number of performance measures that can handle the different types of output data from the five methods. We use two evaluation datasets consisting of known HIV CTL epitopes and their source proteins. The source proteins are split into all possible 9 mers and except for annotated epitopes; all other 9 mers are considered non-epitopes. In the RANK measure, we compare two methods at a time and count how often each of the methods rank the epitope highest. In another measure, we find the specificity of the methods at three predefined sensitivity values. Lastly, for each method, we calculate the percentage of known epitopes that rank within the 5% peptides with the highest predicted score.

Conclusion: NetCTL-1.2 is demonstrated to have a higher predictive performance than EpiJen, MAPPP, MHC-pathway, and WAPP on all performance measures. The higher performance of NetCTL-1.2 as compared to EpiJen and MHC-pathway is, however, not statistically significant on all measures. In the large-scale benchmark calculation consisting of 216 known HIV epitopes covering all 12 recognized HLA supertypes, the NetCTL-1.2 method was shown to have a sensitivity among the 5% top-scoring peptides above 0.72. On this dataset, the best of the other methods achieved a sensitivity of 0.64. The NetCTL-1.2 method is available at http://www.cbs.dtu.dk/services/NetCTL. All used datasets are available at http://www.cbs.dtu.dk/suppl/immunology/CTL-1.2.php.

Show MeSH
Related in: MedlinePlus