Limits...
How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis.

Vihinen M - BMC Genomics (2012)

Bottom Line: Instructions are given on how to interpret and compare method evaluation results.Predictions of genetic variation effects on DNA, RNA and protein level are important as information about variants can be produced much faster than their disease relevance can be experimentally verified.Comparisons of methods for missense variant tolerance, protein stability changes due to amino acid substitutions, and effects of variations on mRNA splicing are presented.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biomedical Technology, University of Tampere, Finland. mauno.vihinen@med.lu.se

ABSTRACT

Background: Prediction methods are increasingly used in biosciences to forecast diverse features and characteristics. Binary two-state classifiers are the most common applications. They are usually based on machine learning approaches. For the end user it is often problematic to evaluate the true performance and applicability of computational tools as some knowledge about computer science and statistics would be needed.

Results: Instructions are given on how to interpret and compare method evaluation results. For systematic method performance analysis is needed established benchmark datasets which contain cases with known outcome, and suitable evaluation measures. The criteria for benchmark datasets are discussed along with their implementation in VariBench, benchmark database for variations. There is no single measure that alone could describe all the aspects of method performance. Predictions of genetic variation effects on DNA, RNA and protein level are important as information about variants can be produced much faster than their disease relevance can be experimentally verified. Therefore numerous prediction tools have been developed, however, systematic analyses of their performance and comparison have just started to emerge.

Conclusions: The end users of prediction tools should be able to understand how evaluation is done and how to interpret the results. Six main performance evaluation measures are introduced. These include sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Matthews correlation coefficient. Together with receiver operating characteristics (ROC) analysis they provide a good picture about the performance of methods and allow their objective and quantitative comparison. A checklist of items to look at is provided. Comparisons of methods for missense variant tolerance, protein stability changes due to amino acid substitutions, and effects of variations on mRNA splicing are presented.

Show MeSH

Related in: MedlinePlus

Separation of classes In most classification problems the two classes are overlapping. By moving the cut off position the amount of the overlap of the classes can be adjusted. FN and FP are misclassified cases. The prediction methods aim at optimizing the cut off and thereby adjusting the numbers in the contingency table.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3303716&req=5

Figure 5: Separation of classes In most classification problems the two classes are overlapping. By moving the cut off position the amount of the overlap of the classes can be adjusted. FN and FP are misclassified cases. The prediction methods aim at optimizing the cut off and thereby adjusting the numbers in the contingency table.

Mentions: The goal of two-class prediction methods is to separate positive cases from negative ones. Because the predictions for the two classes usually overlap a cut off distinguishing the categories has to be optimized (Fig. 5). By moving the cut off different amounts of misclassified cases FN and FP appear. By using well behaved representative data and well trained classifier the misclassifications can be minimized.


How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis.

Vihinen M - BMC Genomics (2012)

Separation of classes In most classification problems the two classes are overlapping. By moving the cut off position the amount of the overlap of the classes can be adjusted. FN and FP are misclassified cases. The prediction methods aim at optimizing the cut off and thereby adjusting the numbers in the contingency table.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3303716&req=5

Figure 5: Separation of classes In most classification problems the two classes are overlapping. By moving the cut off position the amount of the overlap of the classes can be adjusted. FN and FP are misclassified cases. The prediction methods aim at optimizing the cut off and thereby adjusting the numbers in the contingency table.
Mentions: The goal of two-class prediction methods is to separate positive cases from negative ones. Because the predictions for the two classes usually overlap a cut off distinguishing the categories has to be optimized (Fig. 5). By moving the cut off different amounts of misclassified cases FN and FP appear. By using well behaved representative data and well trained classifier the misclassifications can be minimized.

Bottom Line: Instructions are given on how to interpret and compare method evaluation results.Predictions of genetic variation effects on DNA, RNA and protein level are important as information about variants can be produced much faster than their disease relevance can be experimentally verified.Comparisons of methods for missense variant tolerance, protein stability changes due to amino acid substitutions, and effects of variations on mRNA splicing are presented.

View Article: PubMed Central - HTML - PubMed

Affiliation: Institute of Biomedical Technology, University of Tampere, Finland. mauno.vihinen@med.lu.se

ABSTRACT

Background: Prediction methods are increasingly used in biosciences to forecast diverse features and characteristics. Binary two-state classifiers are the most common applications. They are usually based on machine learning approaches. For the end user it is often problematic to evaluate the true performance and applicability of computational tools as some knowledge about computer science and statistics would be needed.

Results: Instructions are given on how to interpret and compare method evaluation results. For systematic method performance analysis is needed established benchmark datasets which contain cases with known outcome, and suitable evaluation measures. The criteria for benchmark datasets are discussed along with their implementation in VariBench, benchmark database for variations. There is no single measure that alone could describe all the aspects of method performance. Predictions of genetic variation effects on DNA, RNA and protein level are important as information about variants can be produced much faster than their disease relevance can be experimentally verified. Therefore numerous prediction tools have been developed, however, systematic analyses of their performance and comparison have just started to emerge.

Conclusions: The end users of prediction tools should be able to understand how evaluation is done and how to interpret the results. Six main performance evaluation measures are introduced. These include sensitivity, specificity, positive predictive value, negative predictive value, accuracy and Matthews correlation coefficient. Together with receiver operating characteristics (ROC) analysis they provide a good picture about the performance of methods and allow their objective and quantitative comparison. A checklist of items to look at is provided. Comparisons of methods for missense variant tolerance, protein stability changes due to amino acid substitutions, and effects of variations on mRNA splicing are presented.

Show MeSH
Related in: MedlinePlus