Limits...
Benchmarking mutation effect prediction algorithms using functionally validated cancer-related missense mutations.

Martelotto LG, Ng CK, De Filippo MR, Zhang Y, Piscuoglio S, Lim RS, Shen R, Norton L, Reis-Filho JS, Weigelt B - Genome Biol. (2014)

Bottom Line: Combinations of predictors modestly improve accuracy and significantly improve negative predictive values.The information provided by mutation effect predictors is not equivalent.Combining algorithms aggregates orthogonal information and may result in improvements in the negative predictive value of mutation effect predictions.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.

ABSTRACT

Background: Massively parallel sequencing studies have led to the identification of a large number of mutations present in a minority of cancers of a given site. Hence, methods to identify the likely pathogenic mutations that are worth exploring experimentally and clinically are required. We sought to compare the performance of 15 mutation effect prediction algorithms and their agreement. As a hypothesis-generating aim, we sought to define whether combinations of prediction algorithms would improve the functional effect predictions of specific mutations.

Results: Literature and database mining of single nucleotide variants (SNVs) affecting 15 cancer genes was performed to identify mutations supported by functional evidence or hereditary disease association to be classified either as non-neutral (n = 849) or neutral (n = 140) with respect to their impact on protein function. These SNVs were employed to test the performance of 15 mutation effect prediction algorithms. The accuracy of the prediction algorithms varies considerably. Although all algorithms perform consistently well in terms of positive predictive value, their negative predictive value varies substantially. Cancer-specific mutation effect predictors display no-to-almost perfect agreement in their predictions of these SNVs, whereas the non-cancer-specific predictors showed no-to-moderate agreement. Combinations of predictors modestly improve accuracy and significantly improve negative predictive values.

Conclusions: The information provided by mutation effect predictors is not equivalent. No algorithm is able to predict sufficiently accurately SNVs that should be taken forward for experimental or clinical testing. Combining algorithms aggregates orthogonal information and may result in improvements in the negative predictive value of mutation effect predictions.

Show MeSH

Related in: MedlinePlus

Recurrence of individual mutation effect prediction algorithms in the top performing mutation effect prediction algorithm combinations ranked by composite score. The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 849) and neutral (n = 140) single nucleotide variants (SNVs) included in the entire dataset and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (A). The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 188) and neutral (n = 109) SNVs not present in the COSMIC database and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (B).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4232638&req=5

Fig4: Recurrence of individual mutation effect prediction algorithms in the top performing mutation effect prediction algorithm combinations ranked by composite score. The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 849) and neutral (n = 140) single nucleotide variants (SNVs) included in the entire dataset and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (A). The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 188) and neutral (n = 109) SNVs not present in the COSMIC database and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (B).

Mentions: Although mutation effect prediction algorithm combinations had a relatively limited impact on accuracy and composite score, some predictor combinations significantly improved the NPV as compared to the best single and meta-predictor (Figure 3, Additional files 20, 21, and 22). Again, the CHASM (breast) and MutationTaster predictor combination resulted in a significant improvement in NPV as compared to the NPV of the best single predictor or the best meta-predictor in all subsets. When analyzing the top 10, top 20, top 50, and top 100 combinations of mutation effect prediction algorithms, we noted that MutationTaster, CHASM (breast), and CHASM (lung) were consistently present in the top performing predictor combinations in subsets 1 and 2 using the 989 functionally validated SNVs, irrespective of whether the combination predictor performance was ranked according to accuracy or composite score (Figure 4; Additional file 23). When only the non-COSMIC SNVs were included in the analysis, the same mutation effect prediction algorithms were consistently present in the best performing mutation effect prediction algorithm combinations (Figure 4; Additional file 23).Figure 4


Benchmarking mutation effect prediction algorithms using functionally validated cancer-related missense mutations.

Martelotto LG, Ng CK, De Filippo MR, Zhang Y, Piscuoglio S, Lim RS, Shen R, Norton L, Reis-Filho JS, Weigelt B - Genome Biol. (2014)

Recurrence of individual mutation effect prediction algorithms in the top performing mutation effect prediction algorithm combinations ranked by composite score. The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 849) and neutral (n = 140) single nucleotide variants (SNVs) included in the entire dataset and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (A). The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 188) and neutral (n = 109) SNVs not present in the COSMIC database and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (B).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4232638&req=5

Fig4: Recurrence of individual mutation effect prediction algorithms in the top performing mutation effect prediction algorithm combinations ranked by composite score. The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 849) and neutral (n = 140) single nucleotide variants (SNVs) included in the entire dataset and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (A). The top 10, top 20, top 50, and top 100 combinations of prediction algorithms were defined using the non-neutral (n = 188) and neutral (n = 109) SNVs not present in the COSMIC database and ranked according to composite score. The frequency of each single mutation effect predictor present in these top combinations was determined in subset 1 and subset 2 (B).
Mentions: Although mutation effect prediction algorithm combinations had a relatively limited impact on accuracy and composite score, some predictor combinations significantly improved the NPV as compared to the best single and meta-predictor (Figure 3, Additional files 20, 21, and 22). Again, the CHASM (breast) and MutationTaster predictor combination resulted in a significant improvement in NPV as compared to the NPV of the best single predictor or the best meta-predictor in all subsets. When analyzing the top 10, top 20, top 50, and top 100 combinations of mutation effect prediction algorithms, we noted that MutationTaster, CHASM (breast), and CHASM (lung) were consistently present in the top performing predictor combinations in subsets 1 and 2 using the 989 functionally validated SNVs, irrespective of whether the combination predictor performance was ranked according to accuracy or composite score (Figure 4; Additional file 23). When only the non-COSMIC SNVs were included in the analysis, the same mutation effect prediction algorithms were consistently present in the best performing mutation effect prediction algorithm combinations (Figure 4; Additional file 23).Figure 4

Bottom Line: Combinations of predictors modestly improve accuracy and significantly improve negative predictive values.The information provided by mutation effect predictors is not equivalent.Combining algorithms aggregates orthogonal information and may result in improvements in the negative predictive value of mutation effect predictions.

View Article: PubMed Central - PubMed

Affiliation: Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.

ABSTRACT

Background: Massively parallel sequencing studies have led to the identification of a large number of mutations present in a minority of cancers of a given site. Hence, methods to identify the likely pathogenic mutations that are worth exploring experimentally and clinically are required. We sought to compare the performance of 15 mutation effect prediction algorithms and their agreement. As a hypothesis-generating aim, we sought to define whether combinations of prediction algorithms would improve the functional effect predictions of specific mutations.

Results: Literature and database mining of single nucleotide variants (SNVs) affecting 15 cancer genes was performed to identify mutations supported by functional evidence or hereditary disease association to be classified either as non-neutral (n = 849) or neutral (n = 140) with respect to their impact on protein function. These SNVs were employed to test the performance of 15 mutation effect prediction algorithms. The accuracy of the prediction algorithms varies considerably. Although all algorithms perform consistently well in terms of positive predictive value, their negative predictive value varies substantially. Cancer-specific mutation effect predictors display no-to-almost perfect agreement in their predictions of these SNVs, whereas the non-cancer-specific predictors showed no-to-moderate agreement. Combinations of predictors modestly improve accuracy and significantly improve negative predictive values.

Conclusions: The information provided by mutation effect predictors is not equivalent. No algorithm is able to predict sufficiently accurately SNVs that should be taken forward for experimental or clinical testing. Combining algorithms aggregates orthogonal information and may result in improvements in the negative predictive value of mutation effect predictions.

Show MeSH
Related in: MedlinePlus