Limits...
OncoScore: a novel, Internet-based tool to assess the oncogenic potential of genes

View Article: PubMed Central - PubMed

ABSTRACT

The complicated, evolving landscape of cancer mutations poses a formidable challenge to identify cancer genes among the large lists of mutations typically generated in NGS experiments. The ability to prioritize these variants is therefore of paramount importance. To address this issue we developed OncoScore, a text-mining tool that ranks genes according to their association with cancer, based on available biomedical literature. Receiver operating characteristic curve and the area under the curve (AUC) metrics on manually curated datasets confirmed the excellent discriminating capability of OncoScore (OncoScore cut-off threshold = 21.09; AUC = 90.3%, 95% CI: 88.1–92.5%), indicating that OncoScore provides useful results in cases where an efficient prioritization of cancer-associated genes is needed.

No MeSH data available.


Related in: MedlinePlus

OncoScore distribution of ‘Cancer’ and ‘Non-Cancer’ gene sets.BoxPlot (a) and frequency histogram (b) of the OncoScore distributions for non-cancer and cancer genes. (a) Each box plot is drawn between the lower and upper quartiles of the distributions with bold black line showing the median value. The OncoScore distributions of ‘Cancer’ and ‘Non-Cancer’ genes are significantly different (Mann-Whitney-Wilcoxon Test: p-value = 2.2e-16). (b) OncoScore frequency distribution plotted by equispaced breaks. (c) OncoScore and (d) Gene Ranker ranking plot of a mixed panel comprising ‘Cancer’ (*) and ‘Non-Cancer’ genes. The horizontal red lines identify the best cut-off classifier threshold models.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5384236&req=5

f1: OncoScore distribution of ‘Cancer’ and ‘Non-Cancer’ gene sets.BoxPlot (a) and frequency histogram (b) of the OncoScore distributions for non-cancer and cancer genes. (a) Each box plot is drawn between the lower and upper quartiles of the distributions with bold black line showing the median value. The OncoScore distributions of ‘Cancer’ and ‘Non-Cancer’ genes are significantly different (Mann-Whitney-Wilcoxon Test: p-value = 2.2e-16). (b) OncoScore frequency distribution plotted by equispaced breaks. (c) OncoScore and (d) Gene Ranker ranking plot of a mixed panel comprising ‘Cancer’ (*) and ‘Non-Cancer’ genes. The horizontal red lines identify the best cut-off classifier threshold models.

Mentions: The distribution of OncoScore values differed significantly between the two groups (mean: 48.8 and 14.8 for CGC and nCan, respectively; p-value = 2.2e−16; Fig. 1a,b). The receiver operating characteristic (ROC) curve and the area under the curve (AUC) metrics (Fig. 2a,b) confirmed the excellent capability of OncoScore in discriminating the true positive from the true negative cancer genes at different cut-off values (OncoScore cut-off threshold = 21.09; AUC1 = 90.3%, 95% CI: 88.1–92.5; see Methods section for further details). The same analysis performed on the entire list of known human genes (Supplementary Table 3) using an identical cut-off (21.09) identified a total of 5945 cancer-related genes, corresponding to 35% of the total (Suppl. Fig. 1).


OncoScore: a novel, Internet-based tool to assess the oncogenic potential of genes
OncoScore distribution of ‘Cancer’ and ‘Non-Cancer’ gene sets.BoxPlot (a) and frequency histogram (b) of the OncoScore distributions for non-cancer and cancer genes. (a) Each box plot is drawn between the lower and upper quartiles of the distributions with bold black line showing the median value. The OncoScore distributions of ‘Cancer’ and ‘Non-Cancer’ genes are significantly different (Mann-Whitney-Wilcoxon Test: p-value = 2.2e-16). (b) OncoScore frequency distribution plotted by equispaced breaks. (c) OncoScore and (d) Gene Ranker ranking plot of a mixed panel comprising ‘Cancer’ (*) and ‘Non-Cancer’ genes. The horizontal red lines identify the best cut-off classifier threshold models.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5384236&req=5

f1: OncoScore distribution of ‘Cancer’ and ‘Non-Cancer’ gene sets.BoxPlot (a) and frequency histogram (b) of the OncoScore distributions for non-cancer and cancer genes. (a) Each box plot is drawn between the lower and upper quartiles of the distributions with bold black line showing the median value. The OncoScore distributions of ‘Cancer’ and ‘Non-Cancer’ genes are significantly different (Mann-Whitney-Wilcoxon Test: p-value = 2.2e-16). (b) OncoScore frequency distribution plotted by equispaced breaks. (c) OncoScore and (d) Gene Ranker ranking plot of a mixed panel comprising ‘Cancer’ (*) and ‘Non-Cancer’ genes. The horizontal red lines identify the best cut-off classifier threshold models.
Mentions: The distribution of OncoScore values differed significantly between the two groups (mean: 48.8 and 14.8 for CGC and nCan, respectively; p-value = 2.2e−16; Fig. 1a,b). The receiver operating characteristic (ROC) curve and the area under the curve (AUC) metrics (Fig. 2a,b) confirmed the excellent capability of OncoScore in discriminating the true positive from the true negative cancer genes at different cut-off values (OncoScore cut-off threshold = 21.09; AUC1 = 90.3%, 95% CI: 88.1–92.5; see Methods section for further details). The same analysis performed on the entire list of known human genes (Supplementary Table 3) using an identical cut-off (21.09) identified a total of 5945 cancer-related genes, corresponding to 35% of the total (Suppl. Fig. 1).

View Article: PubMed Central - PubMed

ABSTRACT

The complicated, evolving landscape of cancer mutations poses a formidable challenge to identify cancer genes among the large lists of mutations typically generated in NGS experiments. The ability to prioritize these variants is therefore of paramount importance. To address this issue we developed OncoScore, a text-mining tool that ranks genes according to their association with cancer, based on available biomedical literature. Receiver operating characteristic curve and the area under the curve (AUC) metrics on manually curated datasets confirmed the excellent discriminating capability of OncoScore (OncoScore cut-off threshold = 21.09; AUC = 90.3%, 95% CI: 88.1–92.5%), indicating that OncoScore provides useful results in cases where an efficient prioritization of cancer-associated genes is needed.

No MeSH data available.


Related in: MedlinePlus