Limits...
Network-based Prediction of Cancer under Genetic Storm.

Ay A, Gong D, Kahveci T - Cancer Inform (2014)

Bottom Line: Here we present a new network-based supervised classification technique, namely the NBC method.We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods.Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Colgate University, Hamilton, NY, USA. ; Department of Biology, Colgate University, Hamilton, NY, USA.

ABSTRACT
Classification of cancer patients using traditional methods is a challenging task in the medical practice. Owing to rapid advances in microarray technologies, currently expression levels of thousands of genes from individual cancer patients can be measured. The classification of cancer patients by supervised statistical learning algorithms using the gene expression datasets provides an alternative to the traditional methods. Here we present a new network-based supervised classification technique, namely the NBC method. We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods. Our results on five large cancer datasets demonstrate that NBC method outperforms traditional classification techniques. Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy. Finally, in-depth analysis of the correlation-based co-expression networks chosen by our network-based classifier in different cancer classes shows that there are drastic changes in the network models of different cancer types.

No MeSH data available.


Related in: MedlinePlus

Closeness centrality distributions of the networks in different cancer classes. The closeness centrality distributions of the networks that are created by the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the closeness centrality score and y-axis represents the frequency.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4214593&req=5

f6-cin-suppl.3-2014-015: Closeness centrality distributions of the networks in different cancer classes. The closeness centrality distributions of the networks that are created by the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the closeness centrality score and y-axis represents the frequency.

Mentions: Despite the vast amount of experimental and computational studies, we still have limited knowledge about the mechanisms of different cancer types. In order to understand cancer-dependent changes in the correlation-based co-expression networks, here we give a brief analysis of the network measures for the networks created by the NBC method for different cancer classes in leukemia and NCI60 cancer datasets (Figs. 3–6). As suggested above, the best feature selection method for the NBC classifier is the SU feature selection method. Because of that, in this section we focused on the association networks created by the genes selected by the SU feature selection method. Owing to sparse network structures, we omitted the lung, breast, and colon cancer datasets in this experiment (see Figure 3). We compared the networks created for different cancer classes with respect to three network measures, namely degree, clustering coefficient, and closeness centrality distributions of the nodes of the network models generated by NBC. For both datasets, we have used the network, which leads to the best classification of the datasets if up to 100 genes are used (Table 1 and Fig. 1). For the NCI60 dataset, the best accuracy is achieved at 75 genes with a correlation threshold of 0.725, and for the leukemia dataset, 100 genes with a correlation threshold of 0.75.


Network-based Prediction of Cancer under Genetic Storm.

Ay A, Gong D, Kahveci T - Cancer Inform (2014)

Closeness centrality distributions of the networks in different cancer classes. The closeness centrality distributions of the networks that are created by the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the closeness centrality score and y-axis represents the frequency.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4214593&req=5

f6-cin-suppl.3-2014-015: Closeness centrality distributions of the networks in different cancer classes. The closeness centrality distributions of the networks that are created by the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the closeness centrality score and y-axis represents the frequency.
Mentions: Despite the vast amount of experimental and computational studies, we still have limited knowledge about the mechanisms of different cancer types. In order to understand cancer-dependent changes in the correlation-based co-expression networks, here we give a brief analysis of the network measures for the networks created by the NBC method for different cancer classes in leukemia and NCI60 cancer datasets (Figs. 3–6). As suggested above, the best feature selection method for the NBC classifier is the SU feature selection method. Because of that, in this section we focused on the association networks created by the genes selected by the SU feature selection method. Owing to sparse network structures, we omitted the lung, breast, and colon cancer datasets in this experiment (see Figure 3). We compared the networks created for different cancer classes with respect to three network measures, namely degree, clustering coefficient, and closeness centrality distributions of the nodes of the network models generated by NBC. For both datasets, we have used the network, which leads to the best classification of the datasets if up to 100 genes are used (Table 1 and Fig. 1). For the NCI60 dataset, the best accuracy is achieved at 75 genes with a correlation threshold of 0.725, and for the leukemia dataset, 100 genes with a correlation threshold of 0.75.

Bottom Line: Here we present a new network-based supervised classification technique, namely the NBC method.We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods.Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Colgate University, Hamilton, NY, USA. ; Department of Biology, Colgate University, Hamilton, NY, USA.

ABSTRACT
Classification of cancer patients using traditional methods is a challenging task in the medical practice. Owing to rapid advances in microarray technologies, currently expression levels of thousands of genes from individual cancer patients can be measured. The classification of cancer patients by supervised statistical learning algorithms using the gene expression datasets provides an alternative to the traditional methods. Here we present a new network-based supervised classification technique, namely the NBC method. We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods. Our results on five large cancer datasets demonstrate that NBC method outperforms traditional classification techniques. Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy. Finally, in-depth analysis of the correlation-based co-expression networks chosen by our network-based classifier in different cancer classes shows that there are drastic changes in the network models of different cancer types.

No MeSH data available.


Related in: MedlinePlus