Limits...
Network-based Prediction of Cancer under Genetic Storm.

Ay A, Gong D, Kahveci T - Cancer Inform (2014)

Bottom Line: Here we present a new network-based supervised classification technique, namely the NBC method.We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods.Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Colgate University, Hamilton, NY, USA. ; Department of Biology, Colgate University, Hamilton, NY, USA.

ABSTRACT
Classification of cancer patients using traditional methods is a challenging task in the medical practice. Owing to rapid advances in microarray technologies, currently expression levels of thousands of genes from individual cancer patients can be measured. The classification of cancer patients by supervised statistical learning algorithms using the gene expression datasets provides an alternative to the traditional methods. Here we present a new network-based supervised classification technique, namely the NBC method. We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods. Our results on five large cancer datasets demonstrate that NBC method outperforms traditional classification techniques. Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy. Finally, in-depth analysis of the correlation-based co-expression networks chosen by our network-based classifier in different cancer classes shows that there are drastic changes in the network models of different cancer types.

No MeSH data available.


Related in: MedlinePlus

Clustering coefficient distributions of the networks in different cancer classes. The clustering coefficient distributions of the networks are shown for the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the clustering coefficient score and y-axis represents the frequency.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4214593&req=5

f5-cin-suppl.3-2014-015: Clustering coefficient distributions of the networks in different cancer classes. The clustering coefficient distributions of the networks are shown for the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the clustering coefficient score and y-axis represents the frequency.

Mentions: Next we measured the clustering coefficient and closeness centrality values for each gene. We observed that in classes 3, 4, 6, 7, and 8, networks have very small clustering coefficients (Fig. 5A), which suggests that in these cancer classes, most of the genes’ neighbors are not associated with each other. In regard to the closeness centrality, these five classes showed centrality scores less than or equal to 8 (Fig. 6A), which is possibly because of the fact that the networks formed in these cancer classes are small because of the many isolated genes in the networks. We observed slightly different behaviors in cancer classes 1, 2, and 5 probably because of the smaller number of isolated genes. In these classes, networks showed slightly more clustering between genes and higher centrality score (9–21) (Figs. 5A and 6A).


Network-based Prediction of Cancer under Genetic Storm.

Ay A, Gong D, Kahveci T - Cancer Inform (2014)

Clustering coefficient distributions of the networks in different cancer classes. The clustering coefficient distributions of the networks are shown for the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the clustering coefficient score and y-axis represents the frequency.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4214593&req=5

f5-cin-suppl.3-2014-015: Clustering coefficient distributions of the networks in different cancer classes. The clustering coefficient distributions of the networks are shown for the NBC method for NCI60 (A) and leukemia (B) datasets. In each graph, x-axis represents the clustering coefficient score and y-axis represents the frequency.
Mentions: Next we measured the clustering coefficient and closeness centrality values for each gene. We observed that in classes 3, 4, 6, 7, and 8, networks have very small clustering coefficients (Fig. 5A), which suggests that in these cancer classes, most of the genes’ neighbors are not associated with each other. In regard to the closeness centrality, these five classes showed centrality scores less than or equal to 8 (Fig. 6A), which is possibly because of the fact that the networks formed in these cancer classes are small because of the many isolated genes in the networks. We observed slightly different behaviors in cancer classes 1, 2, and 5 probably because of the smaller number of isolated genes. In these classes, networks showed slightly more clustering between genes and higher centrality score (9–21) (Figs. 5A and 6A).

Bottom Line: Here we present a new network-based supervised classification technique, namely the NBC method.We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods.Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Colgate University, Hamilton, NY, USA. ; Department of Biology, Colgate University, Hamilton, NY, USA.

ABSTRACT
Classification of cancer patients using traditional methods is a challenging task in the medical practice. Owing to rapid advances in microarray technologies, currently expression levels of thousands of genes from individual cancer patients can be measured. The classification of cancer patients by supervised statistical learning algorithms using the gene expression datasets provides an alternative to the traditional methods. Here we present a new network-based supervised classification technique, namely the NBC method. We compare NBC to five traditional classification techniques (support vector machines (SVM), k-nearest neighbor (kNN), naïve Bayes (NB), C4.5, and random forest (RF)) using 50-300 genes selected by five feature selection methods. Our results on five large cancer datasets demonstrate that NBC method outperforms traditional classification techniques. Our analysis suggests that using symmetrical uncertainty (SU) feature selection method with NBC method provides the most accurate classification strategy. Finally, in-depth analysis of the correlation-based co-expression networks chosen by our network-based classifier in different cancer classes shows that there are drastic changes in the network models of different cancer types.

No MeSH data available.


Related in: MedlinePlus