Limits...
Genetic programming based ensemble system for microarray data classification.

Liu KH, Tong M, Xie ST, Yee Ng VT - Comput Math Methods Med (2015)

Bottom Line: Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process.The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system.By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.

View Article: PubMed Central - PubMed

Affiliation: Software School of Xiamen University, Xiamen, Fujian 361005, China ; Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon 999077, Hong Kong.

ABSTRACT
Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.

Show MeSH

Related in: MedlinePlus

An example of the individual of GP in the proposed algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4355811&req=5

fig2: An example of the individual of GP in the proposed algorithm.

Mentions: In order to make these operators work effectively, each nonterminal is set to contain three children, which can be either terminals or nonterminals. An example of an individual is illustrated in Figure 2. Here, T1–T7 are decision trees, and Average, Min, and Max are fusion operators. In this individual, if T1, T2, and T5 produce negative votes for a sample and others produce positive votes, the final output of the ensemble system is a negative label.


Genetic programming based ensemble system for microarray data classification.

Liu KH, Tong M, Xie ST, Yee Ng VT - Comput Math Methods Med (2015)

An example of the individual of GP in the proposed algorithm.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4355811&req=5

fig2: An example of the individual of GP in the proposed algorithm.
Mentions: In order to make these operators work effectively, each nonterminal is set to contain three children, which can be either terminals or nonterminals. An example of an individual is illustrated in Figure 2. Here, T1–T7 are decision trees, and Average, Min, and Max are fusion operators. In this individual, if T1, T2, and T5 produce negative votes for a sample and others produce positive votes, the final output of the ensemble system is a negative label.

Bottom Line: Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process.The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system.By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.

View Article: PubMed Central - PubMed

Affiliation: Software School of Xiamen University, Xiamen, Fujian 361005, China ; Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon 999077, Hong Kong.

ABSTRACT
Recently, more and more machine learning techniques have been applied to microarray data analysis. The aim of this study is to propose a genetic programming (GP) based new ensemble system (named GPES), which can be used to effectively classify different types of cancers. Decision trees are deployed as base classifiers in this ensemble framework with three operators: Min, Max, and Average. Each individual of the GP is an ensemble system, and they become more and more accurate in the evolutionary process. The feature selection technique and balanced subsampling technique are applied to increase the diversity in each ensemble system. The final ensemble committee is selected by a forward search algorithm, which is shown to be capable of fitting data automatically. The performance of GPES is evaluated using five binary class and six multiclass microarray datasets, and results show that the algorithm can achieve better results in most cases compared with some other ensemble systems. By using elaborate base classifiers or applying other sampling techniques, the performance of GPES may be further improved.

Show MeSH
Related in: MedlinePlus