Limits...
A Highly Efficient Gene Expression Programming (GEP) Model for Auxiliary Diagnosis of Small Cell Lung Cancer.

Yu Z, Lu H, Si H, Liu S, Li X, Gao C, Cui L, Li C, Yang X, Yao X - PLoS ONE (2015)

Bottom Line: GEP successfully discriminated 281 out of 300 cases, showing a correct classification rate for lung cancer patients of 93.75% (225/240) and 93.33% (56/60) for the training and test sets, respectively.Another GEP model incorporating four biomarkers, including CEA, NSE, LDH, and CRP, exhibited slightly lower detection sensitivity than the GEP model, including six biomarkers.We repeat the models on artificial neural network (ANN), and our results showed that the accuracy of GEP models were higher than that in ANN.

View Article: PubMed Central - PubMed

Affiliation: The Affiliated Hospital of Qingdao University, Department of Oncology, Qingdao, Shandong, P.R. China.

ABSTRACT

Background: Lung cancer is an important and common cancer that constitutes a major public health problem, but early detection of small cell lung cancer can significantly improve the survival rate of cancer patients. A number of serum biomarkers have been used in the diagnosis of lung cancers; however, they exhibit low sensitivity and specificity.

Methods: We used biochemical methods to measure blood levels of lactate dehydrogenase (LDH), C-reactive protein (CRP), Na+, Cl-, carcino-embryonic antigen (CEA), and neuron specific enolase (NSE) in 145 small cell lung cancer (SCLC) patients and 155 non-small cell lung cancer and 155 normal controls. A gene expression programming (GEP) model and Receiver Operating Characteristic (ROC) curves incorporating these biomarkers was developed for the auxiliary diagnosis of SCLC.

Results: After appropriate modification of the parameters, the GEP model was initially set up based on a training set of 115 SCLC patients and 125 normal controls for GEP model generation. Then the GEP was applied to the remaining 60 subjects (the test set) for model validation. GEP successfully discriminated 281 out of 300 cases, showing a correct classification rate for lung cancer patients of 93.75% (225/240) and 93.33% (56/60) for the training and test sets, respectively. Another GEP model incorporating four biomarkers, including CEA, NSE, LDH, and CRP, exhibited slightly lower detection sensitivity than the GEP model, including six biomarkers. We repeat the models on artificial neural network (ANN), and our results showed that the accuracy of GEP models were higher than that in ANN. GEP model incorporating six serum biomarkers performed by NSCLC patients and normal controls showed low accuracy than SCLC patients and was enough to prove that the GEP model is suitable for the SCLC patients.

Conclusion: We have developed a GEP model with high sensitivity and specificity for the auxiliary diagnosis of SCLC. This GEP model has the potential for the wide use for detection of SCLC in less developed regions.

No MeSH data available.


Related in: MedlinePlus

The flowchart of the GEP modeling in this study.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4440826&req=5

pone.0125517.g002: The flowchart of the GEP modeling in this study.

Mentions: GEP is an evolutionary algorithm introduced by Ferreira in 2001[25]. It can emulate biological evolution based on computer programming. With the assumption of being, in some way, a natural development of genetic programming (GP) preserves few properties of genetic algorithms (GA)[36][37]. The GEP algorithm inherits the advantages of GA and GP, but overcomes their disadvantages. In contrast to GP, the chromosomes in GEP are not represented as trees, but as linear strings of fixed length, with features taken from GA. GEP adopts a simple linear fixed-length manner to describe individuals; it is therefore easy to use a nonlinear tree structure to solve complicated nonlinear problems, thus achieving the purpose of using simple coding to solve complex problems[38]. GEP uses characteristic linear chromosomes, which are composed of the genes structurally organized in the head and the tail. Head may contain functional elements like {Q, +, −, ×, /} or terminal elements like, “Q” is the statistical function of square root. The size of the tail (t) is computed as t = h (n-1) + 1, where n is the maximum number of parameters required in the function set[39]. When the representation of each gene is given, the genotype is established. It is then converted to the phenotype expression tree (ET). The chromosomes function is used as a genome and is modified by means of mutation, transposition, root transposition, gene transposition, gene recombination, and one- and two-point recombination. The flowchart of a gene expression algorithm (GEA) is shown in Fig 2. [24].


A Highly Efficient Gene Expression Programming (GEP) Model for Auxiliary Diagnosis of Small Cell Lung Cancer.

Yu Z, Lu H, Si H, Liu S, Li X, Gao C, Cui L, Li C, Yang X, Yao X - PLoS ONE (2015)

The flowchart of the GEP modeling in this study.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4440826&req=5

pone.0125517.g002: The flowchart of the GEP modeling in this study.
Mentions: GEP is an evolutionary algorithm introduced by Ferreira in 2001[25]. It can emulate biological evolution based on computer programming. With the assumption of being, in some way, a natural development of genetic programming (GP) preserves few properties of genetic algorithms (GA)[36][37]. The GEP algorithm inherits the advantages of GA and GP, but overcomes their disadvantages. In contrast to GP, the chromosomes in GEP are not represented as trees, but as linear strings of fixed length, with features taken from GA. GEP adopts a simple linear fixed-length manner to describe individuals; it is therefore easy to use a nonlinear tree structure to solve complicated nonlinear problems, thus achieving the purpose of using simple coding to solve complex problems[38]. GEP uses characteristic linear chromosomes, which are composed of the genes structurally organized in the head and the tail. Head may contain functional elements like {Q, +, −, ×, /} or terminal elements like, “Q” is the statistical function of square root. The size of the tail (t) is computed as t = h (n-1) + 1, where n is the maximum number of parameters required in the function set[39]. When the representation of each gene is given, the genotype is established. It is then converted to the phenotype expression tree (ET). The chromosomes function is used as a genome and is modified by means of mutation, transposition, root transposition, gene transposition, gene recombination, and one- and two-point recombination. The flowchart of a gene expression algorithm (GEA) is shown in Fig 2. [24].

Bottom Line: GEP successfully discriminated 281 out of 300 cases, showing a correct classification rate for lung cancer patients of 93.75% (225/240) and 93.33% (56/60) for the training and test sets, respectively.Another GEP model incorporating four biomarkers, including CEA, NSE, LDH, and CRP, exhibited slightly lower detection sensitivity than the GEP model, including six biomarkers.We repeat the models on artificial neural network (ANN), and our results showed that the accuracy of GEP models were higher than that in ANN.

View Article: PubMed Central - PubMed

Affiliation: The Affiliated Hospital of Qingdao University, Department of Oncology, Qingdao, Shandong, P.R. China.

ABSTRACT

Background: Lung cancer is an important and common cancer that constitutes a major public health problem, but early detection of small cell lung cancer can significantly improve the survival rate of cancer patients. A number of serum biomarkers have been used in the diagnosis of lung cancers; however, they exhibit low sensitivity and specificity.

Methods: We used biochemical methods to measure blood levels of lactate dehydrogenase (LDH), C-reactive protein (CRP), Na+, Cl-, carcino-embryonic antigen (CEA), and neuron specific enolase (NSE) in 145 small cell lung cancer (SCLC) patients and 155 non-small cell lung cancer and 155 normal controls. A gene expression programming (GEP) model and Receiver Operating Characteristic (ROC) curves incorporating these biomarkers was developed for the auxiliary diagnosis of SCLC.

Results: After appropriate modification of the parameters, the GEP model was initially set up based on a training set of 115 SCLC patients and 125 normal controls for GEP model generation. Then the GEP was applied to the remaining 60 subjects (the test set) for model validation. GEP successfully discriminated 281 out of 300 cases, showing a correct classification rate for lung cancer patients of 93.75% (225/240) and 93.33% (56/60) for the training and test sets, respectively. Another GEP model incorporating four biomarkers, including CEA, NSE, LDH, and CRP, exhibited slightly lower detection sensitivity than the GEP model, including six biomarkers. We repeat the models on artificial neural network (ANN), and our results showed that the accuracy of GEP models were higher than that in ANN. GEP model incorporating six serum biomarkers performed by NSCLC patients and normal controls showed low accuracy than SCLC patients and was enough to prove that the GEP model is suitable for the SCLC patients.

Conclusion: We have developed a GEP model with high sensitivity and specificity for the auxiliary diagnosis of SCLC. This GEP model has the potential for the wide use for detection of SCLC in less developed regions.

No MeSH data available.


Related in: MedlinePlus