Limits...
Lung cancer prediction using neural network ensemble with histogram of oriented gradient genomic features.

Adetiba E, Olugbara OO - ScientificWorldJournal (2015)

Bottom Line: The Voss DNA encoding was used to map the nucleotide sequences of mutated and normal genomes to obtain the equivalent numerical genomic sequences for training the selected classifiers.The histogram of oriented gradient (HOG) and local binary pattern (LBP) state-of-the-art feature extraction schemes were applied to extract representative genomic features from the encoded sequences of nucleotides.The ANN ensemble and HOG best fit the training dataset of this study with an accuracy of 95.90% and mean square error of 0.0159.

View Article: PubMed Central - PubMed

Affiliation: ICT and Society Research Group, Durban University of Technology, P.O. Box 1334, Durban 4000, South Africa.

ABSTRACT
This paper reports an experimental comparison of artificial neural network (ANN) and support vector machine (SVM) ensembles and their "nonensemble" variants for lung cancer prediction. These machine learning classifiers were trained to predict lung cancer using samples of patient nucleotides with mutations in the epidermal growth factor receptor, Kirsten rat sarcoma viral oncogene, and tumor suppressor p53 genomes collected as biomarkers from the IGDB.NSCLC corpus. The Voss DNA encoding was used to map the nucleotide sequences of mutated and normal genomes to obtain the equivalent numerical genomic sequences for training the selected classifiers. The histogram of oriented gradient (HOG) and local binary pattern (LBP) state-of-the-art feature extraction schemes were applied to extract representative genomic features from the encoded sequences of nucleotides. The ANN ensemble and HOG best fit the training dataset of this study with an accuracy of 95.90% and mean square error of 0.0159. The result of the ANN ensemble and HOG genomic features is promising for automated screening and early detection of lung cancer. This will hopefully assist pathologists in administering targeted molecular therapy and offering counsel to early stage lung cancer patients and persons in at risk populations.

No MeSH data available.


Related in: MedlinePlus

Time domain plot of HOG features for the first samples of EGFR deletion, EGFR substitution, KRAS substitution, TP53 substitution, and TP53 deletion mutations.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4352926&req=5

fig3: Time domain plot of HOG features for the first samples of EGFR deletion, EGFR substitution, KRAS substitution, TP53 substitution, and TP53 deletion mutations.

Mentions: The foregoing HOG algorithmic steps were implemented in MATLAB R2012a. Using the results obtained from the code, we plotted the time domain graph of the first samples in each of the classes in our experimental dataset as shown in Figure 3. This graph clearly and visibly shows unique patterns for the different classes of mutations in our training dataset. This is a strong proof of the discriminatory power of HOG descriptor. Our second objective of using Voss mapping to encode and HOG to extract representative genomic features in this study has therefore been realized with the procedures discussed in Section 2.2 and this section. Apart from the first application of HOG descriptor for human recognition by Dalal and Triggs [26], the method has also been used with good results in domains as diverse as activity recognition [28, 47], pedestrian detection [48], and speaker classification [49]. In order to automate the classification of different patterns (mutation classes) captured by the HOG feature vectors in this work, we designed and trained ensemble and nonensemble artificial neural networks and support vector machines.


Lung cancer prediction using neural network ensemble with histogram of oriented gradient genomic features.

Adetiba E, Olugbara OO - ScientificWorldJournal (2015)

Time domain plot of HOG features for the first samples of EGFR deletion, EGFR substitution, KRAS substitution, TP53 substitution, and TP53 deletion mutations.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4352926&req=5

fig3: Time domain plot of HOG features for the first samples of EGFR deletion, EGFR substitution, KRAS substitution, TP53 substitution, and TP53 deletion mutations.
Mentions: The foregoing HOG algorithmic steps were implemented in MATLAB R2012a. Using the results obtained from the code, we plotted the time domain graph of the first samples in each of the classes in our experimental dataset as shown in Figure 3. This graph clearly and visibly shows unique patterns for the different classes of mutations in our training dataset. This is a strong proof of the discriminatory power of HOG descriptor. Our second objective of using Voss mapping to encode and HOG to extract representative genomic features in this study has therefore been realized with the procedures discussed in Section 2.2 and this section. Apart from the first application of HOG descriptor for human recognition by Dalal and Triggs [26], the method has also been used with good results in domains as diverse as activity recognition [28, 47], pedestrian detection [48], and speaker classification [49]. In order to automate the classification of different patterns (mutation classes) captured by the HOG feature vectors in this work, we designed and trained ensemble and nonensemble artificial neural networks and support vector machines.

Bottom Line: The Voss DNA encoding was used to map the nucleotide sequences of mutated and normal genomes to obtain the equivalent numerical genomic sequences for training the selected classifiers.The histogram of oriented gradient (HOG) and local binary pattern (LBP) state-of-the-art feature extraction schemes were applied to extract representative genomic features from the encoded sequences of nucleotides.The ANN ensemble and HOG best fit the training dataset of this study with an accuracy of 95.90% and mean square error of 0.0159.

View Article: PubMed Central - PubMed

Affiliation: ICT and Society Research Group, Durban University of Technology, P.O. Box 1334, Durban 4000, South Africa.

ABSTRACT
This paper reports an experimental comparison of artificial neural network (ANN) and support vector machine (SVM) ensembles and their "nonensemble" variants for lung cancer prediction. These machine learning classifiers were trained to predict lung cancer using samples of patient nucleotides with mutations in the epidermal growth factor receptor, Kirsten rat sarcoma viral oncogene, and tumor suppressor p53 genomes collected as biomarkers from the IGDB.NSCLC corpus. The Voss DNA encoding was used to map the nucleotide sequences of mutated and normal genomes to obtain the equivalent numerical genomic sequences for training the selected classifiers. The histogram of oriented gradient (HOG) and local binary pattern (LBP) state-of-the-art feature extraction schemes were applied to extract representative genomic features from the encoded sequences of nucleotides. The ANN ensemble and HOG best fit the training dataset of this study with an accuracy of 95.90% and mean square error of 0.0159. The result of the ANN ensemble and HOG genomic features is promising for automated screening and early detection of lung cancer. This will hopefully assist pathologists in administering targeted molecular therapy and offering counsel to early stage lung cancer patients and persons in at risk populations.

No MeSH data available.


Related in: MedlinePlus