Limits...
Screening of feature genes in distinguishing different types of breast cancer using support vector machine.

Wang Q, Liu X - Onco Targets Ther (2015)

Bottom Line: Feature genes obtained by SVM classifier were subjected to function- and pathway-enrichment via the Database for Annotation, Visualization and Integrated Discovery and KEGG Orthology Based Annotation System, respectively.The SVM classifier demonstrated that these genes could distinguish different subtype samples with high accuracy of larger than 90%, and also showed good sensitivity, specificity, positive/negative predictive value, and area under receiver operating characteristic curve.The gene-expression profile data can provide feature genes to distinguish ER+ and ER- samples, and the identified genes can be used for biomarkers for ER+ samples.

View Article: PubMed Central - PubMed

Affiliation: Department of Emergency Surgery, Affiliated Hospital of Inner Mongolia Medical University, Hohhot, People's Republic of China.

ABSTRACT

Objective: To screen the feature genes in estrogen receptor-positive (ER+) breast cancer in comparison with estrogen receptor-negative (ER-) breast cancer.

Methods: Nine microarray data of ER+ and ER- breast cancer samples were collected from Gene Expression Omnibus database. After preprocessing, data in five training sets were analyzed using significance analysis of microarrays to screen the differentially expressed genes (DEGs). The DEGs were further analyzed via support vector machine (SVM) function in e1071 package of R to construct a SVM classifier, the efficacy of which was verified by four testing sets and its combination with training sets using a leave-one-out cross-validation. Feature genes obtained by SVM classifier were subjected to function- and pathway-enrichment via the Database for Annotation, Visualization and Integrated Discovery and KEGG Orthology Based Annotation System, respectively.

Results: A total of 526 DEGs were screened between ER+ and ER- breast cancer. The SVM classifier demonstrated that these genes could distinguish different subtype samples with high accuracy of larger than 90%, and also showed good sensitivity, specificity, positive/negative predictive value, and area under receiver operating characteristic curve. The inflammatory and hormone biological processes were the common enriched results for two different function analyses, indicating that the inflammatory (ie, IL8) and hormone regulation (ie, CGA) genes may be the involved feature genes to distinguish ER+ and ER- types of breast cancer.

Conclusion: The gene-expression profile data can provide feature genes to distinguish ER+ and ER- samples, and the identified genes can be used for biomarkers for ER+ samples.

No MeSH data available.


Related in: MedlinePlus

Classification of three sample datasets by constructed support vector machine classifier.Notes: (A) Six hundred and twenty-six samples for training; (B) 663 samples for testing; (C) 1,289 combined samples for testing. (Aa, Ba, and Ca) indicate the sample distribution for ER+ and ER−. (Ab, Bb, and Cb) indicate the scatterplot of the classification, in which black dots represent ER− while red dots represent ER+ breast cancer samples.Abbreviations: ER+, estrogen receptor-positive; ER−, estrogen receptor-negative.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4556031&req=5

f1-ott-8-2311: Classification of three sample datasets by constructed support vector machine classifier.Notes: (A) Six hundred and twenty-six samples for training; (B) 663 samples for testing; (C) 1,289 combined samples for testing. (Aa, Ba, and Ca) indicate the sample distribution for ER+ and ER−. (Ab, Bb, and Cb) indicate the scatterplot of the classification, in which black dots represent ER− while red dots represent ER+ breast cancer samples.Abbreviations: ER+, estrogen receptor-positive; ER−, estrogen receptor-negative.

Mentions: Using the normalized expression values of DEGs, a SVM classifier was constructed (Figure 1). After that, the accuracy of this classifier was detected. For training, testing, and the combined datasets, two (one ER− and one ER+), 29 (16 ER− and 13 ER+), and 22 (seven ER− and 15 ER+) samples were wrongly classified by the SVM classifier, respectively. However, the accuracies were all larger than 90% (99.7% [99.5% for ER− and 99.8% for ER+], 95.6% [90.6% for ER− and 97.3% for ER+], and 98.2% [98.1% for ER− and 98.4% for ER+], respectively), indicating the reliability of the classifier. Moreover, the results of Se, Sp, positive predictive value, negative predictive value, and area under receiver operating characteristic curve of the SVM classifier showed that it could not only distinguish training datasets, but also testing datasets well (Table 2; Figure 2).


Screening of feature genes in distinguishing different types of breast cancer using support vector machine.

Wang Q, Liu X - Onco Targets Ther (2015)

Classification of three sample datasets by constructed support vector machine classifier.Notes: (A) Six hundred and twenty-six samples for training; (B) 663 samples for testing; (C) 1,289 combined samples for testing. (Aa, Ba, and Ca) indicate the sample distribution for ER+ and ER−. (Ab, Bb, and Cb) indicate the scatterplot of the classification, in which black dots represent ER− while red dots represent ER+ breast cancer samples.Abbreviations: ER+, estrogen receptor-positive; ER−, estrogen receptor-negative.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4556031&req=5

f1-ott-8-2311: Classification of three sample datasets by constructed support vector machine classifier.Notes: (A) Six hundred and twenty-six samples for training; (B) 663 samples for testing; (C) 1,289 combined samples for testing. (Aa, Ba, and Ca) indicate the sample distribution for ER+ and ER−. (Ab, Bb, and Cb) indicate the scatterplot of the classification, in which black dots represent ER− while red dots represent ER+ breast cancer samples.Abbreviations: ER+, estrogen receptor-positive; ER−, estrogen receptor-negative.
Mentions: Using the normalized expression values of DEGs, a SVM classifier was constructed (Figure 1). After that, the accuracy of this classifier was detected. For training, testing, and the combined datasets, two (one ER− and one ER+), 29 (16 ER− and 13 ER+), and 22 (seven ER− and 15 ER+) samples were wrongly classified by the SVM classifier, respectively. However, the accuracies were all larger than 90% (99.7% [99.5% for ER− and 99.8% for ER+], 95.6% [90.6% for ER− and 97.3% for ER+], and 98.2% [98.1% for ER− and 98.4% for ER+], respectively), indicating the reliability of the classifier. Moreover, the results of Se, Sp, positive predictive value, negative predictive value, and area under receiver operating characteristic curve of the SVM classifier showed that it could not only distinguish training datasets, but also testing datasets well (Table 2; Figure 2).

Bottom Line: Feature genes obtained by SVM classifier were subjected to function- and pathway-enrichment via the Database for Annotation, Visualization and Integrated Discovery and KEGG Orthology Based Annotation System, respectively.The SVM classifier demonstrated that these genes could distinguish different subtype samples with high accuracy of larger than 90%, and also showed good sensitivity, specificity, positive/negative predictive value, and area under receiver operating characteristic curve.The gene-expression profile data can provide feature genes to distinguish ER+ and ER- samples, and the identified genes can be used for biomarkers for ER+ samples.

View Article: PubMed Central - PubMed

Affiliation: Department of Emergency Surgery, Affiliated Hospital of Inner Mongolia Medical University, Hohhot, People's Republic of China.

ABSTRACT

Objective: To screen the feature genes in estrogen receptor-positive (ER+) breast cancer in comparison with estrogen receptor-negative (ER-) breast cancer.

Methods: Nine microarray data of ER+ and ER- breast cancer samples were collected from Gene Expression Omnibus database. After preprocessing, data in five training sets were analyzed using significance analysis of microarrays to screen the differentially expressed genes (DEGs). The DEGs were further analyzed via support vector machine (SVM) function in e1071 package of R to construct a SVM classifier, the efficacy of which was verified by four testing sets and its combination with training sets using a leave-one-out cross-validation. Feature genes obtained by SVM classifier were subjected to function- and pathway-enrichment via the Database for Annotation, Visualization and Integrated Discovery and KEGG Orthology Based Annotation System, respectively.

Results: A total of 526 DEGs were screened between ER+ and ER- breast cancer. The SVM classifier demonstrated that these genes could distinguish different subtype samples with high accuracy of larger than 90%, and also showed good sensitivity, specificity, positive/negative predictive value, and area under receiver operating characteristic curve. The inflammatory and hormone biological processes were the common enriched results for two different function analyses, indicating that the inflammatory (ie, IL8) and hormone regulation (ie, CGA) genes may be the involved feature genes to distinguish ER+ and ER- types of breast cancer.

Conclusion: The gene-expression profile data can provide feature genes to distinguish ER+ and ER- samples, and the identified genes can be used for biomarkers for ER+ samples.

No MeSH data available.


Related in: MedlinePlus