Limits...
Predicting breast cancer using an expression values weighted clinical classifier.

Thomas M, De Brabanter K, Suykens JA, De Moor B - BMC Bioinformatics (2014)

Bottom Line: These studies often remain inconclusive regarding an obtained improvement in prediction performance.While bringing up the benefits of these two techniques, we propose a machine learning approach, a weighted LS-SVM classifier to integrate two data sources: microarray and clinical parameters.Thus a clinical classifier weighted with microarray data set results in significantly improved diagnosis, prognosis and prediction responses to therapy.

View Article: PubMed Central - PubMed

Affiliation: KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics/iMinds Future Health Department, Kasteelpark Arenberg 10, Leuven, 3001, Belgium. minta.thomas@esat.kuleuven.be.

ABSTRACT

Background: Clinical data, such as patient history, laboratory analysis, ultrasound parameters-which are the basis of day-to-day clinical decision support-are often used to guide the clinical management of cancer in the presence of microarray data. Several data fusion techniques are available to integrate genomics or proteomics data, but only a few studies have created a single prediction model using both gene expression and clinical data. These studies often remain inconclusive regarding an obtained improvement in prediction performance. To improve clinical management, these data should be fully exploited. This requires efficient algorithms to integrate these data sets and design a final classifier. LS-SVM classifiers and generalized eigenvalue/singular value decompositions are successfully used in many bioinformatics applications for prediction tasks. While bringing up the benefits of these two techniques, we propose a machine learning approach, a weighted LS-SVM classifier to integrate two data sources: microarray and clinical parameters.

Results: We compared and evaluated the proposed methods on five breast cancer case studies. Compared to LS-SVM classifier on individual data sets, generalized eigenvalue decomposition (GEVD) and kernel GEVD, the proposed weighted LS-SVM classifier offers good prediction performance, in terms of test area under ROC Curve (AUC), on all breast cancer case studies.

Conclusions: Thus a clinical classifier weighted with microarray data set results in significantly improved diagnosis, prognosis and prediction responses to therapy. The proposed model has been shown as a promising mathematical framework in both data fusion and non-linear classification problems.

Show MeSH

Related in: MedlinePlus

Comparison of the prediction accuracy of the classifiers. Boxplots of the test AUC values obtained in 100 repetitions for 5 breast cancer cases. (a) Case I (b) Case II (c) Case III (d) Case IV (e) Case V.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4308909&req=5

Fig2: Comparison of the prediction accuracy of the classifiers. Boxplots of the test AUC values obtained in 100 repetitions for 5 breast cancer cases. (a) Case I (b) Case II (c) Case III (d) Case IV (e) Case V.

Mentions: The optimal parameters of the kernel GEVD (bandwidths of clinical and microarray kernels) are selected using LOO-CV performance. We applied kernel GEVD on microarray and clinical kernels. Then we obtained the scores by projecting clinical kernels on to the direction of kernel GEV. Similar to GEVD, LS-SVM model is trained and validated on scores corresponding to training set and test set respectively. High-throughput data such as microarray have used only for the model development. The results show that considerations of two data sets in a single framework improve the prediction performance than individual data sets. In addition, kernel GEVD significantly improve the classification performance over GEVD. The results of the five case studies are shown in Table 2 and Figure 2. We represent expression and clinical data with kernel matrix, based on RBF kernel function. The RBF kernel functions makes each of the these data which has diverse structures, transformed into kernel matrices with same size.Figure 2


Predicting breast cancer using an expression values weighted clinical classifier.

Thomas M, De Brabanter K, Suykens JA, De Moor B - BMC Bioinformatics (2014)

Comparison of the prediction accuracy of the classifiers. Boxplots of the test AUC values obtained in 100 repetitions for 5 breast cancer cases. (a) Case I (b) Case II (c) Case III (d) Case IV (e) Case V.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4308909&req=5

Fig2: Comparison of the prediction accuracy of the classifiers. Boxplots of the test AUC values obtained in 100 repetitions for 5 breast cancer cases. (a) Case I (b) Case II (c) Case III (d) Case IV (e) Case V.
Mentions: The optimal parameters of the kernel GEVD (bandwidths of clinical and microarray kernels) are selected using LOO-CV performance. We applied kernel GEVD on microarray and clinical kernels. Then we obtained the scores by projecting clinical kernels on to the direction of kernel GEV. Similar to GEVD, LS-SVM model is trained and validated on scores corresponding to training set and test set respectively. High-throughput data such as microarray have used only for the model development. The results show that considerations of two data sets in a single framework improve the prediction performance than individual data sets. In addition, kernel GEVD significantly improve the classification performance over GEVD. The results of the five case studies are shown in Table 2 and Figure 2. We represent expression and clinical data with kernel matrix, based on RBF kernel function. The RBF kernel functions makes each of the these data which has diverse structures, transformed into kernel matrices with same size.Figure 2

Bottom Line: These studies often remain inconclusive regarding an obtained improvement in prediction performance.While bringing up the benefits of these two techniques, we propose a machine learning approach, a weighted LS-SVM classifier to integrate two data sources: microarray and clinical parameters.Thus a clinical classifier weighted with microarray data set results in significantly improved diagnosis, prognosis and prediction responses to therapy.

View Article: PubMed Central - PubMed

Affiliation: KU Leuven, Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics/iMinds Future Health Department, Kasteelpark Arenberg 10, Leuven, 3001, Belgium. minta.thomas@esat.kuleuven.be.

ABSTRACT

Background: Clinical data, such as patient history, laboratory analysis, ultrasound parameters-which are the basis of day-to-day clinical decision support-are often used to guide the clinical management of cancer in the presence of microarray data. Several data fusion techniques are available to integrate genomics or proteomics data, but only a few studies have created a single prediction model using both gene expression and clinical data. These studies often remain inconclusive regarding an obtained improvement in prediction performance. To improve clinical management, these data should be fully exploited. This requires efficient algorithms to integrate these data sets and design a final classifier. LS-SVM classifiers and generalized eigenvalue/singular value decompositions are successfully used in many bioinformatics applications for prediction tasks. While bringing up the benefits of these two techniques, we propose a machine learning approach, a weighted LS-SVM classifier to integrate two data sources: microarray and clinical parameters.

Results: We compared and evaluated the proposed methods on five breast cancer case studies. Compared to LS-SVM classifier on individual data sets, generalized eigenvalue decomposition (GEVD) and kernel GEVD, the proposed weighted LS-SVM classifier offers good prediction performance, in terms of test area under ROC Curve (AUC), on all breast cancer case studies.

Conclusions: Thus a clinical classifier weighted with microarray data set results in significantly improved diagnosis, prognosis and prediction responses to therapy. The proposed model has been shown as a promising mathematical framework in both data fusion and non-linear classification problems.

Show MeSH
Related in: MedlinePlus