Limits...
Network and data integration for biomarker signature discovery via network smoothed T-statistics.

Cun Y, Fröhlich H - PLoS ONE (2013)

Bottom Line: We here propose a technique that integrates network information as well as different kinds of experimental data (here exemplified by mRNA and miRNA expression) into one classifier.Moreover, obtained gene lists can be clearly associated to biological knowledge, such as known disease genes and KEGG pathways.We demonstrate that our data integration strategy can improve classification performance compared to using a single data source only.

View Article: PubMed Central - PubMed

Affiliation: Algorithmic Bioinformatics, Bonn-Aachen International Center for IT, Bonn, Germany.

ABSTRACT
Predictive, stable and interpretable gene signatures are generally seen as an important step towards a better personalized medicine. During the last decade various methods have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinics is the typical low reproducibility of signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. We here propose a technique that integrates network information as well as different kinds of experimental data (here exemplified by mRNA and miRNA expression) into one classifier. This is done by smoothing t-statistics of individual genes or miRNAs over the structure of a combined protein-protein interaction (PPI) and miRNA-target gene network. A permutation test is conducted to select features in a highly consistent manner, and subsequently a Support Vector Machine (SVM) classifier is trained. Compared to several other competing methods our algorithm reveals an overall better prediction performance for early versus late disease relapse and a higher signature stability. Moreover, obtained gene lists can be clearly associated to biological knowledge, such as known disease genes and KEGG pathways. We demonstrate that our data integration strategy can improve classification performance compared to using a single data source only. Our method, called stSVM, is available in R-package netClass on CRAN (http://cran.r-project.org).

Show MeSH

Related in: MedlinePlus

Prediction performance of stSVM on integrated gene and miRNA expression data compared to other approaches.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3760887&req=5

pone-0073074-g004: Prediction performance of stSVM on integrated gene and miRNA expression data compared to other approaches.

Mentions: The comparison of our stSVM(mi-mRNA) approach to the graph fusion algorithm as well as to the above described meta-classifier approach (sgSVM(meta)) revealed a superior performance of our method. GraphFusion was outperformed with large margin (Figure 4), and the gain compared to sgSVM(meta) was still weakly/moderately significant ( for ovarian and for prostate cancer; Wilcoxon signed rank test). In that context it was interesting that only on the prostate cancer dataset a significant improvement by integration of mRNA and miRNA data could be observed at all: The comparison of stSVM(meta) versus stSVM yielded a p-value of (Wilcoxon signed rank test). On the ovarian cancer dataset miRNA expression data did not appear to contribute any useful classification information. This is also highlighted by the weak performance of the sgSVM classifier trained only on miRNA expression data (sgSVM(miRNA)).


Network and data integration for biomarker signature discovery via network smoothed T-statistics.

Cun Y, Fröhlich H - PLoS ONE (2013)

Prediction performance of stSVM on integrated gene and miRNA expression data compared to other approaches.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3760887&req=5

pone-0073074-g004: Prediction performance of stSVM on integrated gene and miRNA expression data compared to other approaches.
Mentions: The comparison of our stSVM(mi-mRNA) approach to the graph fusion algorithm as well as to the above described meta-classifier approach (sgSVM(meta)) revealed a superior performance of our method. GraphFusion was outperformed with large margin (Figure 4), and the gain compared to sgSVM(meta) was still weakly/moderately significant ( for ovarian and for prostate cancer; Wilcoxon signed rank test). In that context it was interesting that only on the prostate cancer dataset a significant improvement by integration of mRNA and miRNA data could be observed at all: The comparison of stSVM(meta) versus stSVM yielded a p-value of (Wilcoxon signed rank test). On the ovarian cancer dataset miRNA expression data did not appear to contribute any useful classification information. This is also highlighted by the weak performance of the sgSVM classifier trained only on miRNA expression data (sgSVM(miRNA)).

Bottom Line: We here propose a technique that integrates network information as well as different kinds of experimental data (here exemplified by mRNA and miRNA expression) into one classifier.Moreover, obtained gene lists can be clearly associated to biological knowledge, such as known disease genes and KEGG pathways.We demonstrate that our data integration strategy can improve classification performance compared to using a single data source only.

View Article: PubMed Central - PubMed

Affiliation: Algorithmic Bioinformatics, Bonn-Aachen International Center for IT, Bonn, Germany.

ABSTRACT
Predictive, stable and interpretable gene signatures are generally seen as an important step towards a better personalized medicine. During the last decade various methods have been proposed for that purpose. However, one important obstacle for making gene signatures a standard tool in clinics is the typical low reproducibility of signatures combined with the difficulty to achieve a clear biological interpretation. For that purpose in the last years there has been a growing interest in approaches that try to integrate information from molecular interaction networks. We here propose a technique that integrates network information as well as different kinds of experimental data (here exemplified by mRNA and miRNA expression) into one classifier. This is done by smoothing t-statistics of individual genes or miRNAs over the structure of a combined protein-protein interaction (PPI) and miRNA-target gene network. A permutation test is conducted to select features in a highly consistent manner, and subsequently a Support Vector Machine (SVM) classifier is trained. Compared to several other competing methods our algorithm reveals an overall better prediction performance for early versus late disease relapse and a higher signature stability. Moreover, obtained gene lists can be clearly associated to biological knowledge, such as known disease genes and KEGG pathways. We demonstrate that our data integration strategy can improve classification performance compared to using a single data source only. Our method, called stSVM, is available in R-package netClass on CRAN (http://cran.r-project.org).

Show MeSH
Related in: MedlinePlus