Limits...
Identifying module biomarker in type 2 diabetes mellitus by discriminative area of functional activity.

Zhang X, Gao L, Liu ZP, Chen L - BMC Bioinformatics (2015)

Bottom Line: This module biomarker is enriched with known causal genes and related functions of T2DM.Further analysis shows that the module biomarker is of superior performance in classification, and has consistently high accuracies across tissues and experiments.The proposed approach can efficiently identify robust and functionally meaningful module biomarkers in T2DM, and could be employed in biomarker discovery of other complex diseases characterized by expression profiles.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and Technology, Xidian University, Xi'an, 710000, China. zxd841@163.com.

ABSTRACT

Background: Identifying diagnosis and prognosis biomarkers from expression profiling data is of great significance for achieving personalized medicine and designing therapeutic strategy in complex diseases. However, the reproducibility of identified biomarkers across tissues and experiments is still a challenge for this issue.

Results: We propose a strategy based on discriminative area of module activities to identify gene biomarkers which interconnect as a subnetwork or module by integrating gene expression data and protein-protein interactions. Then, we implement the procedure in T2DM as a case study and identify a module biomarker with 32 genes from mRNA expression data in skeletal muscle for T2DM. This module biomarker is enriched with known causal genes and related functions of T2DM. Further analysis shows that the module biomarker is of superior performance in classification, and has consistently high accuracies across tissues and experiments.

Conclusion: The proposed approach can efficiently identify robust and functionally meaningful module biomarkers in T2DM, and could be employed in biomarker discovery of other complex diseases characterized by expression profiles.

Show MeSH
Performance analysis of the identified module biomarker. (A) The robustness of classification accuracy in perturbation data with different ratio of artificial noises. The mean accuracy of the proposed classifier decreases progressively from 84.02% to 73.26% when ratio of noise increases from 1% to 10%. (B) Comparison of biomarkers identified by different methods in GSE18732. ROC curves shows a superior performance in classification of module biomarker identified in this work (AUC = 0.96). (C) Histogram of mean accuracy with variance for biomarkers identified by our method, SVM-RFE and PAC. We also randomized the interactions of background network (PPIs) 50 times and identified a module biomarker using the proposed method, then mean accuracy and variance are calculated for 10-fold cross-validation across 5 datasets used in this work. Results show a stable performance across tissues for identified biomarkers.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4374500&req=5

Fig4: Performance analysis of the identified module biomarker. (A) The robustness of classification accuracy in perturbation data with different ratio of artificial noises. The mean accuracy of the proposed classifier decreases progressively from 84.02% to 73.26% when ratio of noise increases from 1% to 10%. (B) Comparison of biomarkers identified by different methods in GSE18732. ROC curves shows a superior performance in classification of module biomarker identified in this work (AUC = 0.96). (C) Histogram of mean accuracy with variance for biomarkers identified by our method, SVM-RFE and PAC. We also randomized the interactions of background network (PPIs) 50 times and identified a module biomarker using the proposed method, then mean accuracy and variance are calculated for 10-fold cross-validation across 5 datasets used in this work. Results show a stable performance across tissues for identified biomarkers.

Mentions: For avoiding over-fitting of classifier, we employed 10-fold cross-validation and randomly changed certain percentage of class attributes as artificial noise by 100 times in training dataset. The confidence interval was used to measure correlations between artificial noises and classification accuracies. We used GSE18732 as a case study for enough instances. The result shows that the identified module biomarker maintains a relatively high mean accuracy when the percentage of artificial noise increases from 1% to 10%, which implies the robustness of the classifier induced by identified module biomarker (Figure 4A).Figure 4


Identifying module biomarker in type 2 diabetes mellitus by discriminative area of functional activity.

Zhang X, Gao L, Liu ZP, Chen L - BMC Bioinformatics (2015)

Performance analysis of the identified module biomarker. (A) The robustness of classification accuracy in perturbation data with different ratio of artificial noises. The mean accuracy of the proposed classifier decreases progressively from 84.02% to 73.26% when ratio of noise increases from 1% to 10%. (B) Comparison of biomarkers identified by different methods in GSE18732. ROC curves shows a superior performance in classification of module biomarker identified in this work (AUC = 0.96). (C) Histogram of mean accuracy with variance for biomarkers identified by our method, SVM-RFE and PAC. We also randomized the interactions of background network (PPIs) 50 times and identified a module biomarker using the proposed method, then mean accuracy and variance are calculated for 10-fold cross-validation across 5 datasets used in this work. Results show a stable performance across tissues for identified biomarkers.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4374500&req=5

Fig4: Performance analysis of the identified module biomarker. (A) The robustness of classification accuracy in perturbation data with different ratio of artificial noises. The mean accuracy of the proposed classifier decreases progressively from 84.02% to 73.26% when ratio of noise increases from 1% to 10%. (B) Comparison of biomarkers identified by different methods in GSE18732. ROC curves shows a superior performance in classification of module biomarker identified in this work (AUC = 0.96). (C) Histogram of mean accuracy with variance for biomarkers identified by our method, SVM-RFE and PAC. We also randomized the interactions of background network (PPIs) 50 times and identified a module biomarker using the proposed method, then mean accuracy and variance are calculated for 10-fold cross-validation across 5 datasets used in this work. Results show a stable performance across tissues for identified biomarkers.
Mentions: For avoiding over-fitting of classifier, we employed 10-fold cross-validation and randomly changed certain percentage of class attributes as artificial noise by 100 times in training dataset. The confidence interval was used to measure correlations between artificial noises and classification accuracies. We used GSE18732 as a case study for enough instances. The result shows that the identified module biomarker maintains a relatively high mean accuracy when the percentage of artificial noise increases from 1% to 10%, which implies the robustness of the classifier induced by identified module biomarker (Figure 4A).Figure 4

Bottom Line: This module biomarker is enriched with known causal genes and related functions of T2DM.Further analysis shows that the module biomarker is of superior performance in classification, and has consistently high accuracies across tissues and experiments.The proposed approach can efficiently identify robust and functionally meaningful module biomarkers in T2DM, and could be employed in biomarker discovery of other complex diseases characterized by expression profiles.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science and Technology, Xidian University, Xi'an, 710000, China. zxd841@163.com.

ABSTRACT

Background: Identifying diagnosis and prognosis biomarkers from expression profiling data is of great significance for achieving personalized medicine and designing therapeutic strategy in complex diseases. However, the reproducibility of identified biomarkers across tissues and experiments is still a challenge for this issue.

Results: We propose a strategy based on discriminative area of module activities to identify gene biomarkers which interconnect as a subnetwork or module by integrating gene expression data and protein-protein interactions. Then, we implement the procedure in T2DM as a case study and identify a module biomarker with 32 genes from mRNA expression data in skeletal muscle for T2DM. This module biomarker is enriched with known causal genes and related functions of T2DM. Further analysis shows that the module biomarker is of superior performance in classification, and has consistently high accuracies across tissues and experiments.

Conclusion: The proposed approach can efficiently identify robust and functionally meaningful module biomarkers in T2DM, and could be employed in biomarker discovery of other complex diseases characterized by expression profiles.

Show MeSH