Limits...
Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset.

Bauer C, Kleinjung F, Smith CJ, Towers MW, Tiss A, Chadt A, Dreja T, Beule D, Al-Hasani H, Reinert K, Schuchhardt J, Cramer R - BMC Bioinformatics (2011)

Bottom Line: The combination of ANOVA and redundancy exploitation allows for identification of biomarker candidates in multi-dimensional MALDI-TOF MS profiling studies with complex experimental design.With respect to feature selection our method provides a fast and intuitive alternative to global optimization strategies with comparable performance.The method is implemented in R and the scripts are available by contacting the corresponding author.

View Article: PubMed Central - HTML - PubMed

Affiliation: MicroDiscovery GmbH, Marienburger Str, 1, 10405 Berlin, Germany. chris.bauer@microdiscovery.de

ABSTRACT

Background: Diabetes like many diseases and biological processes is not mono-causal. On the one hand multi-factorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics.

Results: We present a comprehensive work-flow tailored for analyzing complex data including data from multi-factorial studies. The developed approach aims at revealing effects caused by a distinct combination of experimental factors, in our case genotype and diet. Applying the developed work-flow to the analysis of an established polygenic mouse model for diet-induced type 2 diabetes, we found peptides with significant fold changes exclusively for the combination of a particular strain and diet. Exploitation of redundancy enables the visualization of peptide correlation and provides a natural way of feature selection for classification and prediction. Classification based on the features selected using our approach performs similar to classifications based on more complex feature selection methods.

Conclusions: The combination of ANOVA and redundancy exploitation allows for identification of biomarker candidates in multi-dimensional MALDI-TOF MS profiling studies with complex experimental design. With respect to feature selection our method provides a fast and intuitive alternative to global optimization strategies with comparable performance. The method is implemented in R and the scripts are available by contacting the corresponding author.

Show MeSH

Related in: MedlinePlus

Work-flow. Complete work-flow of the cluster-based ANOVA approach with feature selection for multi-factorial MALDI MS profiling data in biomarker discovery.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3116487&req=5

Figure 1: Work-flow. Complete work-flow of the cluster-based ANOVA approach with feature selection for multi-factorial MALDI MS profiling data in biomarker discovery.

Mentions: Although many approaches have been developed for biomarker identification from MALDI MS profile data, only some studies were performed for assessing the influence of correlation in these datasets [27]. As correlation within large MS data sets can confound statistical analyzes, we developed statistical methods that exploit data correlation and integrated these into a comprehensive work-flow designed for the analysis of multi-factorial experimental MALDI-TOF MS data. Merging similarity and significance information our approach allows for the interpretation of complex biological data in an intuitive manner. The soundness of the statistical methods is demonstrated and a special plot for easy visualization and understanding. Furthermore the presented methods provide a natural way of feature selection for classification and prediction. The complete work-flow of the analysis is shown in Figure 1.


Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset.

Bauer C, Kleinjung F, Smith CJ, Towers MW, Tiss A, Chadt A, Dreja T, Beule D, Al-Hasani H, Reinert K, Schuchhardt J, Cramer R - BMC Bioinformatics (2011)

Work-flow. Complete work-flow of the cluster-based ANOVA approach with feature selection for multi-factorial MALDI MS profiling data in biomarker discovery.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3116487&req=5

Figure 1: Work-flow. Complete work-flow of the cluster-based ANOVA approach with feature selection for multi-factorial MALDI MS profiling data in biomarker discovery.
Mentions: Although many approaches have been developed for biomarker identification from MALDI MS profile data, only some studies were performed for assessing the influence of correlation in these datasets [27]. As correlation within large MS data sets can confound statistical analyzes, we developed statistical methods that exploit data correlation and integrated these into a comprehensive work-flow designed for the analysis of multi-factorial experimental MALDI-TOF MS data. Merging similarity and significance information our approach allows for the interpretation of complex biological data in an intuitive manner. The soundness of the statistical methods is demonstrated and a special plot for easy visualization and understanding. Furthermore the presented methods provide a natural way of feature selection for classification and prediction. The complete work-flow of the analysis is shown in Figure 1.

Bottom Line: The combination of ANOVA and redundancy exploitation allows for identification of biomarker candidates in multi-dimensional MALDI-TOF MS profiling studies with complex experimental design.With respect to feature selection our method provides a fast and intuitive alternative to global optimization strategies with comparable performance.The method is implemented in R and the scripts are available by contacting the corresponding author.

View Article: PubMed Central - HTML - PubMed

Affiliation: MicroDiscovery GmbH, Marienburger Str, 1, 10405 Berlin, Germany. chris.bauer@microdiscovery.de

ABSTRACT

Background: Diabetes like many diseases and biological processes is not mono-causal. On the one hand multi-factorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics.

Results: We present a comprehensive work-flow tailored for analyzing complex data including data from multi-factorial studies. The developed approach aims at revealing effects caused by a distinct combination of experimental factors, in our case genotype and diet. Applying the developed work-flow to the analysis of an established polygenic mouse model for diet-induced type 2 diabetes, we found peptides with significant fold changes exclusively for the combination of a particular strain and diet. Exploitation of redundancy enables the visualization of peptide correlation and provides a natural way of feature selection for classification and prediction. Classification based on the features selected using our approach performs similar to classifications based on more complex feature selection methods.

Conclusions: The combination of ANOVA and redundancy exploitation allows for identification of biomarker candidates in multi-dimensional MALDI-TOF MS profiling studies with complex experimental design. With respect to feature selection our method provides a fast and intuitive alternative to global optimization strategies with comparable performance. The method is implemented in R and the scripts are available by contacting the corresponding author.

Show MeSH
Related in: MedlinePlus