Limits...
A markov classification model for metabolic pathways.

Hancock T, Mamitsuka H - Algorithms Mol Biol (2010)

Bottom Line: The results clearly show that HME3M outperformed the comparison methods in the presence of increasing network complexity and pathway noise.This paper clearly shows HME3M to be an accurate and robust method for classifying metabolic pathways.HME3M is shown to outperform all comparison methods and further is capable of identifying known biologically active pathways within microarray data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan. timhancock@kuicr.kyoto-u.ac.jp

ABSTRACT

Background: This paper considers the problem of identifying pathways through metabolic networks that relate to a specific biological response. Our proposed model, HME3M, first identifies frequently traversed network paths using a Markov mixture model. Then by employing a hierarchical mixture of experts, separate classifiers are built using information specific to each path and combined into an ensemble prediction for the response.

Results: We compared the performance of HME3M with logistic regression and support vector machines (SVM) for both simulated pathways and on two metabolic networks, glycolysis and the pentose phosphate pathway for Arabidopsis thaliana. We use AltGenExpress microarray data and focus on the pathway differences in the developmental stages and stress responses of Arabidopsis. The results clearly show that HME3M outperformed the comparison methods in the presence of increasing network complexity and pathway noise. Furthermore an analysis of the paths identified by HME3M for each metabolic network confirmed known biological responses of Arabidopsis.

Conclusions: This paper clearly shows HME3M to be an accurate and robust method for classifying metabolic pathways. HME3M is shown to outperform all comparison methods and further is capable of identifying known biologically active pathways within microarray data.

No MeSH data available.


Related in: MedlinePlus

Arabidopsis thaliana Oxidative Pentose Phosphate Cycle. For visual simplicity, we show only a single edge connecting each compound; however in the actual network there is a separate edge for each gene label displayed.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2823754&req=5

Figure 4: Arabidopsis thaliana Oxidative Pentose Phosphate Cycle. For visual simplicity, we show only a single edge connecting each compound; however in the actual network there is a separate edge for each gene label displayed.

Mentions: To assess the performance of HME3M in a realistic we use two different metabolic networks both extracted from KEGG [1] for the Arabidopsis thaliana plant. The networks are selected for their differing structure and complexity. We deliberately use Arabidopsis as it has become a benchmark organism and it is well known that during the developmental stages and under stress conditions, different components of core metabolic pathways are activated. The first is glycoloysis (Figure 3) which is a simple left to right style network and the second is the pentose phosphate pathway (Figure 4) which is a simple directed cycle. Due to the large number of paths extracted for the KEGG networks to assess the performance of HME3M we conduct 20-fold inverse cross-validation for model sizes M = 2 to M = 10. Inverse 20-fold cross-validation firstly divides the observations randomly into 20 groups and then for each group trains using only observations from one group and tests the performance on the observations from the other 19. The performance of HME3M for 20-fold inverse cross-validation is compared to PLR and the SVM models.


A markov classification model for metabolic pathways.

Hancock T, Mamitsuka H - Algorithms Mol Biol (2010)

Arabidopsis thaliana Oxidative Pentose Phosphate Cycle. For visual simplicity, we show only a single edge connecting each compound; however in the actual network there is a separate edge for each gene label displayed.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2823754&req=5

Figure 4: Arabidopsis thaliana Oxidative Pentose Phosphate Cycle. For visual simplicity, we show only a single edge connecting each compound; however in the actual network there is a separate edge for each gene label displayed.
Mentions: To assess the performance of HME3M in a realistic we use two different metabolic networks both extracted from KEGG [1] for the Arabidopsis thaliana plant. The networks are selected for their differing structure and complexity. We deliberately use Arabidopsis as it has become a benchmark organism and it is well known that during the developmental stages and under stress conditions, different components of core metabolic pathways are activated. The first is glycoloysis (Figure 3) which is a simple left to right style network and the second is the pentose phosphate pathway (Figure 4) which is a simple directed cycle. Due to the large number of paths extracted for the KEGG networks to assess the performance of HME3M we conduct 20-fold inverse cross-validation for model sizes M = 2 to M = 10. Inverse 20-fold cross-validation firstly divides the observations randomly into 20 groups and then for each group trains using only observations from one group and tests the performance on the observations from the other 19. The performance of HME3M for 20-fold inverse cross-validation is compared to PLR and the SVM models.

Bottom Line: The results clearly show that HME3M outperformed the comparison methods in the presence of increasing network complexity and pathway noise.This paper clearly shows HME3M to be an accurate and robust method for classifying metabolic pathways.HME3M is shown to outperform all comparison methods and further is capable of identifying known biologically active pathways within microarray data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan. timhancock@kuicr.kyoto-u.ac.jp

ABSTRACT

Background: This paper considers the problem of identifying pathways through metabolic networks that relate to a specific biological response. Our proposed model, HME3M, first identifies frequently traversed network paths using a Markov mixture model. Then by employing a hierarchical mixture of experts, separate classifiers are built using information specific to each path and combined into an ensemble prediction for the response.

Results: We compared the performance of HME3M with logistic regression and support vector machines (SVM) for both simulated pathways and on two metabolic networks, glycolysis and the pentose phosphate pathway for Arabidopsis thaliana. We use AltGenExpress microarray data and focus on the pathway differences in the developmental stages and stress responses of Arabidopsis. The results clearly show that HME3M outperformed the comparison methods in the presence of increasing network complexity and pathway noise. Furthermore an analysis of the paths identified by HME3M for each metabolic network confirmed known biological responses of Arabidopsis.

Conclusions: This paper clearly shows HME3M to be an accurate and robust method for classifying metabolic pathways. HME3M is shown to outperform all comparison methods and further is capable of identifying known biologically active pathways within microarray data.

No MeSH data available.


Related in: MedlinePlus