Limits...
Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming.

Wu SG, Wang Y, Jiang W, Oyetunde T, Yao R, Zhang X, Shimizu K, Tang YJ, Bao FS - PLoS Comput. Biol. (2016)

Bottom Line: Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification.Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models.This problem can be resolved after more papers on 13C-MFA are published for non-model species.

View Article: PubMed Central - PubMed

Affiliation: Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America.

ABSTRACT
13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.

Show MeSH

Related in: MedlinePlus

Overview of central metabolic fluxes collected in our dataset.“Flux range” represents the variation of each flux in the 13C-MFA dataset. “95% confidence interval” indicates that 95% of flux data were within a small range. “Average flux value” is the average value in each flux based on all data in our 13C-MFA dataset.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4836714&req=5

pcbi.1004838.g003: Overview of central metabolic fluxes collected in our dataset.“Flux range” represents the variation of each flux in the 13C-MFA dataset. “95% confidence interval” indicates that 95% of flux data were within a small range. “Average flux value” is the average value in each flux based on all data in our 13C-MFA dataset.

Mentions: By statistical analysis, we determined the variation between each flux profile and the average flux profile from our 13C-MFA dataset. The average value, the range, and the 95% confidence interval for each flux are shown in Fig 3. The most conservative fluxes from our dataset include the non-oxidative pentose phosphate pathway and the glyoxylate shunt. The former pathway supplies precursors for bio-synthesizing amino acids (i.e., histidine, phenylalanine, and tyrosine) and nucleotides. The latter acts as an alternative carbon reserving path to the TCA cycle and is inhibited by the presence of glucose (most 13C-MFA is based on the glucose metabolism). All 29 fluxes are found to have a relatively narrow confidence interval compared to possible flux ranges, suggesting that fluxes of different bacteria species varies in a relatively small range. This is because most 13C-MFA studies are focusing on models species (e.g., E. coli and B. subtilis) and glucose based metabolism, while there are much less MFA efforts to study non-model species or metabolism of carbon substrates other than sugars (i.e., bias of fluxome research across).


Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming.

Wu SG, Wang Y, Jiang W, Oyetunde T, Yao R, Zhang X, Shimizu K, Tang YJ, Bao FS - PLoS Comput. Biol. (2016)

Overview of central metabolic fluxes collected in our dataset.“Flux range” represents the variation of each flux in the 13C-MFA dataset. “95% confidence interval” indicates that 95% of flux data were within a small range. “Average flux value” is the average value in each flux based on all data in our 13C-MFA dataset.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4836714&req=5

pcbi.1004838.g003: Overview of central metabolic fluxes collected in our dataset.“Flux range” represents the variation of each flux in the 13C-MFA dataset. “95% confidence interval” indicates that 95% of flux data were within a small range. “Average flux value” is the average value in each flux based on all data in our 13C-MFA dataset.
Mentions: By statistical analysis, we determined the variation between each flux profile and the average flux profile from our 13C-MFA dataset. The average value, the range, and the 95% confidence interval for each flux are shown in Fig 3. The most conservative fluxes from our dataset include the non-oxidative pentose phosphate pathway and the glyoxylate shunt. The former pathway supplies precursors for bio-synthesizing amino acids (i.e., histidine, phenylalanine, and tyrosine) and nucleotides. The latter acts as an alternative carbon reserving path to the TCA cycle and is inhibited by the presence of glucose (most 13C-MFA is based on the glucose metabolism). All 29 fluxes are found to have a relatively narrow confidence interval compared to possible flux ranges, suggesting that fluxes of different bacteria species varies in a relatively small range. This is because most 13C-MFA studies are focusing on models species (e.g., E. coli and B. subtilis) and glucose based metabolism, while there are much less MFA efforts to study non-model species or metabolism of carbon substrates other than sugars (i.e., bias of fluxome research across).

Bottom Line: Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification.Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models.This problem can be resolved after more papers on 13C-MFA are published for non-model species.

View Article: PubMed Central - PubMed

Affiliation: Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America.

ABSTRACT
13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.

Show MeSH
Related in: MedlinePlus