Limits...
Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming.

Wu SG, Wang Y, Jiang W, Oyetunde T, Yao R, Zhang X, Shimizu K, Tang YJ, Bao FS - PLoS Comput. Biol. (2016)

Bottom Line: Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification.Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models.This problem can be resolved after more papers on 13C-MFA are published for non-model species.

View Article: PubMed Central - PubMed

Affiliation: Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America.

ABSTRACT
13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.

Show MeSH

Related in: MedlinePlus

A comparison of the 13C-MFA flux, the flux predicted by MFlux, and the flux predicted by FBA.FBA analysis is simulated by an E. coli iJO1366 model (latest version) with default boundary settings from the reference [54]. The default values of growth associated maintenance energy (GAM) and non-growth associated maintenance energy (NGAM) were adopted. A)E. coli fluxome of glucose metabolism was precisely measured via parallel labeling experiments (a recent paper not in our dataset) [12]. B)E. coli fluxome of glycerol and glucose co-metabolism as measured by Drs. Yao and Shimizu (unpublished data). The E. coli strain was cultured in chemostat fermentor with a working volume of 1 L(37 C). The dilution rate in the continuous culture was 0.35 h−1. [1-13C] glucose and [1, 3-13C] glycerol were used for tracer experiments. The flux calculation is based on a previous method [42]. The RMSE from FBA is 22.5, while the RMSE from MFlux (this work) is 5.1. The COBRA toolbox running on MATLAB R2012b was employed for FBA/pFBA/geometricFBA simulation, and Gurobi 5.5 was used for linear programming. Detailed information is included in S2 Table.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4836714&req=5

pcbi.1004838.g010: A comparison of the 13C-MFA flux, the flux predicted by MFlux, and the flux predicted by FBA.FBA analysis is simulated by an E. coli iJO1366 model (latest version) with default boundary settings from the reference [54]. The default values of growth associated maintenance energy (GAM) and non-growth associated maintenance energy (NGAM) were adopted. A)E. coli fluxome of glucose metabolism was precisely measured via parallel labeling experiments (a recent paper not in our dataset) [12]. B)E. coli fluxome of glycerol and glucose co-metabolism as measured by Drs. Yao and Shimizu (unpublished data). The E. coli strain was cultured in chemostat fermentor with a working volume of 1 L(37 C). The dilution rate in the continuous culture was 0.35 h−1. [1-13C] glucose and [1, 3-13C] glycerol were used for tracer experiments. The flux calculation is based on a previous method [42]. The RMSE from FBA is 22.5, while the RMSE from MFlux (this work) is 5.1. The COBRA toolbox running on MATLAB R2012b was employed for FBA/pFBA/geometricFBA simulation, and Gurobi 5.5 was used for linear programming. Detailed information is included in S2 Table.

Mentions: Stoichiometry-based flux balance analysis (FBA) is an important mechanistic tool to predict unknown cell metabolism [50]. Accurate FBA prediction relies highly on setting the objective function and the flux constraints appropriately (based on thermodynamics or experimental analysis). Here, we compare FBA with MFlux for predicting E. coli metabolisms. The latest version of E. coli iJO1366 genome-scale model (2583 fluxes) was used [51]. Two comparative case studies were performed on E. coli fluxomes: one case for glucose based 13C-MFA via parallel labeling experiments [12] and the other for glucose and glycerol co-utilization (unpublished data from the Shimizu Group). Neither of the test cases was included in the training set of MFlux. Given 13C-MFA results as the control, MFlux results apparently have smaller RMSEs than FBA predictions. In the first case, the FBA has an RMSE of 11.3, while MFlux has an RMSE of 6.5 (Fig 10A). In the second case, the FBA has an RMSE of 22.5, while MFlux has an RMSE of 5.1 (Fig 10B). To circumvent variations caused by alternative solutions in FBA, we also employed pFBA and geometricFBA for both cases [52, 53] (S2 Table). In general, pFBA does not show better results compared with FBA for either case, while geometricFBA does not converge in our calculation.


Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming.

Wu SG, Wang Y, Jiang W, Oyetunde T, Yao R, Zhang X, Shimizu K, Tang YJ, Bao FS - PLoS Comput. Biol. (2016)

A comparison of the 13C-MFA flux, the flux predicted by MFlux, and the flux predicted by FBA.FBA analysis is simulated by an E. coli iJO1366 model (latest version) with default boundary settings from the reference [54]. The default values of growth associated maintenance energy (GAM) and non-growth associated maintenance energy (NGAM) were adopted. A)E. coli fluxome of glucose metabolism was precisely measured via parallel labeling experiments (a recent paper not in our dataset) [12]. B)E. coli fluxome of glycerol and glucose co-metabolism as measured by Drs. Yao and Shimizu (unpublished data). The E. coli strain was cultured in chemostat fermentor with a working volume of 1 L(37 C). The dilution rate in the continuous culture was 0.35 h−1. [1-13C] glucose and [1, 3-13C] glycerol were used for tracer experiments. The flux calculation is based on a previous method [42]. The RMSE from FBA is 22.5, while the RMSE from MFlux (this work) is 5.1. The COBRA toolbox running on MATLAB R2012b was employed for FBA/pFBA/geometricFBA simulation, and Gurobi 5.5 was used for linear programming. Detailed information is included in S2 Table.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4836714&req=5

pcbi.1004838.g010: A comparison of the 13C-MFA flux, the flux predicted by MFlux, and the flux predicted by FBA.FBA analysis is simulated by an E. coli iJO1366 model (latest version) with default boundary settings from the reference [54]. The default values of growth associated maintenance energy (GAM) and non-growth associated maintenance energy (NGAM) were adopted. A)E. coli fluxome of glucose metabolism was precisely measured via parallel labeling experiments (a recent paper not in our dataset) [12]. B)E. coli fluxome of glycerol and glucose co-metabolism as measured by Drs. Yao and Shimizu (unpublished data). The E. coli strain was cultured in chemostat fermentor with a working volume of 1 L(37 C). The dilution rate in the continuous culture was 0.35 h−1. [1-13C] glucose and [1, 3-13C] glycerol were used for tracer experiments. The flux calculation is based on a previous method [42]. The RMSE from FBA is 22.5, while the RMSE from MFlux (this work) is 5.1. The COBRA toolbox running on MATLAB R2012b was employed for FBA/pFBA/geometricFBA simulation, and Gurobi 5.5 was used for linear programming. Detailed information is included in S2 Table.
Mentions: Stoichiometry-based flux balance analysis (FBA) is an important mechanistic tool to predict unknown cell metabolism [50]. Accurate FBA prediction relies highly on setting the objective function and the flux constraints appropriately (based on thermodynamics or experimental analysis). Here, we compare FBA with MFlux for predicting E. coli metabolisms. The latest version of E. coli iJO1366 genome-scale model (2583 fluxes) was used [51]. Two comparative case studies were performed on E. coli fluxomes: one case for glucose based 13C-MFA via parallel labeling experiments [12] and the other for glucose and glycerol co-utilization (unpublished data from the Shimizu Group). Neither of the test cases was included in the training set of MFlux. Given 13C-MFA results as the control, MFlux results apparently have smaller RMSEs than FBA predictions. In the first case, the FBA has an RMSE of 11.3, while MFlux has an RMSE of 6.5 (Fig 10A). In the second case, the FBA has an RMSE of 22.5, while MFlux has an RMSE of 5.1 (Fig 10B). To circumvent variations caused by alternative solutions in FBA, we also employed pFBA and geometricFBA for both cases [52, 53] (S2 Table). In general, pFBA does not show better results compared with FBA for either case, while geometricFBA does not converge in our calculation.

Bottom Line: Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification.Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models.This problem can be resolved after more papers on 13C-MFA are published for non-model species.

View Article: PubMed Central - PubMed

Affiliation: Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America.

ABSTRACT
13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.

Show MeSH
Related in: MedlinePlus