Limits...
Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming.

Wu SG, Wang Y, Jiang W, Oyetunde T, Yao R, Zhang X, Shimizu K, Tang YJ, Bao FS - PLoS Comput. Biol. (2016)

Bottom Line: Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification.Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models.This problem can be resolved after more papers on 13C-MFA are published for non-model species.

View Article: PubMed Central - PubMed

Affiliation: Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America.

ABSTRACT
13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.

Show MeSH

Related in: MedlinePlus

A comparison of the 13C-MFA flux, the flux predicted by ML only, and the flux predicted by MFlux in Case 8.B. subtilis was incubated in a shake flask (37 C, 300 rpm, aerobic condition), and supplied with labeled succinate and glutamate as carbon sources in M9 minimal medium. Detailed information is in S1 Table.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4836714&req=5

pcbi.1004838.g008: A comparison of the 13C-MFA flux, the flux predicted by ML only, and the flux predicted by MFlux in Case 8.B. subtilis was incubated in a shake flask (37 C, 300 rpm, aerobic condition), and supplied with labeled succinate and glutamate as carbon sources in M9 minimal medium. Detailed information is in S1 Table.

Mentions: In Case 8, B. subtilis strain uptakes the mixed substrates succinate and glutamate. To illustrate mixed substrates co-metabolisms, we tested MFlux with 13C-MFA data of B. subtilis reported by Chubukov et al. [44]. Microbial fermentation fed with multiple substrates of low price is promising for the biotechnology industry. However, there are very few quantitative analyses of this topic. In this test, we adopted the same set of parameters found in the literature (S1 Table, Case 8) as the inputs of MFlux. For flux correction, we directly took the default boundary settings for quadratic programming. A comparison of flux profiles reported by 13C-MFA, predicted by ML only, and predicted by MFlux (i.e., ML + quadratic programming) is illustrated in Fig 8. ML-only approach and MFlux accurately predict on most fluxes, closely matching the 13C-MFA flux profiles with Root Mean Squared Error (RMSE) under 5. For ML, the predictions have large variation on specific fluxes (e.g., v11—oxidative PP pathway and v19– TCA cycle). Quadratic programming can further adjust flux profiles and reduce deviations of flux predictions. The corrected flux profiles also meet the basic stoichiometric relationship of the metabolic network. The final prediction from MFlux shows improvement with RMSE reduces to 3.2.


Rapid Prediction of Bacterial Heterotrophic Fluxomics Using Machine Learning and Constraint Programming.

Wu SG, Wang Y, Jiang W, Oyetunde T, Yao R, Zhang X, Shimizu K, Tang YJ, Bao FS - PLoS Comput. Biol. (2016)

A comparison of the 13C-MFA flux, the flux predicted by ML only, and the flux predicted by MFlux in Case 8.B. subtilis was incubated in a shake flask (37 C, 300 rpm, aerobic condition), and supplied with labeled succinate and glutamate as carbon sources in M9 minimal medium. Detailed information is in S1 Table.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4836714&req=5

pcbi.1004838.g008: A comparison of the 13C-MFA flux, the flux predicted by ML only, and the flux predicted by MFlux in Case 8.B. subtilis was incubated in a shake flask (37 C, 300 rpm, aerobic condition), and supplied with labeled succinate and glutamate as carbon sources in M9 minimal medium. Detailed information is in S1 Table.
Mentions: In Case 8, B. subtilis strain uptakes the mixed substrates succinate and glutamate. To illustrate mixed substrates co-metabolisms, we tested MFlux with 13C-MFA data of B. subtilis reported by Chubukov et al. [44]. Microbial fermentation fed with multiple substrates of low price is promising for the biotechnology industry. However, there are very few quantitative analyses of this topic. In this test, we adopted the same set of parameters found in the literature (S1 Table, Case 8) as the inputs of MFlux. For flux correction, we directly took the default boundary settings for quadratic programming. A comparison of flux profiles reported by 13C-MFA, predicted by ML only, and predicted by MFlux (i.e., ML + quadratic programming) is illustrated in Fig 8. ML-only approach and MFlux accurately predict on most fluxes, closely matching the 13C-MFA flux profiles with Root Mean Squared Error (RMSE) under 5. For ML, the predictions have large variation on specific fluxes (e.g., v11—oxidative PP pathway and v19– TCA cycle). Quadratic programming can further adjust flux profiles and reduce deviations of flux predictions. The corrected flux profiles also meet the basic stoichiometric relationship of the metabolic network. The final prediction from MFlux shows improvement with RMSE reduces to 3.2.

Bottom Line: Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification.Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models.This problem can be resolved after more papers on 13C-MFA are published for non-model species.

View Article: PubMed Central - PubMed

Affiliation: Department of Energy, Environmental and Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America.

ABSTRACT
13C metabolic flux analysis (13C-MFA) has been widely used to measure in vivo enzyme reaction rates (i.e., metabolic flux) in microorganisms. Mining the relationship between environmental and genetic factors and metabolic fluxes hidden in existing fluxomic data will lead to predictive models that can significantly accelerate flux quantification. In this paper, we present a web-based platform MFlux (http://mflux.org) that predicts the bacterial central metabolism via machine learning, leveraging data from approximately 100 13C-MFA papers on heterotrophic bacterial metabolisms. Three machine learning methods, namely Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), and Decision Tree, were employed to study the sophisticated relationship between influential factors and metabolic fluxes. We performed a grid search of the best parameter set for each algorithm and verified their performance through 10-fold cross validations. SVM yields the highest accuracy among all three algorithms. Further, we employed quadratic programming to adjust flux profiles to satisfy stoichiometric constraints. Multiple case studies have shown that MFlux can reasonably predict fluxomes as a function of bacterial species, substrate types, growth rate, oxygen conditions, and cultivation methods. Due to the interest of studying model organism under particular carbon sources, bias of fluxome in the dataset may limit the applicability of machine learning models. This problem can be resolved after more papers on 13C-MFA are published for non-model species.

Show MeSH
Related in: MedlinePlus