Limits...
Predicting Ecological Roles in the Rhizosphere Using Metabolome and Transportome Modeling.

Larsen PE, Collart FR, Dai Y - PLoS ONE (2015)

Bottom Line: The strongest predictive molecular mechanism features for rhizosphere ecological niche overlap with many previously reported analyses of Pseudomonad interactions in the rhizosphere, suggesting that this approach successfully informs a system-scale level understanding of how Pseudomonads sense and interact with their environments.The observation that an organism's transportome is highly predictive of its ecological niche is a novel discovery and may have implications in our understanding microbial ecology.The framework developed here can be generalized to the analysis of any bacteria across a wide range of environments and ecological niches making this approach a powerful tool for providing insights into functional predictions from bacterial genomic data.

View Article: PubMed Central - PubMed

Affiliation: Argonne National Laboratory, Biosciences Division, Argonne, IL, United States of America; University of Illinois at Chicago, Department of Bioengineering, Chicago, IL, United States of America.

ABSTRACT
The ability to obtain complete genome sequences from bacteria in environmental samples, such as soil samples from the rhizosphere, has highlighted the microbial diversity and complexity of environmental communities. However, new algorithms to analyze genome sequence information in the context of community structure are needed to enhance our understanding of the specific ecological roles of these organisms in soil environments. We present a machine learning approach using sequenced Pseudomonad genomes coupled with outputs of metabolic and transportomic computational models for identifying the most predictive molecular mechanisms indicative of a Pseudomonad's ecological role in the rhizosphere: a biofilm, biocontrol agent, promoter of plant growth, or plant pathogen. Computational predictions of ecological niche were highly accurate overall with models trained on transportomic model output being the most accurate (Leave One Out Validation F-scores between 0.82 and 0.89). The strongest predictive molecular mechanism features for rhizosphere ecological niche overlap with many previously reported analyses of Pseudomonad interactions in the rhizosphere, suggesting that this approach successfully informs a system-scale level understanding of how Pseudomonads sense and interact with their environments. The observation that an organism's transportome is highly predictive of its ecological niche is a novel discovery and may have implications in our understanding microbial ecology. The framework developed here can be generalized to the analysis of any bacteria across a wide range of environments and ecological niches making this approach a powerful tool for providing insights into functional predictions from bacterial genomic data.

No MeSH data available.


Clustering Pseudomonad genomes using enzyme function counts.The “Primer 6” core package and enzyme function profile data were used to generate hierarchical clusters. No obvious pattern by species or by ecological function is apparent using only enzyme function count and hierarchical clustering. Suggesting additional data and/or alternate methods are required to deduce Pseudomonad environmental niche using sequenced and annotated genomes.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4557938&req=5

pone.0132837.g002: Clustering Pseudomonad genomes using enzyme function counts.The “Primer 6” core package and enzyme function profile data were used to generate hierarchical clusters. No obvious pattern by species or by ecological function is apparent using only enzyme function count and hierarchical clustering. Suggesting additional data and/or alternate methods are required to deduce Pseudomonad environmental niche using sequenced and annotated genomes.

Mentions: SVMs were generated using a One Versus Rest (OVR) strategy, implemented as a set of four independent binary classifiers, and validated using a Leave One Out Validation (LOOV) scheme (Fig 2). In the OVR SVM binary classification approach, separate SVMs were generated for each ecological niche class (Biocontrol, Biofilm, Plant Pathogen, and Plant Growth Promotor), that is, Biocontrol vs non-Biocontrol, Biofilm vs. non-Biofilm, Plant Pathogen vs. non-Plant Pathogen, and Plant Growth Promoter vs. non-Plant Growth Promotor. A LOOV scheme is a special case of a K-fold cross validation. It is most appropriate for the data in this study as the number of Pseudomonas is small relative to the number of possible model features and some Pseudomonads are represented by a very small number of examples that would go un-represented in the training sets of a K-fold cross validation. In the LOOV experimental design, a single genome is used as a validation set and the model is trained on the remaining genomes with a 10-fold cross-validation procedure and linear kernels. The selection of validation sample and training SVM is repeated until each of the 43 Pseudomonas genomes was used as the validation sample once. For generation of SVM, package ‘e1071’ v1.6–1 in R-project (August 29, 2013, http://cran.r-project.org/web/packages/e1071/index.html) was used. The outputs collected included class predictions, decision values for all training and validation samples and SVM files. A total of 16 SVM models, each with 43 LOOV, were generated: Four feature types based on computational model output types (enzyme function profiles, metabolomic model, secondary metabolism model, and transportomic model) were used to train for the prediction for each of the four ecological niche classes (biofilm formation, biocontrol agent, plant pathogen, and plant growth promoter).


Predicting Ecological Roles in the Rhizosphere Using Metabolome and Transportome Modeling.

Larsen PE, Collart FR, Dai Y - PLoS ONE (2015)

Clustering Pseudomonad genomes using enzyme function counts.The “Primer 6” core package and enzyme function profile data were used to generate hierarchical clusters. No obvious pattern by species or by ecological function is apparent using only enzyme function count and hierarchical clustering. Suggesting additional data and/or alternate methods are required to deduce Pseudomonad environmental niche using sequenced and annotated genomes.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4557938&req=5

pone.0132837.g002: Clustering Pseudomonad genomes using enzyme function counts.The “Primer 6” core package and enzyme function profile data were used to generate hierarchical clusters. No obvious pattern by species or by ecological function is apparent using only enzyme function count and hierarchical clustering. Suggesting additional data and/or alternate methods are required to deduce Pseudomonad environmental niche using sequenced and annotated genomes.
Mentions: SVMs were generated using a One Versus Rest (OVR) strategy, implemented as a set of four independent binary classifiers, and validated using a Leave One Out Validation (LOOV) scheme (Fig 2). In the OVR SVM binary classification approach, separate SVMs were generated for each ecological niche class (Biocontrol, Biofilm, Plant Pathogen, and Plant Growth Promotor), that is, Biocontrol vs non-Biocontrol, Biofilm vs. non-Biofilm, Plant Pathogen vs. non-Plant Pathogen, and Plant Growth Promoter vs. non-Plant Growth Promotor. A LOOV scheme is a special case of a K-fold cross validation. It is most appropriate for the data in this study as the number of Pseudomonas is small relative to the number of possible model features and some Pseudomonads are represented by a very small number of examples that would go un-represented in the training sets of a K-fold cross validation. In the LOOV experimental design, a single genome is used as a validation set and the model is trained on the remaining genomes with a 10-fold cross-validation procedure and linear kernels. The selection of validation sample and training SVM is repeated until each of the 43 Pseudomonas genomes was used as the validation sample once. For generation of SVM, package ‘e1071’ v1.6–1 in R-project (August 29, 2013, http://cran.r-project.org/web/packages/e1071/index.html) was used. The outputs collected included class predictions, decision values for all training and validation samples and SVM files. A total of 16 SVM models, each with 43 LOOV, were generated: Four feature types based on computational model output types (enzyme function profiles, metabolomic model, secondary metabolism model, and transportomic model) were used to train for the prediction for each of the four ecological niche classes (biofilm formation, biocontrol agent, plant pathogen, and plant growth promoter).

Bottom Line: The strongest predictive molecular mechanism features for rhizosphere ecological niche overlap with many previously reported analyses of Pseudomonad interactions in the rhizosphere, suggesting that this approach successfully informs a system-scale level understanding of how Pseudomonads sense and interact with their environments.The observation that an organism's transportome is highly predictive of its ecological niche is a novel discovery and may have implications in our understanding microbial ecology.The framework developed here can be generalized to the analysis of any bacteria across a wide range of environments and ecological niches making this approach a powerful tool for providing insights into functional predictions from bacterial genomic data.

View Article: PubMed Central - PubMed

Affiliation: Argonne National Laboratory, Biosciences Division, Argonne, IL, United States of America; University of Illinois at Chicago, Department of Bioengineering, Chicago, IL, United States of America.

ABSTRACT
The ability to obtain complete genome sequences from bacteria in environmental samples, such as soil samples from the rhizosphere, has highlighted the microbial diversity and complexity of environmental communities. However, new algorithms to analyze genome sequence information in the context of community structure are needed to enhance our understanding of the specific ecological roles of these organisms in soil environments. We present a machine learning approach using sequenced Pseudomonad genomes coupled with outputs of metabolic and transportomic computational models for identifying the most predictive molecular mechanisms indicative of a Pseudomonad's ecological role in the rhizosphere: a biofilm, biocontrol agent, promoter of plant growth, or plant pathogen. Computational predictions of ecological niche were highly accurate overall with models trained on transportomic model output being the most accurate (Leave One Out Validation F-scores between 0.82 and 0.89). The strongest predictive molecular mechanism features for rhizosphere ecological niche overlap with many previously reported analyses of Pseudomonad interactions in the rhizosphere, suggesting that this approach successfully informs a system-scale level understanding of how Pseudomonads sense and interact with their environments. The observation that an organism's transportome is highly predictive of its ecological niche is a novel discovery and may have implications in our understanding microbial ecology. The framework developed here can be generalized to the analysis of any bacteria across a wide range of environments and ecological niches making this approach a powerful tool for providing insights into functional predictions from bacterial genomic data.

No MeSH data available.