Limits...
Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH

Related in: MedlinePlus

Association of WGCNA module eigenmetabolites with tomato genotypes.ANOVA was used to compare the typical expression patterns (eigenmetabolites) of each module. Significant differences in eigenmetabolites among genotypes within for each module are indicated by letters (at Bonferonni-adjusted threshold of 0.0167). Metabolites in the blue module were more highly expressed in the NC than the AC background. Metabolites in both the blue and turquoise modules were more highly expressed in fully-ripe relative to partially-ripe and unripe fruit. Interestingly, the metabolites in the red module were high in both NC and NC rin, but not in the NC F1 hybrid.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g005: Association of WGCNA module eigenmetabolites with tomato genotypes.ANOVA was used to compare the typical expression patterns (eigenmetabolites) of each module. Significant differences in eigenmetabolites among genotypes within for each module are indicated by letters (at Bonferonni-adjusted threshold of 0.0167). Metabolites in the blue module were more highly expressed in the NC than the AC background. Metabolites in both the blue and turquoise modules were more highly expressed in fully-ripe relative to partially-ripe and unripe fruit. Interestingly, the metabolites in the red module were high in both NC and NC rin, but not in the NC F1 hybrid.

Mentions: One of the most common challenges in systems biology experiments is that of multiple testing. It is extremely common to have very few observations on hundreds to tens of thousands of different entities (e.g. metabolites, genes). WGCNA addresses this issue by allowing the user to investigate associations among specific network nodes or clusters with other factors, such as genetic background or the impact of a mutation. For example, instead of searching for correlations between a given factor or trait (e.g. ripeness) and thousands of genes in a dataset, attention could be focused only on the most highly-connected “hub” genes that might be expected to play the most influential regulatory roles. Alternatively, external traits can be compared to the typical expression pattern (an “eigengene”, or analogously an “eigenmetabolite”) of putatively co-regulated modules instead of to every constituent molecule individually. In our analysis, the turquoise module is positively associated with wild-type fruit ripening (and the presence of functional Rin alleles) (Figure 5). The blue module shares this pattern and is additionally positively associated with the difference in genetic background (AC versus NC), possibly indicating which metabolites prevented principal component 1 of the PCA from completely predicting ripeness. By estimating a set of eigenmetabolites, WGCNA allows the user to apply commonly used and well-understood statistical approaches such as ANOVA to investigate specific hypothesis within the data, by limiting the number of necessary comparisons that are required to query the entire data set.


Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

Association of WGCNA module eigenmetabolites with tomato genotypes.ANOVA was used to compare the typical expression patterns (eigenmetabolites) of each module. Significant differences in eigenmetabolites among genotypes within for each module are indicated by letters (at Bonferonni-adjusted threshold of 0.0167). Metabolites in the blue module were more highly expressed in the NC than the AC background. Metabolites in both the blue and turquoise modules were more highly expressed in fully-ripe relative to partially-ripe and unripe fruit. Interestingly, the metabolites in the red module were high in both NC and NC rin, but not in the NC F1 hybrid.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g005: Association of WGCNA module eigenmetabolites with tomato genotypes.ANOVA was used to compare the typical expression patterns (eigenmetabolites) of each module. Significant differences in eigenmetabolites among genotypes within for each module are indicated by letters (at Bonferonni-adjusted threshold of 0.0167). Metabolites in the blue module were more highly expressed in the NC than the AC background. Metabolites in both the blue and turquoise modules were more highly expressed in fully-ripe relative to partially-ripe and unripe fruit. Interestingly, the metabolites in the red module were high in both NC and NC rin, but not in the NC F1 hybrid.
Mentions: One of the most common challenges in systems biology experiments is that of multiple testing. It is extremely common to have very few observations on hundreds to tens of thousands of different entities (e.g. metabolites, genes). WGCNA addresses this issue by allowing the user to investigate associations among specific network nodes or clusters with other factors, such as genetic background or the impact of a mutation. For example, instead of searching for correlations between a given factor or trait (e.g. ripeness) and thousands of genes in a dataset, attention could be focused only on the most highly-connected “hub” genes that might be expected to play the most influential regulatory roles. Alternatively, external traits can be compared to the typical expression pattern (an “eigengene”, or analogously an “eigenmetabolite”) of putatively co-regulated modules instead of to every constituent molecule individually. In our analysis, the turquoise module is positively associated with wild-type fruit ripening (and the presence of functional Rin alleles) (Figure 5). The blue module shares this pattern and is additionally positively associated with the difference in genetic background (AC versus NC), possibly indicating which metabolites prevented principal component 1 of the PCA from completely predicting ripeness. By estimating a set of eigenmetabolites, WGCNA allows the user to apply commonly used and well-understood statistical approaches such as ANOVA to investigate specific hypothesis within the data, by limiting the number of necessary comparisons that are required to query the entire data set.

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH
Related in: MedlinePlus