Limits...
Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH

Related in: MedlinePlus

Batch Learning Self Organizing Map (BL-SOM) analysis of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by BL-SOM using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns among six genotypes, with highly similar metabolites appearing in the same cells and similar metabolites appearing in adjacent cells. A. BL-SOM Lattice. Numbers indicate how many metabolites are contained within each cell. Metabolites that were clustered in the “blue” module of WGCNA are labeled to facilitate comparison. A complete representation of metabolite locations within the BL-SOM analysis is presented as Figure S1. B. Comparison of AC sample #1 with population mean values. The metabolome of a single wild type AC fruit was compared with the population mean values using BL-SOM. Red highlighting indicates cells where one or more metabolites are more abundant in AC than the population (red: greater than one standard deviation above the mean; pink: less than one standard deviation above the mean). Blue highlighting indicates where one or more metabolites were less abundant in AC than the population (blue: more than one standard deviation less than the population mean; turquoise: less than one standard deviation less than the population mean). The four cells without highlighting were not different in the comparison. C. Comparison of AC sample #2 with NC sample #5. The metabolomes of two single fruit were compared with each other. Nine cells contained metabolites more abundant in AC than NC; five cells contained metabolites less abundant in AC than NC.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g002: Batch Learning Self Organizing Map (BL-SOM) analysis of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by BL-SOM using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns among six genotypes, with highly similar metabolites appearing in the same cells and similar metabolites appearing in adjacent cells. A. BL-SOM Lattice. Numbers indicate how many metabolites are contained within each cell. Metabolites that were clustered in the “blue” module of WGCNA are labeled to facilitate comparison. A complete representation of metabolite locations within the BL-SOM analysis is presented as Figure S1. B. Comparison of AC sample #1 with population mean values. The metabolome of a single wild type AC fruit was compared with the population mean values using BL-SOM. Red highlighting indicates cells where one or more metabolites are more abundant in AC than the population (red: greater than one standard deviation above the mean; pink: less than one standard deviation above the mean). Blue highlighting indicates where one or more metabolites were less abundant in AC than the population (blue: more than one standard deviation less than the population mean; turquoise: less than one standard deviation less than the population mean). The four cells without highlighting were not different in the comparison. C. Comparison of AC sample #2 with NC sample #5. The metabolomes of two single fruit were compared with each other. Nine cells contained metabolites more abundant in AC than NC; five cells contained metabolites less abundant in AC than NC.

Mentions: BL-SOM clusters metabolites in a two-dimensional matrix according to relative similarities in expression patterns and has been used extensively in plant metabolomic studies [7], [40], [41], [42]. Metabolites with similar expression patterns are located in adjacent cells in the matrix and metabolites with nearly identical expression patterns share the same cell. This method makes comparisons between any two samples easy to visualize. One can also easily compare one sample versus the mean population values. BL-SOM was used to analyze our tomato data set [43]. Based on the relationships detected and the rules of the program, lysine was placed adjacent to phenylalanine, while fructose and glucose were collocated, as were aspartate and glutamate (Figure 2A, Figure S1). A heat map was generated to describe the metabolome of a wild type AC fruit relative to the mean values for the complete data set; thirteen clustered metabolites were more abundant in AC than the population mean, while fifteen were less abundant (Figure 2B). A relative comparison was made between one of the AC and NC samples; nine clustered metabolites were more abundant in AC sample #2 than NC sample #5, while five were less abundant (Figure 2C). However, there is no direct way to compare genotypes represented by multiple replicate samples, or to make higher order comparisons. Together, PCA and BL-SOM indicate which metabolites share similar expression patterns and which vary the most among genotypes, but provide limited insight to understand higher order relationships within a complex data set.


Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

Batch Learning Self Organizing Map (BL-SOM) analysis of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by BL-SOM using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns among six genotypes, with highly similar metabolites appearing in the same cells and similar metabolites appearing in adjacent cells. A. BL-SOM Lattice. Numbers indicate how many metabolites are contained within each cell. Metabolites that were clustered in the “blue” module of WGCNA are labeled to facilitate comparison. A complete representation of metabolite locations within the BL-SOM analysis is presented as Figure S1. B. Comparison of AC sample #1 with population mean values. The metabolome of a single wild type AC fruit was compared with the population mean values using BL-SOM. Red highlighting indicates cells where one or more metabolites are more abundant in AC than the population (red: greater than one standard deviation above the mean; pink: less than one standard deviation above the mean). Blue highlighting indicates where one or more metabolites were less abundant in AC than the population (blue: more than one standard deviation less than the population mean; turquoise: less than one standard deviation less than the population mean). The four cells without highlighting were not different in the comparison. C. Comparison of AC sample #2 with NC sample #5. The metabolomes of two single fruit were compared with each other. Nine cells contained metabolites more abundant in AC than NC; five cells contained metabolites less abundant in AC than NC.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g002: Batch Learning Self Organizing Map (BL-SOM) analysis of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by BL-SOM using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns among six genotypes, with highly similar metabolites appearing in the same cells and similar metabolites appearing in adjacent cells. A. BL-SOM Lattice. Numbers indicate how many metabolites are contained within each cell. Metabolites that were clustered in the “blue” module of WGCNA are labeled to facilitate comparison. A complete representation of metabolite locations within the BL-SOM analysis is presented as Figure S1. B. Comparison of AC sample #1 with population mean values. The metabolome of a single wild type AC fruit was compared with the population mean values using BL-SOM. Red highlighting indicates cells where one or more metabolites are more abundant in AC than the population (red: greater than one standard deviation above the mean; pink: less than one standard deviation above the mean). Blue highlighting indicates where one or more metabolites were less abundant in AC than the population (blue: more than one standard deviation less than the population mean; turquoise: less than one standard deviation less than the population mean). The four cells without highlighting were not different in the comparison. C. Comparison of AC sample #2 with NC sample #5. The metabolomes of two single fruit were compared with each other. Nine cells contained metabolites more abundant in AC than NC; five cells contained metabolites less abundant in AC than NC.
Mentions: BL-SOM clusters metabolites in a two-dimensional matrix according to relative similarities in expression patterns and has been used extensively in plant metabolomic studies [7], [40], [41], [42]. Metabolites with similar expression patterns are located in adjacent cells in the matrix and metabolites with nearly identical expression patterns share the same cell. This method makes comparisons between any two samples easy to visualize. One can also easily compare one sample versus the mean population values. BL-SOM was used to analyze our tomato data set [43]. Based on the relationships detected and the rules of the program, lysine was placed adjacent to phenylalanine, while fructose and glucose were collocated, as were aspartate and glutamate (Figure 2A, Figure S1). A heat map was generated to describe the metabolome of a wild type AC fruit relative to the mean values for the complete data set; thirteen clustered metabolites were more abundant in AC than the population mean, while fifteen were less abundant (Figure 2B). A relative comparison was made between one of the AC and NC samples; nine clustered metabolites were more abundant in AC sample #2 than NC sample #5, while five were less abundant (Figure 2C). However, there is no direct way to compare genotypes represented by multiple replicate samples, or to make higher order comparisons. Together, PCA and BL-SOM indicate which metabolites share similar expression patterns and which vary the most among genotypes, but provide limited insight to understand higher order relationships within a complex data set.

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH
Related in: MedlinePlus