Limits...
Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH

Related in: MedlinePlus

Weighted correlation network analysis (WGCNA) of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by WGCNA using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns as represented by the dendrogram and correlation heat map. Clusters of like-regulated metabolites are referred to as modules by color (red, blue, turquoise). Metabolites that could not be assigned to a module are labeled gray. In the heat map, intensity of red coloring indicates strength of correlation between pairs of metabolites on a linear scale.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g003: Weighted correlation network analysis (WGCNA) of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by WGCNA using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns as represented by the dendrogram and correlation heat map. Clusters of like-regulated metabolites are referred to as modules by color (red, blue, turquoise). Metabolites that could not be assigned to a module are labeled gray. In the heat map, intensity of red coloring indicates strength of correlation between pairs of metabolites on a linear scale.

Mentions: WGCNA supports the assembly of both signed and unsigned networks. Here, we constructed unsigned networks using the 46 metabolite data set, which co-localize both positively and negatively correlated metabolites into three modules (Figure 3). The WGCNA package additionally provides easy quantification of several network statistics, or indices [5], [44]. In a weighted correlation network, connectivity equals the sum of connection strengths between a node and all of its neighbors, which has been associated with essentiality in protein and metabolic networks [45], [46]. Additionally, highly connected hubs may play a disproportionate role either in influencing the expression patterns of other nodes in the network, or alternatively may act as "sentries," communicating changes that occur elsewhere in the network. Scaled connectivity indicates the connectivity of a given node relative to the most connected node within the same module. The maximum adjacency ratio is related to connectivity; low values indicate nodes with many, weak connections to their neighbors, while high values indicate nodes with few, strong connections to their neighbors. In some situations, the maximum adjacency ratio may be more effective than connectivity to identify important hub features [5]. The clustering coefficient indicates the local density of a network, or the extent to which a node's neighbors are all strongly connected to each other. Other statistics are used to describe modules instead of individual nodes. Network density describes how tightly co-expressed a set of nodes within a module are, while centralization and heterogeneity describe the extent to which nodes within a given module differ in connectivity. Network heterogeneity describes the variation in connectivity within a module while centralization describes the extent to which a network contains many nodes that connect to a central hub node, but do not connect to their neighbors [5].


Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

Weighted correlation network analysis (WGCNA) of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by WGCNA using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns as represented by the dendrogram and correlation heat map. Clusters of like-regulated metabolites are referred to as modules by color (red, blue, turquoise). Metabolites that could not be assigned to a module are labeled gray. In the heat map, intensity of red coloring indicates strength of correlation between pairs of metabolites on a linear scale.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g003: Weighted correlation network analysis (WGCNA) of metabolic profiles of whole tomato fruit.Six tomato genotypes from two genetic backgrounds were analyzed by WGCNA using 46 NMR-profiled metabolites. Metabolites were clustered by expression patterns as represented by the dendrogram and correlation heat map. Clusters of like-regulated metabolites are referred to as modules by color (red, blue, turquoise). Metabolites that could not be assigned to a module are labeled gray. In the heat map, intensity of red coloring indicates strength of correlation between pairs of metabolites on a linear scale.
Mentions: WGCNA supports the assembly of both signed and unsigned networks. Here, we constructed unsigned networks using the 46 metabolite data set, which co-localize both positively and negatively correlated metabolites into three modules (Figure 3). The WGCNA package additionally provides easy quantification of several network statistics, or indices [5], [44]. In a weighted correlation network, connectivity equals the sum of connection strengths between a node and all of its neighbors, which has been associated with essentiality in protein and metabolic networks [45], [46]. Additionally, highly connected hubs may play a disproportionate role either in influencing the expression patterns of other nodes in the network, or alternatively may act as "sentries," communicating changes that occur elsewhere in the network. Scaled connectivity indicates the connectivity of a given node relative to the most connected node within the same module. The maximum adjacency ratio is related to connectivity; low values indicate nodes with many, weak connections to their neighbors, while high values indicate nodes with few, strong connections to their neighbors. In some situations, the maximum adjacency ratio may be more effective than connectivity to identify important hub features [5]. The clustering coefficient indicates the local density of a network, or the extent to which a node's neighbors are all strongly connected to each other. Other statistics are used to describe modules instead of individual nodes. Network density describes how tightly co-expressed a set of nodes within a module are, while centralization and heterogeneity describe the extent to which nodes within a given module differ in connectivity. Network heterogeneity describes the variation in connectivity within a module while centralization describes the extent to which a network contains many nodes that connect to a central hub node, but do not connect to their neighbors [5].

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH
Related in: MedlinePlus