Limits...
Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH

Related in: MedlinePlus

WGCNA of whole tomato fruit metabolic profiles as represented by node and edge graph.22 of 46 NMR-profiled metabolites were clustered into three modules (red, blue, turquoise); remaining metabolites were not assigned to any module (color coded as gray). Connection strength is represented by edge width (edges <0.10 omitted). The topological overlap measure from the WGCNA was displayed using Cytoscape to illustrate the network assembled from the 22 metabolites.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g004: WGCNA of whole tomato fruit metabolic profiles as represented by node and edge graph.22 of 46 NMR-profiled metabolites were clustered into three modules (red, blue, turquoise); remaining metabolites were not assigned to any module (color coded as gray). Connection strength is represented by edge width (edges <0.10 omitted). The topological overlap measure from the WGCNA was displayed using Cytoscape to illustrate the network assembled from the 22 metabolites.

Mentions: Module and metabolite-specific network statistics were calculated for this network (Tables 2 and 3). The turquoise and blue modules were the largest modules, containing 13 and 12 compounds respectively, while the red module contained 6 (Figure 3, Table 2). Fifteen metabolites were not assigned to any module, and were labeled with the color gray. All three modules are similar in centralization, while the red module is denser and less heterogeneous than the blue and turquoise modules; these features can easily be observed when the network is displayed using Cytoscape (Figure 4) [47]. Aspartate has the highest connectivity in the dataset (1.66) and, logically, the highest scaled connectivity within its module (1.00; Table 3). It surpasses all other metabolites in its number of connections, many of which are strong and link shared neighbors. Sterol, which has a single, very weak connection, has the lowest connectivity among the three modules (0.45). Fructose has the highest maximum adjacency ratio among all metabolites (0.26), which is reflected in its single, yet extremely strong connection to glucose. This connection is consistent with the precursor/product roles of glucose and fructose in glycolysis and gluconeogenesis [48]. Similar precursor/product or product/co-factor relationships were recognized by the WGCNA between isoleucine and threonine, leucine and GABA, and alanine and AMP, as evidenced by their high connectivity within the network (Figure 4) [48]. Formate has one of the lowest maximum adjacency ratios in the dataset (0.08) and has many weak connections. The lowest maximum adjacency ratio within the turquoise, blue, and red modules (0.05) belongs to tyrosine, which has a single, very weak connection. NADP+ has the highest clustering coefficient (0.11) due to the dense connections among all of its neighbors. Glucose has the lowest clustering coefficient within the three modules (0.05), which is reflected in its connection to two completely unconnected nodes.


Weighted correlation network analysis (WGCNA) applied to the tomato fruit metabolome.

DiLeo MV, Strahan GD, den Bakker M, Hoekenga OA - PLoS ONE (2011)

WGCNA of whole tomato fruit metabolic profiles as represented by node and edge graph.22 of 46 NMR-profiled metabolites were clustered into three modules (red, blue, turquoise); remaining metabolites were not assigned to any module (color coded as gray). Connection strength is represented by edge width (edges <0.10 omitted). The topological overlap measure from the WGCNA was displayed using Cytoscape to illustrate the network assembled from the 22 metabolites.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3198806&req=5

pone-0026683-g004: WGCNA of whole tomato fruit metabolic profiles as represented by node and edge graph.22 of 46 NMR-profiled metabolites were clustered into three modules (red, blue, turquoise); remaining metabolites were not assigned to any module (color coded as gray). Connection strength is represented by edge width (edges <0.10 omitted). The topological overlap measure from the WGCNA was displayed using Cytoscape to illustrate the network assembled from the 22 metabolites.
Mentions: Module and metabolite-specific network statistics were calculated for this network (Tables 2 and 3). The turquoise and blue modules were the largest modules, containing 13 and 12 compounds respectively, while the red module contained 6 (Figure 3, Table 2). Fifteen metabolites were not assigned to any module, and were labeled with the color gray. All three modules are similar in centralization, while the red module is denser and less heterogeneous than the blue and turquoise modules; these features can easily be observed when the network is displayed using Cytoscape (Figure 4) [47]. Aspartate has the highest connectivity in the dataset (1.66) and, logically, the highest scaled connectivity within its module (1.00; Table 3). It surpasses all other metabolites in its number of connections, many of which are strong and link shared neighbors. Sterol, which has a single, very weak connection, has the lowest connectivity among the three modules (0.45). Fructose has the highest maximum adjacency ratio among all metabolites (0.26), which is reflected in its single, yet extremely strong connection to glucose. This connection is consistent with the precursor/product roles of glucose and fructose in glycolysis and gluconeogenesis [48]. Similar precursor/product or product/co-factor relationships were recognized by the WGCNA between isoleucine and threonine, leucine and GABA, and alanine and AMP, as evidenced by their high connectivity within the network (Figure 4) [48]. Formate has one of the lowest maximum adjacency ratios in the dataset (0.08) and has many weak connections. The lowest maximum adjacency ratio within the turquoise, blue, and red modules (0.05) belongs to tyrosine, which has a single, very weak connection. NADP+ has the highest clustering coefficient (0.11) due to the dense connections among all of its neighbors. Glucose has the lowest clustering coefficient within the three modules (0.05), which is reflected in its connection to two completely unconnected nodes.

Bottom Line: A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network.Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT

Background: Advances in "omics" technologies have revolutionized the collection of biological data. A matching revolution in our understanding of biological systems, however, will only be realized when similar advances are made in informatic analysis of the resulting "big data." Here, we compare the capabilities of three conventional and novel statistical approaches to summarize and decipher the tomato metabolome.

Methodology: Principal component analysis (PCA), batch learning self-organizing maps (BL-SOM) and weighted gene co-expression network analysis (WGCNA) were applied to a multivariate NMR dataset collected from developmentally staged tomato fruits belonging to several genotypes. While PCA and BL-SOM are appropriate and commonly used methods, WGCNA holds several advantages in the analysis of highly multivariate, complex data.

Conclusions: PCA separated the two major genetic backgrounds (AC and NC), but provided little further information. Both BL-SOM and WGCNA clustered metabolites by expression, but WGCNA additionally defined "modules" of co-expressed metabolites explicitly and provided additional network statistics that described the systems properties of the tomato metabolic network. Our first application of WGCNA to tomato metabolomics data identified three major modules of metabolites that were associated with ripening-related traits and genetic background.

Show MeSH
Related in: MedlinePlus