Limits...
Leveraging non-targeted metabolite profiling via statistical genomics.

Shen M, Broeckling CD, Chu EY, Ziegler G, Baxter IR, Prenni JE, Hoekenga OA - PLoS ONE (2013)

Bottom Line: Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel.Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation.This method is applicable to any organism with sufficient bioinformatic resources.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT
One of the challenges of systems biology is to integrate multiple sources of data in order to build a cohesive view of the system of study. Here we describe the mass spectrometry based profiling of maize kernels, a model system for genomic studies and a cornerstone of the agroeconomy. Using a network analysis, we can include 97.5% of the 8,710 features detected from 210 varieties into a single framework. More conservatively, 47.1% of compounds detected can be organized into a network with 48 distinct modules. Eigenvalues were calculated for each module and then used as inputs for genome-wide association studies. Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel. Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation. This method is applicable to any organism with sufficient bioinformatic resources.

Show MeSH
Visualization of maize grain metabolome.This node and edge projection describes the grain metabolome observed in the methanolic extract from 210 inbred line varieties of maize. This network requires a minimum degree of connectivity between any two nodes (i.e. biochemical markers detected by mass spectrometry) that exceeds four standard deviations above the mean connectivity observed between detected markers. According to this threshold, 4,102 nodes are organized into 48 modules each represented by particular color. However, some modules have separated into multiple, distinct clusters as internal connectivity may fall beneath the 4 standard deviation cutoff, such that there are 101 objects in this projection.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3585405&req=5

pone-0057667-g002: Visualization of maize grain metabolome.This node and edge projection describes the grain metabolome observed in the methanolic extract from 210 inbred line varieties of maize. This network requires a minimum degree of connectivity between any two nodes (i.e. biochemical markers detected by mass spectrometry) that exceeds four standard deviations above the mean connectivity observed between detected markers. According to this threshold, 4,102 nodes are organized into 48 modules each represented by particular color. However, some modules have separated into multiple, distinct clusters as internal connectivity may fall beneath the 4 standard deviation cutoff, such that there are 101 objects in this projection.

Mentions: For our dataset, 97.5% of the detected molecular features (nodes) could be included in a network with 56 defined modules (Table S2). The network was then pruned to require that the minimum connectivity between nodes exceed 4 standard deviations (SD) above the mean connectivity observed between all nodes. At this threshold, the network contained 48 modules and 4,102 nodes (47% of nodes, 3.1% of the theoretical connections; Figure 2). The network was redefined under even stricter terms, using a 6SD threshold (Table S2). As the modules were defined by the strength of the correlations among members, modules varied in size and membership according to the inclusion threshold. For example, the turquoise module in the initial description had 2,105 nodes and ∼3.73 million edges (Table S2). At the 4SD threshold, the turquoise module reduced to 1,597 nodes with more than 0.62 million edges, while at the 6SD shrinking further to 635 nodes with 40,217 connections. Nodes within the turquoise module were also connected with members of the black module, which likewise contained connections to both the turquoise and purple modules. Other modules were much less elaborate; orange contained 81 nodes in its initial description, 63 nodes at 1 SD, dropping to 9 nodes and 56 connections at 4SD, and disappearing completely at 6SD (Table S2). At the 4SD threshold some modules broke into distinct clusters as connections that helped to define the original module, using the original definitions, dropped below the significance threshold (Figure 2). This facet of the WGCNA procedure represents both a strength and a weakness for the approach. Information can be applied to poorly connected members of a particular module using guilt by association on tightly connected central elements. However, as the module eigenvalues are estimated when the network was initially described, the poorly connected nodes may transmit an excessive degree of variance to these values and perhaps confound downstream applications.


Leveraging non-targeted metabolite profiling via statistical genomics.

Shen M, Broeckling CD, Chu EY, Ziegler G, Baxter IR, Prenni JE, Hoekenga OA - PLoS ONE (2013)

Visualization of maize grain metabolome.This node and edge projection describes the grain metabolome observed in the methanolic extract from 210 inbred line varieties of maize. This network requires a minimum degree of connectivity between any two nodes (i.e. biochemical markers detected by mass spectrometry) that exceeds four standard deviations above the mean connectivity observed between detected markers. According to this threshold, 4,102 nodes are organized into 48 modules each represented by particular color. However, some modules have separated into multiple, distinct clusters as internal connectivity may fall beneath the 4 standard deviation cutoff, such that there are 101 objects in this projection.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3585405&req=5

pone-0057667-g002: Visualization of maize grain metabolome.This node and edge projection describes the grain metabolome observed in the methanolic extract from 210 inbred line varieties of maize. This network requires a minimum degree of connectivity between any two nodes (i.e. biochemical markers detected by mass spectrometry) that exceeds four standard deviations above the mean connectivity observed between detected markers. According to this threshold, 4,102 nodes are organized into 48 modules each represented by particular color. However, some modules have separated into multiple, distinct clusters as internal connectivity may fall beneath the 4 standard deviation cutoff, such that there are 101 objects in this projection.
Mentions: For our dataset, 97.5% of the detected molecular features (nodes) could be included in a network with 56 defined modules (Table S2). The network was then pruned to require that the minimum connectivity between nodes exceed 4 standard deviations (SD) above the mean connectivity observed between all nodes. At this threshold, the network contained 48 modules and 4,102 nodes (47% of nodes, 3.1% of the theoretical connections; Figure 2). The network was redefined under even stricter terms, using a 6SD threshold (Table S2). As the modules were defined by the strength of the correlations among members, modules varied in size and membership according to the inclusion threshold. For example, the turquoise module in the initial description had 2,105 nodes and ∼3.73 million edges (Table S2). At the 4SD threshold, the turquoise module reduced to 1,597 nodes with more than 0.62 million edges, while at the 6SD shrinking further to 635 nodes with 40,217 connections. Nodes within the turquoise module were also connected with members of the black module, which likewise contained connections to both the turquoise and purple modules. Other modules were much less elaborate; orange contained 81 nodes in its initial description, 63 nodes at 1 SD, dropping to 9 nodes and 56 connections at 4SD, and disappearing completely at 6SD (Table S2). At the 4SD threshold some modules broke into distinct clusters as connections that helped to define the original module, using the original definitions, dropped below the significance threshold (Figure 2). This facet of the WGCNA procedure represents both a strength and a weakness for the approach. Information can be applied to poorly connected members of a particular module using guilt by association on tightly connected central elements. However, as the module eigenvalues are estimated when the network was initially described, the poorly connected nodes may transmit an excessive degree of variance to these values and perhaps confound downstream applications.

Bottom Line: Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel.Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation.This method is applicable to any organism with sufficient bioinformatic resources.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT
One of the challenges of systems biology is to integrate multiple sources of data in order to build a cohesive view of the system of study. Here we describe the mass spectrometry based profiling of maize kernels, a model system for genomic studies and a cornerstone of the agroeconomy. Using a network analysis, we can include 97.5% of the 8,710 features detected from 210 varieties into a single framework. More conservatively, 47.1% of compounds detected can be organized into a network with 48 distinct modules. Eigenvalues were calculated for each module and then used as inputs for genome-wide association studies. Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel. Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation. This method is applicable to any organism with sufficient bioinformatic resources.

Show MeSH