Limits...
Leveraging non-targeted metabolite profiling via statistical genomics.

Shen M, Broeckling CD, Chu EY, Ziegler G, Baxter IR, Prenni JE, Hoekenga OA - PLoS ONE (2013)

Bottom Line: Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel.Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation.This method is applicable to any organism with sufficient bioinformatic resources.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT
One of the challenges of systems biology is to integrate multiple sources of data in order to build a cohesive view of the system of study. Here we describe the mass spectrometry based profiling of maize kernels, a model system for genomic studies and a cornerstone of the agroeconomy. Using a network analysis, we can include 97.5% of the 8,710 features detected from 210 varieties into a single framework. More conservatively, 47.1% of compounds detected can be organized into a network with 48 distinct modules. Eigenvalues were calculated for each module and then used as inputs for genome-wide association studies. Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel. Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation. This method is applicable to any organism with sufficient bioinformatic resources.

Show MeSH
Genome-wide association studies on three module eigenvalues (ME).Nineteen modules returned significantly correlated SNP markers according to GAPIT. Three are shown here. Significance thresholds were empirically calculated for each trait using GAPIT; FDR-corrected p-values at both a conservative (p<0.001; green line) and generous (p<0.05; aqua line) are displayed. MEmidnightblue identified one region of chromosome 7 with high confidence, with a second region of chromosome 1 with lower confidence. MEplum2 identified multiple genomic intervals with high confidence. MEdarkslateblue identified no significant regions at the conservative threshold, but several regions at the lower threshold.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3585405&req=5

pone-0057667-g003: Genome-wide association studies on three module eigenvalues (ME).Nineteen modules returned significantly correlated SNP markers according to GAPIT. Three are shown here. Significance thresholds were empirically calculated for each trait using GAPIT; FDR-corrected p-values at both a conservative (p<0.001; green line) and generous (p<0.05; aqua line) are displayed. MEmidnightblue identified one region of chromosome 7 with high confidence, with a second region of chromosome 1 with lower confidence. MEplum2 identified multiple genomic intervals with high confidence. MEdarkslateblue identified no significant regions at the conservative threshold, but several regions at the lower threshold.

Mentions: We expected that using WGCNA for the analysis of mass spectrometry based non-targeted metabolite profiling data would accomplish two goals: (1) define the co-regulated networks of metabolites and peptides that contribute to maize kernel quality and composition and (2) reduce the number of variables for downstream analyses. One such analysis is a genome wide association study (GWAS), to correlate particular genomic regions with phenotypes of interest. This approach has already been applied to maize but not on derived variables such as module eigenvalues, so far as we are aware. And while computational resources are improving, conducting GWAS with a SNP dataset as large as that available for the Buckler Diversity Panel using optimized procedures is still a time intensive procedure (0.5 hr/trait or >150 d for the original data) [12], [44]. Module eigenvalues for all 56 modules were analyzed, 19 of which found significant associations (FDR corrected p-value <0.05; Table S2). Modules that were detected under the most stringent membership conditions (>6SD) were more likely to produce significant GWAS outcomes than those present only under lesser requirements (14 of 27 versus 5 of 21; Table S2). However, modules with fewer connections at 4SD were more likely to identify significantly correlated SNPs with GWAS (χ2 = 4.56, p = 0.0328). While 4,830 SNPs were identified by GWAS, nearly two-thirds were associated with only two modules (plum2 and salmon). A variety of patterns were observed in the results, ranging from few to many SNPs and wide to narrow distribution across the genome (Figure 3).


Leveraging non-targeted metabolite profiling via statistical genomics.

Shen M, Broeckling CD, Chu EY, Ziegler G, Baxter IR, Prenni JE, Hoekenga OA - PLoS ONE (2013)

Genome-wide association studies on three module eigenvalues (ME).Nineteen modules returned significantly correlated SNP markers according to GAPIT. Three are shown here. Significance thresholds were empirically calculated for each trait using GAPIT; FDR-corrected p-values at both a conservative (p<0.001; green line) and generous (p<0.05; aqua line) are displayed. MEmidnightblue identified one region of chromosome 7 with high confidence, with a second region of chromosome 1 with lower confidence. MEplum2 identified multiple genomic intervals with high confidence. MEdarkslateblue identified no significant regions at the conservative threshold, but several regions at the lower threshold.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3585405&req=5

pone-0057667-g003: Genome-wide association studies on three module eigenvalues (ME).Nineteen modules returned significantly correlated SNP markers according to GAPIT. Three are shown here. Significance thresholds were empirically calculated for each trait using GAPIT; FDR-corrected p-values at both a conservative (p<0.001; green line) and generous (p<0.05; aqua line) are displayed. MEmidnightblue identified one region of chromosome 7 with high confidence, with a second region of chromosome 1 with lower confidence. MEplum2 identified multiple genomic intervals with high confidence. MEdarkslateblue identified no significant regions at the conservative threshold, but several regions at the lower threshold.
Mentions: We expected that using WGCNA for the analysis of mass spectrometry based non-targeted metabolite profiling data would accomplish two goals: (1) define the co-regulated networks of metabolites and peptides that contribute to maize kernel quality and composition and (2) reduce the number of variables for downstream analyses. One such analysis is a genome wide association study (GWAS), to correlate particular genomic regions with phenotypes of interest. This approach has already been applied to maize but not on derived variables such as module eigenvalues, so far as we are aware. And while computational resources are improving, conducting GWAS with a SNP dataset as large as that available for the Buckler Diversity Panel using optimized procedures is still a time intensive procedure (0.5 hr/trait or >150 d for the original data) [12], [44]. Module eigenvalues for all 56 modules were analyzed, 19 of which found significant associations (FDR corrected p-value <0.05; Table S2). Modules that were detected under the most stringent membership conditions (>6SD) were more likely to produce significant GWAS outcomes than those present only under lesser requirements (14 of 27 versus 5 of 21; Table S2). However, modules with fewer connections at 4SD were more likely to identify significantly correlated SNPs with GWAS (χ2 = 4.56, p = 0.0328). While 4,830 SNPs were identified by GWAS, nearly two-thirds were associated with only two modules (plum2 and salmon). A variety of patterns were observed in the results, ranging from few to many SNPs and wide to narrow distribution across the genome (Figure 3).

Bottom Line: Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel.Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation.This method is applicable to any organism with sufficient bioinformatic resources.

View Article: PubMed Central - PubMed

Affiliation: Boyce Thompson Institute for Plant Research, Ithaca, New York, United States of America.

ABSTRACT
One of the challenges of systems biology is to integrate multiple sources of data in order to build a cohesive view of the system of study. Here we describe the mass spectrometry based profiling of maize kernels, a model system for genomic studies and a cornerstone of the agroeconomy. Using a network analysis, we can include 97.5% of the 8,710 features detected from 210 varieties into a single framework. More conservatively, 47.1% of compounds detected can be organized into a network with 48 distinct modules. Eigenvalues were calculated for each module and then used as inputs for genome-wide association studies. Nineteen modules returned significant results, illustrating the genetic control of biochemical networks within the maize kernel. Our approach leverages the correlations between the genome and metabolome to mutually enhance their annotation and thus enable biological interpretation. This method is applicable to any organism with sufficient bioinformatic resources.

Show MeSH