Limits...
A systems-genetics approach and data mining tool to assist in the discovery of genes underlying complex traits in Oryza sativa.

Ficklin SP, Feltus FA - PLoS ONE (2013)

Bottom Line: GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation.A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits.Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.

View Article: PubMed Central - PubMed

Affiliation: Plant and Environmental Sciences, Clemson University, Clemson, South Carolina, United States of America.

ABSTRACT
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.

Show MeSH
A Significant Module for Amylose Content.Module OsK25v1.0_G0023_LCM0301 significantly overlaps with 15 different genetic features (2 SNPs, 13 QTLs, p-value = 1.9e-4) and is significantly enriched for Bifunctional trypsin/alpha-amylase inhibitor helical domain and starch synthase. A) Red circles indicate nodes that overlap with genetic features and green nodes do not. B) The distribution of module edges along the genomic chromosomes. GWAS SNPs are barely visible as tick marks whereas QTLs are visible as small colored blocks along the chromosomes. Edges are red if one node lies within the region of a genetic feature.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3713027&req=5

pone-0068551-g006: A Significant Module for Amylose Content.Module OsK25v1.0_G0023_LCM0301 significantly overlaps with 15 different genetic features (2 SNPs, 13 QTLs, p-value = 1.9e-4) and is significantly enriched for Bifunctional trypsin/alpha-amylase inhibitor helical domain and starch synthase. A) Red circles indicate nodes that overlap with genetic features and green nodes do not. B) The distribution of module edges along the genomic chromosomes. GWAS SNPs are barely visible as tick marks whereas QTLs are visible as small colored blocks along the chromosomes. Edges are red if one node lies within the region of a genetic feature.

Mentions: To demonstrate the use of the GeneNet Engine, we use as an example the trait amylose content. It is well understood that the Waxy gene (Wx) plays a major role in amylose content [55]. This gene resides on chromosome 6 of Oryza sativa and is at locus LOC_Os06g04200 on the MSU v6.0 genome. A recent study of 171 rice accessions shows that two SNPs in the Waxy gene account for 86.7% of the variation in amylose content [56], indicating it is a large effect gene. Recently, Zhao et. al. included amylose content as a trait in their GWAS study and significantly identified 68 SNPs associated with amylose content with a mixed model p-value <1e-4 [4]. In an effort to find small effect loci that may affect variation in amylose content, a search was performed using the GeneNet Engine. Using the search page a filter was entered that provided the Waxy gene locus, LOC_Os06g04200, as well as overlap with the amylose content trait. In this case, the genetic feature was limited to a ‘GWAS SNP’. The result yielded 6 modules from the Rice GIL collection and one from a previous global rice network [25] which has also been added to the GeneNet Explorer. Most of the network modules were small (between 5–15 nodes). In the GIL collection, the largest module was OsK25v1.0_G0023_LCM0301, with 30 nodes, and it had the largest average connectivity (<k> = 17.47) indicating that the nodes were more highly interconnected than the other 5 modules. The GeneNet Engine provides a Fisher’s p-value as a simple means for filtering modules that may have a high probability of false positives. As mentioned previously, this p-value is simply a guide and does not necessarily imply a high probability of causality for the trait. The top enriched functional terms for all 7 modules included seed storage protein (IPR006044), alpha-amylase inhibitor (IPR013771), and transcription factor CBF/NF-Y (IPR003958). All 6 GIL collection modules were present in GIL G0023 except for one (enriched for Transcription factor CBF/NY-Y) which was present in GIL G0003. Starch synthase (K00703) was also enriched in all 7 modules. All 6 of the Rice GIL modules overlapped with only 1 or 2 GWAS SNPs, with p-values quite high (from 0.2 to 0.03), indicating a high probability of false positives. However, after including overlapping genes underlying QTLs using the ‘Filter by Trait’ tab in the Module Explorer, the p-values were all lower and the most highly connected GIL module, OsK25v1.0_G0023_LCM0301, overlapped with 13 QTLs and 2 GWAS SNPs (15 genetic features) with a p-value of 1.9e-4 (Figure 6). The module from the global network was much larger, overlapped 4 GWAS SNPs and 34 QTLs but had a high probability of false positives (p-value = 0.03). While p-values were not significant for some of the smaller modules, it would seem that any of these modules could be potential candidates to explore small-effect variation in amylose content. Potentially, combining several of these modules may provide, as a group, a set of possible small-effect candidate genes. The OsK25v1.0_G0023_LCM0301 module seemed most suited for exploration as it is relatively small (only 30 genes) had a significant p-value (1.9e-4) and all nodes were highly connected indicating a high degree of cooperation. The effects of these genes may be examined through additional lab experiments, such as where plants with mutations can be grown and phenotyped. As a direct means for verification through experimentation, GeneNet Explorer can provide a list of SNPs that could potentially serve as biomarkers. For module OsK25v1.0_G0023_LCM0301, over 4200 SNPs were obtained, all within 50 kb of genes that overlapped genetic features for amylose content.


A systems-genetics approach and data mining tool to assist in the discovery of genes underlying complex traits in Oryza sativa.

Ficklin SP, Feltus FA - PLoS ONE (2013)

A Significant Module for Amylose Content.Module OsK25v1.0_G0023_LCM0301 significantly overlaps with 15 different genetic features (2 SNPs, 13 QTLs, p-value = 1.9e-4) and is significantly enriched for Bifunctional trypsin/alpha-amylase inhibitor helical domain and starch synthase. A) Red circles indicate nodes that overlap with genetic features and green nodes do not. B) The distribution of module edges along the genomic chromosomes. GWAS SNPs are barely visible as tick marks whereas QTLs are visible as small colored blocks along the chromosomes. Edges are red if one node lies within the region of a genetic feature.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3713027&req=5

pone-0068551-g006: A Significant Module for Amylose Content.Module OsK25v1.0_G0023_LCM0301 significantly overlaps with 15 different genetic features (2 SNPs, 13 QTLs, p-value = 1.9e-4) and is significantly enriched for Bifunctional trypsin/alpha-amylase inhibitor helical domain and starch synthase. A) Red circles indicate nodes that overlap with genetic features and green nodes do not. B) The distribution of module edges along the genomic chromosomes. GWAS SNPs are barely visible as tick marks whereas QTLs are visible as small colored blocks along the chromosomes. Edges are red if one node lies within the region of a genetic feature.
Mentions: To demonstrate the use of the GeneNet Engine, we use as an example the trait amylose content. It is well understood that the Waxy gene (Wx) plays a major role in amylose content [55]. This gene resides on chromosome 6 of Oryza sativa and is at locus LOC_Os06g04200 on the MSU v6.0 genome. A recent study of 171 rice accessions shows that two SNPs in the Waxy gene account for 86.7% of the variation in amylose content [56], indicating it is a large effect gene. Recently, Zhao et. al. included amylose content as a trait in their GWAS study and significantly identified 68 SNPs associated with amylose content with a mixed model p-value <1e-4 [4]. In an effort to find small effect loci that may affect variation in amylose content, a search was performed using the GeneNet Engine. Using the search page a filter was entered that provided the Waxy gene locus, LOC_Os06g04200, as well as overlap with the amylose content trait. In this case, the genetic feature was limited to a ‘GWAS SNP’. The result yielded 6 modules from the Rice GIL collection and one from a previous global rice network [25] which has also been added to the GeneNet Explorer. Most of the network modules were small (between 5–15 nodes). In the GIL collection, the largest module was OsK25v1.0_G0023_LCM0301, with 30 nodes, and it had the largest average connectivity (<k> = 17.47) indicating that the nodes were more highly interconnected than the other 5 modules. The GeneNet Engine provides a Fisher’s p-value as a simple means for filtering modules that may have a high probability of false positives. As mentioned previously, this p-value is simply a guide and does not necessarily imply a high probability of causality for the trait. The top enriched functional terms for all 7 modules included seed storage protein (IPR006044), alpha-amylase inhibitor (IPR013771), and transcription factor CBF/NF-Y (IPR003958). All 6 GIL collection modules were present in GIL G0023 except for one (enriched for Transcription factor CBF/NY-Y) which was present in GIL G0003. Starch synthase (K00703) was also enriched in all 7 modules. All 6 of the Rice GIL modules overlapped with only 1 or 2 GWAS SNPs, with p-values quite high (from 0.2 to 0.03), indicating a high probability of false positives. However, after including overlapping genes underlying QTLs using the ‘Filter by Trait’ tab in the Module Explorer, the p-values were all lower and the most highly connected GIL module, OsK25v1.0_G0023_LCM0301, overlapped with 13 QTLs and 2 GWAS SNPs (15 genetic features) with a p-value of 1.9e-4 (Figure 6). The module from the global network was much larger, overlapped 4 GWAS SNPs and 34 QTLs but had a high probability of false positives (p-value = 0.03). While p-values were not significant for some of the smaller modules, it would seem that any of these modules could be potential candidates to explore small-effect variation in amylose content. Potentially, combining several of these modules may provide, as a group, a set of possible small-effect candidate genes. The OsK25v1.0_G0023_LCM0301 module seemed most suited for exploration as it is relatively small (only 30 genes) had a significant p-value (1.9e-4) and all nodes were highly connected indicating a high degree of cooperation. The effects of these genes may be examined through additional lab experiments, such as where plants with mutations can be grown and phenotyped. As a direct means for verification through experimentation, GeneNet Explorer can provide a list of SNPs that could potentially serve as biomarkers. For module OsK25v1.0_G0023_LCM0301, over 4200 SNPs were obtained, all within 50 kb of genes that overlapped genetic features for amylose content.

Bottom Line: GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation.A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits.Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.

View Article: PubMed Central - PubMed

Affiliation: Plant and Environmental Sciences, Clemson University, Clemson, South Carolina, United States of America.

ABSTRACT
Many traits of biological and agronomic significance in plants are controlled in a complex manner where multiple genes and environmental signals affect the expression of the phenotype. In Oryza sativa (rice), thousands of quantitative genetic signals have been mapped to the rice genome. In parallel, thousands of gene expression profiles have been generated across many experimental conditions. Through the discovery of networks with real gene co-expression relationships, it is possible to identify co-localized genetic and gene expression signals that implicate complex genotype-phenotype relationships. In this work, we used a knowledge-independent, systems genetics approach, to discover a high-quality set of co-expression networks, termed Gene Interaction Layers (GILs). Twenty-two GILs were constructed from 1,306 Affymetrix microarray rice expression profiles that were pre-clustered to allow for improved capture of gene co-expression relationships. Functional genomic and genetic data, including over 8,000 QTLs and 766 phenotype-tagged SNPs (p-value < = 0.001) from genome-wide association studies, both covering over 230 different rice traits were integrated with the GILs. An online systems genetics data-mining resource, the GeneNet Engine, was constructed to enable dynamic discovery of gene sets (i.e. network modules) that overlap with genetic traits. GeneNet Engine does not provide the exact set of genes underlying a given complex trait, but through the evidence of gene-marker correspondence, co-expression, and functional enrichment, site visitors can identify genes with potential shared causality for a trait which could then be used for experimental validation. A set of 2 million SNPs was incorporated into the database and serve as a potential set of testable biomarkers for genes in modules that overlap with genetic traits. Herein, we describe two modules found using GeneNet Engine, one with significant overlap with the trait amylose content and another with significant overlap with blast disease resistance.

Show MeSH