Limits...
Association mapping across numerous traits reveals patterns of functional variation in maize.

Wallace JG, Bradbury PJ, Zhang N, Gibon Y, Stitt M, Buckler ES - PLoS Genet. (2014)

Bottom Line: Phenotypic variation in natural populations results from a combination of genetic effects, environmental effects, and gene-by-environment interactions.We also find that genes tagged by GWAS are enriched for regulatory functions and are ∼ 50% more likely to have a paralog than expected by chance, indicating that gene regulation and gene duplication are strong drivers of phenotypic variation.These results will likely apply to many other organisms, especially ones with large and complex genomes like maize.

View Article: PubMed Central - PubMed

Affiliation: Institute for Genomic Diversity, Cornell University, Ithaca, New York, United States of America.

ABSTRACT
Phenotypic variation in natural populations results from a combination of genetic effects, environmental effects, and gene-by-environment interactions. Despite the vast amount of genomic data becoming available, many pressing questions remain about the nature of genetic mutations that underlie functional variation. We present the results of combining genome-wide association analysis of 41 different phenotypes in ∼ 5,000 inbred maize lines to analyze patterns of high-resolution genetic association among of 28.9 million single-nucleotide polymorphisms (SNPs) and ∼ 800,000 copy-number variants (CNVs). We show that genic and intergenic regions have opposite patterns of enrichment, minor allele frequencies, and effect sizes, implying tradeoffs among the probability that a given polymorphism will have an effect, the detectable size of that effect, and its frequency in the population. We also find that genes tagged by GWAS are enriched for regulatory functions and are ∼ 50% more likely to have a paralog than expected by chance, indicating that gene regulation and gene duplication are strong drivers of phenotypic variation. These results will likely apply to many other organisms, especially ones with large and complex genomes like maize.

No MeSH data available.


Related in: MedlinePlus

Comparison of paralogous to nonparalogous genes.Maize paralogous genes (identified by Schnable & Freeling [52]) were examined for any differences from nonparalogous genes that might spuriously contribute to their enrichment in GWAS analyses. There are no strong differences in either minor allele frequency distribution (A) or linkage disequilibrium decay (B), and the slightly lower SNP density (C) (median 32.8 SNPs/kb versus 33.4 SNPs/kb for nonparalogous genes) would be expected to actually decrease the probability of hitting paralogous genes, albeit by a very small amount.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4256217&req=5

pgen-1004845-g007: Comparison of paralogous to nonparalogous genes.Maize paralogous genes (identified by Schnable & Freeling [52]) were examined for any differences from nonparalogous genes that might spuriously contribute to their enrichment in GWAS analyses. There are no strong differences in either minor allele frequency distribution (A) or linkage disequilibrium decay (B), and the slightly lower SNP density (C) (median 32.8 SNPs/kb versus 33.4 SNPs/kb for nonparalogous genes) would be expected to actually decrease the probability of hitting paralogous genes, albeit by a very small amount.

Mentions: Finally, we found that genes with GWAS hits in their primary transcripts are ∼50% more likely to have a paralog than expected by chance (36.4% of 970 GWAS-hit genes vs 24.2% of 39,656 total genes in the maize AGPv2 filtered gene set; p = 3.79×10−17 by two-sided exact binomial test and 1.06×10-17 by Fisher's exact test). Paralogous genes do not appear to have significant differences from non-paralogous genes in either allele frequency or LD structure, and the marginally lower density of SNPs in them would seem to disfavor their selection by GWAS, all other things being equal (Fig. 7). Thus the enrichment for paralogous genes is probably due to the benefits of gene duplication, since having redundant copies of a gene allows one of them to more easily take on altered (and phenotypically significant) roles through either subfunctionalization or neofunctionalization [31]. Also, we did a parallel analysis looking only at paralogs resulting from maize's most recent genome duplication to see if they followed a different distribution. The resulting enrichment ratio and p-value are nearly identical to the analysis with all paralogs (30.7% paralogous in GWAS versus 20.0% in the maize filtered gene set, p = 2.91×10−17 by exact binomial test), so we conclude that for this analysis the source of paralogs does not play a significant role.


Association mapping across numerous traits reveals patterns of functional variation in maize.

Wallace JG, Bradbury PJ, Zhang N, Gibon Y, Stitt M, Buckler ES - PLoS Genet. (2014)

Comparison of paralogous to nonparalogous genes.Maize paralogous genes (identified by Schnable & Freeling [52]) were examined for any differences from nonparalogous genes that might spuriously contribute to their enrichment in GWAS analyses. There are no strong differences in either minor allele frequency distribution (A) or linkage disequilibrium decay (B), and the slightly lower SNP density (C) (median 32.8 SNPs/kb versus 33.4 SNPs/kb for nonparalogous genes) would be expected to actually decrease the probability of hitting paralogous genes, albeit by a very small amount.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4256217&req=5

pgen-1004845-g007: Comparison of paralogous to nonparalogous genes.Maize paralogous genes (identified by Schnable & Freeling [52]) were examined for any differences from nonparalogous genes that might spuriously contribute to their enrichment in GWAS analyses. There are no strong differences in either minor allele frequency distribution (A) or linkage disequilibrium decay (B), and the slightly lower SNP density (C) (median 32.8 SNPs/kb versus 33.4 SNPs/kb for nonparalogous genes) would be expected to actually decrease the probability of hitting paralogous genes, albeit by a very small amount.
Mentions: Finally, we found that genes with GWAS hits in their primary transcripts are ∼50% more likely to have a paralog than expected by chance (36.4% of 970 GWAS-hit genes vs 24.2% of 39,656 total genes in the maize AGPv2 filtered gene set; p = 3.79×10−17 by two-sided exact binomial test and 1.06×10-17 by Fisher's exact test). Paralogous genes do not appear to have significant differences from non-paralogous genes in either allele frequency or LD structure, and the marginally lower density of SNPs in them would seem to disfavor their selection by GWAS, all other things being equal (Fig. 7). Thus the enrichment for paralogous genes is probably due to the benefits of gene duplication, since having redundant copies of a gene allows one of them to more easily take on altered (and phenotypically significant) roles through either subfunctionalization or neofunctionalization [31]. Also, we did a parallel analysis looking only at paralogs resulting from maize's most recent genome duplication to see if they followed a different distribution. The resulting enrichment ratio and p-value are nearly identical to the analysis with all paralogs (30.7% paralogous in GWAS versus 20.0% in the maize filtered gene set, p = 2.91×10−17 by exact binomial test), so we conclude that for this analysis the source of paralogs does not play a significant role.

Bottom Line: Phenotypic variation in natural populations results from a combination of genetic effects, environmental effects, and gene-by-environment interactions.We also find that genes tagged by GWAS are enriched for regulatory functions and are ∼ 50% more likely to have a paralog than expected by chance, indicating that gene regulation and gene duplication are strong drivers of phenotypic variation.These results will likely apply to many other organisms, especially ones with large and complex genomes like maize.

View Article: PubMed Central - PubMed

Affiliation: Institute for Genomic Diversity, Cornell University, Ithaca, New York, United States of America.

ABSTRACT
Phenotypic variation in natural populations results from a combination of genetic effects, environmental effects, and gene-by-environment interactions. Despite the vast amount of genomic data becoming available, many pressing questions remain about the nature of genetic mutations that underlie functional variation. We present the results of combining genome-wide association analysis of 41 different phenotypes in ∼ 5,000 inbred maize lines to analyze patterns of high-resolution genetic association among of 28.9 million single-nucleotide polymorphisms (SNPs) and ∼ 800,000 copy-number variants (CNVs). We show that genic and intergenic regions have opposite patterns of enrichment, minor allele frequencies, and effect sizes, implying tradeoffs among the probability that a given polymorphism will have an effect, the detectable size of that effect, and its frequency in the population. We also find that genes tagged by GWAS are enriched for regulatory functions and are ∼ 50% more likely to have a paralog than expected by chance, indicating that gene regulation and gene duplication are strong drivers of phenotypic variation. These results will likely apply to many other organisms, especially ones with large and complex genomes like maize.

No MeSH data available.


Related in: MedlinePlus