Limits...
Dissecting genome-wide association signals for loss-of-function phenotypes in sorghum flavonoid pigmentation traits.

Morris GP, Rhodes DH, Brenton Z, Ramu P, Thayil VM, Deshpande S, Hash CT, Acharya C, Mitchell SE, Buckler ES, Yu J, Kresovich S - G3 (Bethesda) (2013)

Bottom Line: Genome-wide association studies are a powerful method to dissect the genetic basis of traits, although in practice the effects of complex genetic architecture and population structure remain poorly understood.Interestingly, a simple loss-of-function genome scan, for genotype-phenotype covariation only in the putative loss-of-function allele, is able to precisely identify the Tannin1 gene without considering relatedness.These findings highlight that complex association signals can emerge from even the simplest traits given epistasis and structured alleles, but that gene-resolution mapping of these traits is possible with high marker density and appropriate models.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, University of South Carolina, Columbia, South Carolina 29208.

ABSTRACT
Genome-wide association studies are a powerful method to dissect the genetic basis of traits, although in practice the effects of complex genetic architecture and population structure remain poorly understood. To compare mapping strategies we dissected the genetic control of flavonoid pigmentation traits in the cereal grass sorghum by using high-resolution genotyping-by-sequencing single-nucleotide polymorphism markers. Studying the grain tannin trait, we find that general linear models (GLMs) are not able to precisely map tan1-a, a known loss-of-function allele of the Tannin1 gene, with either a small panel (n = 142) or large association panel (n = 336), and that indirect associations limit the mapping of the Tannin1 locus to Mb-resolution. A GLM that accounts for population structure (Q) or standard mixed linear model that accounts for kinship (K) can identify tan1-a, whereas a compressed mixed linear model performs worse than the naive GLM. Interestingly, a simple loss-of-function genome scan, for genotype-phenotype covariation only in the putative loss-of-function allele, is able to precisely identify the Tannin1 gene without considering relatedness. We also find that the tan1-a allele can be mapped with gene resolution in a biparental recombinant inbred line family (n = 263) using genotyping-by-sequencing markers but lower precision in the mapping of vegetative pigmentation traits suggest that consistent gene-level resolution will likely require larger families or multiple recombinant inbred lines. These findings highlight that complex association signals can emerge from even the simplest traits given epistasis and structured alleles, but that gene-resolution mapping of these traits is possible with high marker density and appropriate models.

Show MeSH

Related in: MedlinePlus

Distribution of tannin phenotype and loss-of-function alleles in a worldwide sorghum diversity panel. (A) Accessions plotted according to the first two principal components of sorghum population structure (small panel; n = 142), with phenotyped accessions color-coded by Tannin1 allele and tannin phenotype (nonphenotyped accessions in gray). Although some nontannin accessions are explained by tan1-a or tan1-b (green), others must be due to additional loss-of-function alleles (blue). (ND: Not Determined) (B) Distribution of wild-type allele (G; orange) and loss-of-function allele (T; red) for the tan1-a SNP (S4_61667908) in source-identified sorghum accessions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3815067&req=5

fig2: Distribution of tannin phenotype and loss-of-function alleles in a worldwide sorghum diversity panel. (A) Accessions plotted according to the first two principal components of sorghum population structure (small panel; n = 142), with phenotyped accessions color-coded by Tannin1 allele and tannin phenotype (nonphenotyped accessions in gray). Although some nontannin accessions are explained by tan1-a or tan1-b (green), others must be due to additional loss-of-function alleles (blue). (ND: Not Determined) (B) Distribution of wild-type allele (G; orange) and loss-of-function allele (T; red) for the tan1-a SNP (S4_61667908) in source-identified sorghum accessions.

Mentions: Why do some models that account for population structure (CMLM) perform worse than a naive model (GLM), generating a false-negative result for the tan1-a SNP? To better understand the population structure of natural variation in tannins, we characterized the distribution of pigmented testa phenotype in worldwide sorghum collections. The tannin trait segregates in all the botanical races of sorghum but shows modest population structure, with durra and guinea types having the lowest proportion of tannin accessions (15%) and caudatum and guinea-caudatum accessions having the greatest proportion (76% and 83%, respectively). However, with respect to model selection in GWAS, the structuring of the trait itself may be less important than structuring of the alleles underlying the trait. The tan1-a allele is found at high frequency in African and Indian durra accessions and at low frequency in Chinese and southern African accessions (Figure 2). The tan1-a allele explains the tannin phenotypes in all durra-derived accessions studied, but only partially accounts for the phenotype in caudatum accessions and not at all in guinea accessions. Among caudatum, guinea, and kafir types there are numerous accessions that have wild-type Tannin1 coding regions yet have nontannin phenotypes, which suggests that the population structuring of heterogeneous alleles may account for the overcorrection by the CMLM and the effective correction by the MLM.


Dissecting genome-wide association signals for loss-of-function phenotypes in sorghum flavonoid pigmentation traits.

Morris GP, Rhodes DH, Brenton Z, Ramu P, Thayil VM, Deshpande S, Hash CT, Acharya C, Mitchell SE, Buckler ES, Yu J, Kresovich S - G3 (Bethesda) (2013)

Distribution of tannin phenotype and loss-of-function alleles in a worldwide sorghum diversity panel. (A) Accessions plotted according to the first two principal components of sorghum population structure (small panel; n = 142), with phenotyped accessions color-coded by Tannin1 allele and tannin phenotype (nonphenotyped accessions in gray). Although some nontannin accessions are explained by tan1-a or tan1-b (green), others must be due to additional loss-of-function alleles (blue). (ND: Not Determined) (B) Distribution of wild-type allele (G; orange) and loss-of-function allele (T; red) for the tan1-a SNP (S4_61667908) in source-identified sorghum accessions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3815067&req=5

fig2: Distribution of tannin phenotype and loss-of-function alleles in a worldwide sorghum diversity panel. (A) Accessions plotted according to the first two principal components of sorghum population structure (small panel; n = 142), with phenotyped accessions color-coded by Tannin1 allele and tannin phenotype (nonphenotyped accessions in gray). Although some nontannin accessions are explained by tan1-a or tan1-b (green), others must be due to additional loss-of-function alleles (blue). (ND: Not Determined) (B) Distribution of wild-type allele (G; orange) and loss-of-function allele (T; red) for the tan1-a SNP (S4_61667908) in source-identified sorghum accessions.
Mentions: Why do some models that account for population structure (CMLM) perform worse than a naive model (GLM), generating a false-negative result for the tan1-a SNP? To better understand the population structure of natural variation in tannins, we characterized the distribution of pigmented testa phenotype in worldwide sorghum collections. The tannin trait segregates in all the botanical races of sorghum but shows modest population structure, with durra and guinea types having the lowest proportion of tannin accessions (15%) and caudatum and guinea-caudatum accessions having the greatest proportion (76% and 83%, respectively). However, with respect to model selection in GWAS, the structuring of the trait itself may be less important than structuring of the alleles underlying the trait. The tan1-a allele is found at high frequency in African and Indian durra accessions and at low frequency in Chinese and southern African accessions (Figure 2). The tan1-a allele explains the tannin phenotypes in all durra-derived accessions studied, but only partially accounts for the phenotype in caudatum accessions and not at all in guinea accessions. Among caudatum, guinea, and kafir types there are numerous accessions that have wild-type Tannin1 coding regions yet have nontannin phenotypes, which suggests that the population structuring of heterogeneous alleles may account for the overcorrection by the CMLM and the effective correction by the MLM.

Bottom Line: Genome-wide association studies are a powerful method to dissect the genetic basis of traits, although in practice the effects of complex genetic architecture and population structure remain poorly understood.Interestingly, a simple loss-of-function genome scan, for genotype-phenotype covariation only in the putative loss-of-function allele, is able to precisely identify the Tannin1 gene without considering relatedness.These findings highlight that complex association signals can emerge from even the simplest traits given epistasis and structured alleles, but that gene-resolution mapping of these traits is possible with high marker density and appropriate models.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, University of South Carolina, Columbia, South Carolina 29208.

ABSTRACT
Genome-wide association studies are a powerful method to dissect the genetic basis of traits, although in practice the effects of complex genetic architecture and population structure remain poorly understood. To compare mapping strategies we dissected the genetic control of flavonoid pigmentation traits in the cereal grass sorghum by using high-resolution genotyping-by-sequencing single-nucleotide polymorphism markers. Studying the grain tannin trait, we find that general linear models (GLMs) are not able to precisely map tan1-a, a known loss-of-function allele of the Tannin1 gene, with either a small panel (n = 142) or large association panel (n = 336), and that indirect associations limit the mapping of the Tannin1 locus to Mb-resolution. A GLM that accounts for population structure (Q) or standard mixed linear model that accounts for kinship (K) can identify tan1-a, whereas a compressed mixed linear model performs worse than the naive GLM. Interestingly, a simple loss-of-function genome scan, for genotype-phenotype covariation only in the putative loss-of-function allele, is able to precisely identify the Tannin1 gene without considering relatedness. We also find that the tan1-a allele can be mapped with gene resolution in a biparental recombinant inbred line family (n = 263) using genotyping-by-sequencing markers but lower precision in the mapping of vegetative pigmentation traits suggest that consistent gene-level resolution will likely require larger families or multiple recombinant inbred lines. These findings highlight that complex association signals can emerge from even the simplest traits given epistasis and structured alleles, but that gene-resolution mapping of these traits is possible with high marker density and appropriate models.

Show MeSH
Related in: MedlinePlus