Limits...
Deleterious alleles in the human genome are on average younger than neutral alleles of the same frequency.

Kiezun A, Pulit SL, Francioli LC, van Dijk F, Swertz M, Boomsma DI, van Duijn CM, Slagboom PE, van Ommen GJ, Wijmenga C, Genome of the Netherlands Consortiumde Bakker PI, Sunyaev SR - PLoS Genet. (2013)

Bottom Line: A key challenge is to identify, among the myriad alleles, those variants that have an effect on molecular function, phenotypes, and reproductive fitness.When applied to human sequence data from the Genome of the Netherlands Project, our approach distinguishes low-frequency coding non-synonymous variants from synonymous and non-coding variants at the same allele frequency and discriminates between sets of variants independently predicted to be benign or damaging for protein structure and function.The results confirm the abundance of slightly deleterious coding variation in humans.

View Article: PubMed Central - PubMed

Affiliation: Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

ABSTRACT
Large-scale population sequencing studies provide a complete picture of human genetic variation within the studied populations. A key challenge is to identify, among the myriad alleles, those variants that have an effect on molecular function, phenotypes, and reproductive fitness. Most non-neutral variation consists of deleterious alleles segregating at low population frequency due to incessant mutation. To date, studies characterizing selection against deleterious alleles have been based on allele frequency (testing for a relative excess of rare alleles) or ratio of polymorphism to divergence (testing for a relative increase in the number of polymorphic alleles). Here, starting from Maruyama's theoretical prediction (Maruyama T (1974), Am J Hum Genet USA 6:669-673) that a (slightly) deleterious allele is, on average, younger than a neutral allele segregating at the same frequency, we devised an approach to characterize selection based on allelic age. Unlike existing methods, it compares sets of neutral and deleterious sequence variants at the same allele frequency. When applied to human sequence data from the Genome of the Netherlands Project, our approach distinguishes low-frequency coding non-synonymous variants from synonymous and non-coding variants at the same allele frequency and discriminates between sets of variants independently predicted to be benign or damaging for protein structure and function. The results confirm the abundance of slightly deleterious coding variation in humans.

Show MeSH

Related in: MedlinePlus

Empirical Cumulative Distribution Function of the NC statistic for alleles at minor allele count 3 in GoNL data.Synonymous derived variants serve as the baseline distribution. The distribution of NC for probably damaging derived missense variants is notably shifted towards higher values, consistent with their younger age. The NC-statistic distribution for ancestral alleles are at minor allele count 3 is strongly shifted towards lower values, consistent with much older age of those alleles.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3585140&req=5

pgen-1003301-g005: Empirical Cumulative Distribution Function of the NC statistic for alleles at minor allele count 3 in GoNL data.Synonymous derived variants serve as the baseline distribution. The distribution of NC for probably damaging derived missense variants is notably shifted towards higher values, consistent with their younger age. The NC-statistic distribution for ancestral alleles are at minor allele count 3 is strongly shifted towards lower values, consistent with much older age of those alleles.

Mentions: The NC statistic can discriminate between non-synonymous and synonymous SNPs at the same derived allele frequency (Figure 5 and Table 1) and bootstrap analysis shows that the effect is not explained by a small number of variants (Figure 6). This is consistent with the abundance of low frequency deleterious non-synonymous alleles in humans. Variants predicted to be probably damaging by PolyPhen-2 have higher values of NC statistics. Overall, we observe a positive correlation between PolyPhen-2 predictions of damaging effects of derived missense variants and the NC test statistic (Table 2). This result indicates that the NC statistic independently captures some of the same selective characteristics of variants as PolyPhen-2, and it may contain additional signal not present in the conservation or structural properties which PolyPhen-2 is based on.


Deleterious alleles in the human genome are on average younger than neutral alleles of the same frequency.

Kiezun A, Pulit SL, Francioli LC, van Dijk F, Swertz M, Boomsma DI, van Duijn CM, Slagboom PE, van Ommen GJ, Wijmenga C, Genome of the Netherlands Consortiumde Bakker PI, Sunyaev SR - PLoS Genet. (2013)

Empirical Cumulative Distribution Function of the NC statistic for alleles at minor allele count 3 in GoNL data.Synonymous derived variants serve as the baseline distribution. The distribution of NC for probably damaging derived missense variants is notably shifted towards higher values, consistent with their younger age. The NC-statistic distribution for ancestral alleles are at minor allele count 3 is strongly shifted towards lower values, consistent with much older age of those alleles.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3585140&req=5

pgen-1003301-g005: Empirical Cumulative Distribution Function of the NC statistic for alleles at minor allele count 3 in GoNL data.Synonymous derived variants serve as the baseline distribution. The distribution of NC for probably damaging derived missense variants is notably shifted towards higher values, consistent with their younger age. The NC-statistic distribution for ancestral alleles are at minor allele count 3 is strongly shifted towards lower values, consistent with much older age of those alleles.
Mentions: The NC statistic can discriminate between non-synonymous and synonymous SNPs at the same derived allele frequency (Figure 5 and Table 1) and bootstrap analysis shows that the effect is not explained by a small number of variants (Figure 6). This is consistent with the abundance of low frequency deleterious non-synonymous alleles in humans. Variants predicted to be probably damaging by PolyPhen-2 have higher values of NC statistics. Overall, we observe a positive correlation between PolyPhen-2 predictions of damaging effects of derived missense variants and the NC test statistic (Table 2). This result indicates that the NC statistic independently captures some of the same selective characteristics of variants as PolyPhen-2, and it may contain additional signal not present in the conservation or structural properties which PolyPhen-2 is based on.

Bottom Line: A key challenge is to identify, among the myriad alleles, those variants that have an effect on molecular function, phenotypes, and reproductive fitness.When applied to human sequence data from the Genome of the Netherlands Project, our approach distinguishes low-frequency coding non-synonymous variants from synonymous and non-coding variants at the same allele frequency and discriminates between sets of variants independently predicted to be benign or damaging for protein structure and function.The results confirm the abundance of slightly deleterious coding variation in humans.

View Article: PubMed Central - PubMed

Affiliation: Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA.

ABSTRACT
Large-scale population sequencing studies provide a complete picture of human genetic variation within the studied populations. A key challenge is to identify, among the myriad alleles, those variants that have an effect on molecular function, phenotypes, and reproductive fitness. Most non-neutral variation consists of deleterious alleles segregating at low population frequency due to incessant mutation. To date, studies characterizing selection against deleterious alleles have been based on allele frequency (testing for a relative excess of rare alleles) or ratio of polymorphism to divergence (testing for a relative increase in the number of polymorphic alleles). Here, starting from Maruyama's theoretical prediction (Maruyama T (1974), Am J Hum Genet USA 6:669-673) that a (slightly) deleterious allele is, on average, younger than a neutral allele segregating at the same frequency, we devised an approach to characterize selection based on allelic age. Unlike existing methods, it compares sets of neutral and deleterious sequence variants at the same allele frequency. When applied to human sequence data from the Genome of the Netherlands Project, our approach distinguishes low-frequency coding non-synonymous variants from synonymous and non-coding variants at the same allele frequency and discriminates between sets of variants independently predicted to be benign or damaging for protein structure and function. The results confirm the abundance of slightly deleterious coding variation in humans.

Show MeSH
Related in: MedlinePlus