Limits...
A simple method for analyzing exome sequencing data shows distinct levels of nonsynonymous variation for human immune and nervous system genes.

Freudenberg J, Gregersen PK, Freudenberg-Hua Y - PLoS ONE (2012)

Bottom Line: This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio.A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection.This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

View Article: PubMed Central - PubMed

Affiliation: Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Northshore LIJ Healthsystem, Manhasset, New York, United States of America. jan.freudenberg@nshs.edu

ABSTRACT
To measure the strength of natural selection that acts upon single nucleotide variants (SNVs) in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs) per nonsynonymous site and synonymous SNVs (sSNVs) per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv) in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs), nervous system genes (NSGs), randomly sampled genes (RSGs), and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

Show MeSH

Related in: MedlinePlus

Relative density of nsSNVs (rdnsv) in different gene sets as estimated with different SNV datasets.Nervous system genes (NSG, light grey) show a smaller rdnsv than immune system genes (ISG, medium grey) or randomly sampled genes (RSG, dark grey) in a European diploid genome sequence (A, C) and a pooled set of 200 European exome sequences (B, D). The greater rdnsv in the pooled 200 exomes than the individual genome indicates an enrichment of nsSNVs among rare SNVs.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3368947&req=5

pone-0038087-g001: Relative density of nsSNVs (rdnsv) in different gene sets as estimated with different SNV datasets.Nervous system genes (NSG, light grey) show a smaller rdnsv than immune system genes (ISG, medium grey) or randomly sampled genes (RSG, dark grey) in a European diploid genome sequence (A, C) and a pooled set of 200 European exome sequences (B, D). The greater rdnsv in the pooled 200 exomes than the individual genome indicates an enrichment of nsSNVs among rare SNVs.

Mentions: However, to additionally relate the Rn/Rs ratio to the neutral expectation, it is important to consider that transition mutations occur with higher likelihood than transversion mutations and that transitions are enriched among synonymous changes in the genetic code [31]. Here we correct for this nonsynonymous to synonymous mutation rate bias by multiplying the observed Rn/Rs ratio with a respective factor f that is defined by equation (2) in the Materials and Methods. This strategy is designed to estimate the relative density of nonsynonymous variants as compared with neutral expectation (rdnsv) as defined above by equation (3). We estimate rdnsv to be around 20% in NSGs, around 31% in ISGs and around 38% in RSGs with the SNVs from the diploid genome (Figure 1a, Table 1a). We next retrieve a second set of candidate genes through keyword search of the EntrezGene database [27]. These keyword-based candidates may differ from the expression-based candidates in the sense that they are more likely to have been experimentally studied in detail. When analyzing the SNVs from the diploid genome in these keyword-based candidates, the estimates of rdnsv are ∼21% in NSGs and ∼41% in ISGs (Figure 1c, Table 1a). Thus, also keyword-based NSGs again display a significantly (P<10−5) smaller level of nonsynonymous variation than ISGs (Table S2b).


A simple method for analyzing exome sequencing data shows distinct levels of nonsynonymous variation for human immune and nervous system genes.

Freudenberg J, Gregersen PK, Freudenberg-Hua Y - PLoS ONE (2012)

Relative density of nsSNVs (rdnsv) in different gene sets as estimated with different SNV datasets.Nervous system genes (NSG, light grey) show a smaller rdnsv than immune system genes (ISG, medium grey) or randomly sampled genes (RSG, dark grey) in a European diploid genome sequence (A, C) and a pooled set of 200 European exome sequences (B, D). The greater rdnsv in the pooled 200 exomes than the individual genome indicates an enrichment of nsSNVs among rare SNVs.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3368947&req=5

pone-0038087-g001: Relative density of nsSNVs (rdnsv) in different gene sets as estimated with different SNV datasets.Nervous system genes (NSG, light grey) show a smaller rdnsv than immune system genes (ISG, medium grey) or randomly sampled genes (RSG, dark grey) in a European diploid genome sequence (A, C) and a pooled set of 200 European exome sequences (B, D). The greater rdnsv in the pooled 200 exomes than the individual genome indicates an enrichment of nsSNVs among rare SNVs.
Mentions: However, to additionally relate the Rn/Rs ratio to the neutral expectation, it is important to consider that transition mutations occur with higher likelihood than transversion mutations and that transitions are enriched among synonymous changes in the genetic code [31]. Here we correct for this nonsynonymous to synonymous mutation rate bias by multiplying the observed Rn/Rs ratio with a respective factor f that is defined by equation (2) in the Materials and Methods. This strategy is designed to estimate the relative density of nonsynonymous variants as compared with neutral expectation (rdnsv) as defined above by equation (3). We estimate rdnsv to be around 20% in NSGs, around 31% in ISGs and around 38% in RSGs with the SNVs from the diploid genome (Figure 1a, Table 1a). We next retrieve a second set of candidate genes through keyword search of the EntrezGene database [27]. These keyword-based candidates may differ from the expression-based candidates in the sense that they are more likely to have been experimentally studied in detail. When analyzing the SNVs from the diploid genome in these keyword-based candidates, the estimates of rdnsv are ∼21% in NSGs and ∼41% in ISGs (Figure 1c, Table 1a). Thus, also keyword-based NSGs again display a significantly (P<10−5) smaller level of nonsynonymous variation than ISGs (Table S2b).

Bottom Line: This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio.A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection.This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

View Article: PubMed Central - PubMed

Affiliation: Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Northshore LIJ Healthsystem, Manhasset, New York, United States of America. jan.freudenberg@nshs.edu

ABSTRACT
To measure the strength of natural selection that acts upon single nucleotide variants (SNVs) in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs) per nonsynonymous site and synonymous SNVs (sSNVs) per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv) in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs), nervous system genes (NSGs), randomly sampled genes (RSGs), and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

Show MeSH
Related in: MedlinePlus