Limits...
A simple method for analyzing exome sequencing data shows distinct levels of nonsynonymous variation for human immune and nervous system genes.

Freudenberg J, Gregersen PK, Freudenberg-Hua Y - PLoS ONE (2012)

Bottom Line: This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio.A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection.This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

View Article: PubMed Central - PubMed

Affiliation: Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Northshore LIJ Healthsystem, Manhasset, New York, United States of America. jan.freudenberg@nshs.edu

ABSTRACT
To measure the strength of natural selection that acts upon single nucleotide variants (SNVs) in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs) per nonsynonymous site and synonymous SNVs (sSNVs) per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv) in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs), nervous system genes (NSGs), randomly sampled genes (RSGs), and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

Show MeSH

Related in: MedlinePlus

Estimates of rdnsv over different allele frequency bins.The estimates of rdnsv decrease with SNV allele frequency in all gene categories. The slope of the fitted regression model can be interpreted as a measure for the influence of purifying selection on segregating nsSNVs. The y-intercept (rdnsv0) can be interpreted as the proportion of nsSites where mutations are tolerated to segregate with an allele frequency notably greater than 0. A) Expression-based NSGs (circles), ISGs (triangles) or RSGs (crosses). The fitted models are rdnsv(NSG) = 0.45−0.061×; rdnsv(ISG) = 0.58−0.079× and rdnsv(RSG) = 0.58−0.071×B) Keyword based NSGs (blue) and ISGs (red). The fitted models are rdnsv(NSG) = 0.43−0.061×; rdnsv(ISG) = 0.58−0.045×.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3368947&req=5

pone-0038087-g003: Estimates of rdnsv over different allele frequency bins.The estimates of rdnsv decrease with SNV allele frequency in all gene categories. The slope of the fitted regression model can be interpreted as a measure for the influence of purifying selection on segregating nsSNVs. The y-intercept (rdnsv0) can be interpreted as the proportion of nsSites where mutations are tolerated to segregate with an allele frequency notably greater than 0. A) Expression-based NSGs (circles), ISGs (triangles) or RSGs (crosses). The fitted models are rdnsv(NSG) = 0.45−0.061×; rdnsv(ISG) = 0.58−0.079× and rdnsv(RSG) = 0.58−0.071×B) Keyword based NSGs (blue) and ISGs (red). The fitted models are rdnsv(NSG) = 0.43−0.061×; rdnsv(ISG) = 0.58−0.045×.

Mentions: When rdnsv is estimated in a diploid genome or a pooled set of chromosomes, its value reflects the level of nonsynonymous variation on a mixture of SNVs that range from rare to common in their population frequency. To additionally exploit the information that is contained in the change of rdnsv with allele frequency, we next group SNVs into disjoint frequency bins and separately estimate for each bin its rdnsv value in the pooled set of 200 exomes. This shows that expression-based NSGs display a reduced rdnsv value across all frequency bins (Figure 3a). When we further estimate rdnsv for our keyword-based candidate genes, we again see smaller rdnsv values for NSGs across all bins (Figure 3b). Additionally, rdnsv tends to decrease with SNV allele frequency in all sets of candidate genes.


A simple method for analyzing exome sequencing data shows distinct levels of nonsynonymous variation for human immune and nervous system genes.

Freudenberg J, Gregersen PK, Freudenberg-Hua Y - PLoS ONE (2012)

Estimates of rdnsv over different allele frequency bins.The estimates of rdnsv decrease with SNV allele frequency in all gene categories. The slope of the fitted regression model can be interpreted as a measure for the influence of purifying selection on segregating nsSNVs. The y-intercept (rdnsv0) can be interpreted as the proportion of nsSites where mutations are tolerated to segregate with an allele frequency notably greater than 0. A) Expression-based NSGs (circles), ISGs (triangles) or RSGs (crosses). The fitted models are rdnsv(NSG) = 0.45−0.061×; rdnsv(ISG) = 0.58−0.079× and rdnsv(RSG) = 0.58−0.071×B) Keyword based NSGs (blue) and ISGs (red). The fitted models are rdnsv(NSG) = 0.43−0.061×; rdnsv(ISG) = 0.58−0.045×.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3368947&req=5

pone-0038087-g003: Estimates of rdnsv over different allele frequency bins.The estimates of rdnsv decrease with SNV allele frequency in all gene categories. The slope of the fitted regression model can be interpreted as a measure for the influence of purifying selection on segregating nsSNVs. The y-intercept (rdnsv0) can be interpreted as the proportion of nsSites where mutations are tolerated to segregate with an allele frequency notably greater than 0. A) Expression-based NSGs (circles), ISGs (triangles) or RSGs (crosses). The fitted models are rdnsv(NSG) = 0.45−0.061×; rdnsv(ISG) = 0.58−0.079× and rdnsv(RSG) = 0.58−0.071×B) Keyword based NSGs (blue) and ISGs (red). The fitted models are rdnsv(NSG) = 0.43−0.061×; rdnsv(ISG) = 0.58−0.045×.
Mentions: When rdnsv is estimated in a diploid genome or a pooled set of chromosomes, its value reflects the level of nonsynonymous variation on a mixture of SNVs that range from rare to common in their population frequency. To additionally exploit the information that is contained in the change of rdnsv with allele frequency, we next group SNVs into disjoint frequency bins and separately estimate for each bin its rdnsv value in the pooled set of 200 exomes. This shows that expression-based NSGs display a reduced rdnsv value across all frequency bins (Figure 3a). When we further estimate rdnsv for our keyword-based candidate genes, we again see smaller rdnsv values for NSGs across all bins (Figure 3b). Additionally, rdnsv tends to decrease with SNV allele frequency in all sets of candidate genes.

Bottom Line: This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio.A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection.This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

View Article: PubMed Central - PubMed

Affiliation: Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Northshore LIJ Healthsystem, Manhasset, New York, United States of America. jan.freudenberg@nshs.edu

ABSTRACT
To measure the strength of natural selection that acts upon single nucleotide variants (SNVs) in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs) per nonsynonymous site and synonymous SNVs (sSNVs) per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv) in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs), nervous system genes (NSGs), randomly sampled genes (RSGs), and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.

Show MeSH
Related in: MedlinePlus