Limits...
From mouse to human: evolutionary genomics analysis of human orthologs of essential genes.

Georgi B, Voight BF, Bućan M - PLoS Genet. (2013)

Bottom Line: Studies in model organisms identified a significant fraction of essential genes through the analysis of -mutations that lead to lethality.Consistent with the action of strong, purifying selection, these genes exhibit comparatively reduced levels of sequence variation, skew in allele frequency towards more rare, and exhibit increased conservation across the primate and rodent lineages relative to the remainder of genes in the genome.While incomplete, our set of human orthologs shows characteristics fully consistent with essential function in human and thus provides a resource to inform and facilitate interpretation of sequence data in studies of human disease.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.

ABSTRACT
Understanding the core set of genes that are necessary for basic developmental functions is one of the central goals in biology. Studies in model organisms identified a significant fraction of essential genes through the analysis of -mutations that lead to lethality. Recent large-scale next-generation sequencing efforts have provided unprecedented data on genetic variation in human. However, evolutionary and genomic characteristics of human essential genes have never been directly studied on a genome-wide scale. Here we use detailed phenotypic resources available for the mouse and deep genomics sequencing data from human populations to characterize patterns of genetic variation and mutational burden in a set of 2,472 human orthologs of known essential genes in the mouse. Consistent with the action of strong, purifying selection, these genes exhibit comparatively reduced levels of sequence variation, skew in allele frequency towards more rare, and exhibit increased conservation across the primate and rodent lineages relative to the remainder of genes in the genome. In individual genomes we observed ~12 rare mutations within essential genes predicted to be damaging. Consistent with the hypothesis that mutations in essential genes are risk factors for neurodevelopmental disease, we show that de novo variants in patients with Autism Spectrum Disorder are more likely to occur in this collection of genes. While incomplete, our set of human orthologs shows characteristics fully consistent with essential function in human and thus provides a resource to inform and facilitate interpretation of sequence data in studies of human disease.

Show MeSH

Related in: MedlinePlus

Population genetics properties of essential genes.A) Average numbers of exonic missense variants in EG, NLG and ALL. The plotted Z-score is normalized relative to the genome average. The plotted range is truncated to visualize differences between gene sets, with a full log-transformed plot available in Figure S7. B) Differences in the allele frequency distributions in four continental populations of the 1000G data for EG, NLG and ALL. A data point above the zero line corresponds to a relative excess of variants of a given allele frequency. It can be seen that the essential genes contain significantly more rare variants than either NLG or ALL. The reported p-values are with respect to all 1000 Genome samples combined.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3649967&req=5

pgen-1003484-g002: Population genetics properties of essential genes.A) Average numbers of exonic missense variants in EG, NLG and ALL. The plotted Z-score is normalized relative to the genome average. The plotted range is truncated to visualize differences between gene sets, with a full log-transformed plot available in Figure S7. B) Differences in the allele frequency distributions in four continental populations of the 1000G data for EG, NLG and ALL. A data point above the zero line corresponds to a relative excess of variants of a given allele frequency. It can be seen that the essential genes contain significantly more rare variants than either NLG or ALL. The reported p-values are with respect to all 1000 Genome samples combined.

Mentions: In addition to evolutionary constraint across species, we hypothesized that genes identified as essential in the mouse should also be subject to significant background selection in recent human history. This pressure would be expected to leave a signature of (a) a reduction in overall polymorphism levels, particularly in the levels of missense and loss-of-function mutations, and (b) a skewing of the allele frequency distribution towards increasingly rare variants in EG relative to NLG. Using data from the 1000 Genomes Project [18] Phase 1 release, and after controlling for the total exon length in each gene, we observed a significant reduction in the level of exonic single nucleotide polymorphisms (SNP) in EG relative to either NLG or ALL (Wilcoxon Test P = 1.08×10−59, Figure 2A) as well as a shift in the distribution of allele frequencies towards rare variants (Wilcoxon Test P = 3.12×10−35, Figure 2B, Figure S8). Both of these results hold even after stratifying by continental population group (Asian, African, American or European) or when considering individual subpopulations (Table S3, Table S4). To ensure that this observed constraint is not simply a result of the sequencing technology used to produce the data or the number of individuals characterized, we also examined re-sequencing data reported recently for ∼200 drug-target genes in 14,002 individuals [28]. After adjusting for total exon length, we confirmed a significant reduction in the level of polymorphisms among 55 EG compared to 115 NLG in this set of genes (Wilcoxon Test P = 9.78×10−7).


From mouse to human: evolutionary genomics analysis of human orthologs of essential genes.

Georgi B, Voight BF, Bućan M - PLoS Genet. (2013)

Population genetics properties of essential genes.A) Average numbers of exonic missense variants in EG, NLG and ALL. The plotted Z-score is normalized relative to the genome average. The plotted range is truncated to visualize differences between gene sets, with a full log-transformed plot available in Figure S7. B) Differences in the allele frequency distributions in four continental populations of the 1000G data for EG, NLG and ALL. A data point above the zero line corresponds to a relative excess of variants of a given allele frequency. It can be seen that the essential genes contain significantly more rare variants than either NLG or ALL. The reported p-values are with respect to all 1000 Genome samples combined.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3649967&req=5

pgen-1003484-g002: Population genetics properties of essential genes.A) Average numbers of exonic missense variants in EG, NLG and ALL. The plotted Z-score is normalized relative to the genome average. The plotted range is truncated to visualize differences between gene sets, with a full log-transformed plot available in Figure S7. B) Differences in the allele frequency distributions in four continental populations of the 1000G data for EG, NLG and ALL. A data point above the zero line corresponds to a relative excess of variants of a given allele frequency. It can be seen that the essential genes contain significantly more rare variants than either NLG or ALL. The reported p-values are with respect to all 1000 Genome samples combined.
Mentions: In addition to evolutionary constraint across species, we hypothesized that genes identified as essential in the mouse should also be subject to significant background selection in recent human history. This pressure would be expected to leave a signature of (a) a reduction in overall polymorphism levels, particularly in the levels of missense and loss-of-function mutations, and (b) a skewing of the allele frequency distribution towards increasingly rare variants in EG relative to NLG. Using data from the 1000 Genomes Project [18] Phase 1 release, and after controlling for the total exon length in each gene, we observed a significant reduction in the level of exonic single nucleotide polymorphisms (SNP) in EG relative to either NLG or ALL (Wilcoxon Test P = 1.08×10−59, Figure 2A) as well as a shift in the distribution of allele frequencies towards rare variants (Wilcoxon Test P = 3.12×10−35, Figure 2B, Figure S8). Both of these results hold even after stratifying by continental population group (Asian, African, American or European) or when considering individual subpopulations (Table S3, Table S4). To ensure that this observed constraint is not simply a result of the sequencing technology used to produce the data or the number of individuals characterized, we also examined re-sequencing data reported recently for ∼200 drug-target genes in 14,002 individuals [28]. After adjusting for total exon length, we confirmed a significant reduction in the level of polymorphisms among 55 EG compared to 115 NLG in this set of genes (Wilcoxon Test P = 9.78×10−7).

Bottom Line: Studies in model organisms identified a significant fraction of essential genes through the analysis of -mutations that lead to lethality.Consistent with the action of strong, purifying selection, these genes exhibit comparatively reduced levels of sequence variation, skew in allele frequency towards more rare, and exhibit increased conservation across the primate and rodent lineages relative to the remainder of genes in the genome.While incomplete, our set of human orthologs shows characteristics fully consistent with essential function in human and thus provides a resource to inform and facilitate interpretation of sequence data in studies of human disease.

View Article: PubMed Central - PubMed

Affiliation: Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.

ABSTRACT
Understanding the core set of genes that are necessary for basic developmental functions is one of the central goals in biology. Studies in model organisms identified a significant fraction of essential genes through the analysis of -mutations that lead to lethality. Recent large-scale next-generation sequencing efforts have provided unprecedented data on genetic variation in human. However, evolutionary and genomic characteristics of human essential genes have never been directly studied on a genome-wide scale. Here we use detailed phenotypic resources available for the mouse and deep genomics sequencing data from human populations to characterize patterns of genetic variation and mutational burden in a set of 2,472 human orthologs of known essential genes in the mouse. Consistent with the action of strong, purifying selection, these genes exhibit comparatively reduced levels of sequence variation, skew in allele frequency towards more rare, and exhibit increased conservation across the primate and rodent lineages relative to the remainder of genes in the genome. In individual genomes we observed ~12 rare mutations within essential genes predicted to be damaging. Consistent with the hypothesis that mutations in essential genes are risk factors for neurodevelopmental disease, we show that de novo variants in patients with Autism Spectrum Disorder are more likely to occur in this collection of genes. While incomplete, our set of human orthologs shows characteristics fully consistent with essential function in human and thus provides a resource to inform and facilitate interpretation of sequence data in studies of human disease.

Show MeSH
Related in: MedlinePlus