Limits...
Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates.

Zhang ZD, Frankish A, Hunt T, Harrow J, Gerstein M - Genome Biol. (2010)

Bottom Line: Furthermore, we identify 11 unitary pseudogenes that are polymorphic - that is, they have both nonfunctional and functional alleles currently segregating in the human population.Comparing them with their orthologs in other primates, we find that two of them are in fact pseudogenes in non-human primates, suggesting that they represent cases of a gene being resurrected in the human lineage.This analysis of unitary pseudogenes provides insights into the evolutionary constraints faced by different organisms and the timescales of functional gene loss in humans.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA. zdzmg@gersteinlab.org

ABSTRACT

Background: Unitary pseudogenes are a class of unprocessed pseudogenes without functioning counterparts in the genome. They constitute only a small fraction of annotated pseudogenes in the human genome. However, as they represent distinct functional losses over time, they shed light on the unique features of humans in primate evolution.

Results: We have developed a pipeline to detect human unitary pseudogenes through analyzing the global inventory of orthologs between the human genome and its mammalian relatives. We focus on gene losses along the human lineage after the divergence from rodents about 75 million years ago. In total, we identify 76 unitary pseudogenes, including previously annotated ones, and many novel ones. By comparing each of these to its functioning ortholog in other mammals, we can approximately date the creation of each unitary pseudogene (that is, the gene 'death date') and show that for our group of 76, the functional genes appear to be disabled at a fairly uniform rate throughout primate evolution - not all at once, correlated, for instance, with the 'Alu burst'. Furthermore, we identify 11 unitary pseudogenes that are polymorphic - that is, they have both nonfunctional and functional alleles currently segregating in the human population. Comparing them with their orthologs in other primates, we find that two of them are in fact pseudogenes in non-human primates, suggesting that they represent cases of a gene being resurrected in the human lineage.

Conclusions: This analysis of unitary pseudogenes provides insights into the evolutionary constraints faced by different organisms and the timescales of functional gene loss in humans.

Show MeSH
Population structure analysis for SNP rs4940595. (a) Hierarchical clustering of 11 populations using the FST metric. Two subdivisions in the meta-population, as indicated by the dashed line, are clearly visible in the cluster. (b) Histogram of FST from the permutation test using the population subdivisions as seen in (a).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2864566&req=5

Figure 6: Population structure analysis for SNP rs4940595. (a) Hierarchical clustering of 11 populations using the FST metric. Two subdivisions in the meta-population, as indicated by the dashed line, are clearly visible in the cluster. (b) Histogram of FST from the permutation test using the population subdivisions as seen in (a).

Mentions: Various genomic and genetic features of the HapMap SNPs rs17097921, rs4940595, and rs2842899 are summarized in Table 3 (see Table S4 in Additional file 1 for allele frequency information). Each of the nonsense alleles should effectively pseudogenize the gene, as all three SNPs are located in the early part of the coding sequences. Using the HapMap genotype data, several recent studies [30,31] scanned the human genome to detect positive selection in human populations. These three SNPs were not found to be under recent positive selection. Such negative results, however, could be caused by a lack of detection power due to a deficiency in data and/or method. The human reference alleles of all three SNPs are pseudogenic. The reference alleles in other primates are functional for rs17097921 but pseudogenic for both rs4940595 and rs2842899. Using the genotype and allele frequency data from the HapMap Project, we check for the Hardy-Weinberg equilibrium for the two alleles of each SNP in each population and all populations combined. Our statistical analysis shows that, in the meta-population, the two alleles, T/G, of rs4940595 are not at Hardy-Weinberg equilibrium (χ2 goodness-of-fit test, degrees of freedom = 2, χ2 = 8.659, P = 0.013). We calculate FST between two populations to measure their difference (distance), and the FST metric shows population subdivision in the meta-population. Hierarchical clustering groups 11 populations into two subdivisions: one composed of the Europeans in Utah, the Tuscans in Italy, and the Gujarati Indians in Houston, Texas, and the other the rest (Figure 6a). The FST between these two subdivisions is 0.044, which is highly significant based on the permutation test (Figure 6b). Such population structure at rs4940595 - the difference in the allelic frequencies in different populations - could be the result, and thus a sign, of different selective regimes that the same allele at rs4940595 is subjected to in different population subdivisions.


Identification and analysis of unitary pseudogenes: historic and contemporary gene losses in humans and other primates.

Zhang ZD, Frankish A, Hunt T, Harrow J, Gerstein M - Genome Biol. (2010)

Population structure analysis for SNP rs4940595. (a) Hierarchical clustering of 11 populations using the FST metric. Two subdivisions in the meta-population, as indicated by the dashed line, are clearly visible in the cluster. (b) Histogram of FST from the permutation test using the population subdivisions as seen in (a).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2864566&req=5

Figure 6: Population structure analysis for SNP rs4940595. (a) Hierarchical clustering of 11 populations using the FST metric. Two subdivisions in the meta-population, as indicated by the dashed line, are clearly visible in the cluster. (b) Histogram of FST from the permutation test using the population subdivisions as seen in (a).
Mentions: Various genomic and genetic features of the HapMap SNPs rs17097921, rs4940595, and rs2842899 are summarized in Table 3 (see Table S4 in Additional file 1 for allele frequency information). Each of the nonsense alleles should effectively pseudogenize the gene, as all three SNPs are located in the early part of the coding sequences. Using the HapMap genotype data, several recent studies [30,31] scanned the human genome to detect positive selection in human populations. These three SNPs were not found to be under recent positive selection. Such negative results, however, could be caused by a lack of detection power due to a deficiency in data and/or method. The human reference alleles of all three SNPs are pseudogenic. The reference alleles in other primates are functional for rs17097921 but pseudogenic for both rs4940595 and rs2842899. Using the genotype and allele frequency data from the HapMap Project, we check for the Hardy-Weinberg equilibrium for the two alleles of each SNP in each population and all populations combined. Our statistical analysis shows that, in the meta-population, the two alleles, T/G, of rs4940595 are not at Hardy-Weinberg equilibrium (χ2 goodness-of-fit test, degrees of freedom = 2, χ2 = 8.659, P = 0.013). We calculate FST between two populations to measure their difference (distance), and the FST metric shows population subdivision in the meta-population. Hierarchical clustering groups 11 populations into two subdivisions: one composed of the Europeans in Utah, the Tuscans in Italy, and the Gujarati Indians in Houston, Texas, and the other the rest (Figure 6a). The FST between these two subdivisions is 0.044, which is highly significant based on the permutation test (Figure 6b). Such population structure at rs4940595 - the difference in the allelic frequencies in different populations - could be the result, and thus a sign, of different selective regimes that the same allele at rs4940595 is subjected to in different population subdivisions.

Bottom Line: Furthermore, we identify 11 unitary pseudogenes that are polymorphic - that is, they have both nonfunctional and functional alleles currently segregating in the human population.Comparing them with their orthologs in other primates, we find that two of them are in fact pseudogenes in non-human primates, suggesting that they represent cases of a gene being resurrected in the human lineage.This analysis of unitary pseudogenes provides insights into the evolutionary constraints faced by different organisms and the timescales of functional gene loss in humans.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA. zdzmg@gersteinlab.org

ABSTRACT

Background: Unitary pseudogenes are a class of unprocessed pseudogenes without functioning counterparts in the genome. They constitute only a small fraction of annotated pseudogenes in the human genome. However, as they represent distinct functional losses over time, they shed light on the unique features of humans in primate evolution.

Results: We have developed a pipeline to detect human unitary pseudogenes through analyzing the global inventory of orthologs between the human genome and its mammalian relatives. We focus on gene losses along the human lineage after the divergence from rodents about 75 million years ago. In total, we identify 76 unitary pseudogenes, including previously annotated ones, and many novel ones. By comparing each of these to its functioning ortholog in other mammals, we can approximately date the creation of each unitary pseudogene (that is, the gene 'death date') and show that for our group of 76, the functional genes appear to be disabled at a fairly uniform rate throughout primate evolution - not all at once, correlated, for instance, with the 'Alu burst'. Furthermore, we identify 11 unitary pseudogenes that are polymorphic - that is, they have both nonfunctional and functional alleles currently segregating in the human population. Comparing them with their orthologs in other primates, we find that two of them are in fact pseudogenes in non-human primates, suggesting that they represent cases of a gene being resurrected in the human lineage.

Conclusions: This analysis of unitary pseudogenes provides insights into the evolutionary constraints faced by different organisms and the timescales of functional gene loss in humans.

Show MeSH