Limits...
Patterns of positive selection in six Mammalian genomes.

Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A - PLoS Genet. (2008)

Bottom Line: The increased phylogenetic depth of this dataset results in substantially improved statistical power, and permits several new lineage- and clade-specific tests to be applied.A detailed analysis of Affymetrix exon array data indicated that PSGs are expressed at significantly lower levels, and in a more tissue-specific manner, than non-PSGs.Genes that are specifically expressed in the spleen, testes, liver, and breast are significantly enriched for PSGs, but no evidence was found for an enrichment for PSGs among brain-specific genes.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America.

ABSTRACT
Genome-wide scans for positively selected genes (PSGs) in mammals have provided insight into the dynamics of genome evolution, the genetic basis of differences between species, and the functions of individual genes. However, previous scans have been limited in power and accuracy owing to small numbers of available genomes. Here we present the most comprehensive examination of mammalian PSGs to date, using the six high-coverage genome assemblies now available for eutherian mammals. The increased phylogenetic depth of this dataset results in substantially improved statistical power, and permits several new lineage- and clade-specific tests to be applied. Of approximately 16,500 human genes with high-confidence orthologs in at least two other species, 400 genes showed significant evidence of positive selection (FDR<0.05), according to a standard likelihood ratio test. An additional 144 genes showed evidence of positive selection on particular lineages or clades. As in previous studies, the identified PSGs were enriched for roles in defense/immunity, chemosensory perception, and reproduction, but enrichments were also evident for more specific functions, such as complement-mediated immunity and taste perception. Several pathways were strongly enriched for PSGs, suggesting possible co-evolution of interacting genes. A novel Bayesian analysis of the possible "selection histories" of each gene indicated that most PSGs have switched multiple times between positive selection and nonselection, suggesting that positive selection is often episodic. A detailed analysis of Affymetrix exon array data indicated that PSGs are expressed at significantly lower levels, and in a more tissue-specific manner, than non-PSGs. Genes that are specifically expressed in the spleen, testes, liver, and breast are significantly enriched for PSGs, but no evidence was found for an enrichment for PSGs among brain-specific genes. This study provides additional evidence for widespread positive selection in mammalian evolution and new genome-wide insights into the functional implications of positive selection.

Show MeSH

Related in: MedlinePlus

Hierarchical clustering of 27 over-represented GO categories identified by the Mann-Whitney U test (“biological process” group only), based on the genes assigned to each category.This dendrogram is derived from a dissimilarity matrix defined such that any two GO categories, X and Y, have dissimilarity 0 when all genes assigned to X are also assigned to Y (or vice-versa), and dissimilarity 1 when the sets of genes assigned to X and Y do not overlap. Specifically, X and Y have dissimilarity , where  denotes the (nonempty) set of genes assigned to GO category C. Thus, GO categories associated with similar sets of genes group together in the dendrogram, even if these categories are not closely related in the GO hierarchy (such as “cytolysis” and “single fertilization”). Full names of abbreviated categories (*) are “humoral immune response mediated by circulating immunoglobulin,” “activation of plasma proteins during acute inflammatory response,” and “adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains.” (Dendrogram produced by the hclust function in R with method = “average”.)
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2483296&req=5

pgen-1000144-g002: Hierarchical clustering of 27 over-represented GO categories identified by the Mann-Whitney U test (“biological process” group only), based on the genes assigned to each category.This dendrogram is derived from a dissimilarity matrix defined such that any two GO categories, X and Y, have dissimilarity 0 when all genes assigned to X are also assigned to Y (or vice-versa), and dissimilarity 1 when the sets of genes assigned to X and Y do not overlap. Specifically, X and Y have dissimilarity , where denotes the (nonempty) set of genes assigned to GO category C. Thus, GO categories associated with similar sets of genes group together in the dendrogram, even if these categories are not closely related in the GO hierarchy (such as “cytolysis” and “single fertilization”). Full names of abbreviated categories (*) are “humoral immune response mediated by circulating immunoglobulin,” “activation of plasma proteins during acute inflammatory response,” and “adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains.” (Dendrogram produced by the hclust function in R with method = “average”.)

Mentions: The identified PSGs are significantly enriched for a large number of functional categories, according to the Gene Ontology (GO) [32] and Protein Analysis Through Evolutionary Relationships (PANTHER) databases (Tables 2, S2, and S3). If these over-represented categories are clustered by the PSGs that are assigned to them, major groups corresponding to sensory perception, immunity, and defense emerge (Figure 2), in agreement with previous genome-wide scans [4],[5]. However, the increased power of our analysis allows biological processes and functions associated with positive selection to be identified at much finer resolution than in previous analyses, as discussed below. The increased power also seems to diminish the dependency of functional enrichments on the database or statistical methodology selected for the analysis. In particular, better agreement was observed between functional categories over-represented among the identified PSGs, as determined by Fisher's exact test (FET), and categories whose genes displayed significant shift toward smaller LRT P-values (whether or not they met the significance threshold for PSGs), as determined by the Mann-Whitney U (MWU) test (see Methods). Better agreement was also observed between analyses based on the GO and PANTHER databases (see Tables S2 and S3). The observed enrichments do not appear to be an artifact of differences between categories in gene length or alignment depth per gene (Text S1). In the discussion below, we focus on GO categories and nominal P -values based on the MWU test, as applied to P-values from the LRT for selection on any branch of the tree (except when otherwise indicated); full results are shown in Table 2 and Text S1.


Patterns of positive selection in six Mammalian genomes.

Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A - PLoS Genet. (2008)

Hierarchical clustering of 27 over-represented GO categories identified by the Mann-Whitney U test (“biological process” group only), based on the genes assigned to each category.This dendrogram is derived from a dissimilarity matrix defined such that any two GO categories, X and Y, have dissimilarity 0 when all genes assigned to X are also assigned to Y (or vice-versa), and dissimilarity 1 when the sets of genes assigned to X and Y do not overlap. Specifically, X and Y have dissimilarity , where  denotes the (nonempty) set of genes assigned to GO category C. Thus, GO categories associated with similar sets of genes group together in the dendrogram, even if these categories are not closely related in the GO hierarchy (such as “cytolysis” and “single fertilization”). Full names of abbreviated categories (*) are “humoral immune response mediated by circulating immunoglobulin,” “activation of plasma proteins during acute inflammatory response,” and “adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains.” (Dendrogram produced by the hclust function in R with method = “average”.)
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2483296&req=5

pgen-1000144-g002: Hierarchical clustering of 27 over-represented GO categories identified by the Mann-Whitney U test (“biological process” group only), based on the genes assigned to each category.This dendrogram is derived from a dissimilarity matrix defined such that any two GO categories, X and Y, have dissimilarity 0 when all genes assigned to X are also assigned to Y (or vice-versa), and dissimilarity 1 when the sets of genes assigned to X and Y do not overlap. Specifically, X and Y have dissimilarity , where denotes the (nonempty) set of genes assigned to GO category C. Thus, GO categories associated with similar sets of genes group together in the dendrogram, even if these categories are not closely related in the GO hierarchy (such as “cytolysis” and “single fertilization”). Full names of abbreviated categories (*) are “humoral immune response mediated by circulating immunoglobulin,” “activation of plasma proteins during acute inflammatory response,” and “adaptive immune response based on somatic recombination of immune receptors built from immunoglobulin superfamily domains.” (Dendrogram produced by the hclust function in R with method = “average”.)
Mentions: The identified PSGs are significantly enriched for a large number of functional categories, according to the Gene Ontology (GO) [32] and Protein Analysis Through Evolutionary Relationships (PANTHER) databases (Tables 2, S2, and S3). If these over-represented categories are clustered by the PSGs that are assigned to them, major groups corresponding to sensory perception, immunity, and defense emerge (Figure 2), in agreement with previous genome-wide scans [4],[5]. However, the increased power of our analysis allows biological processes and functions associated with positive selection to be identified at much finer resolution than in previous analyses, as discussed below. The increased power also seems to diminish the dependency of functional enrichments on the database or statistical methodology selected for the analysis. In particular, better agreement was observed between functional categories over-represented among the identified PSGs, as determined by Fisher's exact test (FET), and categories whose genes displayed significant shift toward smaller LRT P-values (whether or not they met the significance threshold for PSGs), as determined by the Mann-Whitney U (MWU) test (see Methods). Better agreement was also observed between analyses based on the GO and PANTHER databases (see Tables S2 and S3). The observed enrichments do not appear to be an artifact of differences between categories in gene length or alignment depth per gene (Text S1). In the discussion below, we focus on GO categories and nominal P -values based on the MWU test, as applied to P-values from the LRT for selection on any branch of the tree (except when otherwise indicated); full results are shown in Table 2 and Text S1.

Bottom Line: The increased phylogenetic depth of this dataset results in substantially improved statistical power, and permits several new lineage- and clade-specific tests to be applied.A detailed analysis of Affymetrix exon array data indicated that PSGs are expressed at significantly lower levels, and in a more tissue-specific manner, than non-PSGs.Genes that are specifically expressed in the spleen, testes, liver, and breast are significantly enriched for PSGs, but no evidence was found for an enrichment for PSGs among brain-specific genes.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York, United States of America.

ABSTRACT
Genome-wide scans for positively selected genes (PSGs) in mammals have provided insight into the dynamics of genome evolution, the genetic basis of differences between species, and the functions of individual genes. However, previous scans have been limited in power and accuracy owing to small numbers of available genomes. Here we present the most comprehensive examination of mammalian PSGs to date, using the six high-coverage genome assemblies now available for eutherian mammals. The increased phylogenetic depth of this dataset results in substantially improved statistical power, and permits several new lineage- and clade-specific tests to be applied. Of approximately 16,500 human genes with high-confidence orthologs in at least two other species, 400 genes showed significant evidence of positive selection (FDR<0.05), according to a standard likelihood ratio test. An additional 144 genes showed evidence of positive selection on particular lineages or clades. As in previous studies, the identified PSGs were enriched for roles in defense/immunity, chemosensory perception, and reproduction, but enrichments were also evident for more specific functions, such as complement-mediated immunity and taste perception. Several pathways were strongly enriched for PSGs, suggesting possible co-evolution of interacting genes. A novel Bayesian analysis of the possible "selection histories" of each gene indicated that most PSGs have switched multiple times between positive selection and nonselection, suggesting that positive selection is often episodic. A detailed analysis of Affymetrix exon array data indicated that PSGs are expressed at significantly lower levels, and in a more tissue-specific manner, than non-PSGs. Genes that are specifically expressed in the spleen, testes, liver, and breast are significantly enriched for PSGs, but no evidence was found for an enrichment for PSGs among brain-specific genes. This study provides additional evidence for widespread positive selection in mammalian evolution and new genome-wide insights into the functional implications of positive selection.

Show MeSH
Related in: MedlinePlus