Limits...
Evolutionary dynamics of co-segregating gene clusters associated with complex diseases.

Preuss C, Riemenschneider M, Wiedmann D, Stoll M - PLoS ONE (2012)

Bottom Line: We observed distinct clustering of disease-associated SNPs in evolutionary rearranged regions of low recombination and high gene density, which harbor genes involved in immunity, that is, the interleukin cluster on 5q31 or RhoA on 3p21.Our results suggest that multiple lineage specific rearrangements led to a physical clustering of functionally related and linked genes exhibiting an enrichment of susceptibility loci for complex traits.This implies that besides recent evolutionary adaptations other evolutionary dynamics have played a role in the formation of linked gene clusters associated with complex disease traits.

View Article: PubMed Central - PubMed

Affiliation: Genetic Epidemiology of Vascular Disorders, Leibniz Institute for Arteriosclerosis Research (LIFA) at the University of Muenster, Muenster, Germany. christoph.preuss@lifa-muenster.de

ABSTRACT

Background: The distribution of human disease-associated mutations is not random across the human genome. Despite the fact that natural selection continually removes disease-associated mutations, an enrichment of these variants can be observed in regions of low recombination. There are a number of mechanisms by which such a clustering could occur, including genetic perturbations or demographic effects within different populations. Recent genome-wide association studies (GWAS) suggest that single nucleotide polymorphisms (SNPs) associated with complex disease traits are not randomly distributed throughout the genome, but tend to cluster in regions of low recombination.

Principal findings: Here we investigated whether deleterious mutations have accumulated in regions of low recombination due to the impact of recent positive selection and genetic hitchhiking. Using publicly available data on common complex diseases and population demography, we observed an enrichment of hitchhiked disease associations in conserved gene clusters subject to selection pressure. Evolutionary analysis revealed that these conserved gene clusters arose by multiple concerted rearrangements events across the vertebrate lineage. We observed distinct clustering of disease-associated SNPs in evolutionary rearranged regions of low recombination and high gene density, which harbor genes involved in immunity, that is, the interleukin cluster on 5q31 or RhoA on 3p21.

Conclusions: Our results suggest that multiple lineage specific rearrangements led to a physical clustering of functionally related and linked genes exhibiting an enrichment of susceptibility loci for complex traits. This implies that besides recent evolutionary adaptations other evolutionary dynamics have played a role in the formation of linked gene clusters associated with complex disease traits.

Show MeSH

Related in: MedlinePlus

Plots of Crohn’s disease risk locus at chromosome 3p21.(A) Map of the 3p21 risk locus containing -log(P) values of SNPs, LD blocks defined by Proxy SNP with r2 >0.8 as well as positions of SNPs considered iHS signals (light colour) or strong iHS signals (darker colour) for the three HapMap populations (CEU: blue, ASN: yellow, YRI: brown). (B) Reference allele frequencies of SNPs showing allele frequency differences in the 95th percentile between at least two of three populations according to 1000 Genomes data. (C) Density plot of reference allele frequencies of SNPs associated with Crohn’s disease. Allele frequencies were retrieved from 1000 Genomes (CEU: blue, ASN: yellow, YRI: brown). (D) Percentages of SNPs associated with Crohn’s disease, which are iHS signals (left) or show allele frequency difference in the 95th percentile between populations (right).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351447&req=5

pone-0036205-g005: Plots of Crohn’s disease risk locus at chromosome 3p21.(A) Map of the 3p21 risk locus containing -log(P) values of SNPs, LD blocks defined by Proxy SNP with r2 >0.8 as well as positions of SNPs considered iHS signals (light colour) or strong iHS signals (darker colour) for the three HapMap populations (CEU: blue, ASN: yellow, YRI: brown). (B) Reference allele frequencies of SNPs showing allele frequency differences in the 95th percentile between at least two of three populations according to 1000 Genomes data. (C) Density plot of reference allele frequencies of SNPs associated with Crohn’s disease. Allele frequencies were retrieved from 1000 Genomes (CEU: blue, ASN: yellow, YRI: brown). (D) Percentages of SNPs associated with Crohn’s disease, which are iHS signals (left) or show allele frequency difference in the 95th percentile between populations (right).

Mentions: The second population specific risk locus is located on chromosome 3p21 and includes genes such as GPX1, MST1 and BSN (Figure 5A) [19]. In contrast to the region on chromosome 5q31, the enrichment of disease variants is not accompanied by a selective sweep within the European population. For the GPX1 gene, a recent selective sweep in the Asian population has already been established [20] and could be reproduced in this study. Within LD of the CD SNPs associated in the European population, a large number of strong iHS signals could be observed in the ASN dataset (72% of SNPs) and a more moderate, but still elevated number in the YRI data (11%, see Figure 5B). Consistently, the allele frequencies show large differences. In the ASN population, most CD SNPs feature extreme reference allele frequencies, which are lower than 0.2 or greater than 0.8 (Figure 5C). These differences in allele frequencies due to recent selection events might have had a profound impact on disease prevalence between populations. While there is a strong association with CD in Europeans, an association signal in individuals of Asian ancestry could not be replicated. The substantial variation in the frequency of disease variants due to recent selective events across human populations may point to differences in disease prevalence between the populations. In the European and African populations, allele frequencies resolve around an intermediate range. This suggests that the strong wide-range selective sweep in the ASN population also had effects on CD SNPs (Figure 5D) and might in fact have lowered the disease risk originating from 3p21. The index SNP rs3197999 was shown to be a non-synonymous coding SNP in the MST1 gene [21]. Its risk allele A is more common in EUR (0.28) and YRI (0.24) and less frequent in ASN (0.08). Thus, assuming that the risk conferred by disease variants is constant across populations, our data suggest that the common-disease-common-variant hypothesis does not necessarily extend across populations since risk alleles discovered in the European population are found at extremely low or high frequencies in other populations.


Evolutionary dynamics of co-segregating gene clusters associated with complex diseases.

Preuss C, Riemenschneider M, Wiedmann D, Stoll M - PLoS ONE (2012)

Plots of Crohn’s disease risk locus at chromosome 3p21.(A) Map of the 3p21 risk locus containing -log(P) values of SNPs, LD blocks defined by Proxy SNP with r2 >0.8 as well as positions of SNPs considered iHS signals (light colour) or strong iHS signals (darker colour) for the three HapMap populations (CEU: blue, ASN: yellow, YRI: brown). (B) Reference allele frequencies of SNPs showing allele frequency differences in the 95th percentile between at least two of three populations according to 1000 Genomes data. (C) Density plot of reference allele frequencies of SNPs associated with Crohn’s disease. Allele frequencies were retrieved from 1000 Genomes (CEU: blue, ASN: yellow, YRI: brown). (D) Percentages of SNPs associated with Crohn’s disease, which are iHS signals (left) or show allele frequency difference in the 95th percentile between populations (right).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351447&req=5

pone-0036205-g005: Plots of Crohn’s disease risk locus at chromosome 3p21.(A) Map of the 3p21 risk locus containing -log(P) values of SNPs, LD blocks defined by Proxy SNP with r2 >0.8 as well as positions of SNPs considered iHS signals (light colour) or strong iHS signals (darker colour) for the three HapMap populations (CEU: blue, ASN: yellow, YRI: brown). (B) Reference allele frequencies of SNPs showing allele frequency differences in the 95th percentile between at least two of three populations according to 1000 Genomes data. (C) Density plot of reference allele frequencies of SNPs associated with Crohn’s disease. Allele frequencies were retrieved from 1000 Genomes (CEU: blue, ASN: yellow, YRI: brown). (D) Percentages of SNPs associated with Crohn’s disease, which are iHS signals (left) or show allele frequency difference in the 95th percentile between populations (right).
Mentions: The second population specific risk locus is located on chromosome 3p21 and includes genes such as GPX1, MST1 and BSN (Figure 5A) [19]. In contrast to the region on chromosome 5q31, the enrichment of disease variants is not accompanied by a selective sweep within the European population. For the GPX1 gene, a recent selective sweep in the Asian population has already been established [20] and could be reproduced in this study. Within LD of the CD SNPs associated in the European population, a large number of strong iHS signals could be observed in the ASN dataset (72% of SNPs) and a more moderate, but still elevated number in the YRI data (11%, see Figure 5B). Consistently, the allele frequencies show large differences. In the ASN population, most CD SNPs feature extreme reference allele frequencies, which are lower than 0.2 or greater than 0.8 (Figure 5C). These differences in allele frequencies due to recent selection events might have had a profound impact on disease prevalence between populations. While there is a strong association with CD in Europeans, an association signal in individuals of Asian ancestry could not be replicated. The substantial variation in the frequency of disease variants due to recent selective events across human populations may point to differences in disease prevalence between the populations. In the European and African populations, allele frequencies resolve around an intermediate range. This suggests that the strong wide-range selective sweep in the ASN population also had effects on CD SNPs (Figure 5D) and might in fact have lowered the disease risk originating from 3p21. The index SNP rs3197999 was shown to be a non-synonymous coding SNP in the MST1 gene [21]. Its risk allele A is more common in EUR (0.28) and YRI (0.24) and less frequent in ASN (0.08). Thus, assuming that the risk conferred by disease variants is constant across populations, our data suggest that the common-disease-common-variant hypothesis does not necessarily extend across populations since risk alleles discovered in the European population are found at extremely low or high frequencies in other populations.

Bottom Line: We observed distinct clustering of disease-associated SNPs in evolutionary rearranged regions of low recombination and high gene density, which harbor genes involved in immunity, that is, the interleukin cluster on 5q31 or RhoA on 3p21.Our results suggest that multiple lineage specific rearrangements led to a physical clustering of functionally related and linked genes exhibiting an enrichment of susceptibility loci for complex traits.This implies that besides recent evolutionary adaptations other evolutionary dynamics have played a role in the formation of linked gene clusters associated with complex disease traits.

View Article: PubMed Central - PubMed

Affiliation: Genetic Epidemiology of Vascular Disorders, Leibniz Institute for Arteriosclerosis Research (LIFA) at the University of Muenster, Muenster, Germany. christoph.preuss@lifa-muenster.de

ABSTRACT

Background: The distribution of human disease-associated mutations is not random across the human genome. Despite the fact that natural selection continually removes disease-associated mutations, an enrichment of these variants can be observed in regions of low recombination. There are a number of mechanisms by which such a clustering could occur, including genetic perturbations or demographic effects within different populations. Recent genome-wide association studies (GWAS) suggest that single nucleotide polymorphisms (SNPs) associated with complex disease traits are not randomly distributed throughout the genome, but tend to cluster in regions of low recombination.

Principal findings: Here we investigated whether deleterious mutations have accumulated in regions of low recombination due to the impact of recent positive selection and genetic hitchhiking. Using publicly available data on common complex diseases and population demography, we observed an enrichment of hitchhiked disease associations in conserved gene clusters subject to selection pressure. Evolutionary analysis revealed that these conserved gene clusters arose by multiple concerted rearrangements events across the vertebrate lineage. We observed distinct clustering of disease-associated SNPs in evolutionary rearranged regions of low recombination and high gene density, which harbor genes involved in immunity, that is, the interleukin cluster on 5q31 or RhoA on 3p21.

Conclusions: Our results suggest that multiple lineage specific rearrangements led to a physical clustering of functionally related and linked genes exhibiting an enrichment of susceptibility loci for complex traits. This implies that besides recent evolutionary adaptations other evolutionary dynamics have played a role in the formation of linked gene clusters associated with complex disease traits.

Show MeSH
Related in: MedlinePlus