Limits...
Evidence for evolutionary and nonevolutionary forces shaping the distribution of human genetic variants near transcription start sites.

Scala G, Affinito O, Miele G, Monticelli A, Cocozza S - PLoS ONE (2014)

Bottom Line: We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS.We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores.In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

View Article: PubMed Central - PubMed

Affiliation: Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli "Federico II", Naples, Italy; Dipartimento di Fisica, Università degli Studi di Napoli "Federico II", Naples, Italy; Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy.

ABSTRACT
The regions surrounding transcription start sites (TSSs) of genes play a critical role in the regulation of gene expression. At the same time, current evidence indicates that these regions are particularly stressed by transcription-related mutagenic phenomena. In this work we performed a genome-wide analysis of the distribution of single nucleotide polymorphisms (SNPs) inside the 10 kb region flanking human TSSs by dividing SNPs into four classes according to their frequency (rare, two intermediate classes, and common). We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS. We found that the distribution of variants is generally different for TSSs located inside or outside of CpG islands. We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores. Furthermore, our analysis suggests that evolutionary (purifying selection) and nonevolutionary (biased gene conversion) forces both play a role in determining the relative SNP frequency around TSSs. Finally, we analyzed the potential pathogenicity of each class of variant using the Combined Annotation Dependent Depletion score. In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

Show MeSH
Nucleosome density distribution is different between CGI-TSSs and nCGI-TSSs.The BNP values are plotted together for CGI-TSSs (black line) and nCGI-TSSs (red line). On the x-axis is the position of the bin relative to the TSS.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4256220&req=5

pone-0114432-g003: Nucleosome density distribution is different between CGI-TSSs and nCGI-TSSs.The BNP values are plotted together for CGI-TSSs (black line) and nCGI-TSSs (red line). On the x-axis is the position of the bin relative to the TSS.

Mentions: In general one can expect that variants belonging to different (frequency and/or CGI) classes will be differentially susceptible to the action of different evolutionary forces. It is likely that rare variants are more closely linked to the mutational process and that their frequency is influenced by the presence of mutational “hotspots”. On the other hand, stochastic and evolutionary events (such as drift and selection) can influence the localization of common variants. As a first step, we decided to explore forces potentially affecting the distribution of “rare” variants. It is well known that the presence of DNA packaging structures, for example nucleosomes, can affect the emergence of novel mutations, thus influencing the presence of low frequency variants in a genomic region. Therefore, we decided to search for possible relationships between nucleosome position and rare variant distribution. We downloaded nucleosome positioning scores of the Gm12878 cell line from the UCSC “Stanf Nucleosome” track. By following an analog approach, as for BVF computation (see Materials and Methods), we evaluated the “average nucleosome positioning score” for each bin (BNP), by averaging nucleosome scores for a fixed bin on all TSSs. As expected [27]–[28], nucleosome positioning distribution around the TSS behaved differently for CGI-TSS and nCGI-TSS (Fisher p-value <1e-4), with CGI-TSSs being characterized by a marked depression in nucleosome density in the proximity of the TSS (Figure 3).


Evidence for evolutionary and nonevolutionary forces shaping the distribution of human genetic variants near transcription start sites.

Scala G, Affinito O, Miele G, Monticelli A, Cocozza S - PLoS ONE (2014)

Nucleosome density distribution is different between CGI-TSSs and nCGI-TSSs.The BNP values are plotted together for CGI-TSSs (black line) and nCGI-TSSs (red line). On the x-axis is the position of the bin relative to the TSS.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4256220&req=5

pone-0114432-g003: Nucleosome density distribution is different between CGI-TSSs and nCGI-TSSs.The BNP values are plotted together for CGI-TSSs (black line) and nCGI-TSSs (red line). On the x-axis is the position of the bin relative to the TSS.
Mentions: In general one can expect that variants belonging to different (frequency and/or CGI) classes will be differentially susceptible to the action of different evolutionary forces. It is likely that rare variants are more closely linked to the mutational process and that their frequency is influenced by the presence of mutational “hotspots”. On the other hand, stochastic and evolutionary events (such as drift and selection) can influence the localization of common variants. As a first step, we decided to explore forces potentially affecting the distribution of “rare” variants. It is well known that the presence of DNA packaging structures, for example nucleosomes, can affect the emergence of novel mutations, thus influencing the presence of low frequency variants in a genomic region. Therefore, we decided to search for possible relationships between nucleosome position and rare variant distribution. We downloaded nucleosome positioning scores of the Gm12878 cell line from the UCSC “Stanf Nucleosome” track. By following an analog approach, as for BVF computation (see Materials and Methods), we evaluated the “average nucleosome positioning score” for each bin (BNP), by averaging nucleosome scores for a fixed bin on all TSSs. As expected [27]–[28], nucleosome positioning distribution around the TSS behaved differently for CGI-TSS and nCGI-TSS (Fisher p-value <1e-4), with CGI-TSSs being characterized by a marked depression in nucleosome density in the proximity of the TSS (Figure 3).

Bottom Line: We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS.We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores.In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

View Article: PubMed Central - PubMed

Affiliation: Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli "Federico II", Naples, Italy; Dipartimento di Fisica, Università degli Studi di Napoli "Federico II", Naples, Italy; Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy.

ABSTRACT
The regions surrounding transcription start sites (TSSs) of genes play a critical role in the regulation of gene expression. At the same time, current evidence indicates that these regions are particularly stressed by transcription-related mutagenic phenomena. In this work we performed a genome-wide analysis of the distribution of single nucleotide polymorphisms (SNPs) inside the 10 kb region flanking human TSSs by dividing SNPs into four classes according to their frequency (rare, two intermediate classes, and common). We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS. We found that the distribution of variants is generally different for TSSs located inside or outside of CpG islands. We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores. Furthermore, our analysis suggests that evolutionary (purifying selection) and nonevolutionary (biased gene conversion) forces both play a role in determining the relative SNP frequency around TSSs. Finally, we analyzed the potential pathogenicity of each class of variant using the Combined Annotation Dependent Depletion score. In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

Show MeSH