Limits...
Evidence for evolutionary and nonevolutionary forces shaping the distribution of human genetic variants near transcription start sites.

Scala G, Affinito O, Miele G, Monticelli A, Cocozza S - PLoS ONE (2014)

Bottom Line: We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS.We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores.In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

View Article: PubMed Central - PubMed

Affiliation: Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli "Federico II", Naples, Italy; Dipartimento di Fisica, Università degli Studi di Napoli "Federico II", Naples, Italy; Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy.

ABSTRACT
The regions surrounding transcription start sites (TSSs) of genes play a critical role in the regulation of gene expression. At the same time, current evidence indicates that these regions are particularly stressed by transcription-related mutagenic phenomena. In this work we performed a genome-wide analysis of the distribution of single nucleotide polymorphisms (SNPs) inside the 10 kb region flanking human TSSs by dividing SNPs into four classes according to their frequency (rare, two intermediate classes, and common). We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS. We found that the distribution of variants is generally different for TSSs located inside or outside of CpG islands. We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores. Furthermore, our analysis suggests that evolutionary (purifying selection) and nonevolutionary (biased gene conversion) forces both play a role in determining the relative SNP frequency around TSSs. Finally, we analyzed the potential pathogenicity of each class of variant using the Combined Annotation Dependent Depletion score. In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

Show MeSH
BVF distribution is different among classes.Normalized BVF values for rare (black line), mid1 (red line), mid2 (green line) and common (blue line) variants are reported together on the same plot for CGI-TSSs (left panel) and nCGI-TSSs (right panel). A dot is placed over the bins where the difference of normalized BVF among the four classes is statistically significant. On the x-axis is the position of the bin relative to the TSS.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4256220&req=5

pone-0114432-g002: BVF distribution is different among classes.Normalized BVF values for rare (black line), mid1 (red line), mid2 (green line) and common (blue line) variants are reported together on the same plot for CGI-TSSs (left panel) and nCGI-TSSs (right panel). A dot is placed over the bins where the difference of normalized BVF among the four classes is statistically significant. On the x-axis is the position of the bin relative to the TSS.

Mentions: Looking in more in detail at the rare variant class, for CGI-TSSs we observed a depression of BVF values in the near vicinity of the TSS with a relative peak of BVF values in the first four bins downstream of the TSS. In this restricted region, corresponding to the transcribed 200 bp after the TSS, the BVF values increased ∼1.7 fold in comparison to the corresponding upstream, nontranscribed, region. In regions around nCGI-TSSs we found a significant positional effect only for rare variants. Also, a significant deviation from the BVF neutral distribution was found in the near vicinity of TSSs, but only in the downstream, transcribed region. Figure 1 also shows that BVF signals of the four classes seem to deviate from the neutral model in different manners. This phenomenon is better shown in Figure 2, where all normalized signals are shown on the same graph.


Evidence for evolutionary and nonevolutionary forces shaping the distribution of human genetic variants near transcription start sites.

Scala G, Affinito O, Miele G, Monticelli A, Cocozza S - PLoS ONE (2014)

BVF distribution is different among classes.Normalized BVF values for rare (black line), mid1 (red line), mid2 (green line) and common (blue line) variants are reported together on the same plot for CGI-TSSs (left panel) and nCGI-TSSs (right panel). A dot is placed over the bins where the difference of normalized BVF among the four classes is statistically significant. On the x-axis is the position of the bin relative to the TSS.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4256220&req=5

pone-0114432-g002: BVF distribution is different among classes.Normalized BVF values for rare (black line), mid1 (red line), mid2 (green line) and common (blue line) variants are reported together on the same plot for CGI-TSSs (left panel) and nCGI-TSSs (right panel). A dot is placed over the bins where the difference of normalized BVF among the four classes is statistically significant. On the x-axis is the position of the bin relative to the TSS.
Mentions: Looking in more in detail at the rare variant class, for CGI-TSSs we observed a depression of BVF values in the near vicinity of the TSS with a relative peak of BVF values in the first four bins downstream of the TSS. In this restricted region, corresponding to the transcribed 200 bp after the TSS, the BVF values increased ∼1.7 fold in comparison to the corresponding upstream, nontranscribed, region. In regions around nCGI-TSSs we found a significant positional effect only for rare variants. Also, a significant deviation from the BVF neutral distribution was found in the near vicinity of TSSs, but only in the downstream, transcribed region. Figure 1 also shows that BVF signals of the four classes seem to deviate from the neutral model in different manners. This phenomenon is better shown in Figure 2, where all normalized signals are shown on the same graph.

Bottom Line: We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS.We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores.In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

View Article: PubMed Central - PubMed

Affiliation: Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli "Federico II", Naples, Italy; Dipartimento di Fisica, Università degli Studi di Napoli "Federico II", Naples, Italy; Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy.

ABSTRACT
The regions surrounding transcription start sites (TSSs) of genes play a critical role in the regulation of gene expression. At the same time, current evidence indicates that these regions are particularly stressed by transcription-related mutagenic phenomena. In this work we performed a genome-wide analysis of the distribution of single nucleotide polymorphisms (SNPs) inside the 10 kb region flanking human TSSs by dividing SNPs into four classes according to their frequency (rare, two intermediate classes, and common). We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS. We found that the distribution of variants is generally different for TSSs located inside or outside of CpG islands. We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores. Furthermore, our analysis suggests that evolutionary (purifying selection) and nonevolutionary (biased gene conversion) forces both play a role in determining the relative SNP frequency around TSSs. Finally, we analyzed the potential pathogenicity of each class of variant using the Combined Annotation Dependent Depletion score. In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.

Show MeSH