Limits...
Genomic sequence is highly predictive of local nucleosome depletion.

Yuan GC, Liu JS - PLoS Comput. Biol. (2007)

Bottom Line: This new approach has significantly improved the prediction accuracy.Regulatory elements are enriched in low N-score regions.While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America. gcyuan@jimmy.harvard.edu

ABSTRACT
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computational models have been developed to predict genome-wide nucleosome positions from DNA sequences, but these models consider only nucleosome sequences, which may have limited their power. We developed a statistical multi-resolution approach to identify a sequence signature, called the N-score, that distinguishes nucleosome binding DNA from non-nucleosome DNA. This new approach has significantly improved the prediction accuracy. The sequence information is highly predictive for local nucleosome enrichment or depletion, whereas predictions of the exact positions are only modestly more accurate than a model, suggesting the importance of other regulatory factors in fine-tuning the nucleosome positions. The N-score in promoter regions is negatively correlated with gene expression levels. Regulatory elements are enriched in low N-score regions. While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

Show MeSH
Correlation Between Poly dA:dT Run Length and N-Score, and the BLAST-Entropy Normalized Log-Ratio in Yuan et al. [6](A) N-score.(B) BLAST-entropy normalized log-ratio in Yuan et al. [6]
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2211532&req=5

pcbi-0040013-g004: Correlation Between Poly dA:dT Run Length and N-Score, and the BLAST-Entropy Normalized Log-Ratio in Yuan et al. [6](A) N-score.(B) BLAST-entropy normalized log-ratio in Yuan et al. [6]

Mentions: A few short sequence features have also been known to be associated with nucleosome positioning. Poly dA:dT tracks destabilizes nucleosomes in vitro and in vivo [32,33]. Recent genomic studies have also associated poly dA:dT with nucleosome-free regions [2,6]. To investigate whether such an association can also be predicted from N-scores, we investigated the relationship between the N-score distribution at poly dA:dT loci (repeat length ≥ 3) in the yeast genome and the dA:dT run length. Figure 4A shows a clear negative correlation between the N-score and the length of its center poly dA:dT track (R = −0.15, p < 1.0 × 10−16), consistent with experimental results (Figure 4B).


Genomic sequence is highly predictive of local nucleosome depletion.

Yuan GC, Liu JS - PLoS Comput. Biol. (2007)

Correlation Between Poly dA:dT Run Length and N-Score, and the BLAST-Entropy Normalized Log-Ratio in Yuan et al. [6](A) N-score.(B) BLAST-entropy normalized log-ratio in Yuan et al. [6]
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2211532&req=5

pcbi-0040013-g004: Correlation Between Poly dA:dT Run Length and N-Score, and the BLAST-Entropy Normalized Log-Ratio in Yuan et al. [6](A) N-score.(B) BLAST-entropy normalized log-ratio in Yuan et al. [6]
Mentions: A few short sequence features have also been known to be associated with nucleosome positioning. Poly dA:dT tracks destabilizes nucleosomes in vitro and in vivo [32,33]. Recent genomic studies have also associated poly dA:dT with nucleosome-free regions [2,6]. To investigate whether such an association can also be predicted from N-scores, we investigated the relationship between the N-score distribution at poly dA:dT loci (repeat length ≥ 3) in the yeast genome and the dA:dT run length. Figure 4A shows a clear negative correlation between the N-score and the length of its center poly dA:dT track (R = −0.15, p < 1.0 × 10−16), consistent with experimental results (Figure 4B).

Bottom Line: This new approach has significantly improved the prediction accuracy.Regulatory elements are enriched in low N-score regions.While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America. gcyuan@jimmy.harvard.edu

ABSTRACT
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computational models have been developed to predict genome-wide nucleosome positions from DNA sequences, but these models consider only nucleosome sequences, which may have limited their power. We developed a statistical multi-resolution approach to identify a sequence signature, called the N-score, that distinguishes nucleosome binding DNA from non-nucleosome DNA. This new approach has significantly improved the prediction accuracy. The sequence information is highly predictive for local nucleosome enrichment or depletion, whereas predictions of the exact positions are only modestly more accurate than a model, suggesting the importance of other regulatory factors in fine-tuning the nucleosome positions. The N-score in promoter regions is negatively correlated with gene expression levels. Regulatory elements are enriched in low N-score regions. While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

Show MeSH