Limits...
Genomic sequence is highly predictive of local nucleosome depletion.

Yuan GC, Liu JS - PLoS Comput. Biol. (2007)

Bottom Line: This new approach has significantly improved the prediction accuracy.Regulatory elements are enriched in low N-score regions.While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America. gcyuan@jimmy.harvard.edu

ABSTRACT
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computational models have been developed to predict genome-wide nucleosome positions from DNA sequences, but these models consider only nucleosome sequences, which may have limited their power. We developed a statistical multi-resolution approach to identify a sequence signature, called the N-score, that distinguishes nucleosome binding DNA from non-nucleosome DNA. This new approach has significantly improved the prediction accuracy. The sequence information is highly predictive for local nucleosome enrichment or depletion, whereas predictions of the exact positions are only modestly more accurate than a model, suggesting the importance of other regulatory factors in fine-tuning the nucleosome positions. The N-score in promoter regions is negatively correlated with gene expression levels. Regulatory elements are enriched in low N-score regions. While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

Show MeSH
The Average Promoter N-Score Pattern(A) The average N-score pattern over promoters for all verified non–chromosome III genes. Promoters are aligned by the ATG codon.(B) The average log-ratio over non–chromosome III promoters probed by the tiling array [6].(C) Same as (A), except that promoters are divided into groups according to the gene transcription rate r (in mRNA/h) as in Holstege et al. [29] Different curves correspond to different gene groups.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2211532&req=5

pcbi-0040013-g002: The Average Promoter N-Score Pattern(A) The average N-score pattern over promoters for all verified non–chromosome III genes. Promoters are aligned by the ATG codon.(B) The average log-ratio over non–chromosome III promoters probed by the tiling array [6].(C) Same as (A), except that promoters are divided into groups according to the gene transcription rate r (in mRNA/h) as in Holstege et al. [29] Different curves correspond to different gene groups.

Mentions: One of the most striking features of global nucleosome positioning is that most active promoters contain a nucleosome-free region (NFR) near transcription start sites (TSS). This feature has been identified in a number of organisms including yeast [4,6], Drosophila [10], and human [11–13]. This overall NFR pattern has also been predicted based on the sequence information by aligning the promoters for all yeast genes at their initial ATG codon and evaluating the average predicted nucleosome occupancy [7,25]. We repeated the analysis but averaged the N-score pattern instead. The results are shown in Figure 2A. Consistent with previous studies, we found good agreement between the average N-score pattern with the experimental data on NFR (Figure 2B). Both the N-score and NFR have a pronounced dip near −200 bp, with a width of about 150 bp. The N-score is noticeably higher in coding than in promoter regions. This is also consistent with the experimentally verified bias of the nucleosome occupancy [3,4]. Compared with the results from [7,25], the N-score curve appears to be smoother and less oscillatory (Figure S1A and S1B).


Genomic sequence is highly predictive of local nucleosome depletion.

Yuan GC, Liu JS - PLoS Comput. Biol. (2007)

The Average Promoter N-Score Pattern(A) The average N-score pattern over promoters for all verified non–chromosome III genes. Promoters are aligned by the ATG codon.(B) The average log-ratio over non–chromosome III promoters probed by the tiling array [6].(C) Same as (A), except that promoters are divided into groups according to the gene transcription rate r (in mRNA/h) as in Holstege et al. [29] Different curves correspond to different gene groups.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2211532&req=5

pcbi-0040013-g002: The Average Promoter N-Score Pattern(A) The average N-score pattern over promoters for all verified non–chromosome III genes. Promoters are aligned by the ATG codon.(B) The average log-ratio over non–chromosome III promoters probed by the tiling array [6].(C) Same as (A), except that promoters are divided into groups according to the gene transcription rate r (in mRNA/h) as in Holstege et al. [29] Different curves correspond to different gene groups.
Mentions: One of the most striking features of global nucleosome positioning is that most active promoters contain a nucleosome-free region (NFR) near transcription start sites (TSS). This feature has been identified in a number of organisms including yeast [4,6], Drosophila [10], and human [11–13]. This overall NFR pattern has also been predicted based on the sequence information by aligning the promoters for all yeast genes at their initial ATG codon and evaluating the average predicted nucleosome occupancy [7,25]. We repeated the analysis but averaged the N-score pattern instead. The results are shown in Figure 2A. Consistent with previous studies, we found good agreement between the average N-score pattern with the experimental data on NFR (Figure 2B). Both the N-score and NFR have a pronounced dip near −200 bp, with a width of about 150 bp. The N-score is noticeably higher in coding than in promoter regions. This is also consistent with the experimentally verified bias of the nucleosome occupancy [3,4]. Compared with the results from [7,25], the N-score curve appears to be smoother and less oscillatory (Figure S1A and S1B).

Bottom Line: This new approach has significantly improved the prediction accuracy.Regulatory elements are enriched in low N-score regions.While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America. gcyuan@jimmy.harvard.edu

ABSTRACT
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computational models have been developed to predict genome-wide nucleosome positions from DNA sequences, but these models consider only nucleosome sequences, which may have limited their power. We developed a statistical multi-resolution approach to identify a sequence signature, called the N-score, that distinguishes nucleosome binding DNA from non-nucleosome DNA. This new approach has significantly improved the prediction accuracy. The sequence information is highly predictive for local nucleosome enrichment or depletion, whereas predictions of the exact positions are only modestly more accurate than a model, suggesting the importance of other regulatory factors in fine-tuning the nucleosome positions. The N-score in promoter regions is negatively correlated with gene expression levels. Regulatory elements are enriched in low N-score regions. While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

Show MeSH