Limits...
Genomic sequence is highly predictive of local nucleosome depletion.

Yuan GC, Liu JS - PLoS Comput. Biol. (2007)

Bottom Line: This new approach has significantly improved the prediction accuracy.Regulatory elements are enriched in low N-score regions.While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America. gcyuan@jimmy.harvard.edu

ABSTRACT
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computational models have been developed to predict genome-wide nucleosome positions from DNA sequences, but these models consider only nucleosome sequences, which may have limited their power. We developed a statistical multi-resolution approach to identify a sequence signature, called the N-score, that distinguishes nucleosome binding DNA from non-nucleosome DNA. This new approach has significantly improved the prediction accuracy. The sequence information is highly predictive for local nucleosome enrichment or depletion, whereas predictions of the exact positions are only modestly more accurate than a model, suggesting the importance of other regulatory factors in fine-tuning the nucleosome positions. The N-score in promoter regions is negatively correlated with gene expression levels. Regulatory elements are enriched in low N-score regions. While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

Show MeSH
Application of the N-Score Model Derived from the Yeast Data to the Human Genome(A) The average N-score pattern for all human promoters aligned by TSS [39].(B) The average N-score pattern aligned by CTCF binding sites [13,38].
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2211532&req=5

pcbi-0040013-g007: Application of the N-Score Model Derived from the Yeast Data to the Human Genome(A) The average N-score pattern for all human promoters aligned by TSS [39].(B) The average N-score pattern aligned by CTCF binding sites [13,38].

Mentions: Since the nucleosome is conserved across all eukaryotes, an intriguing question is whether the sequence preference for nucleosome binding is also conserved. Previous studies have found that the nucleosome sequence patterns in chicken and yeast are similar [7]; however, it is unclear whether the information in non-nucleosome DNA, which is critical for the prediction of local nucleosome depletion as demonstrated above, is also conserved. Recently, high-resolution genome-wide binding sites of several important histone related proteins in human have been experimentally identified [13,38]. To test whether our model predictions agree with these data, we calculated the average N-score at the human promoter regions aligned by known transcription start sites (TSS) obtained from DBTSS [39], where the model parameters for the N-score computation are kept as those estimated from the yeast data (Figure 7A). The ∼100 bp wide dip at TSS and the two peaks located at approximately −100 bp and +200 bp agree well with the locations of the experimentally identified NFRs and adjacent nucleosomes (Figure 2B and 2L in Barski et al. [13]), strongly suggesting the conservation of sequence specificity of nucleosome binding DNA across eukaryotes.


Genomic sequence is highly predictive of local nucleosome depletion.

Yuan GC, Liu JS - PLoS Comput. Biol. (2007)

Application of the N-Score Model Derived from the Yeast Data to the Human Genome(A) The average N-score pattern for all human promoters aligned by TSS [39].(B) The average N-score pattern aligned by CTCF binding sites [13,38].
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2211532&req=5

pcbi-0040013-g007: Application of the N-Score Model Derived from the Yeast Data to the Human Genome(A) The average N-score pattern for all human promoters aligned by TSS [39].(B) The average N-score pattern aligned by CTCF binding sites [13,38].
Mentions: Since the nucleosome is conserved across all eukaryotes, an intriguing question is whether the sequence preference for nucleosome binding is also conserved. Previous studies have found that the nucleosome sequence patterns in chicken and yeast are similar [7]; however, it is unclear whether the information in non-nucleosome DNA, which is critical for the prediction of local nucleosome depletion as demonstrated above, is also conserved. Recently, high-resolution genome-wide binding sites of several important histone related proteins in human have been experimentally identified [13,38]. To test whether our model predictions agree with these data, we calculated the average N-score at the human promoter regions aligned by known transcription start sites (TSS) obtained from DBTSS [39], where the model parameters for the N-score computation are kept as those estimated from the yeast data (Figure 7A). The ∼100 bp wide dip at TSS and the two peaks located at approximately −100 bp and +200 bp agree well with the locations of the experimentally identified NFRs and adjacent nucleosomes (Figure 2B and 2L in Barski et al. [13]), strongly suggesting the conservation of sequence specificity of nucleosome binding DNA across eukaryotes.

Bottom Line: This new approach has significantly improved the prediction accuracy.Regulatory elements are enriched in low N-score regions.While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America. gcyuan@jimmy.harvard.edu

ABSTRACT
The regulation of DNA accessibility through nucleosome positioning is important for transcription control. Computational models have been developed to predict genome-wide nucleosome positions from DNA sequences, but these models consider only nucleosome sequences, which may have limited their power. We developed a statistical multi-resolution approach to identify a sequence signature, called the N-score, that distinguishes nucleosome binding DNA from non-nucleosome DNA. This new approach has significantly improved the prediction accuracy. The sequence information is highly predictive for local nucleosome enrichment or depletion, whereas predictions of the exact positions are only modestly more accurate than a model, suggesting the importance of other regulatory factors in fine-tuning the nucleosome positions. The N-score in promoter regions is negatively correlated with gene expression levels. Regulatory elements are enriched in low N-score regions. While our model is derived from yeast data, the N-score pattern computed from this model agrees well with recent high-resolution protein-binding data in human.

Show MeSH