Limits...
Automated mapping of large-scale chromatin structure in ENCODE.

Lian H, Thompson WA, Thurman R, Stamatoyannopoulos JA, Noble WS, Lawrence CE - Bioinformatics (2008)

Bottom Line: The CPM produces a better fit to the observed data than the HMM.The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.The CPM software is available upon request from the authors.

View Article: PubMed Central - PubMed

Affiliation: Division of Mathematical Sciences, SPMS, Nanyang Technological University, Singapore.

ABSTRACT

Motivation: A recently developed DNaseI assay has given us our first genome-wide view of chromatin structure. In addition to cataloging DNaseI hypersensitive sites, these data allows us to more completely characterize overall features of chromatin accessibility. We employed a Bayesian hierarchical change-point model (CPM), a generalization of a hidden Markov Model (HMM), to characterize tiled microarray DNaseI sensitivity data available from the ENCODE project.

Results: Our analysis shows that the accessibility of chromatin to cleavage by DNaseI is well described by a four state model of local segments with each state described by a continuous mixture of Gaussian variables. The CPM produces a better fit to the observed data than the HMM. The large posterior probability for the four-state CPM suggests that the data falls naturally into four classes of regions, which we call major and minor DNaseI hypersensitive sites (DHSs), regions of intermediate sensitivity, and insensitive regions. These classes agree well with a model of chromatin in which local disruptions (DHSs) are concentrated within larger domains of intermediate sensitivity, the accessibility islands. The CPM assigns 92% of the bases within the ENCODE regions to the insensitive regions. The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.

Availability: The CPM software is available upon request from the authors.

Show MeSH

Related in: MedlinePlus

Intensity distributions of segments. The two panels plot histograms of probe intensities and segment means for the four states, repectively.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2519158&req=5

Figure 6: Intensity distributions of segments. The two panels plot histograms of probe intensities and segment means for the four states, repectively.

Mentions: Figure 5 shows histograms of the distribution of segment lengths, and Figure 6 shows histograms of probe intensities and segment mean intensities for the four states. These two figures show that there is considerable overlap between the probe intensities of adjacent states, but little overlap in the mean values of the four states. Thus, Figure 6 supports the use of subinterval averaging of the CPM as an effective means of averaging across noisy probes for these data. A key feature of the CPM is that we average over adjacent probes by using the hierarchical model and do not require that the means and variances of the four classes are fixed, but rather are drawn from state specific models. The strong evidence for four classes also likely stems from this strong separation. The CPM assigns 92% of the bases to the insensitive state, 5.8% to the intermediate sensitivity states and 2.2% to the hypersensitive state.Fig. 5.


Automated mapping of large-scale chromatin structure in ENCODE.

Lian H, Thompson WA, Thurman R, Stamatoyannopoulos JA, Noble WS, Lawrence CE - Bioinformatics (2008)

Intensity distributions of segments. The two panels plot histograms of probe intensities and segment means for the four states, repectively.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2519158&req=5

Figure 6: Intensity distributions of segments. The two panels plot histograms of probe intensities and segment means for the four states, repectively.
Mentions: Figure 5 shows histograms of the distribution of segment lengths, and Figure 6 shows histograms of probe intensities and segment mean intensities for the four states. These two figures show that there is considerable overlap between the probe intensities of adjacent states, but little overlap in the mean values of the four states. Thus, Figure 6 supports the use of subinterval averaging of the CPM as an effective means of averaging across noisy probes for these data. A key feature of the CPM is that we average over adjacent probes by using the hierarchical model and do not require that the means and variances of the four classes are fixed, but rather are drawn from state specific models. The strong evidence for four classes also likely stems from this strong separation. The CPM assigns 92% of the bases to the insensitive state, 5.8% to the intermediate sensitivity states and 2.2% to the hypersensitive state.Fig. 5.

Bottom Line: The CPM produces a better fit to the observed data than the HMM.The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.The CPM software is available upon request from the authors.

View Article: PubMed Central - PubMed

Affiliation: Division of Mathematical Sciences, SPMS, Nanyang Technological University, Singapore.

ABSTRACT

Motivation: A recently developed DNaseI assay has given us our first genome-wide view of chromatin structure. In addition to cataloging DNaseI hypersensitive sites, these data allows us to more completely characterize overall features of chromatin accessibility. We employed a Bayesian hierarchical change-point model (CPM), a generalization of a hidden Markov Model (HMM), to characterize tiled microarray DNaseI sensitivity data available from the ENCODE project.

Results: Our analysis shows that the accessibility of chromatin to cleavage by DNaseI is well described by a four state model of local segments with each state described by a continuous mixture of Gaussian variables. The CPM produces a better fit to the observed data than the HMM. The large posterior probability for the four-state CPM suggests that the data falls naturally into four classes of regions, which we call major and minor DNaseI hypersensitive sites (DHSs), regions of intermediate sensitivity, and insensitive regions. These classes agree well with a model of chromatin in which local disruptions (DHSs) are concentrated within larger domains of intermediate sensitivity, the accessibility islands. The CPM assigns 92% of the bases within the ENCODE regions to the insensitive regions. The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.

Availability: The CPM software is available upon request from the authors.

Show MeSH
Related in: MedlinePlus