Limits...
Automated mapping of large-scale chromatin structure in ENCODE.

Lian H, Thompson WA, Thurman R, Stamatoyannopoulos JA, Noble WS, Lawrence CE - Bioinformatics (2008)

Bottom Line: The CPM produces a better fit to the observed data than the HMM.The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.The CPM software is available upon request from the authors.

View Article: PubMed Central - PubMed

Affiliation: Division of Mathematical Sciences, SPMS, Nanyang Technological University, Singapore.

ABSTRACT

Motivation: A recently developed DNaseI assay has given us our first genome-wide view of chromatin structure. In addition to cataloging DNaseI hypersensitive sites, these data allows us to more completely characterize overall features of chromatin accessibility. We employed a Bayesian hierarchical change-point model (CPM), a generalization of a hidden Markov Model (HMM), to characterize tiled microarray DNaseI sensitivity data available from the ENCODE project.

Results: Our analysis shows that the accessibility of chromatin to cleavage by DNaseI is well described by a four state model of local segments with each state described by a continuous mixture of Gaussian variables. The CPM produces a better fit to the observed data than the HMM. The large posterior probability for the four-state CPM suggests that the data falls naturally into four classes of regions, which we call major and minor DNaseI hypersensitive sites (DHSs), regions of intermediate sensitivity, and insensitive regions. These classes agree well with a model of chromatin in which local disruptions (DHSs) are concentrated within larger domains of intermediate sensitivity, the accessibility islands. The CPM assigns 92% of the bases within the ENCODE regions to the insensitive regions. The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.

Availability: The CPM software is available upon request from the authors.

Show MeSH

Related in: MedlinePlus

The figure shows a sample segmentation for a 27 kb region within ENr212. The x-axis is the probe number and two probes with no gaps in between are connected by solid line so that the gaps becomes visually obvious in this figure. The figure contains four change points (c1,… c4) and five regions (δ0, …, δ4), with each region assigned to a particular model state. Within each region, the mean and SD is shown by horizontal red dashed lines. Note that δ3 and δ4 are both in state 0, even though there is a change point at c4.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2519158&req=5

Figure 1: The figure shows a sample segmentation for a 27 kb region within ENr212. The x-axis is the probe number and two probes with no gaps in between are connected by solid line so that the gaps becomes visually obvious in this figure. The figure contains four change points (c1,… c4) and five regions (δ0, …, δ4), with each region assigned to a particular model state. Within each region, the mean and SD is shown by horizontal red dashed lines. Note that δ3 and δ4 are both in state 0, even though there is a change point at c4.

Mentions: For example, consider the data shown in Figure 1. The figure contains three segments in state 0 (δ0, δ3 and δ4), each with its own mean and variance. While the means of these segments differ from one another, all three means are lower than those of the substrings from higher states (δ1 and δ2), because the means of segments in state 0 are drawn from a distribution that is shifted toward lower values. Because each state of the hierarchical CPM is described by a family of normal distributions, rather than by a single distribution, the model gains flexibility to capture complex variability in the data.Fig. 1.


Automated mapping of large-scale chromatin structure in ENCODE.

Lian H, Thompson WA, Thurman R, Stamatoyannopoulos JA, Noble WS, Lawrence CE - Bioinformatics (2008)

The figure shows a sample segmentation for a 27 kb region within ENr212. The x-axis is the probe number and two probes with no gaps in between are connected by solid line so that the gaps becomes visually obvious in this figure. The figure contains four change points (c1,… c4) and five regions (δ0, …, δ4), with each region assigned to a particular model state. Within each region, the mean and SD is shown by horizontal red dashed lines. Note that δ3 and δ4 are both in state 0, even though there is a change point at c4.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2519158&req=5

Figure 1: The figure shows a sample segmentation for a 27 kb region within ENr212. The x-axis is the probe number and two probes with no gaps in between are connected by solid line so that the gaps becomes visually obvious in this figure. The figure contains four change points (c1,… c4) and five regions (δ0, …, δ4), with each region assigned to a particular model state. Within each region, the mean and SD is shown by horizontal red dashed lines. Note that δ3 and δ4 are both in state 0, even though there is a change point at c4.
Mentions: For example, consider the data shown in Figure 1. The figure contains three segments in state 0 (δ0, δ3 and δ4), each with its own mean and variance. While the means of these segments differ from one another, all three means are lower than those of the substrings from higher states (δ1 and δ2), because the means of segments in state 0 are drawn from a distribution that is shifted toward lower values. Because each state of the hierarchical CPM is described by a family of normal distributions, rather than by a single distribution, the model gains flexibility to capture complex variability in the data.Fig. 1.

Bottom Line: The CPM produces a better fit to the observed data than the HMM.The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.The CPM software is available upon request from the authors.

View Article: PubMed Central - PubMed

Affiliation: Division of Mathematical Sciences, SPMS, Nanyang Technological University, Singapore.

ABSTRACT

Motivation: A recently developed DNaseI assay has given us our first genome-wide view of chromatin structure. In addition to cataloging DNaseI hypersensitive sites, these data allows us to more completely characterize overall features of chromatin accessibility. We employed a Bayesian hierarchical change-point model (CPM), a generalization of a hidden Markov Model (HMM), to characterize tiled microarray DNaseI sensitivity data available from the ENCODE project.

Results: Our analysis shows that the accessibility of chromatin to cleavage by DNaseI is well described by a four state model of local segments with each state described by a continuous mixture of Gaussian variables. The CPM produces a better fit to the observed data than the HMM. The large posterior probability for the four-state CPM suggests that the data falls naturally into four classes of regions, which we call major and minor DNaseI hypersensitive sites (DHSs), regions of intermediate sensitivity, and insensitive regions. These classes agree well with a model of chromatin in which local disruptions (DHSs) are concentrated within larger domains of intermediate sensitivity, the accessibility islands. The CPM assigns 92% of the bases within the ENCODE regions to the insensitive regions. The 5.8% of the bases that are in regions of intermediate sensitivity are clearly enriched in functional elements, including genes and activating histone modifications, while the remaining 2.2% of the bases in hypersensitive regions are very strongly enriched in these elements.

Availability: The CPM software is available upon request from the authors.

Show MeSH
Related in: MedlinePlus