Limits...
Multi-scale chromatin state annotation using a hierarchical hidden Markov model

View Article: PubMed Central - PubMed

ABSTRACT

Chromatin-state analysis is widely applied in the studies of development and diseases. However, existing methods operate at a single length scale, and therefore cannot distinguish large domains from isolated elements of the same type. To overcome this limitation, we present a hierarchical hidden Markov model, diHMM, to systematically annotate chromatin states at multiple length scales. We apply diHMM to analyse a public ChIP-seq data set. diHMM not only accurately captures nucleosome-level information, but identifies domain-level states that vary in nucleosome-level state composition, spatial distribution and functionality. The domain-level states recapitulate known patterns such as super-enhancers, bivalent promoters and Polycomb repressed regions, and identify additional patterns whose biological functions are not yet characterized. By integrating chromatin-state information with gene expression and Hi-C data, we identify context-dependent functions of nucleosome-level states. Thus, diHMM provides a powerful tool for investigating the role of higher-order chromatin structure in gene regulation.

No MeSH data available.


Context-specific functionality of diHMM nucleosome-level states.(a,b) Heatmaps represent average gene expression (z-score for each gene and cell line obtained from a panel of 17 cell lines studied by ENCODE2) for genes mapped to enhancers in different domain contexts. In each row, genes are selected by proximity (±2 kb from TSS) to nucleosome-level enhancers (states N9 to N13) in super-enhancer domains (D10–D13) or in the rest of the domains, as indicated by the small cartoon in each heatmap. Each column represents the average gene expression values for the different sets of genes when estimated in different cell lines. Numbers indicate the fraction of enhancers distributed between the different domains. (c–e) Heatmaps represent average gene expression for genes mapped to bivalent promoter state N7 in different domain contexts as indicated.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC5385569&req=5

f3: Context-specific functionality of diHMM nucleosome-level states.(a,b) Heatmaps represent average gene expression (z-score for each gene and cell line obtained from a panel of 17 cell lines studied by ENCODE2) for genes mapped to enhancers in different domain contexts. In each row, genes are selected by proximity (±2 kb from TSS) to nucleosome-level enhancers (states N9 to N13) in super-enhancer domains (D10–D13) or in the rest of the domains, as indicated by the small cartoon in each heatmap. Each column represents the average gene expression values for the different sets of genes when estimated in different cell lines. Numbers indicate the fraction of enhancers distributed between the different domains. (c–e) Heatmaps represent average gene expression for genes mapped to bivalent promoter state N7 in different domain contexts as indicated.

Mentions: diHMM provides an opportunity to systematically investigate how the function of enhancer elements is influenced by the large-scale chromatin organization, an effect that cannot be evaluated based on a single-scale model. For example, the enhancer state N13 was used in both poised enhancer (D8) and super-enhancer (D10) domains (Fig. 2d and Supplementary Fig. 6), but its spatial context was very different in these domains. In D8, it transitions to heterochromatin (N27, N28) and polycomb repressed state (N26), whereas in D10 it often transitions to strong enhancer states (N9–N11) or transcribed enhancer states (N14–N19). To test whether such contextual differences were functionally relevant, we divided the nucleosome-level enhancer states (N9–N13) into two broad categories, one associated with super-enhancer domains and the other with other domains, and compared the expression levels of their target genes. Remarkably, the gene expression levels corresponding to super-enhancer domain associated enhancers were much more cell-type specific (Fig. 3), indicating this subset of enhancers may play a more important role in maintenance of cell identity than other enhancers. This difference was not obvious for other enhancer-associated domains (poised enhancer, upstream enhancer and intron/enhancer) (Supplementary Fig. 8). We also compared our super-enhancer domains with the super-enhancers originally identified by the Lab of Young and colleagues23 and found a high degree of overlap, hence justifying its name (Supplementary Figs 9a and 10). These domains also had a high degree of overlap with stretch enhancers22 and broad H3K4me3 domains24 (Supplementary Fig. 10). Next, we observed that downregulated genes were typically associated with bivalent promoter nucleosome-level states in the context of polycomb repressed domains (Fig. 3b). We repeated this analysis for other domain-level contexts and found a weaker trend for bivalent promoter domains (Fig. 3).


Multi-scale chromatin state annotation using a hierarchical hidden Markov model
Context-specific functionality of diHMM nucleosome-level states.(a,b) Heatmaps represent average gene expression (z-score for each gene and cell line obtained from a panel of 17 cell lines studied by ENCODE2) for genes mapped to enhancers in different domain contexts. In each row, genes are selected by proximity (±2 kb from TSS) to nucleosome-level enhancers (states N9 to N13) in super-enhancer domains (D10–D13) or in the rest of the domains, as indicated by the small cartoon in each heatmap. Each column represents the average gene expression values for the different sets of genes when estimated in different cell lines. Numbers indicate the fraction of enhancers distributed between the different domains. (c–e) Heatmaps represent average gene expression for genes mapped to bivalent promoter state N7 in different domain contexts as indicated.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC5385569&req=5

f3: Context-specific functionality of diHMM nucleosome-level states.(a,b) Heatmaps represent average gene expression (z-score for each gene and cell line obtained from a panel of 17 cell lines studied by ENCODE2) for genes mapped to enhancers in different domain contexts. In each row, genes are selected by proximity (±2 kb from TSS) to nucleosome-level enhancers (states N9 to N13) in super-enhancer domains (D10–D13) or in the rest of the domains, as indicated by the small cartoon in each heatmap. Each column represents the average gene expression values for the different sets of genes when estimated in different cell lines. Numbers indicate the fraction of enhancers distributed between the different domains. (c–e) Heatmaps represent average gene expression for genes mapped to bivalent promoter state N7 in different domain contexts as indicated.
Mentions: diHMM provides an opportunity to systematically investigate how the function of enhancer elements is influenced by the large-scale chromatin organization, an effect that cannot be evaluated based on a single-scale model. For example, the enhancer state N13 was used in both poised enhancer (D8) and super-enhancer (D10) domains (Fig. 2d and Supplementary Fig. 6), but its spatial context was very different in these domains. In D8, it transitions to heterochromatin (N27, N28) and polycomb repressed state (N26), whereas in D10 it often transitions to strong enhancer states (N9–N11) or transcribed enhancer states (N14–N19). To test whether such contextual differences were functionally relevant, we divided the nucleosome-level enhancer states (N9–N13) into two broad categories, one associated with super-enhancer domains and the other with other domains, and compared the expression levels of their target genes. Remarkably, the gene expression levels corresponding to super-enhancer domain associated enhancers were much more cell-type specific (Fig. 3), indicating this subset of enhancers may play a more important role in maintenance of cell identity than other enhancers. This difference was not obvious for other enhancer-associated domains (poised enhancer, upstream enhancer and intron/enhancer) (Supplementary Fig. 8). We also compared our super-enhancer domains with the super-enhancers originally identified by the Lab of Young and colleagues23 and found a high degree of overlap, hence justifying its name (Supplementary Figs 9a and 10). These domains also had a high degree of overlap with stretch enhancers22 and broad H3K4me3 domains24 (Supplementary Fig. 10). Next, we observed that downregulated genes were typically associated with bivalent promoter nucleosome-level states in the context of polycomb repressed domains (Fig. 3b). We repeated this analysis for other domain-level contexts and found a weaker trend for bivalent promoter domains (Fig. 3).

View Article: PubMed Central - PubMed

ABSTRACT

Chromatin-state analysis is widely applied in the studies of development and diseases. However, existing methods operate at a single length scale, and therefore cannot distinguish large domains from isolated elements of the same type. To overcome this limitation, we present a hierarchical hidden Markov model, diHMM, to systematically annotate chromatin states at multiple length scales. We apply diHMM to analyse a public ChIP-seq data set. diHMM not only accurately captures nucleosome-level information, but identifies domain-level states that vary in nucleosome-level state composition, spatial distribution and functionality. The domain-level states recapitulate known patterns such as super-enhancers, bivalent promoters and Polycomb repressed regions, and identify additional patterns whose biological functions are not yet characterized. By integrating chromatin-state information with gene expression and Hi-C data, we identify context-dependent functions of nucleosome-level states. Thus, diHMM provides a powerful tool for investigating the role of higher-order chromatin structure in gene regulation.

No MeSH data available.