Limits...
Predicting chromatin organization using histone marks.

Huang J, Marco E, Pinello L, Yuan GC - Genome Biol. (2015)

Bottom Line: To aid experimental effort and to understand the determinants of long-range chromatin interactions, we have developed a computational model integrating Hi-C and histone mark ChIP-seq data to predict two important features of chromatin organization: chromatin interaction hubs and topologically associated domain (TAD) boundaries.Cell-type specific histone mark information is required for prediction of chromatin interaction hubs but not for TAD boundaries.Our predictions provide a useful guide for the exploration of chromatin organization.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA. jhuang@jimmy.harvard.edu.

ABSTRACT
Genome-wide mapping of three dimensional chromatin organization is an important yet technically challenging task. To aid experimental effort and to understand the determinants of long-range chromatin interactions, we have developed a computational model integrating Hi-C and histone mark ChIP-seq data to predict two important features of chromatin organization: chromatin interaction hubs and topologically associated domain (TAD) boundaries. Our model accurately and robustly predicts these features across datasets and cell types. Cell-type specific histone mark information is required for prediction of chromatin interaction hubs but not for TAD boundaries. Our predictions provide a useful guide for the exploration of chromatin organization.

Show MeSH
Analysis of the Rao2014 dataset. a Workflow for identifying hubs from the raw interaction matrix. b Comparison between the Rao2014 and Jin2013 datasets. Genome browser snapshots showing two hubs adjacent to the LIN28A locus (indicated by red and blue respectively) and their associated targets in each dataset are shown. c Prediction accuracy for the Rao2014 IMR90 hubs. The ROC curves correspond to the testing data. AUC scores are shown in parentheses. d Prediction accuracy for applying the Rao2014 IMR90 model to predict hubs in other datasets (Jin2013) or cell-types (GM12872(Rao2014) and K562 (Rao2014)). The ROC curves correspond to the testing data. AUC scores are shown in parentheses
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4549084&req=5

Fig4: Analysis of the Rao2014 dataset. a Workflow for identifying hubs from the raw interaction matrix. b Comparison between the Rao2014 and Jin2013 datasets. Genome browser snapshots showing two hubs adjacent to the LIN28A locus (indicated by red and blue respectively) and their associated targets in each dataset are shown. c Prediction accuracy for the Rao2014 IMR90 hubs. The ROC curves correspond to the testing data. AUC scores are shown in parentheses. d Prediction accuracy for applying the Rao2014 IMR90 model to predict hubs in other datasets (Jin2013) or cell-types (GM12872(Rao2014) and K562 (Rao2014)). The ROC curves correspond to the testing data. AUC scores are shown in parentheses

Mentions: To test the robustness of our prediction, we repeated our analysis on a recently published Hi-C dataset with higher spatial resolution in multiple cell-types [12]. To identify hubs from this dataset, we first normalized the raw interaction matrix (at 5 kb resolution) using the ICE (Iterative Correction and Eigenvector Decomposition) algorithm [25]. Then we identified statistically significant chromatin interactions by using Fit-Hi-C [26] (Methods). We ranked the 5 kb segments by the interaction frequency and defined the hubs as the top 10 % segments (Fig. 4a, Additional file 1: Figure S2A), and referred to this set as the Rao2014 hubs in order to distinguish it from the set of hubs defined from ref. 11 (referred to as the Jin2013 hubs). Despite the difference in experimental protocols, these two sets of hubs overlapped quite substantially. About 60 % of the Rao2014 hubs overlapped with the Jin2013 hubs. For example, the chromatin interaction profiles identified from these two datasets were very similar at the LIN28A locus, and the hub locations were nearly identical (Fig. 4b).Fig. 4


Predicting chromatin organization using histone marks.

Huang J, Marco E, Pinello L, Yuan GC - Genome Biol. (2015)

Analysis of the Rao2014 dataset. a Workflow for identifying hubs from the raw interaction matrix. b Comparison between the Rao2014 and Jin2013 datasets. Genome browser snapshots showing two hubs adjacent to the LIN28A locus (indicated by red and blue respectively) and their associated targets in each dataset are shown. c Prediction accuracy for the Rao2014 IMR90 hubs. The ROC curves correspond to the testing data. AUC scores are shown in parentheses. d Prediction accuracy for applying the Rao2014 IMR90 model to predict hubs in other datasets (Jin2013) or cell-types (GM12872(Rao2014) and K562 (Rao2014)). The ROC curves correspond to the testing data. AUC scores are shown in parentheses
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4549084&req=5

Fig4: Analysis of the Rao2014 dataset. a Workflow for identifying hubs from the raw interaction matrix. b Comparison between the Rao2014 and Jin2013 datasets. Genome browser snapshots showing two hubs adjacent to the LIN28A locus (indicated by red and blue respectively) and their associated targets in each dataset are shown. c Prediction accuracy for the Rao2014 IMR90 hubs. The ROC curves correspond to the testing data. AUC scores are shown in parentheses. d Prediction accuracy for applying the Rao2014 IMR90 model to predict hubs in other datasets (Jin2013) or cell-types (GM12872(Rao2014) and K562 (Rao2014)). The ROC curves correspond to the testing data. AUC scores are shown in parentheses
Mentions: To test the robustness of our prediction, we repeated our analysis on a recently published Hi-C dataset with higher spatial resolution in multiple cell-types [12]. To identify hubs from this dataset, we first normalized the raw interaction matrix (at 5 kb resolution) using the ICE (Iterative Correction and Eigenvector Decomposition) algorithm [25]. Then we identified statistically significant chromatin interactions by using Fit-Hi-C [26] (Methods). We ranked the 5 kb segments by the interaction frequency and defined the hubs as the top 10 % segments (Fig. 4a, Additional file 1: Figure S2A), and referred to this set as the Rao2014 hubs in order to distinguish it from the set of hubs defined from ref. 11 (referred to as the Jin2013 hubs). Despite the difference in experimental protocols, these two sets of hubs overlapped quite substantially. About 60 % of the Rao2014 hubs overlapped with the Jin2013 hubs. For example, the chromatin interaction profiles identified from these two datasets were very similar at the LIN28A locus, and the hub locations were nearly identical (Fig. 4b).Fig. 4

Bottom Line: To aid experimental effort and to understand the determinants of long-range chromatin interactions, we have developed a computational model integrating Hi-C and histone mark ChIP-seq data to predict two important features of chromatin organization: chromatin interaction hubs and topologically associated domain (TAD) boundaries.Cell-type specific histone mark information is required for prediction of chromatin interaction hubs but not for TAD boundaries.Our predictions provide a useful guide for the exploration of chromatin organization.

View Article: PubMed Central - PubMed

Affiliation: Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA. jhuang@jimmy.harvard.edu.

ABSTRACT
Genome-wide mapping of three dimensional chromatin organization is an important yet technically challenging task. To aid experimental effort and to understand the determinants of long-range chromatin interactions, we have developed a computational model integrating Hi-C and histone mark ChIP-seq data to predict two important features of chromatin organization: chromatin interaction hubs and topologically associated domain (TAD) boundaries. Our model accurately and robustly predicts these features across datasets and cell types. Cell-type specific histone mark information is required for prediction of chromatin interaction hubs but not for TAD boundaries. Our predictions provide a useful guide for the exploration of chromatin organization.

Show MeSH