Limits...
Probing long-range interactions by extracting free energies from genome-wide chromosome conformation capture data.

Saberi S, Farré P, Cuvier O, Emberly E - BMC Bioinformatics (2015)

Bottom Line: PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out.The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping.PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

View Article: PubMed Central - PubMed

Affiliation: Physics Department, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6, BC, Canada. saied.sabery.m@gmail.com.

ABSTRACT

Background: A variety of DNA binding proteins are involved in regulating and shaping the packing of chromatin. They aid the formation of loops in the DNA that function to isolate different structural domains. A recent experimental technique, Hi-C, provides a method for determining the frequency of such looping between all distant parts of the genome. Given that the binding locations of many chromatin associated proteins have also been measured, it has been possible to make estimates for their influence on the long-range interactions as measured by Hi-C. However, a challenge in this analysis is the predominance of non-specific contacts that mask out the specific interactions of interest.

Results: We show that transforming the Hi-C contact frequencies into free energies gives a natural method for separating out the distance dependent non-specific interactions. In particular we apply Principal Component Analysis (PCA) to the transformed free energy matrix to identify the dominant modes of interaction. PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out. Thus it can be used as a data driven approach for normalizing Hi-C data. We assess this PCA based normalization approach, along with several other normalization schemes, by fitting the transformed Hi-C data using a pairwise interaction model that takes as input the known locations of bound chromatin factors. The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping. We show that the quality of the fit can be used as a means to determine how much PCA filtering should be applied to the Hi-C data.

Conclusions: We find that the different normalizations of the Hi-C data vary in the quality of fit to the pairwise interaction model. PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

Show MeSH

Related in: MedlinePlus

Specific energies of interaction and associated chromatin factor contacts. Shown in (A, B, C) are the energies of interaction  of a portion of chromosome 2L for three different Fi,j matrices: A) raw, B) raw + PC filtered and hierarchical. (The first 35 PCs were used in reconstructing the raw + PC free energies). All have been aligned so that the zeroth column corresponds to i=j. Blue regions correspond to attractive interactions (negative) and red regions to effective repulsive interactions (positive). Figures (D, E) show the locations of pairwise self contacts,  for the insulator factor BEAF and the polycomb group protein Pc (blue corresponds to  and red to ). Comparing the interaction energies (A, B, C) with the locations of pairwise contacts (D, E) highlights how these contacts could be generating the observed interactions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4492175&req=5

Fig3: Specific energies of interaction and associated chromatin factor contacts. Shown in (A, B, C) are the energies of interaction of a portion of chromosome 2L for three different Fi,j matrices: A) raw, B) raw + PC filtered and hierarchical. (The first 35 PCs were used in reconstructing the raw + PC free energies). All have been aligned so that the zeroth column corresponds to i=j. Blue regions correspond to attractive interactions (negative) and red regions to effective repulsive interactions (positive). Figures (D, E) show the locations of pairwise self contacts, for the insulator factor BEAF and the polycomb group protein Pc (blue corresponds to and red to ). Comparing the interaction energies (A, B, C) with the locations of pairwise contacts (D, E) highlights how these contacts could be generating the observed interactions.

Mentions: In Figure 3(A-C) we show the specific interaction energies for a portion of chromosome 2L. As can be seen PCA filtering dramatically smoothens the data, highlighting domains of attractive (blue) and repulsive interactions (red). As a comparison we show the energies computed from hierarchical normalized data for the same region. The two normalized energy matrices agree in many domains, but do possess differences, such as the size of the interacting domain situated around 9 Mb. Many of these interactions are due to specific contacts between chromatin factors at the given loci. We highlight this connection by showing the pairwise self contacts, , for the same region for the insulator BEAF and the polycomb factor Pc (Figure 3D,E). For example, some of the attractive energies (blue region near 8Mb in the δFi,j heat maps) are likely due to interactions between insulators (BEAF-BEAF domain in Figure 3D), whereas other attractive interactions (region between 5 Mb and 6 Mb) could be due to interactions between the polycomb group of factors (Pc-Pc domain in Figure 3E). We now assess how well the interaction energies are fit to a model that takes the distribution of contacts between bound factors as input.Figure 3


Probing long-range interactions by extracting free energies from genome-wide chromosome conformation capture data.

Saberi S, Farré P, Cuvier O, Emberly E - BMC Bioinformatics (2015)

Specific energies of interaction and associated chromatin factor contacts. Shown in (A, B, C) are the energies of interaction  of a portion of chromosome 2L for three different Fi,j matrices: A) raw, B) raw + PC filtered and hierarchical. (The first 35 PCs were used in reconstructing the raw + PC free energies). All have been aligned so that the zeroth column corresponds to i=j. Blue regions correspond to attractive interactions (negative) and red regions to effective repulsive interactions (positive). Figures (D, E) show the locations of pairwise self contacts,  for the insulator factor BEAF and the polycomb group protein Pc (blue corresponds to  and red to ). Comparing the interaction energies (A, B, C) with the locations of pairwise contacts (D, E) highlights how these contacts could be generating the observed interactions.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4492175&req=5

Fig3: Specific energies of interaction and associated chromatin factor contacts. Shown in (A, B, C) are the energies of interaction of a portion of chromosome 2L for three different Fi,j matrices: A) raw, B) raw + PC filtered and hierarchical. (The first 35 PCs were used in reconstructing the raw + PC free energies). All have been aligned so that the zeroth column corresponds to i=j. Blue regions correspond to attractive interactions (negative) and red regions to effective repulsive interactions (positive). Figures (D, E) show the locations of pairwise self contacts, for the insulator factor BEAF and the polycomb group protein Pc (blue corresponds to and red to ). Comparing the interaction energies (A, B, C) with the locations of pairwise contacts (D, E) highlights how these contacts could be generating the observed interactions.
Mentions: In Figure 3(A-C) we show the specific interaction energies for a portion of chromosome 2L. As can be seen PCA filtering dramatically smoothens the data, highlighting domains of attractive (blue) and repulsive interactions (red). As a comparison we show the energies computed from hierarchical normalized data for the same region. The two normalized energy matrices agree in many domains, but do possess differences, such as the size of the interacting domain situated around 9 Mb. Many of these interactions are due to specific contacts between chromatin factors at the given loci. We highlight this connection by showing the pairwise self contacts, , for the same region for the insulator BEAF and the polycomb factor Pc (Figure 3D,E). For example, some of the attractive energies (blue region near 8Mb in the δFi,j heat maps) are likely due to interactions between insulators (BEAF-BEAF domain in Figure 3D), whereas other attractive interactions (region between 5 Mb and 6 Mb) could be due to interactions between the polycomb group of factors (Pc-Pc domain in Figure 3E). We now assess how well the interaction energies are fit to a model that takes the distribution of contacts between bound factors as input.Figure 3

Bottom Line: PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out.The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping.PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

View Article: PubMed Central - PubMed

Affiliation: Physics Department, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6, BC, Canada. saied.sabery.m@gmail.com.

ABSTRACT

Background: A variety of DNA binding proteins are involved in regulating and shaping the packing of chromatin. They aid the formation of loops in the DNA that function to isolate different structural domains. A recent experimental technique, Hi-C, provides a method for determining the frequency of such looping between all distant parts of the genome. Given that the binding locations of many chromatin associated proteins have also been measured, it has been possible to make estimates for their influence on the long-range interactions as measured by Hi-C. However, a challenge in this analysis is the predominance of non-specific contacts that mask out the specific interactions of interest.

Results: We show that transforming the Hi-C contact frequencies into free energies gives a natural method for separating out the distance dependent non-specific interactions. In particular we apply Principal Component Analysis (PCA) to the transformed free energy matrix to identify the dominant modes of interaction. PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out. Thus it can be used as a data driven approach for normalizing Hi-C data. We assess this PCA based normalization approach, along with several other normalization schemes, by fitting the transformed Hi-C data using a pairwise interaction model that takes as input the known locations of bound chromatin factors. The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping. We show that the quality of the fit can be used as a means to determine how much PCA filtering should be applied to the Hi-C data.

Conclusions: We find that the different normalizations of the Hi-C data vary in the quality of fit to the pairwise interaction model. PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

Show MeSH
Related in: MedlinePlus