Limits...
Probing long-range interactions by extracting free energies from genome-wide chromosome conformation capture data.

Saberi S, Farré P, Cuvier O, Emberly E - BMC Bioinformatics (2015)

Bottom Line: PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out.The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping.PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

View Article: PubMed Central - PubMed

Affiliation: Physics Department, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6, BC, Canada. saied.sabery.m@gmail.com.

ABSTRACT

Background: A variety of DNA binding proteins are involved in regulating and shaping the packing of chromatin. They aid the formation of loops in the DNA that function to isolate different structural domains. A recent experimental technique, Hi-C, provides a method for determining the frequency of such looping between all distant parts of the genome. Given that the binding locations of many chromatin associated proteins have also been measured, it has been possible to make estimates for their influence on the long-range interactions as measured by Hi-C. However, a challenge in this analysis is the predominance of non-specific contacts that mask out the specific interactions of interest.

Results: We show that transforming the Hi-C contact frequencies into free energies gives a natural method for separating out the distance dependent non-specific interactions. In particular we apply Principal Component Analysis (PCA) to the transformed free energy matrix to identify the dominant modes of interaction. PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out. Thus it can be used as a data driven approach for normalizing Hi-C data. We assess this PCA based normalization approach, along with several other normalization schemes, by fitting the transformed Hi-C data using a pairwise interaction model that takes as input the known locations of bound chromatin factors. The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping. We show that the quality of the fit can be used as a means to determine how much PCA filtering should be applied to the Hi-C data.

Conclusions: We find that the different normalizations of the Hi-C data vary in the quality of fit to the pairwise interaction model. PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

Show MeSH

Related in: MedlinePlus

Chromatin factor coupling energies from fitting. The fitted coupling energies, Jμ,ν, between chromosome associated factors. The left heat maps show the chromosomal average J’s, and the right heat map the associated standard deviations in the average values. The following free energy matrices were used: A)raw, B)raw + PC filtering (optimal number of PCs used was 35) and C)hierarchical.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4492175&req=5

Fig5: Chromatin factor coupling energies from fitting. The fitted coupling energies, Jμ,ν, between chromosome associated factors. The left heat maps show the chromosomal average J’s, and the right heat map the associated standard deviations in the average values. The following free energy matrices were used: A)raw, B)raw + PC filtering (optimal number of PCs used was 35) and C)hierarchical.

Mentions: In Figure 5, we show the fitted coupling energies from the fits to the raw, raw + PC filtered and hierarchical data. As mentioned PCA filtering improved the fit, yet the resulting J’s show an overall agreement between the different data sets. Here we show the average J’s over all the chromosomes (left heat maps) and their associated standard deviations (right heat maps). The parameter error estimates show that many of the couplings are consistently predicted from one chromosome to the next. An inspection of the fitted couplings that are consistent across chromosomes show that many of the insulators and factors that are linked to euchromatic domains have attractive (negative) interactions, speaking to their ability to stabilize loops in such domains [24,25]. Many of these have effective repulsive (loop hindering) interactions with polycomb group proteins (PCL, Pc), though some have attractive interactions with Pho. Other things that are shared between these sets of J are the associations between BEAF, Chromator and Cohesin and the transcriptional machinery factors, PolII and Nurf. Interestingly, the predicted interactions between CTCF and such factors are more complex, highlighted by the effective positive interactions. We should also point out that a given J represents a pair’s effect on looping and should not be interpreted as a prediction of whether they interact or not. Factors may very well interact (i.e. have attractive protein-protein interactions) but yet have a destabilizing effect on loop formation. We note that within both the insulator and polycomb group, some pairs of factors are predicted to effectively raise the energy of loop formation. We also point out that other models could also be fit, for instance leaving out self-interactions, that may help to reveal more specific interactions, though potentially reducing the quality of the fit.Figure 5


Probing long-range interactions by extracting free energies from genome-wide chromosome conformation capture data.

Saberi S, Farré P, Cuvier O, Emberly E - BMC Bioinformatics (2015)

Chromatin factor coupling energies from fitting. The fitted coupling energies, Jμ,ν, between chromosome associated factors. The left heat maps show the chromosomal average J’s, and the right heat map the associated standard deviations in the average values. The following free energy matrices were used: A)raw, B)raw + PC filtering (optimal number of PCs used was 35) and C)hierarchical.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4492175&req=5

Fig5: Chromatin factor coupling energies from fitting. The fitted coupling energies, Jμ,ν, between chromosome associated factors. The left heat maps show the chromosomal average J’s, and the right heat map the associated standard deviations in the average values. The following free energy matrices were used: A)raw, B)raw + PC filtering (optimal number of PCs used was 35) and C)hierarchical.
Mentions: In Figure 5, we show the fitted coupling energies from the fits to the raw, raw + PC filtered and hierarchical data. As mentioned PCA filtering improved the fit, yet the resulting J’s show an overall agreement between the different data sets. Here we show the average J’s over all the chromosomes (left heat maps) and their associated standard deviations (right heat maps). The parameter error estimates show that many of the couplings are consistently predicted from one chromosome to the next. An inspection of the fitted couplings that are consistent across chromosomes show that many of the insulators and factors that are linked to euchromatic domains have attractive (negative) interactions, speaking to their ability to stabilize loops in such domains [24,25]. Many of these have effective repulsive (loop hindering) interactions with polycomb group proteins (PCL, Pc), though some have attractive interactions with Pho. Other things that are shared between these sets of J are the associations between BEAF, Chromator and Cohesin and the transcriptional machinery factors, PolII and Nurf. Interestingly, the predicted interactions between CTCF and such factors are more complex, highlighted by the effective positive interactions. We should also point out that a given J represents a pair’s effect on looping and should not be interpreted as a prediction of whether they interact or not. Factors may very well interact (i.e. have attractive protein-protein interactions) but yet have a destabilizing effect on loop formation. We note that within both the insulator and polycomb group, some pairs of factors are predicted to effectively raise the energy of loop formation. We also point out that other models could also be fit, for instance leaving out self-interactions, that may help to reveal more specific interactions, though potentially reducing the quality of the fit.Figure 5

Bottom Line: PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out.The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping.PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

View Article: PubMed Central - PubMed

Affiliation: Physics Department, Simon Fraser University, 8888 University Drive, Burnaby, V5A 1S6, BC, Canada. saied.sabery.m@gmail.com.

ABSTRACT

Background: A variety of DNA binding proteins are involved in regulating and shaping the packing of chromatin. They aid the formation of loops in the DNA that function to isolate different structural domains. A recent experimental technique, Hi-C, provides a method for determining the frequency of such looping between all distant parts of the genome. Given that the binding locations of many chromatin associated proteins have also been measured, it has been possible to make estimates for their influence on the long-range interactions as measured by Hi-C. However, a challenge in this analysis is the predominance of non-specific contacts that mask out the specific interactions of interest.

Results: We show that transforming the Hi-C contact frequencies into free energies gives a natural method for separating out the distance dependent non-specific interactions. In particular we apply Principal Component Analysis (PCA) to the transformed free energy matrix to identify the dominant modes of interaction. PCA identifies systematic effects as well as high frequency spatial noise in the Hi-C data which can be filtered out. Thus it can be used as a data driven approach for normalizing Hi-C data. We assess this PCA based normalization approach, along with several other normalization schemes, by fitting the transformed Hi-C data using a pairwise interaction model that takes as input the known locations of bound chromatin factors. The result of fitting is a set of predictions for the coupling energies between the various chromatin factors and their effect on the energetics of looping. We show that the quality of the fit can be used as a means to determine how much PCA filtering should be applied to the Hi-C data.

Conclusions: We find that the different normalizations of the Hi-C data vary in the quality of fit to the pairwise interaction model. PCA filtering can improve the fit, and the predicted coupling energies lead to biologically meaningful insights for how various chromatin bound factors influence the stability of DNA loops in chromatin.

Show MeSH
Related in: MedlinePlus