Limits...
Normalization of ChIP-seq data with control.

Liang K, Keleş S - BMC Bioinformatics (2012)

Bottom Line: ChIP-seq has become an important tool for identifying genome-wide protein-DNA interactions, including transcription factor binding and histone modifications.Proper normalization between the ChIP and control samples is an essential aspect of ChIP-seq data analysis.Our results indicate that the proper normalization between the ChIP and control samples is an important step in ChIP-seq analysis in terms of power and error rate control.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. k22liang@uwaterloo.ca

ABSTRACT

Background: ChIP-seq has become an important tool for identifying genome-wide protein-DNA interactions, including transcription factor binding and histone modifications. In ChIP-seq experiments, ChIP samples are usually coupled with their matching control samples. Proper normalization between the ChIP and control samples is an essential aspect of ChIP-seq data analysis.

Results: We have developed a novel method for estimating the normalization factor between the ChIP and the control samples. Our method, named as NCIS (Normalization of ChIP-seq) can accommodate both low and high sequencing depth datasets. We compare statistical properties of NCIS against existing methods in a set of diverse simulation settings, where NCIS enjoys the best estimation precision. In addition, we illustrate the impact of the normalization factor in FDR control and show that NCIS leads to more power among methods that control FDR at nominal levels.

Conclusion: Our results indicate that the proper normalization between the ChIP and control samples is an important step in ChIP-seq analysis in terms of power and error rate control. Our proposed method shows excellent statistical properties and is useful in the full range of ChIP-seq applications, especially with deeply sequenced data.

Show MeSH

Related in: MedlinePlus

ChIP vs control bin counts for yeast strain SEG1. ChIP versus control bin counts for yeast strain SEG1 plotted with bin-width of 500 bp. The upper black line represents the sequencing depth ratio, and the lower blue line the NCIS normalization factor estimate.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3475056&req=5

Figure 4: ChIP vs control bin counts for yeast strain SEG1. ChIP versus control bin counts for yeast strain SEG1 plotted with bin-width of 500 bp. The upper black line represents the sequencing depth ratio, and the lower blue line the NCIS normalization factor estimate.

Mentions: To further illustrate the differences between different normalization factors, we plotted the ChIP versus the control bin counts with bin-width w = 500 bp in Figure 4. In this plot, different colors indicate different densities of bins which are annotated at the right-hand side. There are many bins with relatively high ChIP counts due to the enrichment signal. The slope of the upper black line is the sequencing depth ratio, and majority of bins (83.7%) appear below this line. We should expect less than 50% of bins to appear below the normalization factor line because binding regions have smaller than 0.5 probability to exhibit a ChIP count/control count ratio below the normalization factor. The NCIS normalization factor is represented by the lower blue line, which passes right through the densest area of bins and has 49.7% of bins below the line.


Normalization of ChIP-seq data with control.

Liang K, Keleş S - BMC Bioinformatics (2012)

ChIP vs control bin counts for yeast strain SEG1. ChIP versus control bin counts for yeast strain SEG1 plotted with bin-width of 500 bp. The upper black line represents the sequencing depth ratio, and the lower blue line the NCIS normalization factor estimate.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3475056&req=5

Figure 4: ChIP vs control bin counts for yeast strain SEG1. ChIP versus control bin counts for yeast strain SEG1 plotted with bin-width of 500 bp. The upper black line represents the sequencing depth ratio, and the lower blue line the NCIS normalization factor estimate.
Mentions: To further illustrate the differences between different normalization factors, we plotted the ChIP versus the control bin counts with bin-width w = 500 bp in Figure 4. In this plot, different colors indicate different densities of bins which are annotated at the right-hand side. There are many bins with relatively high ChIP counts due to the enrichment signal. The slope of the upper black line is the sequencing depth ratio, and majority of bins (83.7%) appear below this line. We should expect less than 50% of bins to appear below the normalization factor line because binding regions have smaller than 0.5 probability to exhibit a ChIP count/control count ratio below the normalization factor. The NCIS normalization factor is represented by the lower blue line, which passes right through the densest area of bins and has 49.7% of bins below the line.

Bottom Line: ChIP-seq has become an important tool for identifying genome-wide protein-DNA interactions, including transcription factor binding and histone modifications.Proper normalization between the ChIP and control samples is an essential aspect of ChIP-seq data analysis.Our results indicate that the proper normalization between the ChIP and control samples is an important step in ChIP-seq analysis in terms of power and error rate control.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada. k22liang@uwaterloo.ca

ABSTRACT

Background: ChIP-seq has become an important tool for identifying genome-wide protein-DNA interactions, including transcription factor binding and histone modifications. In ChIP-seq experiments, ChIP samples are usually coupled with their matching control samples. Proper normalization between the ChIP and control samples is an essential aspect of ChIP-seq data analysis.

Results: We have developed a novel method for estimating the normalization factor between the ChIP and the control samples. Our method, named as NCIS (Normalization of ChIP-seq) can accommodate both low and high sequencing depth datasets. We compare statistical properties of NCIS against existing methods in a set of diverse simulation settings, where NCIS enjoys the best estimation precision. In addition, we illustrate the impact of the normalization factor in FDR control and show that NCIS leads to more power among methods that control FDR at nominal levels.

Conclusion: Our results indicate that the proper normalization between the ChIP and control samples is an important step in ChIP-seq analysis in terms of power and error rate control. Our proposed method shows excellent statistical properties and is useful in the full range of ChIP-seq applications, especially with deeply sequenced data.

Show MeSH
Related in: MedlinePlus