Limits...
Multiplexing of ChIP-Seq Samples in an Optimized Experimental Condition Has Minimal Impact on Peak Detection.

Kacmarczyk TJ, Bourque C, Zhang X, Jiang Y, Houvras Y, Alonso A, Betel D - PLoS ONE (2015)

Bottom Line: In most cases the number of samples that are multiplexed is determined by financial consideration or experimental convenience, with limited understanding on the effects on the experimental results.We found that, for histone marker H3K4me3, one can multiplex up to 8 samples (7 IP + 1 input) at ~21 million single-end reads each and still detect over 90% of all peaks found when using a full lane for sample (~181 million reads).Furthermore, there are no variations introduced by indexing or lane batch effects and importantly there is no significant reduction in the number of genes with neighboring H3K4me3 peaks.

View Article: PubMed Central - PubMed

Affiliation: Department of Medicine, Division of Hematology/Oncology, Epigenomics Core Facility, Weill Cornell Medical College, New York, New York, United States of America.

ABSTRACT
Multiplexing samples in sequencing experiments is a common approach to maximize information yield while minimizing cost. In most cases the number of samples that are multiplexed is determined by financial consideration or experimental convenience, with limited understanding on the effects on the experimental results. Here we set to examine the impact of multiplexing ChIP-seq experiments on the ability to identify a specific epigenetic modification. We performed peak detection analyses to determine the effects of multiplexing. These include false discovery rates, size, position and statistical significance of peak detection, and changes in gene annotation. We found that, for histone marker H3K4me3, one can multiplex up to 8 samples (7 IP + 1 input) at ~21 million single-end reads each and still detect over 90% of all peaks found when using a full lane for sample (~181 million reads). Furthermore, there are no variations introduced by indexing or lane batch effects and importantly there is no significant reduction in the number of genes with neighboring H3K4me3 peaks. We conclude that, for a well characterized antibody and, therefore, model IP condition, multiplexing 8 samples per lane is sufficient to capture most of the biological signal.

No MeSH data available.


Related in: MedlinePlus

Peak characteristics.A) P-values for detected peaks shift towards reduced significance as multiplexing increases. B) The difference in peak apex position of peaks detected in multiplexed libraries to peak apex positions of peaks detected in the non-multiplexed library shows consistent difference across all multiplexed levels while increasing variability as multiplexing increases. C) Peak width distributions show a marginal reduction across multiplex levels.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4466019&req=5

pone.0129350.g004: Peak characteristics.A) P-values for detected peaks shift towards reduced significance as multiplexing increases. B) The difference in peak apex position of peaks detected in multiplexed libraries to peak apex positions of peaks detected in the non-multiplexed library shows consistent difference across all multiplexed levels while increasing variability as multiplexing increases. C) Peak width distributions show a marginal reduction across multiplex levels.

Mentions: Initially we compared the number of peaks and average peak widths to see if there were any critical differences. To understand in greater detail the variability of called peaks for each lane fraction, we examined the distributions of three peak characteristics: peak p-values, peak apex position, and peak widths. The p-values distributions show an overall shift towards reduced significance (higher p-values) as multiplexing increases (Fig 4A and S3 Fig) due to reduction in sequence coverage. While there is a significant overlap among peaks identified from multiplexed libraries with those identified from the non-multiplexed library we were interested to what extent the positions (genomic coordinates) of the overlapping peaks have shifted. Comparing the peak apex positions of peaks identified from 1-plex, most peaks, 88.1–85.1% (differences increase as multiplexing increases), were within the mean median peak size (751bp), suggesting little difference in peak positions (Fig 4B and S4 Fig). Libraries sequenced at higher multiplexing show more variability in peak apex position, possibly due to less data supporting a peak for detection. Finally, for each multiplex level we also examined the peak width distribution of identified peaks and found marginal reduction in peak width across multiplexed libraries (Fig 4C and S5 Fig). Taken together these results indicate that while the number of significant peaks identified is reduced due to multiplexing there is very little effect on the uniformity and positional coverage of H3K4me3.


Multiplexing of ChIP-Seq Samples in an Optimized Experimental Condition Has Minimal Impact on Peak Detection.

Kacmarczyk TJ, Bourque C, Zhang X, Jiang Y, Houvras Y, Alonso A, Betel D - PLoS ONE (2015)

Peak characteristics.A) P-values for detected peaks shift towards reduced significance as multiplexing increases. B) The difference in peak apex position of peaks detected in multiplexed libraries to peak apex positions of peaks detected in the non-multiplexed library shows consistent difference across all multiplexed levels while increasing variability as multiplexing increases. C) Peak width distributions show a marginal reduction across multiplex levels.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4466019&req=5

pone.0129350.g004: Peak characteristics.A) P-values for detected peaks shift towards reduced significance as multiplexing increases. B) The difference in peak apex position of peaks detected in multiplexed libraries to peak apex positions of peaks detected in the non-multiplexed library shows consistent difference across all multiplexed levels while increasing variability as multiplexing increases. C) Peak width distributions show a marginal reduction across multiplex levels.
Mentions: Initially we compared the number of peaks and average peak widths to see if there were any critical differences. To understand in greater detail the variability of called peaks for each lane fraction, we examined the distributions of three peak characteristics: peak p-values, peak apex position, and peak widths. The p-values distributions show an overall shift towards reduced significance (higher p-values) as multiplexing increases (Fig 4A and S3 Fig) due to reduction in sequence coverage. While there is a significant overlap among peaks identified from multiplexed libraries with those identified from the non-multiplexed library we were interested to what extent the positions (genomic coordinates) of the overlapping peaks have shifted. Comparing the peak apex positions of peaks identified from 1-plex, most peaks, 88.1–85.1% (differences increase as multiplexing increases), were within the mean median peak size (751bp), suggesting little difference in peak positions (Fig 4B and S4 Fig). Libraries sequenced at higher multiplexing show more variability in peak apex position, possibly due to less data supporting a peak for detection. Finally, for each multiplex level we also examined the peak width distribution of identified peaks and found marginal reduction in peak width across multiplexed libraries (Fig 4C and S5 Fig). Taken together these results indicate that while the number of significant peaks identified is reduced due to multiplexing there is very little effect on the uniformity and positional coverage of H3K4me3.

Bottom Line: In most cases the number of samples that are multiplexed is determined by financial consideration or experimental convenience, with limited understanding on the effects on the experimental results.We found that, for histone marker H3K4me3, one can multiplex up to 8 samples (7 IP + 1 input) at ~21 million single-end reads each and still detect over 90% of all peaks found when using a full lane for sample (~181 million reads).Furthermore, there are no variations introduced by indexing or lane batch effects and importantly there is no significant reduction in the number of genes with neighboring H3K4me3 peaks.

View Article: PubMed Central - PubMed

Affiliation: Department of Medicine, Division of Hematology/Oncology, Epigenomics Core Facility, Weill Cornell Medical College, New York, New York, United States of America.

ABSTRACT
Multiplexing samples in sequencing experiments is a common approach to maximize information yield while minimizing cost. In most cases the number of samples that are multiplexed is determined by financial consideration or experimental convenience, with limited understanding on the effects on the experimental results. Here we set to examine the impact of multiplexing ChIP-seq experiments on the ability to identify a specific epigenetic modification. We performed peak detection analyses to determine the effects of multiplexing. These include false discovery rates, size, position and statistical significance of peak detection, and changes in gene annotation. We found that, for histone marker H3K4me3, one can multiplex up to 8 samples (7 IP + 1 input) at ~21 million single-end reads each and still detect over 90% of all peaks found when using a full lane for sample (~181 million reads). Furthermore, there are no variations introduced by indexing or lane batch effects and importantly there is no significant reduction in the number of genes with neighboring H3K4me3 peaks. We conclude that, for a well characterized antibody and, therefore, model IP condition, multiplexing 8 samples per lane is sufficient to capture most of the biological signal.

No MeSH data available.


Related in: MedlinePlus