Limits...
TileProbe: modeling tiling array probe effects using publicly available data.

Judy JT, Ji H - Bioinformatics (2009)

Bottom Line: Individual probes on an Affymetrix tiling array usually behave differently.We propose TileProbe, a new technique that builds upon the MAT algorithm by incorporating publicly available data sets to remove tiling array probe effects.When applied to analyzing ChIP-chip data, TileProbe performs consistently better than MAT across a variety of analytical conditions.

View Article: PubMed Central - PubMed

Affiliation: Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

ABSTRACT

Motivation: Individual probes on an Affymetrix tiling array usually behave differently. Modeling and removing these probe effects are critical for detecting signals from the array data. Current data processing techniques either require control samples or use probe sequences to model probe-specific variability, such as with MAT. Although the MAT approach can be applied without control samples, residual probe effects continue to distort the true biological signals.

Results: We propose TileProbe, a new technique that builds upon the MAT algorithm by incorporating publicly available data sets to remove tiling array probe effects. By using a large number of these readily available arrays, TileProbe robustly models the residual probe effects that MAT model cannot explain. When applied to analyzing ChIP-chip data, TileProbe performs consistently better than MAT across a variety of analytical conditions. This shows that TileProbe resolves the issue of probe-specific effects more completely.

Availability: http://www.biostat.jhsph.edu/ approximately hji/cisgenome/index_files/tileprobe.htm.

Show MeSH
Illustration of probe effects on Affymetrix Mouse Promoter 1.0R arrays. (a) IP1–IP3, CT1–CT3: quantile normalized Gli3 ChIP and control probe intensities at log2 scale. Log2(FC): log2(IP/CT) fold change. IP1_MAT-IP3_MAT: MAT background corrected probe intensities for IP1–IP3. (b) MAT corrected probe intensities for samples collected from different studies. (c) IP_MAT, CT_MAT: MAT corrected probe intensities. MedianMAT_All-GEO-Arrays: median MAT corrected probe intensities across all samples stored in GEO. IP_TileProbe, CT_TileProbe: TileProbe background corrected probe intensities.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2735670&req=5

Figure 1: Illustration of probe effects on Affymetrix Mouse Promoter 1.0R arrays. (a) IP1–IP3, CT1–CT3: quantile normalized Gli3 ChIP and control probe intensities at log2 scale. Log2(FC): log2(IP/CT) fold change. IP1_MAT-IP3_MAT: MAT background corrected probe intensities for IP1–IP3. (b) MAT corrected probe intensities for samples collected from different studies. (c) IP_MAT, CT_MAT: MAT corrected probe intensities. MedianMAT_All-GEO-Arrays: median MAT corrected probe intensities across all samples stored in GEO. IP_TileProbe, CT_TileProbe: TileProbe background corrected probe intensities.

Mentions: High density tiling arrays are widely used to study transcription factor binding (Cawley et al., 2004; Carroll et al., 2005), transcriptome (Bertone et al., 2004; Kapranov et al., 2002), DNA methylation (Weber et al., 2007; Zhang et al., 2006), chromatin modification (Bernstein et al., 2006), nucleosome positioning (Yuan et al., 2005; Ozsolak et al., 2007) and copy number variations (Urban et al., 2006). Among the various array platforms, Affymetrix tiling arrays offer the lowest price per probe, the highest resolution, and can be used in most of the applications above (see Liu, 2007 for a review). These arrays use densely spaced probes to interrogate either the entire or part of the genome. Similar to other microarray platforms, different probes on an Affymetrix tiling array usually behave differently. These probe-specific behaviors, also known as probe effects (Irizarry et al., 2003; Johnson et al., 2006; Li and Wong, 2001; Wu et al., 2004), need to be properly controlled before meaningful biological signals can be extracted from the data. Figure 1a provides an example that illustrates the probe effects in a typical ChIP-chip experiment, in which DNA fragments bound by a transcription factor are collected through chromatin immunoprecipitation (ChIP) and hybridized to tiling arrays. The first three tracks show log2 transformed probe intensities of three independent ChIP samples for a transcription factor Gli3, representing three biological replicates. The next three tracks show log2 transformed probe intensities of three control samples in which the immunoprecipitation step was skipped. Track 7 shows log2 fold changes between the ChIP and control intensities averaged across three replicates. The peak in this track is a functional Gli3 binding site that has been experimentally verified. Existence of probe effects is clearly demonstrated by the fact that many probes outside the binding region have higher intensity values than probes inside the binding region (e.g. compare probes highlighted by the boxes), and this trend is consistent across all the samples. A direct consequence of probe effects is that the first three tracks alone (ChIP samples without controls) incorrectly define the location of transcription factor binding.Fig. 1.


TileProbe: modeling tiling array probe effects using publicly available data.

Judy JT, Ji H - Bioinformatics (2009)

Illustration of probe effects on Affymetrix Mouse Promoter 1.0R arrays. (a) IP1–IP3, CT1–CT3: quantile normalized Gli3 ChIP and control probe intensities at log2 scale. Log2(FC): log2(IP/CT) fold change. IP1_MAT-IP3_MAT: MAT background corrected probe intensities for IP1–IP3. (b) MAT corrected probe intensities for samples collected from different studies. (c) IP_MAT, CT_MAT: MAT corrected probe intensities. MedianMAT_All-GEO-Arrays: median MAT corrected probe intensities across all samples stored in GEO. IP_TileProbe, CT_TileProbe: TileProbe background corrected probe intensities.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2735670&req=5

Figure 1: Illustration of probe effects on Affymetrix Mouse Promoter 1.0R arrays. (a) IP1–IP3, CT1–CT3: quantile normalized Gli3 ChIP and control probe intensities at log2 scale. Log2(FC): log2(IP/CT) fold change. IP1_MAT-IP3_MAT: MAT background corrected probe intensities for IP1–IP3. (b) MAT corrected probe intensities for samples collected from different studies. (c) IP_MAT, CT_MAT: MAT corrected probe intensities. MedianMAT_All-GEO-Arrays: median MAT corrected probe intensities across all samples stored in GEO. IP_TileProbe, CT_TileProbe: TileProbe background corrected probe intensities.
Mentions: High density tiling arrays are widely used to study transcription factor binding (Cawley et al., 2004; Carroll et al., 2005), transcriptome (Bertone et al., 2004; Kapranov et al., 2002), DNA methylation (Weber et al., 2007; Zhang et al., 2006), chromatin modification (Bernstein et al., 2006), nucleosome positioning (Yuan et al., 2005; Ozsolak et al., 2007) and copy number variations (Urban et al., 2006). Among the various array platforms, Affymetrix tiling arrays offer the lowest price per probe, the highest resolution, and can be used in most of the applications above (see Liu, 2007 for a review). These arrays use densely spaced probes to interrogate either the entire or part of the genome. Similar to other microarray platforms, different probes on an Affymetrix tiling array usually behave differently. These probe-specific behaviors, also known as probe effects (Irizarry et al., 2003; Johnson et al., 2006; Li and Wong, 2001; Wu et al., 2004), need to be properly controlled before meaningful biological signals can be extracted from the data. Figure 1a provides an example that illustrates the probe effects in a typical ChIP-chip experiment, in which DNA fragments bound by a transcription factor are collected through chromatin immunoprecipitation (ChIP) and hybridized to tiling arrays. The first three tracks show log2 transformed probe intensities of three independent ChIP samples for a transcription factor Gli3, representing three biological replicates. The next three tracks show log2 transformed probe intensities of three control samples in which the immunoprecipitation step was skipped. Track 7 shows log2 fold changes between the ChIP and control intensities averaged across three replicates. The peak in this track is a functional Gli3 binding site that has been experimentally verified. Existence of probe effects is clearly demonstrated by the fact that many probes outside the binding region have higher intensity values than probes inside the binding region (e.g. compare probes highlighted by the boxes), and this trend is consistent across all the samples. A direct consequence of probe effects is that the first three tracks alone (ChIP samples without controls) incorrectly define the location of transcription factor binding.Fig. 1.

Bottom Line: Individual probes on an Affymetrix tiling array usually behave differently.We propose TileProbe, a new technique that builds upon the MAT algorithm by incorporating publicly available data sets to remove tiling array probe effects.When applied to analyzing ChIP-chip data, TileProbe performs consistently better than MAT across a variety of analytical conditions.

View Article: PubMed Central - PubMed

Affiliation: Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA.

ABSTRACT

Motivation: Individual probes on an Affymetrix tiling array usually behave differently. Modeling and removing these probe effects are critical for detecting signals from the array data. Current data processing techniques either require control samples or use probe sequences to model probe-specific variability, such as with MAT. Although the MAT approach can be applied without control samples, residual probe effects continue to distort the true biological signals.

Results: We propose TileProbe, a new technique that builds upon the MAT algorithm by incorporating publicly available data sets to remove tiling array probe effects. By using a large number of these readily available arrays, TileProbe robustly models the residual probe effects that MAT model cannot explain. When applied to analyzing ChIP-chip data, TileProbe performs consistently better than MAT across a variety of analytical conditions. This shows that TileProbe resolves the issue of probe-specific effects more completely.

Availability: http://www.biostat.jhsph.edu/ approximately hji/cisgenome/index_files/tileprobe.htm.

Show MeSH