Limits...
Graphical technique for identifying a monotonic variance stabilizing transformation for absolute gene intensity signals.

Archer KJ, Dumur CI, Ramakrishnan V - BMC Bioinformatics (2004)

Bottom Line: For Affymetrix data, where absolute intensities are indicative of number of transcripts, there is a systematic relationship between variance and magnitude of measurements.For the data presented, the spread-versus-level plot identified a power transformation that successfully stabilized the variance of probe set summaries.This is robust against outliers and avoids assumption of models and maximizations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biostatistics, Virginia Commonwealth University, Richmond, VA 23298, USA. kjarcher@vcu.edu

ABSTRACT

Background: The usefulness of log2 transformation for cDNA microarray data has led to its widespread application to Affymetrix data. For Affymetrix data, where absolute intensities are indicative of number of transcripts, there is a systematic relationship between variance and magnitude of measurements. Application of the log2 transformation expands the scale of genes with low intensities while compressing the scale of genes with higher intensities thus reversing the mean by variance relationship. The usefulness of these transformations needs to be examined.

Results: Using an Affymetrix GeneChip dataset, problems associated with applying the log2 transformation to absolute intensity data are demonstrated. Use of the spread-versus-level plot to identify an appropriate variance stabilizing transformation is presented. For the data presented, the spread-versus-level plot identified a power transformation that successfully stabilized the variance of probe set summaries.

Conclusion: The spread-versus-level plot is helpful to identify transformations for variance stabilization. This is robust against outliers and avoids assumption of models and maximizations.

Show MeSH

Related in: MedlinePlus

Mean versus variance plot of power transformed data. Plot of the mean of the probe set signal intensities after applying the  transformation by the associated variance for the 16 HG-U133A GeneChips®.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC419979&req=5

Figure 4: Mean versus variance plot of power transformed data. Plot of the mean of the probe set signal intensities after applying the transformation by the associated variance for the 16 HG-U133A GeneChips®.

Mentions: Applying this transformation to the signal intensities in the QAQC dataset and plotting mean versus the variance as before (Figure 4) shows that stabilization of the variance is achieved. There are a few outlying probe sets identifiable in Figure 4. One of the advantages of using the estimated slope from the spread-versus-level plot to identify a power transformation is that this plot uses robust measures of location (median) and spread (fourth-spread). For this dataset, the "outlying" probe sets were defined as those with a variance greater than 50. The estimated slope after removing the outliers was 0.568; as expected, the slope of the linear regression fit to the spread-versus-level plot is fairly robust against outliers. Thus, the transformation identified by the spread-versus-level plot is not affected by these few outlying probe sets.


Graphical technique for identifying a monotonic variance stabilizing transformation for absolute gene intensity signals.

Archer KJ, Dumur CI, Ramakrishnan V - BMC Bioinformatics (2004)

Mean versus variance plot of power transformed data. Plot of the mean of the probe set signal intensities after applying the  transformation by the associated variance for the 16 HG-U133A GeneChips®.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC419979&req=5

Figure 4: Mean versus variance plot of power transformed data. Plot of the mean of the probe set signal intensities after applying the transformation by the associated variance for the 16 HG-U133A GeneChips®.
Mentions: Applying this transformation to the signal intensities in the QAQC dataset and plotting mean versus the variance as before (Figure 4) shows that stabilization of the variance is achieved. There are a few outlying probe sets identifiable in Figure 4. One of the advantages of using the estimated slope from the spread-versus-level plot to identify a power transformation is that this plot uses robust measures of location (median) and spread (fourth-spread). For this dataset, the "outlying" probe sets were defined as those with a variance greater than 50. The estimated slope after removing the outliers was 0.568; as expected, the slope of the linear regression fit to the spread-versus-level plot is fairly robust against outliers. Thus, the transformation identified by the spread-versus-level plot is not affected by these few outlying probe sets.

Bottom Line: For Affymetrix data, where absolute intensities are indicative of number of transcripts, there is a systematic relationship between variance and magnitude of measurements.For the data presented, the spread-versus-level plot identified a power transformation that successfully stabilized the variance of probe set summaries.This is robust against outliers and avoids assumption of models and maximizations.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biostatistics, Virginia Commonwealth University, Richmond, VA 23298, USA. kjarcher@vcu.edu

ABSTRACT

Background: The usefulness of log2 transformation for cDNA microarray data has led to its widespread application to Affymetrix data. For Affymetrix data, where absolute intensities are indicative of number of transcripts, there is a systematic relationship between variance and magnitude of measurements. Application of the log2 transformation expands the scale of genes with low intensities while compressing the scale of genes with higher intensities thus reversing the mean by variance relationship. The usefulness of these transformations needs to be examined.

Results: Using an Affymetrix GeneChip dataset, problems associated with applying the log2 transformation to absolute intensity data are demonstrated. Use of the spread-versus-level plot to identify an appropriate variance stabilizing transformation is presented. For the data presented, the spread-versus-level plot identified a power transformation that successfully stabilized the variance of probe set summaries.

Conclusion: The spread-versus-level plot is helpful to identify transformations for variance stabilization. This is robust against outliers and avoids assumption of models and maximizations.

Show MeSH
Related in: MedlinePlus