Limits...
A first principles approach to differential expression in microarray data analysis.

Rubin RA - BMC Bioinformatics (2009)

Bottom Line: Here we take the approach of making the fewest assumptions about the structure of the microarray data.We applied the technique to the HGU-133A, HG-U95A, and "Golden Spike" spike-in data sets.The resulting receiver operating characteristic (ROC) curves compared favorably with other published results.

View Article: PubMed Central - HTML - PubMed

Affiliation: Mathematics Department, Whittier College, 13406 E. Philadelphia St., Whittier, CA 90608, USA. brubin698@earthlink.net

ABSTRACT

Background: The disparate results from the methods commonly used to determine differential expression in Affymetrix microarray experiments may well result from the wide variety of probe set and probe level models employed. Here we take the approach of making the fewest assumptions about the structure of the microarray data. Specifically, we only require that, under the hypothesis that a gene is not differentially expressed for specified conditions, for any probe position in the gene's probe set: a) the probe amplitudes are independent and identically distributed over the conditions, and b) the distributions of the replicated probe amplitudes are amenable to classical analysis of variance (ANOVA). Log-amplitudes that have been standardized within-chip meet these conditions well enough for our approach, which is to perform ANOVA across conditions for each probe position, and then take the median of the resulting (1 - p) values as a gene-level measure of differential expression.

Results: We applied the technique to the HGU-133A, HG-U95A, and "Golden Spike" spike-in data sets. The resulting receiver operating characteristic (ROC) curves compared favorably with other published results. This procedure is quite sensitive, so much so that it has revealed the presence of probe sets that might properly be called "unanticipated positives" rather than "false positives", because plots of these probe sets strongly suggest that they are differentially expressed.

Conclusion: The median ANOVA (1-p) approach presented here is a very simple methodology that does not depend on any specific probe level or probe models, and does not require any pre-processing other than within-chip standardization of probe level log amplitudes. Its performance is comparable to other published methods on the standard spike-in data sets, and has revealed the presence of new categories of probe sets that might properly be referred to as "unanticipated positives" and "unanticipated negatives" that need to be taken into account when using spiked-in data sets at "truthed" test beds.

Show MeSH

Related in: MedlinePlus

An example of why the median of the (1-p)'s might be a better measure of differential expression than the (trimmed) Total-p. This chart contains the probe set plot and ANOVA results for the gene 208010_s_at from the HGU-133A Latin Square experiment for concentrations of 128 pM (replicates in cyan) and 64 pM (replicates in red). The lack of a significant difference between conditions at probe positions 1,2, 6 and 7 adversely affects even the trimmed Total-p, while the median-based ranking of this gene (among all 22300 genes on the two chips involved in the comparison) is much more consistent with the Latin Square design. This is one among many cases for which the median seems to be the most robust measure.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2749840&req=5

Figure 2: An example of why the median of the (1-p)'s might be a better measure of differential expression than the (trimmed) Total-p. This chart contains the probe set plot and ANOVA results for the gene 208010_s_at from the HGU-133A Latin Square experiment for concentrations of 128 pM (replicates in cyan) and 64 pM (replicates in red). The lack of a significant difference between conditions at probe positions 1,2, 6 and 7 adversely affects even the trimmed Total-p, while the median-based ranking of this gene (among all 22300 genes on the two chips involved in the comparison) is much more consistent with the Latin Square design. This is one among many cases for which the median seems to be the most robust measure.

Mentions: In practice, however, as a tool for assessing differential expression Total p can be overly influenced by a few large (non-significant) probe level p-values, as can be the mean of the p's. Other summary measures such as the trimmed mean or trimmed geometric mean of the p's also do not appear to be as effective as the median in ranking genes in accordance with the known differences in concentration for the conditions being examined. As shown in the table in Figure 2, even after trimming the highest and lowest p-values, Total p can, in some cases, produce a much lower ranking of a condition than would have been expected. The Total p ranking of the comparison of 64 versus 128 pM concentrations for gene 208010_s_at from the HGU-133A Latin Square experiment was 279 out of the 22300 genes in the comparison. On the other hand, the rank based on the median of the ANOVA (1-p)'s for the same condition was 6, which is much more consistent with the concentrations involved. While determining the "best" way of combining the probe level p-values deserves a great deal more study, from this point on we will base our measure of differential expression on the median of the probe level p-values. In order to make a larger measure correspond to the condition of being more differentially expressed, our measure of differential expression for the gene will be the median of the probe level ANOVA (1-p)'s. This is a harmless change since the median of a set of (1-p)'s is the same as (1 - the median of the p's).


A first principles approach to differential expression in microarray data analysis.

Rubin RA - BMC Bioinformatics (2009)

An example of why the median of the (1-p)'s might be a better measure of differential expression than the (trimmed) Total-p. This chart contains the probe set plot and ANOVA results for the gene 208010_s_at from the HGU-133A Latin Square experiment for concentrations of 128 pM (replicates in cyan) and 64 pM (replicates in red). The lack of a significant difference between conditions at probe positions 1,2, 6 and 7 adversely affects even the trimmed Total-p, while the median-based ranking of this gene (among all 22300 genes on the two chips involved in the comparison) is much more consistent with the Latin Square design. This is one among many cases for which the median seems to be the most robust measure.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2749840&req=5

Figure 2: An example of why the median of the (1-p)'s might be a better measure of differential expression than the (trimmed) Total-p. This chart contains the probe set plot and ANOVA results for the gene 208010_s_at from the HGU-133A Latin Square experiment for concentrations of 128 pM (replicates in cyan) and 64 pM (replicates in red). The lack of a significant difference between conditions at probe positions 1,2, 6 and 7 adversely affects even the trimmed Total-p, while the median-based ranking of this gene (among all 22300 genes on the two chips involved in the comparison) is much more consistent with the Latin Square design. This is one among many cases for which the median seems to be the most robust measure.
Mentions: In practice, however, as a tool for assessing differential expression Total p can be overly influenced by a few large (non-significant) probe level p-values, as can be the mean of the p's. Other summary measures such as the trimmed mean or trimmed geometric mean of the p's also do not appear to be as effective as the median in ranking genes in accordance with the known differences in concentration for the conditions being examined. As shown in the table in Figure 2, even after trimming the highest and lowest p-values, Total p can, in some cases, produce a much lower ranking of a condition than would have been expected. The Total p ranking of the comparison of 64 versus 128 pM concentrations for gene 208010_s_at from the HGU-133A Latin Square experiment was 279 out of the 22300 genes in the comparison. On the other hand, the rank based on the median of the ANOVA (1-p)'s for the same condition was 6, which is much more consistent with the concentrations involved. While determining the "best" way of combining the probe level p-values deserves a great deal more study, from this point on we will base our measure of differential expression on the median of the probe level p-values. In order to make a larger measure correspond to the condition of being more differentially expressed, our measure of differential expression for the gene will be the median of the probe level ANOVA (1-p)'s. This is a harmless change since the median of a set of (1-p)'s is the same as (1 - the median of the p's).

Bottom Line: Here we take the approach of making the fewest assumptions about the structure of the microarray data.We applied the technique to the HGU-133A, HG-U95A, and "Golden Spike" spike-in data sets.The resulting receiver operating characteristic (ROC) curves compared favorably with other published results.

View Article: PubMed Central - HTML - PubMed

Affiliation: Mathematics Department, Whittier College, 13406 E. Philadelphia St., Whittier, CA 90608, USA. brubin698@earthlink.net

ABSTRACT

Background: The disparate results from the methods commonly used to determine differential expression in Affymetrix microarray experiments may well result from the wide variety of probe set and probe level models employed. Here we take the approach of making the fewest assumptions about the structure of the microarray data. Specifically, we only require that, under the hypothesis that a gene is not differentially expressed for specified conditions, for any probe position in the gene's probe set: a) the probe amplitudes are independent and identically distributed over the conditions, and b) the distributions of the replicated probe amplitudes are amenable to classical analysis of variance (ANOVA). Log-amplitudes that have been standardized within-chip meet these conditions well enough for our approach, which is to perform ANOVA across conditions for each probe position, and then take the median of the resulting (1 - p) values as a gene-level measure of differential expression.

Results: We applied the technique to the HGU-133A, HG-U95A, and "Golden Spike" spike-in data sets. The resulting receiver operating characteristic (ROC) curves compared favorably with other published results. This procedure is quite sensitive, so much so that it has revealed the presence of probe sets that might properly be called "unanticipated positives" rather than "false positives", because plots of these probe sets strongly suggest that they are differentially expressed.

Conclusion: The median ANOVA (1-p) approach presented here is a very simple methodology that does not depend on any specific probe level or probe models, and does not require any pre-processing other than within-chip standardization of probe level log amplitudes. Its performance is comparable to other published methods on the standard spike-in data sets, and has revealed the presence of new categories of probe sets that might properly be referred to as "unanticipated positives" and "unanticipated negatives" that need to be taken into account when using spiked-in data sets at "truthed" test beds.

Show MeSH
Related in: MedlinePlus