Exploratory differential gene expression analysis in microarray experiments with no or limited replication.
Bottom Line:
We describe an exploratory, data-oriented approach for identifying candidates for differential gene expression in cDNA microarray experiments in terms of alpha-outliers and outlier regions, using simultaneous tolerance intervals relative to the line of equivalence (Cy5 = Cy3).We demonstrate the improved performance of our approach over existing single-slide methods using public datasets and simulation studies.
Affiliation: Department of Nutritional Sciences and Toxicology, University of California at Berkeley, Morgan Hall, Berkeley, CA 94720, USA. Avl53@aol.com
ABSTRACT
Show MeSH
We describe an exploratory, data-oriented approach for identifying candidates for differential gene expression in cDNA microarray experiments in terms of alpha-outliers and outlier regions, using simultaneous tolerance intervals relative to the line of equivalence (Cy5 = Cy3). We demonstrate the improved performance of our approach over existing single-slide methods using public datasets and simulation studies. |
Related In:
Results -
Collection
getmorefigures.php?uid=PMC395768&req=5
Mentions: Typically, microarray data involve thousands of genes so clearly there is a problem of multiplicity of comparisons. Other model-based single-slide approaches do not consider this issue explicitly (see single-slide procedures described in [1,13,14,17,18]). First, we identify candidate outliers without correction to obtain unadjusted p-values (Table 3). A p-value is a probability to reject the hypothesis when the hypothesis is true and represents a measure of statistical significance in terms of false positive rate. One way to obtain adjusted p-values is to apply a Bonferroni correction based on N (the sample size of the entire dataset) which may be too conservative, so we examine two alternative corrections. In one alternative approach, we apply a multiplicity of comparison correction based on an estimate of k (number of non-regular observations) rather than the sample size of the entire dataset. This approach emphasizes stable outliers at the expense of other possible outliers (that is, N-k) which are inliers in the current single-slide experiment. Clearly, this Bonferroni correction by k provides a much less conservative result than the correction by N and we would argue more reasonable correction to identify true outliers. Other robust exploratory tools (see Methods) can be used to estimate k. In a more sophisticated approach to address these issues, the q-value is calculated from the ordered list of unadjusted p-values [45,46] (Figure 24). The q-value is the minimum false discovery rate [47] for a particular feature from a list of all features [45,46]. The false discovery rate is the proportion of true hypotheses among all hypotheses which were found to be significant - for example, a false discovery rate of 1% means that among all candidates for differential expression found significant, 1% of these are true s on average [46]. |
View Article: PubMed Central - HTML - PubMed
Affiliation: Department of Nutritional Sciences and Toxicology, University of California at Berkeley, Morgan Hall, Berkeley, CA 94720, USA. Avl53@aol.com