Open-i Logo
Submit this form
Results 1-1   << Back

 
Relationship between sample mean and sample variance.Sample mean versus log sample variance plots of three different datasets from either control or treatment conditions. Smoothed variances using a non-paramteric method [6], [7] is displayed with green lines. Sample size n is indicated for each dataset. The data sets were normalized with RMA method.
© Copyright Policy

pone-0019640-g001: Relationship between sample mean and sample variance.Sample mean versus log sample variance plots of three different datasets from either control or treatment conditions. Smoothed variances using a non-paramteric method [6], [7] is displayed with green lines. Sample size n is indicated for each dataset. The data sets were normalized with RMA method.

Mentions: Microarray has become a powerful tool for biological and medical science to monitor transcriptome changes under different treatments. However, because of high price of microarray experiments, replicates for each experiment are restricted in most cases. The feature of small replicates and large gene numbers, e.g., about 6,000 in yeast and 23,000 in Arabidopsis, in microarray data usually results in poor estimation of gene-specific variances. Several methods have been suggested for modification of gene specific variances or covariances to improve the estimation. For example, Efron et al. [1] suggested modifying the denominator of the -statistic to allow estimation less sensitive to gene-specific variances. Smyth [2] proposed smoothing gene-specific variances to a common value. Cui et al. [3] and Tong and Wang [4] developed shrinkage estimators for gene specific variances using Stein-type estimation under squared error loss function which were used to construct traditional - type and - type statistics. In all the above estimators, gene specific means were assumed to be independent of variances. It has been observed that means are related to variances in microarray experiments; usually genes with high expression level show high variances, while genes with low expression level display small variances (Figure 1).

A Nonparametric Mean-Variance Smoothing Method to Assess Arabidopsis Cold Stress Transcriptional Regulator CBF2 Overexpression Microarray Data

Hu P, Maiti T - PLoS ONE (2011)

Bottom Line: The good performance of NPMVS is mainly due to its shrinkage estimation for both means and variances.In addition, NPMVS exploits a non-parametric regression between mean and variance, instead of assuming a specific parametric relationship between mean and variance.The source code written in R is available from the authors on request.

Affiliation: Department of Energy-Plant Research Laboratory, Michigan State University, East Lansing, Michigan, United States of America. phu@msu.edu

ABSTRACT
Microarray is a powerful tool for genome-wide gene expression analysis. In microarray expression data, often mean and variance have certain relationships. We present a non-parametric mean-variance smoothing method (NPMVS) to analyze differentially expressed genes. In this method, a nonlinear smoothing curve is fitted to estimate the relationship between mean and variance. Inference is then made upon shrinkage estimation of posterior means assuming variances are known. Different methods have been applied to simulated datasets, in which a variety of mean and variance relationships were imposed. The simulation study showed that NPMVS outperformed the other two popular shrinkage estimation methods in some mean-variance relationships; and NPMVS was competitive with the two methods in other relationships. A real biological dataset, in which a cold stress transcription factor gene, CBF2, was overexpressed, has also been analyzed with the three methods. Gene ontology and cis-element analysis showed that NPMVS identified more cold and stress responsive genes than the other two methods did. The good performance of NPMVS is mainly due to its shrinkage estimation for both means and variances. In addition, NPMVS exploits a non-parametric regression between mean and variance, instead of assuming a specific parametric relationship between mean and variance. The source code written in R is available from the authors on request.

View Similar Images In: Results Collection              View Article: Pubmed Central PubMed      Show All Figures 
getmorefigures.php?pmc=3096627&rFormat=json&query=null&fields=all&favor=none&it=none&sub=none&sp=none&coll=none&req=5
Show MeSH

Lister Hill National Center for Biomedical Communications
U.S. National Library of Medicine, 8600 Rockville Pike, Bethesda, MD 20894
National Institutes of Health, Department of Health & Human Services
Privacy, Accessibility, Frequently Asked Questions, Contact Us, Collection
Freedom of Information Act, USA.gov