Limits...
Analyzing 2D gel images using a two-component empirical Bayes model.

Li F, Seillier-Moiseiwitsch F - BMC Bioinformatics (2011)

Bottom Line: The estimation of the mixture density does not take into account assumptions about the density.The proposed constrained estimation method always yields valid estimates and more stable results.The proposed estimation approach proposed can be applied to other contexts where large-scale hypothesis testing occurs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, MD, USA. feng.li@fda.hhs.gov

ABSTRACT

Background: Two-dimensional polyacrylomide gel electrophoresis (2D gel, 2D PAGE, 2-DE) is a powerful tool for analyzing the proteome of a organism. Differential analysis of 2D gel images aims at finding proteins that change under different conditions, which leads to large-scale hypothesis testing as in microarray data analysis. Two-component empirical Bayes (EB) models have been widely discussed for large-scale hypothesis testing and applied in the context of genomic data. They have not been implemented for the differential analysis of 2D gel data. In the literature, the mixture and densities of the test statistics are estimated separately. The estimation of the mixture density does not take into account assumptions about the density. Thus, there is no guarantee that the estimated component will be no greater than the mixture density as it should be.

Results: We present an implementation of a two-component EB model for the analysis of 2D gel images. In contrast to the published estimation method, we propose to estimate the mixture and densities simultaneously using a constrained estimation approach, which relies on an iteratively re-weighted least-squares algorithm. The assumption about the density is naturally taken into account in the estimation of the mixture density. This strategy is illustrated using a set of 2D gel images from a factorial experiment. The proposed approach is validated using a set of simulated gels.

Conclusions: The two-component EB model is a very useful for large-scale hypothesis testing. In proteomic analysis, the theoretical density is often not appropriate. We demonstrate how to implement a two-component EB model for analyzing a set of 2D gel images. We show that it is necessary to estimate the mixture density and empirical component simultaneously. The proposed constrained estimation method always yields valid estimates and more stable results. The proposed estimation approach proposed can be applied to other contexts where large-scale hypothesis testing occurs.

Show MeSH

Related in: MedlinePlus

Estimation results using the constrained approach. Estimates of mixture densities and their  components from the constrained estimation approach and the corresponding local FDR estimation. Upper panel: the solid green curves are the spline-fitted mixture densities; the blue dashed curves are the empirical  densities from constrained estimation approach. Lower panel: the blue solid curves represent the local FDR estimates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3300069&req=5

Figure 3: Estimation results using the constrained approach. Estimates of mixture densities and their components from the constrained estimation approach and the corresponding local FDR estimation. Upper panel: the solid green curves are the spline-fitted mixture densities; the blue dashed curves are the empirical densities from constrained estimation approach. Lower panel: the blue solid curves represent the local FDR estimates.

Mentions: Figure 2 also demonstrates that neither CME nor MLE yields a desirable empirical estimate. The estimated components are not below the estimated mixture density throughout the range of z-values. Consequently, the estimated local FDR has multiple peaks and values greater than 1 at many z's. The estimate for the proportion of true hypotheses can also be greater than 1, which is not a desirable outcome. There is significant discrepancy between the results from CME and MLE, as demonstrated by plots for the gender effect. We tried alternative specifications for the intervals used for estimating the empirical density and different degrees of freedom for the splines: all yielded very similar results. Moreover, we found that MLE is more sensitive to the choice of the interval [a, b] as also observed in [24]. Next, we applied the proposed constraint estimation approach with the same choices of intervals. The degrees for the splines that minimize the AIC were 5, 9 and 5 for the treatment, gender and interaction effects, respectively. Figure 3 displays the results. The green solid and blue dashed curves in the upper panel represent the mixture and empirical densities, respectively. The lower panel shows the estimated local FDR at different z-values.


Analyzing 2D gel images using a two-component empirical Bayes model.

Li F, Seillier-Moiseiwitsch F - BMC Bioinformatics (2011)

Estimation results using the constrained approach. Estimates of mixture densities and their  components from the constrained estimation approach and the corresponding local FDR estimation. Upper panel: the solid green curves are the spline-fitted mixture densities; the blue dashed curves are the empirical  densities from constrained estimation approach. Lower panel: the blue solid curves represent the local FDR estimates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3300069&req=5

Figure 3: Estimation results using the constrained approach. Estimates of mixture densities and their components from the constrained estimation approach and the corresponding local FDR estimation. Upper panel: the solid green curves are the spline-fitted mixture densities; the blue dashed curves are the empirical densities from constrained estimation approach. Lower panel: the blue solid curves represent the local FDR estimates.
Mentions: Figure 2 also demonstrates that neither CME nor MLE yields a desirable empirical estimate. The estimated components are not below the estimated mixture density throughout the range of z-values. Consequently, the estimated local FDR has multiple peaks and values greater than 1 at many z's. The estimate for the proportion of true hypotheses can also be greater than 1, which is not a desirable outcome. There is significant discrepancy between the results from CME and MLE, as demonstrated by plots for the gender effect. We tried alternative specifications for the intervals used for estimating the empirical density and different degrees of freedom for the splines: all yielded very similar results. Moreover, we found that MLE is more sensitive to the choice of the interval [a, b] as also observed in [24]. Next, we applied the proposed constraint estimation approach with the same choices of intervals. The degrees for the splines that minimize the AIC were 5, 9 and 5 for the treatment, gender and interaction effects, respectively. Figure 3 displays the results. The green solid and blue dashed curves in the upper panel represent the mixture and empirical densities, respectively. The lower panel shows the estimated local FDR at different z-values.

Bottom Line: The estimation of the mixture density does not take into account assumptions about the density.The proposed constrained estimation method always yields valid estimates and more stable results.The proposed estimation approach proposed can be applied to other contexts where large-scale hypothesis testing occurs.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, MD, USA. feng.li@fda.hhs.gov

ABSTRACT

Background: Two-dimensional polyacrylomide gel electrophoresis (2D gel, 2D PAGE, 2-DE) is a powerful tool for analyzing the proteome of a organism. Differential analysis of 2D gel images aims at finding proteins that change under different conditions, which leads to large-scale hypothesis testing as in microarray data analysis. Two-component empirical Bayes (EB) models have been widely discussed for large-scale hypothesis testing and applied in the context of genomic data. They have not been implemented for the differential analysis of 2D gel data. In the literature, the mixture and densities of the test statistics are estimated separately. The estimation of the mixture density does not take into account assumptions about the density. Thus, there is no guarantee that the estimated component will be no greater than the mixture density as it should be.

Results: We present an implementation of a two-component EB model for the analysis of 2D gel images. In contrast to the published estimation method, we propose to estimate the mixture and densities simultaneously using a constrained estimation approach, which relies on an iteratively re-weighted least-squares algorithm. The assumption about the density is naturally taken into account in the estimation of the mixture density. This strategy is illustrated using a set of 2D gel images from a factorial experiment. The proposed approach is validated using a set of simulated gels.

Conclusions: The two-component EB model is a very useful for large-scale hypothesis testing. In proteomic analysis, the theoretical density is often not appropriate. We demonstrate how to implement a two-component EB model for analyzing a set of 2D gel images. We show that it is necessary to estimate the mixture density and empirical component simultaneously. The proposed constrained estimation method always yields valid estimates and more stable results. The proposed estimation approach proposed can be applied to other contexts where large-scale hypothesis testing occurs.

Show MeSH
Related in: MedlinePlus