Limits...
Comparative evaluation of gene-set analysis methods.

Liu Q, Dinu I, Adewale AJ, Potter JD, Yasui Y - BMC Bioinformatics (2007)

Bottom Line: In the simulation experiment, we found that the use of the asymptotic distribution in the two Global Tests leads to a statistical test with an incorrect size.After the standardization, the three methods gave very similar biologically-sensible results, with slightly higher statistical significance given by SAM-GS.The three methods gave similar patterns of results in the analysis of the other two microarray datasets.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Public Health, University of Alberta, Edmonton, Alberta, T6G2G3, Canada. qliu@phs.med.ualberta.ca

ABSTRACT

Background: Multiple data-analytic methods have been proposed for evaluating gene-expression levels in specific biological pathways, assessing differential expression associated with a binary phenotype. Following Goeman and Bühlmann's recent review, we compared statistical performance of three methods, namely Global Test, ANCOVA Global Test, and SAM-GS, that test "self-contained hypotheses" Via. subject sampling. The three methods were compared based on a simulation experiment and analyses of three real-world microarray datasets.

Results: In the simulation experiment, we found that the use of the asymptotic distribution in the two Global Tests leads to a statistical test with an incorrect size. Specifically, p-values calculated by the scaled chi2 distribution of Global Test and the asymptotic distribution of ANCOVA Global Test are too liberal, while the asymptotic distribution with a quadratic form of the Global Test results in p-values that are too conservative. The two Global Tests with permutation-based inference, however, gave a correct size. While the three methods showed similar power using permutation inference after a proper standardization of gene expression data, SAM-GS showed slightly higher power than the Global Tests. In the analysis of a real-world microarray dataset, the two Global Tests gave markedly different results, compared to SAM-GS, in identifying pathways whose gene expressions are associated with p53 mutation in cancer cell lines. A proper standardization of gene expression variances is necessary for the two Global Tests in order to produce biologically sensible results. After the standardization, the three methods gave very similar biologically-sensible results, with slightly higher statistical significance given by SAM-GS. The three methods gave similar patterns of results in the analysis of the other two microarray datasets.

Conclusion: An appropriate standardization makes the performance of all three methods similar, given the use of permutation-based inference. SAM-GS tends to have slightly higher power in the lower alpha-level region (i.e. gene sets that are of the greatest interest). Global Test and ANCOVA Global Test have the important advantage of being able to analyze continuous and survival phenotypes and to adjust for covariates. A free Microsoft Excel Add-In to perform SAM-GS is available from http://www.ualberta.ca/~yyasui/homepage.html.

Show MeSH

Related in: MedlinePlus

Lowest P-values in the p53 data analysis: p-values of Global Test and ANCOVA Global Test after the VSN normalization vs. SAM-GS p-values after the VSN normalization. The line indicates equal p-values between SAM-GS and Global Tests.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2238724&req=5

Figure 9: Lowest P-values in the p53 data analysis: p-values of Global Test and ANCOVA Global Test after the VSN normalization vs. SAM-GS p-values after the VSN normalization. The line indicates equal p-values between SAM-GS and Global Tests.

Mentions: In the User Guides for Global Test and ANCOVA Global Test, Variance Stabilization (VSN) was used to normalize the data [10,11]. We also assessed the performance of the three methods on the p53 dataset, male vs. female dataset, and the ALL/AML dataset using VSN. The results for the p53 dataset are shown in Table 2 and Figure 9. When VSN was used for the normalization of the data, we observed: (1) p-values of Global Test and ANCOVA Global Test became similar to those of SAM-GS, but not as close as the p-values after the z-score standardization; and (2) in the lower range of p-values, the p-values for SAM-GS tended to be smaller than those of Global Test and ANCOVA Global Test, (Table 2, Figure 9).


Comparative evaluation of gene-set analysis methods.

Liu Q, Dinu I, Adewale AJ, Potter JD, Yasui Y - BMC Bioinformatics (2007)

Lowest P-values in the p53 data analysis: p-values of Global Test and ANCOVA Global Test after the VSN normalization vs. SAM-GS p-values after the VSN normalization. The line indicates equal p-values between SAM-GS and Global Tests.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2238724&req=5

Figure 9: Lowest P-values in the p53 data analysis: p-values of Global Test and ANCOVA Global Test after the VSN normalization vs. SAM-GS p-values after the VSN normalization. The line indicates equal p-values between SAM-GS and Global Tests.
Mentions: In the User Guides for Global Test and ANCOVA Global Test, Variance Stabilization (VSN) was used to normalize the data [10,11]. We also assessed the performance of the three methods on the p53 dataset, male vs. female dataset, and the ALL/AML dataset using VSN. The results for the p53 dataset are shown in Table 2 and Figure 9. When VSN was used for the normalization of the data, we observed: (1) p-values of Global Test and ANCOVA Global Test became similar to those of SAM-GS, but not as close as the p-values after the z-score standardization; and (2) in the lower range of p-values, the p-values for SAM-GS tended to be smaller than those of Global Test and ANCOVA Global Test, (Table 2, Figure 9).

Bottom Line: In the simulation experiment, we found that the use of the asymptotic distribution in the two Global Tests leads to a statistical test with an incorrect size.After the standardization, the three methods gave very similar biologically-sensible results, with slightly higher statistical significance given by SAM-GS.The three methods gave similar patterns of results in the analysis of the other two microarray datasets.

View Article: PubMed Central - HTML - PubMed

Affiliation: School of Public Health, University of Alberta, Edmonton, Alberta, T6G2G3, Canada. qliu@phs.med.ualberta.ca

ABSTRACT

Background: Multiple data-analytic methods have been proposed for evaluating gene-expression levels in specific biological pathways, assessing differential expression associated with a binary phenotype. Following Goeman and Bühlmann's recent review, we compared statistical performance of three methods, namely Global Test, ANCOVA Global Test, and SAM-GS, that test "self-contained hypotheses" Via. subject sampling. The three methods were compared based on a simulation experiment and analyses of three real-world microarray datasets.

Results: In the simulation experiment, we found that the use of the asymptotic distribution in the two Global Tests leads to a statistical test with an incorrect size. Specifically, p-values calculated by the scaled chi2 distribution of Global Test and the asymptotic distribution of ANCOVA Global Test are too liberal, while the asymptotic distribution with a quadratic form of the Global Test results in p-values that are too conservative. The two Global Tests with permutation-based inference, however, gave a correct size. While the three methods showed similar power using permutation inference after a proper standardization of gene expression data, SAM-GS showed slightly higher power than the Global Tests. In the analysis of a real-world microarray dataset, the two Global Tests gave markedly different results, compared to SAM-GS, in identifying pathways whose gene expressions are associated with p53 mutation in cancer cell lines. A proper standardization of gene expression variances is necessary for the two Global Tests in order to produce biologically sensible results. After the standardization, the three methods gave very similar biologically-sensible results, with slightly higher statistical significance given by SAM-GS. The three methods gave similar patterns of results in the analysis of the other two microarray datasets.

Conclusion: An appropriate standardization makes the performance of all three methods similar, given the use of permutation-based inference. SAM-GS tends to have slightly higher power in the lower alpha-level region (i.e. gene sets that are of the greatest interest). Global Test and ANCOVA Global Test have the important advantage of being able to analyze continuous and survival phenotypes and to adjust for covariates. A free Microsoft Excel Add-In to perform SAM-GS is available from http://www.ualberta.ca/~yyasui/homepage.html.

Show MeSH
Related in: MedlinePlus