Limits...
Testing gene set enrichment for subset of genes: Sub-GSE.

Yan X, Sun F - BMC Bioinformatics (2008)

Bottom Line: The results based on gene set analysis are generally more biologically interpretable, accurate and robust than the results based on individual gene analysis.This is particularly true for cases in which only a fraction of the genes in the gene set are associated with the phenotypes.Applications to two simulated datasets and two real datasets show that this method is sensitive to the associations between gene sets and phenotype.

View Article: PubMed Central - HTML - PubMed

Affiliation: Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089-2910, USA. xitingya@usc.edu

ABSTRACT

Background: Many methods have been developed to test the enrichment of genes related to certain phenotypes or cell states in gene sets. These approaches usually combine gene expression data with functionally related gene sets as defined in databases such as GeneOntology (GO), KEGG, or BioCarta. The results based on gene set analysis are generally more biologically interpretable, accurate and robust than the results based on individual gene analysis. However, while most available methods for gene set enrichment analysis test the enrichment of the entire gene set, it is more likely that only a subset of the genes in the gene set may be related to the phenotypes of interest.

Results: In this paper, we develop a novel method, termed Sub-GSE, which measures the enrichment of a predefined gene set, or pathway, by testing its subsets. The application of Sub-GSE to two simulated and two real datasets shows Sub-GSE to be more sensitive than previous methods, such as GSEA, GSA, and SigPath, in detecting gene sets assiated with a phenotype of interest. This is particularly true for cases in which only a fraction of the genes in the gene set are associated with the phenotypes. Furthermore, the application of Sub-GSE to two real data sets demonstrates that it can detect more biologically meaningful gene sets than GSEA.

Conclusion: We developed a new method to measure the gene set enrichment. Applications to two simulated datasets and two real datasets show that this method is sensitive to the associations between gene sets and phenotype. The program Sub-GSE can be downloaded from http://www-rcf.usc.edu/~fsun.

Show MeSH
Comparison results of the different tests based on simulation 2. The average ranks of the target gene sets by Sub-GSE, GSEA, GSA and SigPath for different percentages of correlated genes and correlation coefficients in Simulation II. The left panel compares the average ranks in 2-D plots in which each subplot corresponds to one value of PCG. For a given PCG, the average ranks of the target sets from the four methods are plotted against the correlation coefficient between the correlated genes. The right panel shows the average ranks of the target sets versus the PCG and correlation coefficient in a 3-D plot. The four cubes correspond to Sub-GSE, GSEA, GSA and SigPath.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2543030&req=5

Figure 6: Comparison results of the different tests based on simulation 2. The average ranks of the target gene sets by Sub-GSE, GSEA, GSA and SigPath for different percentages of correlated genes and correlation coefficients in Simulation II. The left panel compares the average ranks in 2-D plots in which each subplot corresponds to one value of PCG. For a given PCG, the average ranks of the target sets from the four methods are plotted against the correlation coefficient between the correlated genes. The right panel shows the average ranks of the target sets versus the PCG and correlation coefficient in a 3-D plot. The four cubes correspond to Sub-GSE, GSEA, GSA and SigPath.

Mentions: For this simulation study, we again apply the four different methods to prioritize the gene sets as in the first simulation study and calculate the average rank of the two target gene sets. The results can be found in Figure 6.


Testing gene set enrichment for subset of genes: Sub-GSE.

Yan X, Sun F - BMC Bioinformatics (2008)

Comparison results of the different tests based on simulation 2. The average ranks of the target gene sets by Sub-GSE, GSEA, GSA and SigPath for different percentages of correlated genes and correlation coefficients in Simulation II. The left panel compares the average ranks in 2-D plots in which each subplot corresponds to one value of PCG. For a given PCG, the average ranks of the target sets from the four methods are plotted against the correlation coefficient between the correlated genes. The right panel shows the average ranks of the target sets versus the PCG and correlation coefficient in a 3-D plot. The four cubes correspond to Sub-GSE, GSEA, GSA and SigPath.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2543030&req=5

Figure 6: Comparison results of the different tests based on simulation 2. The average ranks of the target gene sets by Sub-GSE, GSEA, GSA and SigPath for different percentages of correlated genes and correlation coefficients in Simulation II. The left panel compares the average ranks in 2-D plots in which each subplot corresponds to one value of PCG. For a given PCG, the average ranks of the target sets from the four methods are plotted against the correlation coefficient between the correlated genes. The right panel shows the average ranks of the target sets versus the PCG and correlation coefficient in a 3-D plot. The four cubes correspond to Sub-GSE, GSEA, GSA and SigPath.
Mentions: For this simulation study, we again apply the four different methods to prioritize the gene sets as in the first simulation study and calculate the average rank of the two target gene sets. The results can be found in Figure 6.

Bottom Line: The results based on gene set analysis are generally more biologically interpretable, accurate and robust than the results based on individual gene analysis.This is particularly true for cases in which only a fraction of the genes in the gene set are associated with the phenotypes.Applications to two simulated datasets and two real datasets show that this method is sensitive to the associations between gene sets and phenotype.

View Article: PubMed Central - HTML - PubMed

Affiliation: Molecular and Computational Biology Program, Department of Biological Sciences, University of Southern California, Los Angeles, CA 90089-2910, USA. xitingya@usc.edu

ABSTRACT

Background: Many methods have been developed to test the enrichment of genes related to certain phenotypes or cell states in gene sets. These approaches usually combine gene expression data with functionally related gene sets as defined in databases such as GeneOntology (GO), KEGG, or BioCarta. The results based on gene set analysis are generally more biologically interpretable, accurate and robust than the results based on individual gene analysis. However, while most available methods for gene set enrichment analysis test the enrichment of the entire gene set, it is more likely that only a subset of the genes in the gene set may be related to the phenotypes of interest.

Results: In this paper, we develop a novel method, termed Sub-GSE, which measures the enrichment of a predefined gene set, or pathway, by testing its subsets. The application of Sub-GSE to two simulated and two real datasets shows Sub-GSE to be more sensitive than previous methods, such as GSEA, GSA, and SigPath, in detecting gene sets assiated with a phenotype of interest. This is particularly true for cases in which only a fraction of the genes in the gene set are associated with the phenotypes. Furthermore, the application of Sub-GSE to two real data sets demonstrates that it can detect more biologically meaningful gene sets than GSEA.

Conclusion: We developed a new method to measure the gene set enrichment. Applications to two simulated datasets and two real datasets show that this method is sensitive to the associations between gene sets and phenotype. The program Sub-GSE can be downloaded from http://www-rcf.usc.edu/~fsun.

Show MeSH