Limits...
CoGA: An R Package to Identify Differentially Co-Expressed Gene Sets by Analyzing the Graph Spectra.

Santos Sde S, Galatro TF, Watanabe RA, Oba-Shinjo SM, Nagahashi Marie SK, Fujita A - PLoS ONE (2015)

Bottom Line: Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions.In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes.The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil.

ABSTRACT
Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their "importance" in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.

No MeSH data available.


Venn diagrams of the gene sets co-identified by the methods.Each diagram shows the number of gene sets co-identified by the Spectral distribution test from the CoGA package, and the GSCA and GSNCA methods. In (A), (B), and (C) the significance level of the tests is set to 0.01, 0.05, and 0.1, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4551485&req=5

pone.0135831.g002: Venn diagrams of the gene sets co-identified by the methods.Each diagram shows the number of gene sets co-identified by the Spectral distribution test from the CoGA package, and the GSCA and GSNCA methods. In (A), (B), and (C) the significance level of the tests is set to 0.01, 0.05, and 0.1, respectively.

Mentions: For each permutation test, we set the number of random resamples to 10,000. We show the resulting p-values for all gene sets in S1 Table. In Fig 2, we show Venn diagrams of the gene sets co-identified by the methods for different significance levels (α = 0.01, 0.05, 0.10). When the significance level (α) is 0.01, the CoGA package identified four sets that were not detected by the other methods. For α = 0.05 and α = 0.1, the number of sets identified only by CoGA is 25 and 40, respectively. Then, the CoGA method can identify sets that were not detected by the GSCA and the GSNCA tests.


CoGA: An R Package to Identify Differentially Co-Expressed Gene Sets by Analyzing the Graph Spectra.

Santos Sde S, Galatro TF, Watanabe RA, Oba-Shinjo SM, Nagahashi Marie SK, Fujita A - PLoS ONE (2015)

Venn diagrams of the gene sets co-identified by the methods.Each diagram shows the number of gene sets co-identified by the Spectral distribution test from the CoGA package, and the GSCA and GSNCA methods. In (A), (B), and (C) the significance level of the tests is set to 0.01, 0.05, and 0.1, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4551485&req=5

pone.0135831.g002: Venn diagrams of the gene sets co-identified by the methods.Each diagram shows the number of gene sets co-identified by the Spectral distribution test from the CoGA package, and the GSCA and GSNCA methods. In (A), (B), and (C) the significance level of the tests is set to 0.01, 0.05, and 0.1, respectively.
Mentions: For each permutation test, we set the number of random resamples to 10,000. We show the resulting p-values for all gene sets in S1 Table. In Fig 2, we show Venn diagrams of the gene sets co-identified by the methods for different significance levels (α = 0.01, 0.05, 0.10). When the significance level (α) is 0.01, the CoGA package identified four sets that were not detected by the other methods. For α = 0.05 and α = 0.1, the number of sets identified only by CoGA is 25 and 40, respectively. Then, the CoGA method can identify sets that were not detected by the GSCA and the GSNCA tests.

Bottom Line: Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions.In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes.The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Institute of Mathematics and Statistics, University of São Paulo, São Paulo, Brazil.

ABSTRACT
Gene set analysis aims to identify predefined sets of functionally related genes that are differentially expressed between two conditions. Although gene set analysis has been very successful, by incorporating biological knowledge about the gene sets and enhancing statistical power over gene-by-gene analyses, it does not take into account the correlation (association) structure among the genes. In this work, we present CoGA (Co-expression Graph Analyzer), an R package for the identification of groups of differentially associated genes between two phenotypes. The analysis is based on concepts of Information Theory applied to the spectral distributions of the gene co-expression graphs, such as the spectral entropy to measure the randomness of a graph structure and the Jensen-Shannon divergence to discriminate classes of graphs. The package also includes common measures to compare gene co-expression networks in terms of their structural properties, such as centrality, degree distribution, shortest path length, and clustering coefficient. Besides the structural analyses, CoGA also includes graphical interfaces for visual inspection of the networks, ranking of genes according to their "importance" in the network, and the standard differential expression analysis. We show by both simulation experiments and analyses of real data that the statistical tests performed by CoGA indeed control the rate of false positives and is able to identify differentially co-expressed genes that other methods failed.

No MeSH data available.