Limits...
Large-scale analysis of Arabidopsis transcription reveals a basal co-regulation network.

Atias O, Chor B, Chamovitz DA - BMC Syst Biol (2009)

Bottom Line: We found clusters of globally co-expressed Arabidopsis genes that are enriched for known Gene Ontology annotations.The analysis reveals that part of the Arabidopsis transcriptome is globally co-expressed, and can be further divided into known as well as novel functional gene modules.Our methodology is general enough to apply to any set of microarray experiments, using any scoring function.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Plant Sciences, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel. dafniosn@post.tau.ac.il

ABSTRACT

Background: Analyses of gene expression data from microarray experiments has become a central tool for identifying co-regulated, functional gene modules. A crucial aspect of such analysis is the integration of data from different experiments and different laboratories. How to weigh the contribution of different experiments is an important point influencing the final outcomes. We have developed a novel method for this integration, and applied it to genome-wide data from multiple Arabidopsis microarray experiments performed under a variety of experimental conditions. The goal of this study is to identify functional globally co-regulated gene modules in the Arabidopsis genome.

Results: Following the analysis of 21,000 Arabidopsis genes in 43 datasets and about 2 x 10(8) gene pairs, we identified a globally co-expressed gene network. We found clusters of globally co-expressed Arabidopsis genes that are enriched for known Gene Ontology annotations. Two types of modules were identified in the regulatory network that differed in their sensitivity to the node-scoring parameter; we further showed these two pertain to general and specialized modules. Some of these modules were further investigated using the Genevestigator compendium of microarray experiments. Analyses of smaller subsets of data lead to the identification of condition-specific modules.

Conclusion: Our method for identification of gene clusters allows the integration of diverse microarray experiments from many sources. The analysis reveals that part of the Arabidopsis transcriptome is globally co-expressed, and can be further divided into known as well as novel functional gene modules. Our methodology is general enough to apply to any set of microarray experiments, using any scoring function.

Show MeSH
Comparison between experimental and random networks. Each cluster found in the 0.3 (A) or 0.4 (B) networks was plotted as a dot according to cluster size and clustering coefficient. Red dots represent clusters in the experimental networks. Blue dots represent clusters found in 10 different random networks, created by random shuffling of edges in the experimental networks.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2944327&req=5

Figure 3: Comparison between experimental and random networks. Each cluster found in the 0.3 (A) or 0.4 (B) networks was plotted as a dot according to cluster size and clustering coefficient. Red dots represent clusters in the experimental networks. Blue dots represent clusters found in 10 different random networks, created by random shuffling of edges in the experimental networks.

Mentions: To select a threshold for tscore, we took into account network size, as well as the number of clusters detected by MCODE for different threshold values (Figure 1). We chose to further explore the networks built using the thresholds 0.3 and 0.4 as a good compromise between compact network size and a relatively large number of clusters (Figure 2 and Table 2). We compared these results to those obtained from randomized networks, produced by two different methods, as follows. First, scores calculated for each gene pair were randomly shuffled, and a new network was built using either the 0.3 or the 0.4 thresholds. This procedure was repeated 10 times, and no clusters were detected by MCODE in these randomized networks, indicating that the scores calculated using our method represent meaningful interactions between the genes. As a second verification, we randomly shuffled the edges of the 0.3 and of the 0.4 networks, which were created using the correct scores. This procedure was independently repeated 10 times for each threshold, and each random network was searched for clusters using MCODE. Results summarized in Table 3 show that the shuffled random networks have significantly fewer clusters than the real ones. Average cluster size in the random networks was 4-5 times larger than that of the real network, which tended to be smaller and denser than those detected in the random networks (Figure 3). This indicates that the experimental networks have a rich topology, which may represent biological meaning.


Large-scale analysis of Arabidopsis transcription reveals a basal co-regulation network.

Atias O, Chor B, Chamovitz DA - BMC Syst Biol (2009)

Comparison between experimental and random networks. Each cluster found in the 0.3 (A) or 0.4 (B) networks was plotted as a dot according to cluster size and clustering coefficient. Red dots represent clusters in the experimental networks. Blue dots represent clusters found in 10 different random networks, created by random shuffling of edges in the experimental networks.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2944327&req=5

Figure 3: Comparison between experimental and random networks. Each cluster found in the 0.3 (A) or 0.4 (B) networks was plotted as a dot according to cluster size and clustering coefficient. Red dots represent clusters in the experimental networks. Blue dots represent clusters found in 10 different random networks, created by random shuffling of edges in the experimental networks.
Mentions: To select a threshold for tscore, we took into account network size, as well as the number of clusters detected by MCODE for different threshold values (Figure 1). We chose to further explore the networks built using the thresholds 0.3 and 0.4 as a good compromise between compact network size and a relatively large number of clusters (Figure 2 and Table 2). We compared these results to those obtained from randomized networks, produced by two different methods, as follows. First, scores calculated for each gene pair were randomly shuffled, and a new network was built using either the 0.3 or the 0.4 thresholds. This procedure was repeated 10 times, and no clusters were detected by MCODE in these randomized networks, indicating that the scores calculated using our method represent meaningful interactions between the genes. As a second verification, we randomly shuffled the edges of the 0.3 and of the 0.4 networks, which were created using the correct scores. This procedure was independently repeated 10 times for each threshold, and each random network was searched for clusters using MCODE. Results summarized in Table 3 show that the shuffled random networks have significantly fewer clusters than the real ones. Average cluster size in the random networks was 4-5 times larger than that of the real network, which tended to be smaller and denser than those detected in the random networks (Figure 3). This indicates that the experimental networks have a rich topology, which may represent biological meaning.

Bottom Line: We found clusters of globally co-expressed Arabidopsis genes that are enriched for known Gene Ontology annotations.The analysis reveals that part of the Arabidopsis transcriptome is globally co-expressed, and can be further divided into known as well as novel functional gene modules.Our methodology is general enough to apply to any set of microarray experiments, using any scoring function.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Plant Sciences, The George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel. dafniosn@post.tau.ac.il

ABSTRACT

Background: Analyses of gene expression data from microarray experiments has become a central tool for identifying co-regulated, functional gene modules. A crucial aspect of such analysis is the integration of data from different experiments and different laboratories. How to weigh the contribution of different experiments is an important point influencing the final outcomes. We have developed a novel method for this integration, and applied it to genome-wide data from multiple Arabidopsis microarray experiments performed under a variety of experimental conditions. The goal of this study is to identify functional globally co-regulated gene modules in the Arabidopsis genome.

Results: Following the analysis of 21,000 Arabidopsis genes in 43 datasets and about 2 x 10(8) gene pairs, we identified a globally co-expressed gene network. We found clusters of globally co-expressed Arabidopsis genes that are enriched for known Gene Ontology annotations. Two types of modules were identified in the regulatory network that differed in their sensitivity to the node-scoring parameter; we further showed these two pertain to general and specialized modules. Some of these modules were further investigated using the Genevestigator compendium of microarray experiments. Analyses of smaller subsets of data lead to the identification of condition-specific modules.

Conclusion: Our method for identification of gene clusters allows the integration of diverse microarray experiments from many sources. The analysis reveals that part of the Arabidopsis transcriptome is globally co-expressed, and can be further divided into known as well as novel functional gene modules. Our methodology is general enough to apply to any set of microarray experiments, using any scoring function.

Show MeSH