Limits...
MADIBA: a web server toolkit for biological interpretation of Plasmodium and plant gene clusters.

Law PJ, Claudel-Renard C, Joubert F, Louw AI, Berger DK - BMC Genomics (2008)

Bottom Line: While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill.MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes.In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics and Computational Biology Unit, African Centre for Gene Technologies (ACGT), Department of Biochemistry, Faculty of Natural and Agricultural Sciences, University of Pretoria, Pretoria, 0002, South Africa. plaw@tuks.co.za

ABSTRACT

Background: Microarray technology makes it possible to identify changes in gene expression of an organism, under various conditions. Data mining is thus essential for deducing significant biological information such as the identification of new biological mechanisms or putative drug targets. While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill.

Description: MADIBA (MicroArray Data Interface for Biological Annotation) facilitates the assignment of biological meaning to gene expression clusters by automating the post-processing stage. A relational database has been designed to store the data from gene to pathway for Plasmodium, rice and Arabidopsis. Tools within the web interface allow rapid analyses for the identification of the Gene Ontology terms relevant to each cluster; visualising the metabolic pathways where the genes are implicated, their genomic localisations, putative common transcriptional regulatory elements in the upstream sequences, and an analysis specific to the organism being studied.

Conclusion: MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes. Functionality of MADIBA was validated by analysing a number of gene clusters from several published experiments - expression profiling of the Plasmodium life cycle, and salt stress treatments of Arabidopsis and rice. In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.

Show MeSH

Related in: MedlinePlus

Results from the Arabidopsis data. (A) Analysis of cluster 0 from the Arabidopsis salt stress experiment [42] with the Metabolic Pathways module revealed that the cluster contained genes involved in lignin biosynthesis. The red colour indicates that the annotations were found by two annotation methods (PRIAM and KEGG in this case), and the purple indicates the enzyme was annotated by PRIAM only. (B) After analysing cluster 8 of the Arabidopsis data [42] with the Transcription Regulation module, it was possible to identify putative transcription factor binding sites. The output of the oligo-analysis tool of RSAT is shown, indicating two motifs on the reverse complement that were identified as similar to the WRKY binding site ((C/T)TGAC(T/C)) (highlighted in the red box). Cluster 8 is known to contain several WRKY transcription factors and several disease-resistance genes. (C) Output from the Patch program of the TRANSFAC sub-module. Shown is the PR-1a (a pathogenesis related protein) promoter binding site that was identified. The table headers are provided for convenience.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2277412&req=5

Figure 4: Results from the Arabidopsis data. (A) Analysis of cluster 0 from the Arabidopsis salt stress experiment [42] with the Metabolic Pathways module revealed that the cluster contained genes involved in lignin biosynthesis. The red colour indicates that the annotations were found by two annotation methods (PRIAM and KEGG in this case), and the purple indicates the enzyme was annotated by PRIAM only. (B) After analysing cluster 8 of the Arabidopsis data [42] with the Transcription Regulation module, it was possible to identify putative transcription factor binding sites. The output of the oligo-analysis tool of RSAT is shown, indicating two motifs on the reverse complement that were identified as similar to the WRKY binding site ((C/T)TGAC(T/C)) (highlighted in the red box). Cluster 8 is known to contain several WRKY transcription factors and several disease-resistance genes. (C) Output from the Patch program of the TRANSFAC sub-module. Shown is the PR-1a (a pathogenesis related protein) promoter binding site that was identified. The table headers are provided for convenience.

Mentions: MADIBA was used to analyse data from a study of the response of Arabidopsis to salt stress [42]. The genes were clustered using a "fuzzy k-means clustering" into 10 major clusters. After analysing individual clusters with MADIBA, it was found that the analyses supported the authors' conclusions. An example is cluster 0 that had genes responding to osmotic stress in leaves and salt stress in roots, and meta-analysis with other array data by the authors led them to conclude that it contained many biotic stress response genes. Analysis with the Metabolic Pathways module of MADIBA showed over-representation of several enzymes involved with lignin biosynthesis in cluster 0 (p = 0.001) (Figure 4a), which is indicative of a defence response. The Gene Ontology analysis showed enrichment of terms in dihydrocamalexic acid decarboxylase activity, an enzyme responsible for the production of camalexin, a phytoalexin in Arabidopsis produced in response to pathogen infection [43]. Other enriched terms included chitinase activity, ion channel activity, and terms involved in calcium ion activity including ion binding terms and calmodulin binding. The calcium ion responsive terms are most likely related to the effect of salinity on the plant. In the molecular process ontology, terms included regulation of cellular defence responses, hypersensitive response, as well as several biotic stress indicators including responses to ethylene stimulus, jasmonic acid stimulus, salicylic stimulus and abscisic acid stimulus. These hormones are known to be involved in plant defences as well as playing roles in salt-stress signalling, again suggesting cross-talk between the various signalling responses. Cluster 8 was annotated as immediate response genes, and contained members of the WRKY transcription factor family and disease-resistance protein genes. Analysis by the oligo-analysis program of RSAT showed that on the reverse complement, the TTGACT and TTTGAC motifs were overrepresented in the cluster (E-values 1.1 × 10-7 and 2 × 10-7 respectively), which is similar to the WRKY binding site ((C/T)TGAC(T/C) [44]) (Figure 4b). In addition, analysis by the TRANSFAC subsection of the Transcription Regulation module showed that a large proportion of the genes (110 out of a total of 142 genes in the cluster) contained a motif (ATTTAC) that is functionally important in the promoter of PR-1a, a well characterised pathogenesis related protein [45] (Figure 4c).


MADIBA: a web server toolkit for biological interpretation of Plasmodium and plant gene clusters.

Law PJ, Claudel-Renard C, Joubert F, Louw AI, Berger DK - BMC Genomics (2008)

Results from the Arabidopsis data. (A) Analysis of cluster 0 from the Arabidopsis salt stress experiment [42] with the Metabolic Pathways module revealed that the cluster contained genes involved in lignin biosynthesis. The red colour indicates that the annotations were found by two annotation methods (PRIAM and KEGG in this case), and the purple indicates the enzyme was annotated by PRIAM only. (B) After analysing cluster 8 of the Arabidopsis data [42] with the Transcription Regulation module, it was possible to identify putative transcription factor binding sites. The output of the oligo-analysis tool of RSAT is shown, indicating two motifs on the reverse complement that were identified as similar to the WRKY binding site ((C/T)TGAC(T/C)) (highlighted in the red box). Cluster 8 is known to contain several WRKY transcription factors and several disease-resistance genes. (C) Output from the Patch program of the TRANSFAC sub-module. Shown is the PR-1a (a pathogenesis related protein) promoter binding site that was identified. The table headers are provided for convenience.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2277412&req=5

Figure 4: Results from the Arabidopsis data. (A) Analysis of cluster 0 from the Arabidopsis salt stress experiment [42] with the Metabolic Pathways module revealed that the cluster contained genes involved in lignin biosynthesis. The red colour indicates that the annotations were found by two annotation methods (PRIAM and KEGG in this case), and the purple indicates the enzyme was annotated by PRIAM only. (B) After analysing cluster 8 of the Arabidopsis data [42] with the Transcription Regulation module, it was possible to identify putative transcription factor binding sites. The output of the oligo-analysis tool of RSAT is shown, indicating two motifs on the reverse complement that were identified as similar to the WRKY binding site ((C/T)TGAC(T/C)) (highlighted in the red box). Cluster 8 is known to contain several WRKY transcription factors and several disease-resistance genes. (C) Output from the Patch program of the TRANSFAC sub-module. Shown is the PR-1a (a pathogenesis related protein) promoter binding site that was identified. The table headers are provided for convenience.
Mentions: MADIBA was used to analyse data from a study of the response of Arabidopsis to salt stress [42]. The genes were clustered using a "fuzzy k-means clustering" into 10 major clusters. After analysing individual clusters with MADIBA, it was found that the analyses supported the authors' conclusions. An example is cluster 0 that had genes responding to osmotic stress in leaves and salt stress in roots, and meta-analysis with other array data by the authors led them to conclude that it contained many biotic stress response genes. Analysis with the Metabolic Pathways module of MADIBA showed over-representation of several enzymes involved with lignin biosynthesis in cluster 0 (p = 0.001) (Figure 4a), which is indicative of a defence response. The Gene Ontology analysis showed enrichment of terms in dihydrocamalexic acid decarboxylase activity, an enzyme responsible for the production of camalexin, a phytoalexin in Arabidopsis produced in response to pathogen infection [43]. Other enriched terms included chitinase activity, ion channel activity, and terms involved in calcium ion activity including ion binding terms and calmodulin binding. The calcium ion responsive terms are most likely related to the effect of salinity on the plant. In the molecular process ontology, terms included regulation of cellular defence responses, hypersensitive response, as well as several biotic stress indicators including responses to ethylene stimulus, jasmonic acid stimulus, salicylic stimulus and abscisic acid stimulus. These hormones are known to be involved in plant defences as well as playing roles in salt-stress signalling, again suggesting cross-talk between the various signalling responses. Cluster 8 was annotated as immediate response genes, and contained members of the WRKY transcription factor family and disease-resistance protein genes. Analysis by the oligo-analysis program of RSAT showed that on the reverse complement, the TTGACT and TTTGAC motifs were overrepresented in the cluster (E-values 1.1 × 10-7 and 2 × 10-7 respectively), which is similar to the WRKY binding site ((C/T)TGAC(T/C) [44]) (Figure 4b). In addition, analysis by the TRANSFAC subsection of the Transcription Regulation module showed that a large proportion of the genes (110 out of a total of 142 genes in the cluster) contained a motif (ATTTAC) that is functionally important in the promoter of PR-1a, a well characterised pathogenesis related protein [45] (Figure 4c).

Bottom Line: While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill.MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes.In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.

View Article: PubMed Central - HTML - PubMed

Affiliation: Bioinformatics and Computational Biology Unit, African Centre for Gene Technologies (ACGT), Department of Biochemistry, Faculty of Natural and Agricultural Sciences, University of Pretoria, Pretoria, 0002, South Africa. plaw@tuks.co.za

ABSTRACT

Background: Microarray technology makes it possible to identify changes in gene expression of an organism, under various conditions. Data mining is thus essential for deducing significant biological information such as the identification of new biological mechanisms or putative drug targets. While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill.

Description: MADIBA (MicroArray Data Interface for Biological Annotation) facilitates the assignment of biological meaning to gene expression clusters by automating the post-processing stage. A relational database has been designed to store the data from gene to pathway for Plasmodium, rice and Arabidopsis. Tools within the web interface allow rapid analyses for the identification of the Gene Ontology terms relevant to each cluster; visualising the metabolic pathways where the genes are implicated, their genomic localisations, putative common transcriptional regulatory elements in the upstream sequences, and an analysis specific to the organism being studied.

Conclusion: MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes. Functionality of MADIBA was validated by analysing a number of gene clusters from several published experiments - expression profiling of the Plasmodium life cycle, and salt stress treatments of Arabidopsis and rice. In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.

Show MeSH
Related in: MedlinePlus