Limits...
DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes.

Sebestyén E, Nagy T, Suhai S, Barta E - BMC Bioinformatics (2009)

Bottom Line: Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s).Viridiplantae).The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Agricultural Research Institute of the Hungarian Academy of Sciences, Martonvásár, Brunszvik u, 2, H-2462, Hungary. sebestyene@mail.mgki.hu

ABSTRACT

Background: The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s).

Results: We have developed a new tool called DoOPSearch http://doopsearch.abc.hu for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program.

Conclusion: We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes.

Show MeSH
MOFEXT and GeneMerge analysis of the 300 base pair upstream region of the matrilin-1 and the FABP4 genes. We downloaded the 500 base pair promoter region of the matrilin-1 (A1) and FABP4 (B1) genes. We used the last 300 base pair of these sequences as a query in the MOFEXT search with the following parameters: wordsize: 8, cutoff: 70 and the 1000 base pair E subset (A2 and B2). After the MOFEXT search we got 30548 (MATN1) and 23463 (FABP4) hits. We used the score range 151-40 (MATN1) and 105-40 (FABP4) for the GeneMerge analysis (A3 and B3). The genes in the GO term "Extracellular matrix (sensu metazoan)" are listed in the panel A4. Some genes in the GO term "positive regulation of transcription from RNA polymerase II promoter" are listed in the panel B4.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2697653&req=5

Figure 2: MOFEXT and GeneMerge analysis of the 300 base pair upstream region of the matrilin-1 and the FABP4 genes. We downloaded the 500 base pair promoter region of the matrilin-1 (A1) and FABP4 (B1) genes. We used the last 300 base pair of these sequences as a query in the MOFEXT search with the following parameters: wordsize: 8, cutoff: 70 and the 1000 base pair E subset (A2 and B2). After the MOFEXT search we got 30548 (MATN1) and 23463 (FABP4) hits. We used the score range 151-40 (MATN1) and 105-40 (FABP4) for the GeneMerge analysis (A3 and B3). The genes in the GO term "Extracellular matrix (sensu metazoan)" are listed in the panel A4. Some genes in the GO term "positive regulation of transcription from RNA polymerase II promoter" are listed in the panel B4.

Mentions: First we demonstrate how a longer sequence can be used for promoter analysis (Figure 2). Earlier we determined both with experimental and in silico analysis tools, a promoter element in the upstream region of the matrilin-1 gene [24]. In this example we used a 300 base pair upstream fragment starting from the ATG start codon of the human matrilin-1 gene available at the DoOP database [15]. As a control, we chose the FABP4 gene and used the same promoter region. After the MOFEXT search with exactly the same parameters, we filtered the result and ran the GeneMerge analysis. It is clear that there are specific Gene Ontology categories overrepresented in each example. It is also obvious that some categories like "transcription" or "transcription factor activity" can be found in both results. The explanation for this can be that the transcription factors contain more conserved motifs in their promoter regions than other type of genes, but to confirm this, we need to perform other analyses.


DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes.

Sebestyén E, Nagy T, Suhai S, Barta E - BMC Bioinformatics (2009)

MOFEXT and GeneMerge analysis of the 300 base pair upstream region of the matrilin-1 and the FABP4 genes. We downloaded the 500 base pair promoter region of the matrilin-1 (A1) and FABP4 (B1) genes. We used the last 300 base pair of these sequences as a query in the MOFEXT search with the following parameters: wordsize: 8, cutoff: 70 and the 1000 base pair E subset (A2 and B2). After the MOFEXT search we got 30548 (MATN1) and 23463 (FABP4) hits. We used the score range 151-40 (MATN1) and 105-40 (FABP4) for the GeneMerge analysis (A3 and B3). The genes in the GO term "Extracellular matrix (sensu metazoan)" are listed in the panel A4. Some genes in the GO term "positive regulation of transcription from RNA polymerase II promoter" are listed in the panel B4.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2697653&req=5

Figure 2: MOFEXT and GeneMerge analysis of the 300 base pair upstream region of the matrilin-1 and the FABP4 genes. We downloaded the 500 base pair promoter region of the matrilin-1 (A1) and FABP4 (B1) genes. We used the last 300 base pair of these sequences as a query in the MOFEXT search with the following parameters: wordsize: 8, cutoff: 70 and the 1000 base pair E subset (A2 and B2). After the MOFEXT search we got 30548 (MATN1) and 23463 (FABP4) hits. We used the score range 151-40 (MATN1) and 105-40 (FABP4) for the GeneMerge analysis (A3 and B3). The genes in the GO term "Extracellular matrix (sensu metazoan)" are listed in the panel A4. Some genes in the GO term "positive regulation of transcription from RNA polymerase II promoter" are listed in the panel B4.
Mentions: First we demonstrate how a longer sequence can be used for promoter analysis (Figure 2). Earlier we determined both with experimental and in silico analysis tools, a promoter element in the upstream region of the matrilin-1 gene [24]. In this example we used a 300 base pair upstream fragment starting from the ATG start codon of the human matrilin-1 gene available at the DoOP database [15]. As a control, we chose the FABP4 gene and used the same promoter region. After the MOFEXT search with exactly the same parameters, we filtered the result and ran the GeneMerge analysis. It is clear that there are specific Gene Ontology categories overrepresented in each example. It is also obvious that some categories like "transcription" or "transcription factor activity" can be found in both results. The explanation for this can be that the transcription factors contain more conserved motifs in their promoter regions than other type of genes, but to confirm this, we need to perform other analyses.

Bottom Line: Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s).Viridiplantae).The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Agricultural Research Institute of the Hungarian Academy of Sciences, Martonvásár, Brunszvik u, 2, H-2462, Hungary. sebestyene@mail.mgki.hu

ABSTRACT

Background: The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s).

Results: We have developed a new tool called DoOPSearch http://doopsearch.abc.hu for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program.

Conclusion: We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes.

Show MeSH