Limits...
Biocomputational prediction of non-coding RNAs in model cyanobacteria.

Voss B, Georg J, Schön V, Ude S, Hess WR - BMC Genomics (2009)

Bottom Line: Removing also transposon-associated repeats, finally 78, 53, 42 and 168 sequences, respectively, are left belonging to 109 different clusters in the data set.Yfr2a promoter-luxAB fusions confirmed a very strong activity of this promoter and indicated a stimulation of expression if the cultures were exposed to elevated light intensities.Modelling of RNA secondary structures indicated two conserved single-stranded sequence motifs that might be involved in RNA-protein interactions or in the recognition of target RNAs.

View Article: PubMed Central - HTML - PubMed

Affiliation: University of Freiburg, Faculty of Biology, Genetics and Experimental Bioinformatics, Freiburg, Germany. bjoern.voss@biologie.uni-freiburg.de

ABSTRACT

Background: In bacteria, non-coding RNAs (ncRNA) are crucial regulators of gene expression, controlling various stress responses, virulence, and motility. Previous work revealed a relatively high number of ncRNAs in some marine cyanobacteria. However, for efficient genetic and biochemical analysis it would be desirable to identify a set of ncRNA candidate genes in model cyanobacteria that are easy to manipulate and for which extended mutant, transcriptomic and proteomic data sets are available.

Results: Here we have used comparative genome analysis for the biocomputational prediction of ncRNA genes and other sequence/structure-conserved elements in intergenic regions of the three unicellular model cyanobacteria Synechocystis PCC6803, Synechococcus elongatus PCC6301 and Thermosynechococcus elongatus BP1 plus the toxic Microcystis aeruginosa NIES843. The unfiltered numbers of predicted elements in these strains is 383, 168, 168, and 809, respectively, combined into 443 sequence clusters, whereas the numbers of individual elements with high support are 94, 56, 64, and 406, respectively. Removing also transposon-associated repeats, finally 78, 53, 42 and 168 sequences, respectively, are left belonging to 109 different clusters in the data set. Experimental analysis of selected ncRNA candidates in Synechocystis PCC6803 validated new ncRNAs originating from the fabF-hoxH and apcC-prmA intergenic spacers and three highly expressed ncRNAs belonging to the Yfr2 family of ncRNAs. Yfr2a promoter-luxAB fusions confirmed a very strong activity of this promoter and indicated a stimulation of expression if the cultures were exposed to elevated light intensities.

Conclusion: Comparison to entries in Rfam and experimental testing of selected ncRNA candidates in Synechocystis PCC6803 indicate a high reliability of the current prediction, despite some contamination by the high number of repetitive sequences in some of these species. In particular, we identified in the four species altogether 8 new ncRNA homologs belonging to the Yfr2 family of ncRNAs. Modelling of RNA secondary structures indicated two conserved single-stranded sequence motifs that might be involved in RNA-protein interactions or in the recognition of target RNAs. Since our analysis has been restricted to find ncRNA candidates with a reasonable high degree of conservation among these four cyanobacteria, there might be many more, requiring direct experimental approaches for their identification.

Show MeSH

Related in: MedlinePlus

Sequence alignments and secondary structure predictions of the 8 Yfr2-type ncRNAs identify conserved structure and sequence motifs. A. Secondary structure predictions of the three experimentally confirmed ncRNAs Yfr2a, Yfr2b and Yfr2c from Synechocystis 6803. They share a 12 nt central loop on a long helical stem that is interrupted by at least one bulge at position -4 with regard to this loop (red arrows). Moreover, the first 8–13 nt are predicted to be single-stranded. B. Alignments of all eight predicted Yfr2-type DNA sequences reveal two extremely conserved nucleotide stretches: the short unpaired region at the 5' end as well as the predicted centrally located loop element (labelled by horizontal black arrows). In contrast, the region between these two elements is not conserved in sequence or in its length. The single nucleotide breaking the stem at position -4 with regard to the loop is indicated by a red arrow. Note that the 3' end of the transcribed region has not been mapped. Those sequence stretches resembling "GGA" and "ANGGA" motifs are labelled by a set of black arrows. The non-Synechocystis 6803 sequences are one from Thermosynechococcus (Thermo_Yfr2), two from Microcystis (Micro_Yfr2a and Micro_Yfr2b) and two from Synechococcus 6301 (6301_Yfr2a and 6301_Yfr2b).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2662882&req=5

Figure 6: Sequence alignments and secondary structure predictions of the 8 Yfr2-type ncRNAs identify conserved structure and sequence motifs. A. Secondary structure predictions of the three experimentally confirmed ncRNAs Yfr2a, Yfr2b and Yfr2c from Synechocystis 6803. They share a 12 nt central loop on a long helical stem that is interrupted by at least one bulge at position -4 with regard to this loop (red arrows). Moreover, the first 8–13 nt are predicted to be single-stranded. B. Alignments of all eight predicted Yfr2-type DNA sequences reveal two extremely conserved nucleotide stretches: the short unpaired region at the 5' end as well as the predicted centrally located loop element (labelled by horizontal black arrows). In contrast, the region between these two elements is not conserved in sequence or in its length. The single nucleotide breaking the stem at position -4 with regard to the loop is indicated by a red arrow. Note that the 3' end of the transcribed region has not been mapped. Those sequence stretches resembling "GGA" and "ANGGA" motifs are labelled by a set of black arrows. The non-Synechocystis 6803 sequences are one from Thermosynechococcus (Thermo_Yfr2), two from Microcystis (Micro_Yfr2a and Micro_Yfr2b) and two from Synechococcus 6301 (6301_Yfr2a and 6301_Yfr2b).

Mentions: Sequence alignments and secondary structure predictions of the 8 Yfr2-5-type ncRNAs suggest a centrally located single-stranded loop element together with a short unpaired region at the 5' end that are highly conserved (Fig. 6). The long helical stem bearing the 12 nt loop is very characteristically predicted in all sequences to be interrupted by at least one bulge at position -4 with regard to this loop (Fig. 6). Interestingly, this feature is shared with the Yfr2-Yfr5 ncRNAs from marine cyanobacteria [8]. Bulge motifs have been recognized in a wide range of RNAs as key structural elements determining molecular recognition by other molecules [38]. Therefore, the conserved bulges in Yfr2-type ncRNAs may indicate their interaction with proteins. Indeed, another hint comes from the unpaired regions of these ncRNAs which resemble the extended "GGA" and "ANGGA" RsmA-binding motifs. The ncRNAs RsmX, RsmY and RsmZ found in Pseudomonas species contain several GGA and extended ANGGA motifs [39]. For RsmY, these motifs have been shown to be essential for sequestration of RsmA and its homolog RsmE in Pseudomonas fluorescens [40]. Non-coding RNAs containing this motif frequently have a titrating role on their target protein, regulating gene expression at the translational level. It was not possible, however, to identify RsmA and RsmE homologs in cyanobacteria.


Biocomputational prediction of non-coding RNAs in model cyanobacteria.

Voss B, Georg J, Schön V, Ude S, Hess WR - BMC Genomics (2009)

Sequence alignments and secondary structure predictions of the 8 Yfr2-type ncRNAs identify conserved structure and sequence motifs. A. Secondary structure predictions of the three experimentally confirmed ncRNAs Yfr2a, Yfr2b and Yfr2c from Synechocystis 6803. They share a 12 nt central loop on a long helical stem that is interrupted by at least one bulge at position -4 with regard to this loop (red arrows). Moreover, the first 8–13 nt are predicted to be single-stranded. B. Alignments of all eight predicted Yfr2-type DNA sequences reveal two extremely conserved nucleotide stretches: the short unpaired region at the 5' end as well as the predicted centrally located loop element (labelled by horizontal black arrows). In contrast, the region between these two elements is not conserved in sequence or in its length. The single nucleotide breaking the stem at position -4 with regard to the loop is indicated by a red arrow. Note that the 3' end of the transcribed region has not been mapped. Those sequence stretches resembling "GGA" and "ANGGA" motifs are labelled by a set of black arrows. The non-Synechocystis 6803 sequences are one from Thermosynechococcus (Thermo_Yfr2), two from Microcystis (Micro_Yfr2a and Micro_Yfr2b) and two from Synechococcus 6301 (6301_Yfr2a and 6301_Yfr2b).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2662882&req=5

Figure 6: Sequence alignments and secondary structure predictions of the 8 Yfr2-type ncRNAs identify conserved structure and sequence motifs. A. Secondary structure predictions of the three experimentally confirmed ncRNAs Yfr2a, Yfr2b and Yfr2c from Synechocystis 6803. They share a 12 nt central loop on a long helical stem that is interrupted by at least one bulge at position -4 with regard to this loop (red arrows). Moreover, the first 8–13 nt are predicted to be single-stranded. B. Alignments of all eight predicted Yfr2-type DNA sequences reveal two extremely conserved nucleotide stretches: the short unpaired region at the 5' end as well as the predicted centrally located loop element (labelled by horizontal black arrows). In contrast, the region between these two elements is not conserved in sequence or in its length. The single nucleotide breaking the stem at position -4 with regard to the loop is indicated by a red arrow. Note that the 3' end of the transcribed region has not been mapped. Those sequence stretches resembling "GGA" and "ANGGA" motifs are labelled by a set of black arrows. The non-Synechocystis 6803 sequences are one from Thermosynechococcus (Thermo_Yfr2), two from Microcystis (Micro_Yfr2a and Micro_Yfr2b) and two from Synechococcus 6301 (6301_Yfr2a and 6301_Yfr2b).
Mentions: Sequence alignments and secondary structure predictions of the 8 Yfr2-5-type ncRNAs suggest a centrally located single-stranded loop element together with a short unpaired region at the 5' end that are highly conserved (Fig. 6). The long helical stem bearing the 12 nt loop is very characteristically predicted in all sequences to be interrupted by at least one bulge at position -4 with regard to this loop (Fig. 6). Interestingly, this feature is shared with the Yfr2-Yfr5 ncRNAs from marine cyanobacteria [8]. Bulge motifs have been recognized in a wide range of RNAs as key structural elements determining molecular recognition by other molecules [38]. Therefore, the conserved bulges in Yfr2-type ncRNAs may indicate their interaction with proteins. Indeed, another hint comes from the unpaired regions of these ncRNAs which resemble the extended "GGA" and "ANGGA" RsmA-binding motifs. The ncRNAs RsmX, RsmY and RsmZ found in Pseudomonas species contain several GGA and extended ANGGA motifs [39]. For RsmY, these motifs have been shown to be essential for sequestration of RsmA and its homolog RsmE in Pseudomonas fluorescens [40]. Non-coding RNAs containing this motif frequently have a titrating role on their target protein, regulating gene expression at the translational level. It was not possible, however, to identify RsmA and RsmE homologs in cyanobacteria.

Bottom Line: Removing also transposon-associated repeats, finally 78, 53, 42 and 168 sequences, respectively, are left belonging to 109 different clusters in the data set.Yfr2a promoter-luxAB fusions confirmed a very strong activity of this promoter and indicated a stimulation of expression if the cultures were exposed to elevated light intensities.Modelling of RNA secondary structures indicated two conserved single-stranded sequence motifs that might be involved in RNA-protein interactions or in the recognition of target RNAs.

View Article: PubMed Central - HTML - PubMed

Affiliation: University of Freiburg, Faculty of Biology, Genetics and Experimental Bioinformatics, Freiburg, Germany. bjoern.voss@biologie.uni-freiburg.de

ABSTRACT

Background: In bacteria, non-coding RNAs (ncRNA) are crucial regulators of gene expression, controlling various stress responses, virulence, and motility. Previous work revealed a relatively high number of ncRNAs in some marine cyanobacteria. However, for efficient genetic and biochemical analysis it would be desirable to identify a set of ncRNA candidate genes in model cyanobacteria that are easy to manipulate and for which extended mutant, transcriptomic and proteomic data sets are available.

Results: Here we have used comparative genome analysis for the biocomputational prediction of ncRNA genes and other sequence/structure-conserved elements in intergenic regions of the three unicellular model cyanobacteria Synechocystis PCC6803, Synechococcus elongatus PCC6301 and Thermosynechococcus elongatus BP1 plus the toxic Microcystis aeruginosa NIES843. The unfiltered numbers of predicted elements in these strains is 383, 168, 168, and 809, respectively, combined into 443 sequence clusters, whereas the numbers of individual elements with high support are 94, 56, 64, and 406, respectively. Removing also transposon-associated repeats, finally 78, 53, 42 and 168 sequences, respectively, are left belonging to 109 different clusters in the data set. Experimental analysis of selected ncRNA candidates in Synechocystis PCC6803 validated new ncRNAs originating from the fabF-hoxH and apcC-prmA intergenic spacers and three highly expressed ncRNAs belonging to the Yfr2 family of ncRNAs. Yfr2a promoter-luxAB fusions confirmed a very strong activity of this promoter and indicated a stimulation of expression if the cultures were exposed to elevated light intensities.

Conclusion: Comparison to entries in Rfam and experimental testing of selected ncRNA candidates in Synechocystis PCC6803 indicate a high reliability of the current prediction, despite some contamination by the high number of repetitive sequences in some of these species. In particular, we identified in the four species altogether 8 new ncRNA homologs belonging to the Yfr2 family of ncRNAs. Modelling of RNA secondary structures indicated two conserved single-stranded sequence motifs that might be involved in RNA-protein interactions or in the recognition of target RNAs. Since our analysis has been restricted to find ncRNA candidates with a reasonable high degree of conservation among these four cyanobacteria, there might be many more, requiring direct experimental approaches for their identification.

Show MeSH
Related in: MedlinePlus