Limits...
Transcriptome analysis of thermophilic methylotrophic Bacillus methanolicus MGA3 using RNA-sequencing provides detailed insights into its previously uncharted transcriptional landscape.

Irla M, Neshat A, Brautaset T, Rückert C, Kalinowski J, Wendisch VF - BMC Genomics (2015)

Bottom Line: Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements.The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons.Several of the genes related to methylotrophy had highly abundant transcripts.

View Article: PubMed Central - PubMed

Affiliation: Genetics of Prokaryotes, Faculty of Biology & Center for Biotechnology, Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany. mairla@cebitec.uni-bielefeld.de.

ABSTRACT

Background: Bacillus methanolicus MGA3 is a thermophilic, facultative ribulose monophosphate (RuMP) cycle methylotroph. Together with its ability to produce high yields of amino acids, the relevance of this microorganism as a promising candidate for biotechnological applications is evident. The B. methanolicus MGA3 genome consists of a 3,337,035 nucleotides (nt) circular chromosome, the 19,174 nt plasmid pBM19 and the 68,999 nt plasmid pBM69. 3,218 protein-coding regions were annotated on the chromosome, 22 on pBM19 and 82 on pBM69. In the present study, the RNA-seq approach was used to comprehensively investigate the transcriptome of B. methanolicus MGA3 in order to improve the genome annotation, identify novel transcripts, analyze conserved sequence motifs involved in gene expression and reveal operon structures. For this aim, two different cDNA library preparation methods were applied: one which allows characterization of the whole transcriptome and another which includes enrichment of primary transcript 5'-ends.

Results: Analysis of the primary transcriptome data enabled the detection of 2,167 putative transcription start sites (TSSs) which were categorized into 1,642 TSSs located in the upstream region (5'-UTR) of known protein-coding genes and 525 TSSs of novel antisense, intragenic, or intergenic transcripts. Firstly, 14 wrongly annotated translation start sites (TLSs) were corrected based on primary transcriptome data. Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements. Moreover, the exact TSSs positions were utilized to define conserved sequence motifs for translation start sites, ribosome binding sites and promoters in B. methanolicus MGA3. Based on the whole transcriptome data set, novel transcripts, operon structures and mRNA abundances were determined. The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons. Several of the genes related to methylotrophy had highly abundant transcripts.

Conclusion: The extensive insights into the transcriptional landscape of B. methanolicus MGA3, gained in this study, represent a valuable foundation for further comparative quantitative transcriptome analyses and possibly also for the development of molecular biology tools which at present are very limited for this organism.

Show MeSH

Related in: MedlinePlus

Distribution of nucleotides within the −10 and −35 regions ofB. methanolicusMGA3 promoter regions. The conserved sequences were determined by using the Improbizer motif-finding program [23]. For this analysis, the upstream regions of the 1,642 TSSs located in the 5′-UTR of annotated protein-coding genes were used. Conserved -10 motifs were detected in 1,619 sequences (98.6%), whereas 1,616 of the analyzed sequences contributed to identification of the -35 motif (98.4%). The conservation of a specific nucleotide at certain position is measured in bits and represented in the illustration by the size of the nucleotide. The hexamer of the core -10 region is underlined. The position values below the nucleotides are represented in relation to the positions of the identified TSSs, while the two spacers represent the mean distance between extended -10 region and TSS or -10 and -35 region, respectively. The depicted sequence logo was created with the software WebLogo [24].
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4342826&req=5

Fig4: Distribution of nucleotides within the −10 and −35 regions ofB. methanolicusMGA3 promoter regions. The conserved sequences were determined by using the Improbizer motif-finding program [23]. For this analysis, the upstream regions of the 1,642 TSSs located in the 5′-UTR of annotated protein-coding genes were used. Conserved -10 motifs were detected in 1,619 sequences (98.6%), whereas 1,616 of the analyzed sequences contributed to identification of the -35 motif (98.4%). The conservation of a specific nucleotide at certain position is measured in bits and represented in the illustration by the size of the nucleotide. The hexamer of the core -10 region is underlined. The position values below the nucleotides are represented in relation to the positions of the identified TSSs, while the two spacers represent the mean distance between extended -10 region and TSS or -10 and -35 region, respectively. The depicted sequence logo was created with the software WebLogo [24].

Mentions: The upstream regions of the 1,642 TSSs, which were identified at the 5′-UTR of annotated genes (see section “Detection of putative transcription start sites from RNA-seq data of enriched 5′ ends of primary transcripts”) were searched for conserved motifs within 70 bases upstream of each TSS using Improbizer [23]. The −10 hexamer sequence TAtaaT was identified in 1,619 of the upstream sequences (98.6%) (Figure 4) and in 33% of the sequences the motif TGN was present. The distance between an identified -10 region to the corresponding TSS ranges from 4 to 10 nt and is in average 6.7 bases in length. Upstream of an identified −10 hexamer sequence, a weakly conserved −35 motif ttgana was found in 1,616 (98.4%) upstream sequences (Figure 4). The first three bases of the motif are present in approximately 70% of the sequences and the following three bases in about 46%. The average distances between the −10 and the -35 regions was 16.6 bases in B. methanolicus MGA3.Figure 4


Transcriptome analysis of thermophilic methylotrophic Bacillus methanolicus MGA3 using RNA-sequencing provides detailed insights into its previously uncharted transcriptional landscape.

Irla M, Neshat A, Brautaset T, Rückert C, Kalinowski J, Wendisch VF - BMC Genomics (2015)

Distribution of nucleotides within the −10 and −35 regions ofB. methanolicusMGA3 promoter regions. The conserved sequences were determined by using the Improbizer motif-finding program [23]. For this analysis, the upstream regions of the 1,642 TSSs located in the 5′-UTR of annotated protein-coding genes were used. Conserved -10 motifs were detected in 1,619 sequences (98.6%), whereas 1,616 of the analyzed sequences contributed to identification of the -35 motif (98.4%). The conservation of a specific nucleotide at certain position is measured in bits and represented in the illustration by the size of the nucleotide. The hexamer of the core -10 region is underlined. The position values below the nucleotides are represented in relation to the positions of the identified TSSs, while the two spacers represent the mean distance between extended -10 region and TSS or -10 and -35 region, respectively. The depicted sequence logo was created with the software WebLogo [24].
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4342826&req=5

Fig4: Distribution of nucleotides within the −10 and −35 regions ofB. methanolicusMGA3 promoter regions. The conserved sequences were determined by using the Improbizer motif-finding program [23]. For this analysis, the upstream regions of the 1,642 TSSs located in the 5′-UTR of annotated protein-coding genes were used. Conserved -10 motifs were detected in 1,619 sequences (98.6%), whereas 1,616 of the analyzed sequences contributed to identification of the -35 motif (98.4%). The conservation of a specific nucleotide at certain position is measured in bits and represented in the illustration by the size of the nucleotide. The hexamer of the core -10 region is underlined. The position values below the nucleotides are represented in relation to the positions of the identified TSSs, while the two spacers represent the mean distance between extended -10 region and TSS or -10 and -35 region, respectively. The depicted sequence logo was created with the software WebLogo [24].
Mentions: The upstream regions of the 1,642 TSSs, which were identified at the 5′-UTR of annotated genes (see section “Detection of putative transcription start sites from RNA-seq data of enriched 5′ ends of primary transcripts”) were searched for conserved motifs within 70 bases upstream of each TSS using Improbizer [23]. The −10 hexamer sequence TAtaaT was identified in 1,619 of the upstream sequences (98.6%) (Figure 4) and in 33% of the sequences the motif TGN was present. The distance between an identified -10 region to the corresponding TSS ranges from 4 to 10 nt and is in average 6.7 bases in length. Upstream of an identified −10 hexamer sequence, a weakly conserved −35 motif ttgana was found in 1,616 (98.4%) upstream sequences (Figure 4). The first three bases of the motif are present in approximately 70% of the sequences and the following three bases in about 46%. The average distances between the −10 and the -35 regions was 16.6 bases in B. methanolicus MGA3.Figure 4

Bottom Line: Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements.The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons.Several of the genes related to methylotrophy had highly abundant transcripts.

View Article: PubMed Central - PubMed

Affiliation: Genetics of Prokaryotes, Faculty of Biology & Center for Biotechnology, Bielefeld University, Universitätsstr. 25, 33615, Bielefeld, Germany. mairla@cebitec.uni-bielefeld.de.

ABSTRACT

Background: Bacillus methanolicus MGA3 is a thermophilic, facultative ribulose monophosphate (RuMP) cycle methylotroph. Together with its ability to produce high yields of amino acids, the relevance of this microorganism as a promising candidate for biotechnological applications is evident. The B. methanolicus MGA3 genome consists of a 3,337,035 nucleotides (nt) circular chromosome, the 19,174 nt plasmid pBM19 and the 68,999 nt plasmid pBM69. 3,218 protein-coding regions were annotated on the chromosome, 22 on pBM19 and 82 on pBM69. In the present study, the RNA-seq approach was used to comprehensively investigate the transcriptome of B. methanolicus MGA3 in order to improve the genome annotation, identify novel transcripts, analyze conserved sequence motifs involved in gene expression and reveal operon structures. For this aim, two different cDNA library preparation methods were applied: one which allows characterization of the whole transcriptome and another which includes enrichment of primary transcript 5'-ends.

Results: Analysis of the primary transcriptome data enabled the detection of 2,167 putative transcription start sites (TSSs) which were categorized into 1,642 TSSs located in the upstream region (5'-UTR) of known protein-coding genes and 525 TSSs of novel antisense, intragenic, or intergenic transcripts. Firstly, 14 wrongly annotated translation start sites (TLSs) were corrected based on primary transcriptome data. Further investigation of the identified 5'-UTRs resulted in the detailed characterization of their length distribution and the detection of 75 hitherto unknown cis-regulatory RNA elements. Moreover, the exact TSSs positions were utilized to define conserved sequence motifs for translation start sites, ribosome binding sites and promoters in B. methanolicus MGA3. Based on the whole transcriptome data set, novel transcripts, operon structures and mRNA abundances were determined. The analysis of the operon structures revealed that almost half of the genes are transcribed monocistronically (940), whereas 1,164 genes are organized in 381 operons. Several of the genes related to methylotrophy had highly abundant transcripts.

Conclusion: The extensive insights into the transcriptional landscape of B. methanolicus MGA3, gained in this study, represent a valuable foundation for further comparative quantitative transcriptome analyses and possibly also for the development of molecular biology tools which at present are very limited for this organism.

Show MeSH
Related in: MedlinePlus