Limits...
A comprehensive metatranscriptome analysis pipeline and its validation using human small intestine microbiota datasets.

Leimena MM, Ramiro-Garcia J, Davids M, van den Bogert B, Smidt H, Smid EJ, Boekhorst J, Zoetendal EG, Schaap PJ, Kleerebezem M - BMC Genomics (2013)

Bottom Line: Reproducibility of the metatranscriptome sequencing approach was established by independent duplicate experiments.In addition, comparison of metatranscriptome analysis employing single- or paired-end sequencing methods indicated that the latter approach does not provide improved functional or phylogenetic insights.The set-up of the pipeline is very generic and can be applied for (bacterial) metatranscriptome analysis in any chosen niche.

View Article: PubMed Central - HTML - PubMed

Affiliation: TI Food and Nutrition (TIFN), P,O, Box 557, 6700 AN, Wageningen, The Netherlands.

ABSTRACT

Background: Next generation sequencing (NGS) technologies can be applied in complex microbial ecosystems for metatranscriptome analysis by employing direct cDNA sequencing, which is known as RNA sequencing (RNA-seq). RNA-seq generates large datasets of great complexity, the comprehensive interpretation of which requires a reliable bioinformatic pipeline. In this study, we focus on the development of such a metatranscriptome pipeline, which we validate using Illumina RNA-seq datasets derived from the small intestine microbiota of two individuals with an ileostomy.

Results: The metatranscriptome pipeline developed here enabled effective removal of rRNA derived sequences, followed by confident assignment of the predicted function and taxonomic origin of the mRNA reads. Phylogenetic analysis of the small intestine metatranscriptome datasets revealed a strong similarity with the community composition profiles obtained from 16S rDNA and rRNA pyrosequencing, indicating considerable congruency between community composition (rDNA), and the taxonomic distribution of overall (rRNA) and specific (mRNA) activity among its microbial members. Reproducibility of the metatranscriptome sequencing approach was established by independent duplicate experiments. In addition, comparison of metatranscriptome analysis employing single- or paired-end sequencing methods indicated that the latter approach does not provide improved functional or phylogenetic insights. Metatranscriptome functional-mapping allowed the analysis of global, and genus specific activity of the microbiota, and illustrated the potential of these approaches to unravel syntrophic interactions in microbial ecosystems.

Conclusions: A reliable pipeline for metatransciptome data analysis was developed and evaluated using RNA-seq datasets obtained for the human small intestine microbiota. The set-up of the pipeline is very generic and can be applied for (bacterial) metatranscriptome analysis in any chosen niche.

Show MeSH
Distribution of mRNA reads assignment. The mRNA reads were assigned to the reference genome database and classified based on their alignment to protein-encoding genes (dark bars) or non-coding (light bars) regions. Based on alignment bit score of mRNA reads to the genome, the reads can obtain phylogenetic and functional identification at genus (blue) and family (green) levels with a minimum bit score of 148 and between 110 and 148, respectively; while the reads with an alignment bit score between 74 and 110 only obtained functional assignments (red). The unassigned reads were represented in black. The specific read numbers that belong to each classification are presented in table S4.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3750648&req=5

Figure 3: Distribution of mRNA reads assignment. The mRNA reads were assigned to the reference genome database and classified based on their alignment to protein-encoding genes (dark bars) or non-coding (light bars) regions. Based on alignment bit score of mRNA reads to the genome, the reads can obtain phylogenetic and functional identification at genus (blue) and family (green) levels with a minimum bit score of 148 and between 110 and 148, respectively; while the reads with an alignment bit score between 74 and 110 only obtained functional assignments (red). The unassigned reads were represented in black. The specific read numbers that belong to each classification are presented in table S4.

Mentions: All read alignments with minimum bit score of 74 or higher could reliably (>95% confidence) be assigned to a COG-based function (see Additional file 4 and Additional file 5: Figure S3A). Using this minimum bit score threshold, 78 to 93% of the mRNA reads (FigureĀ 3, Additional file 3: Table S4) could be assigned to homologous loci in bacterial genomes using MegaBLAST (65-85%) and BLASTN (8-14%). This relatively high hit-frequency illustrates that the NCBI genome database provides a good representation of the functional diversity encountered in the human small intestine ecosystem. However, it should be noted that samples from other ecosystems may be less well represented in this database.


A comprehensive metatranscriptome analysis pipeline and its validation using human small intestine microbiota datasets.

Leimena MM, Ramiro-Garcia J, Davids M, van den Bogert B, Smidt H, Smid EJ, Boekhorst J, Zoetendal EG, Schaap PJ, Kleerebezem M - BMC Genomics (2013)

Distribution of mRNA reads assignment. The mRNA reads were assigned to the reference genome database and classified based on their alignment to protein-encoding genes (dark bars) or non-coding (light bars) regions. Based on alignment bit score of mRNA reads to the genome, the reads can obtain phylogenetic and functional identification at genus (blue) and family (green) levels with a minimum bit score of 148 and between 110 and 148, respectively; while the reads with an alignment bit score between 74 and 110 only obtained functional assignments (red). The unassigned reads were represented in black. The specific read numbers that belong to each classification are presented in table S4.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3750648&req=5

Figure 3: Distribution of mRNA reads assignment. The mRNA reads were assigned to the reference genome database and classified based on their alignment to protein-encoding genes (dark bars) or non-coding (light bars) regions. Based on alignment bit score of mRNA reads to the genome, the reads can obtain phylogenetic and functional identification at genus (blue) and family (green) levels with a minimum bit score of 148 and between 110 and 148, respectively; while the reads with an alignment bit score between 74 and 110 only obtained functional assignments (red). The unassigned reads were represented in black. The specific read numbers that belong to each classification are presented in table S4.
Mentions: All read alignments with minimum bit score of 74 or higher could reliably (>95% confidence) be assigned to a COG-based function (see Additional file 4 and Additional file 5: Figure S3A). Using this minimum bit score threshold, 78 to 93% of the mRNA reads (FigureĀ 3, Additional file 3: Table S4) could be assigned to homologous loci in bacterial genomes using MegaBLAST (65-85%) and BLASTN (8-14%). This relatively high hit-frequency illustrates that the NCBI genome database provides a good representation of the functional diversity encountered in the human small intestine ecosystem. However, it should be noted that samples from other ecosystems may be less well represented in this database.

Bottom Line: Reproducibility of the metatranscriptome sequencing approach was established by independent duplicate experiments.In addition, comparison of metatranscriptome analysis employing single- or paired-end sequencing methods indicated that the latter approach does not provide improved functional or phylogenetic insights.The set-up of the pipeline is very generic and can be applied for (bacterial) metatranscriptome analysis in any chosen niche.

View Article: PubMed Central - HTML - PubMed

Affiliation: TI Food and Nutrition (TIFN), P,O, Box 557, 6700 AN, Wageningen, The Netherlands.

ABSTRACT

Background: Next generation sequencing (NGS) technologies can be applied in complex microbial ecosystems for metatranscriptome analysis by employing direct cDNA sequencing, which is known as RNA sequencing (RNA-seq). RNA-seq generates large datasets of great complexity, the comprehensive interpretation of which requires a reliable bioinformatic pipeline. In this study, we focus on the development of such a metatranscriptome pipeline, which we validate using Illumina RNA-seq datasets derived from the small intestine microbiota of two individuals with an ileostomy.

Results: The metatranscriptome pipeline developed here enabled effective removal of rRNA derived sequences, followed by confident assignment of the predicted function and taxonomic origin of the mRNA reads. Phylogenetic analysis of the small intestine metatranscriptome datasets revealed a strong similarity with the community composition profiles obtained from 16S rDNA and rRNA pyrosequencing, indicating considerable congruency between community composition (rDNA), and the taxonomic distribution of overall (rRNA) and specific (mRNA) activity among its microbial members. Reproducibility of the metatranscriptome sequencing approach was established by independent duplicate experiments. In addition, comparison of metatranscriptome analysis employing single- or paired-end sequencing methods indicated that the latter approach does not provide improved functional or phylogenetic insights. Metatranscriptome functional-mapping allowed the analysis of global, and genus specific activity of the microbiota, and illustrated the potential of these approaches to unravel syntrophic interactions in microbial ecosystems.

Conclusions: A reliable pipeline for metatransciptome data analysis was developed and evaluated using RNA-seq datasets obtained for the human small intestine microbiota. The set-up of the pipeline is very generic and can be applied for (bacterial) metatranscriptome analysis in any chosen niche.

Show MeSH