Limits...
Mining RNA-seq data for infections and contaminations.

Bonfert T, Csaba G, Zimmer R, Friedel CC - PLoS ONE (2013)

Bottom Line: In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime.In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences.By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.

View Article: PubMed Central - PubMed

Affiliation: Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany.

ABSTRACT
RNA sequencing (RNA-seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences) of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.

Show MeSH

Related in: MedlinePlus

Hierarchical clustering (average linkage) of microbes and viruses. Results are shown for hits with a coverage  and at least 20 mapped reads as determined by ContextMap. Microbes actually contained in the sample are indicated in red and by an asterisk and the three clusters discussed in the text are marked by rectangles. In addition, number of reads, confidence and  are indicated next to the microbe names.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3760913&req=5

pone-0073071-g005: Hierarchical clustering (average linkage) of microbes and viruses. Results are shown for hits with a coverage and at least 20 mapped reads as determined by ContextMap. Microbes actually contained in the sample are indicated in red and by an asterisk and the three clusters discussed in the text are marked by rectangles. In addition, number of reads, confidence and are indicated next to the microbe names.

Mentions: For five species, identification is straightforward based on this list. A. cellulolyticus, S. amazonensis, L. brevis, and M. xanthus are characterized by high mapping confidence (), low () and high number of reads and coverage. For P. pentosaceus confidence is also high and still relatively low (0.064), but coverage is quite small (). However, as 90% of the reads map to distinct positions, it is clearly a correct hit and the low coverage is likely due to low abundance of P. pentosaceus in the simulated community. In the clustering of species hits, these five species also form distinct clusters with no similarities to any of the other species hits (Figure 5).


Mining RNA-seq data for infections and contaminations.

Bonfert T, Csaba G, Zimmer R, Friedel CC - PLoS ONE (2013)

Hierarchical clustering (average linkage) of microbes and viruses. Results are shown for hits with a coverage  and at least 20 mapped reads as determined by ContextMap. Microbes actually contained in the sample are indicated in red and by an asterisk and the three clusters discussed in the text are marked by rectangles. In addition, number of reads, confidence and  are indicated next to the microbe names.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3760913&req=5

pone-0073071-g005: Hierarchical clustering (average linkage) of microbes and viruses. Results are shown for hits with a coverage and at least 20 mapped reads as determined by ContextMap. Microbes actually contained in the sample are indicated in red and by an asterisk and the three clusters discussed in the text are marked by rectangles. In addition, number of reads, confidence and are indicated next to the microbe names.
Mentions: For five species, identification is straightforward based on this list. A. cellulolyticus, S. amazonensis, L. brevis, and M. xanthus are characterized by high mapping confidence (), low () and high number of reads and coverage. For P. pentosaceus confidence is also high and still relatively low (0.064), but coverage is quite small (). However, as 90% of the reads map to distinct positions, it is clearly a correct hit and the low coverage is likely due to low abundance of P. pentosaceus in the simulated community. In the clustering of species hits, these five species also form distinct clusters with no similarities to any of the other species hits (Figure 5).

Bottom Line: In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime.In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences.By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.

View Article: PubMed Central - PubMed

Affiliation: Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany.

ABSTRACT
RNA sequencing (RNA-seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e.g. due to missing genome sequences) of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.

Show MeSH
Related in: MedlinePlus