Limits...
MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes.

Verneau J, Levasseur A, Raoult D, La Scola B, Colson P - Front Microbiol (2016)

Bottom Line: Metagenomes previously found to contain megavirus-like sequences were tested as controls.These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses.Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification.

View Article: PubMed Central - PubMed

Affiliation: Aix-Marseille University, URMITE UM 63 CNRS 7278 IRD 198 INSERM U1095 Marseille, France.

ABSTRACT
The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a 'dark matter.' We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body.

No MeSH data available.


Related in: MedlinePlus

Flowchart of the MG-Digger tool.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4814491&req=5

Figure 1: Flowchart of the MG-Digger tool.

Mentions: The pipeline dedicated to the search for giant virus-related sequences in metagenomes comprises several scripts written in Python language and include independent modules (Figure 1). These modules automatically operate successively, without the need for any user intervention. Alternatively, the user can launch a single module to perform only part of the analysis. The type of BLAST analysis performed by the pipeline can be chosen, depending on the nature of the sequence set to study. Hence, BLAST analyses that are launched can use nucleotide or protein queries and target sequences.


MG-Digger: An Automated Pipeline to Search for Giant Virus-Related Sequences in Metagenomes.

Verneau J, Levasseur A, Raoult D, La Scola B, Colson P - Front Microbiol (2016)

Flowchart of the MG-Digger tool.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4814491&req=5

Figure 1: Flowchart of the MG-Digger tool.
Mentions: The pipeline dedicated to the search for giant virus-related sequences in metagenomes comprises several scripts written in Python language and include independent modules (Figure 1). These modules automatically operate successively, without the need for any user intervention. Alternatively, the user can launch a single module to perform only part of the analysis. The type of BLAST analysis performed by the pipeline can be chosen, depending on the nature of the sequence set to study. Hence, BLAST analyses that are launched can use nucleotide or protein queries and target sequences.

Bottom Line: Metagenomes previously found to contain megavirus-like sequences were tested as controls.These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses.Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification.

View Article: PubMed Central - PubMed

Affiliation: Aix-Marseille University, URMITE UM 63 CNRS 7278 IRD 198 INSERM U1095 Marseille, France.

ABSTRACT
The number of metagenomic studies conducted each year is growing dramatically. Storage and analysis of such big data is difficult and time-consuming. Interestingly, analysis shows that environmental and human metagenomes include a significant amount of non-annotated sequences, representing a 'dark matter.' We established a bioinformatics pipeline that automatically detects metagenome reads matching query sequences from a given set and applied this tool to the detection of sequences matching large and giant DNA viral members of the proposed order Megavirales or virophages. A total of 1,045 environmental and human metagenomes (≈ 1 Terabase) were collected, processed, and stored on our bioinformatics server. In addition, nucleotide and protein sequences from 93 Megavirales representatives, including 19 giant viruses of amoeba, and 5 virophages, were collected. The pipeline was generated by scripts written in Python language and entitled MG-Digger. Metagenomes previously found to contain megavirus-like sequences were tested as controls. MG-Digger was able to annotate 100s of metagenome sequences as best matching those of giant viruses. These sequences were most often found to be similar to phycodnavirus or mimivirus sequences, but included reads related to recently available pandoraviruses, Pithovirus sibericum, and faustoviruses. Compared to other tools, MG-Digger combined stand-alone use on Linux or Windows operating systems through a user-friendly interface, implementation of ready-to-use customized metagenome databases and query sequence databases, adjustable parameters for BLAST searches, and creation of output files containing selected reads with best match identification. Compared to Metavir 2, a reference tool in viral metagenome analysis, MG-Digger detected 8% more true positive Megavirales-related reads in a control metagenome. The present work shows that massive, automated and recurrent analyses of metagenomes are effective in improving knowledge about the presence and prevalence of giant viruses in the environment and the human body.

No MeSH data available.


Related in: MedlinePlus