Limits...
Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O - PLoS ONE (2014)

Bottom Line: We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms.It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis.The procedures are available as web tools.

View Article: PubMed Central - PubMed

Affiliation: National Food Institute, Technical University of Denmark, Lyngby, Denmark.

ABSTRACT
Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

No MeSH data available.


Related in: MedlinePlus

Salmonella DT104 phylogeny.Labels are colored according to isolate. The sequencing platforms applied are appended to the end of each label. If repetitive sequencing has been performed then the label has also been appended either “1” or “2”. (A) Phylogeny inferred with snpTree; (B) Phylogeny inferred with the novel SNP procedure; (C) Phylogeny inferred with the Nucleotide Difference (ND) method.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4128722&req=5

pone-0104984-g003: Salmonella DT104 phylogeny.Labels are colored according to isolate. The sequencing platforms applied are appended to the end of each label. If repetitive sequencing has been performed then the label has also been appended either “1” or “2”. (A) Phylogeny inferred with snpTree; (B) Phylogeny inferred with the novel SNP procedure; (C) Phylogeny inferred with the Nucleotide Difference (ND) method.

Mentions: snpTree seems to have problems differentiating properly between the sequence of the isolates that are closely related (Figure 3A), even with a closely related reference. Applying a distantly related reference a clear clustering of platforms and not isolates is seen (Figure S5). The ND method and the novel SNP procedure both cluster the isolates correctly (Figures 3C and 3B). The two methods create two identical phylogenies regardless of the distance to the reference used (see Figures S6 and S7 for phylogenies inferred with a distant reference). The novel SNP method finds between 1 and 1.5 SNPs on average between identical isolates. The ND method finds none.


Solving the problem of comparing whole bacterial genomes across different sequencing platforms.

Kaas RS, Leekitcharoenphon P, Aarestrup FM, Lund O - PLoS ONE (2014)

Salmonella DT104 phylogeny.Labels are colored according to isolate. The sequencing platforms applied are appended to the end of each label. If repetitive sequencing has been performed then the label has also been appended either “1” or “2”. (A) Phylogeny inferred with snpTree; (B) Phylogeny inferred with the novel SNP procedure; (C) Phylogeny inferred with the Nucleotide Difference (ND) method.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4128722&req=5

pone-0104984-g003: Salmonella DT104 phylogeny.Labels are colored according to isolate. The sequencing platforms applied are appended to the end of each label. If repetitive sequencing has been performed then the label has also been appended either “1” or “2”. (A) Phylogeny inferred with snpTree; (B) Phylogeny inferred with the novel SNP procedure; (C) Phylogeny inferred with the Nucleotide Difference (ND) method.
Mentions: snpTree seems to have problems differentiating properly between the sequence of the isolates that are closely related (Figure 3A), even with a closely related reference. Applying a distantly related reference a clear clustering of platforms and not isolates is seen (Figure S5). The ND method and the novel SNP procedure both cluster the isolates correctly (Figures 3C and 3B). The two methods create two identical phylogenies regardless of the distance to the reference used (see Figures S6 and S7 for phylogenies inferred with a distant reference). The novel SNP method finds between 1 and 1.5 SNPs on average between identical isolates. The ND method finds none.

Bottom Line: We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms.It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis.The procedures are available as web tools.

View Article: PubMed Central - PubMed

Affiliation: National Food Institute, Technical University of Denmark, Lyngby, Denmark.

ABSTRACT
Whole genome sequencing (WGS) shows great potential for real-time monitoring and identification of infectious disease outbreaks. However, rapid and reliable comparison of data generated in multiple laboratories and using multiple technologies is essential. So far studies have focused on using one technology because each technology has a systematic bias making integration of data generated from different platforms difficult. We developed two different procedures for identifying variable sites and inferring phylogenies in WGS data across multiple platforms. The methods were evaluated on three bacterial data sets and sequenced on three different platforms (Illumina, 454, Ion Torrent). We show that the methods are able to overcome the systematic biases caused by the sequencers and infer the expected phylogenies. It is concluded that the cause of the success of these new procedures is due to a validation of all informative sites that are included in the analysis. The procedures are available as web tools.

No MeSH data available.


Related in: MedlinePlus