Limits...
The Streptomyces leeuwenhoekii genome: de novo sequencing and assembly in single contigs of the chromosome, circular plasmid pSLE1 and linear plasmid pSLE2.

Gomez-Escribano JP, Castro JF, Razmilic V, Chandra G, Andrews B, Asenjo JA, Bibb MJ - BMC Genomics (2015)

Bottom Line: This project has served to evaluate the current state of NGS for efficient and effective genome mining of high GC actinomycetes.The PacBio technology now permits the assembly of actinomycete replicons into single contigs with >99 % accuracy.The assembled Illumina sequence permitted not only the correction of omissions found in GC homopolymers in the PacBio assembly (exacerbated by the high GC content of actinomycete DNA) but it also allowed us to obtain the sequences of the termini of the chromosome and of a linear plasmid that were not assembled by PacBio.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom. Juan-Pablo.Gomez-Escribano@jic.ac.uk.

ABSTRACT

Background: Next Generation DNA Sequencing (NGS) and genome mining of actinomycetes and other microorganisms is currently one of the most promising strategies for the discovery of novel bioactive natural products, potentially revealing novel chemistry and enzymology involved in their biosynthesis. This approach also allows rapid insights into the biosynthetic potential of microorganisms isolated from unexploited habitats and ecosystems, which in many cases may prove difficult to culture and manipulate in the laboratory. Streptomyces leeuwenhoekii (formerly Streptomyces sp. strain C34) was isolated from the hyper-arid high-altitude Atacama Desert in Chile and shown to produce novel polyketide antibiotics.

Results: Here we present the de novo sequencing of the S. leeuwenhoekii linear chromosome (8 Mb) and two extrachromosomal replicons, the circular pSLE1 (86 kb) and the linear pSLE2 (132 kb), all in single contigs, obtained by combining Pacific Biosciences SMRT (PacBio) and Illumina MiSeq technologies. We identified the biosynthetic gene clusters for chaxamycin, chaxalactin, hygromycin A and desferrioxamine E, metabolites all previously shown to be produced by this strain (J Nat Prod, 2011, 74:1965) and an additional 31 putative gene clusters for specialised metabolites. As well as gene clusters for polyketides and non-ribosomal peptides, we also identified three gene clusters encoding novel lasso-peptides.

Conclusions: The S. leeuwenhoekii genome contains 35 gene clusters apparently encoding the biosynthesis of specialised metabolites, most of them completely novel and uncharacterised. This project has served to evaluate the current state of NGS for efficient and effective genome mining of high GC actinomycetes. The PacBio technology now permits the assembly of actinomycete replicons into single contigs with >99 % accuracy. The assembled Illumina sequence permitted not only the correction of omissions found in GC homopolymers in the PacBio assembly (exacerbated by the high GC content of actinomycete DNA) but it also allowed us to obtain the sequences of the termini of the chromosome and of a linear plasmid that were not assembled by PacBio. We propose an experimental pipeline that uses the Illumina assembled contigs, in addition to just the reads, to complement the current limitations of the PacBio sequencing technology and assembly software.

No MeSH data available.


Related in: MedlinePlus

Sequencing and assembly pipeline. The sequencing and assembly pipeline followed in this work (data specific to this project are shown in brackets) and suggested as strategy for actinomycete genome sequencing
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4487206&req=5

Fig2: Sequencing and assembly pipeline. The sequencing and assembly pipeline followed in this work (data specific to this project are shown in brackets) and suggested as strategy for actinomycete genome sequencing

Mentions: In summary, for a de novo shotgun genome sequence from an actinomycete aimed at yielding single contigs per replicon, we currently propose a strategy (Fig. 2) that includes sequencing genomic DNA with PacBio RSII using initially two (and a third later if required) SMRT cells and a >20 kb insert library (aiming at >100x coverage) combined with Illumina MiSeq paired-end sequencing of a 500 bp library without PCR amplification (to avoid introducing bias from uneven amplification of high G + C actinomycete DNA (aiming at >90x coverage)). Both data sets are assembled and compared, and the Illumina contigs used to correct the PacBio nucleotide omissions/additions, which should be confirmed using GC Frame Plot and BLAST analyses. This consensus is further corrected with the Illumina reads. Despite the highly efficient current assembly algorithms, a considerable amount of human input was still needed to obtain a high quality single contig assembly, and accurate annotation of gene function.


The Streptomyces leeuwenhoekii genome: de novo sequencing and assembly in single contigs of the chromosome, circular plasmid pSLE1 and linear plasmid pSLE2.

Gomez-Escribano JP, Castro JF, Razmilic V, Chandra G, Andrews B, Asenjo JA, Bibb MJ - BMC Genomics (2015)

Sequencing and assembly pipeline. The sequencing and assembly pipeline followed in this work (data specific to this project are shown in brackets) and suggested as strategy for actinomycete genome sequencing
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4487206&req=5

Fig2: Sequencing and assembly pipeline. The sequencing and assembly pipeline followed in this work (data specific to this project are shown in brackets) and suggested as strategy for actinomycete genome sequencing
Mentions: In summary, for a de novo shotgun genome sequence from an actinomycete aimed at yielding single contigs per replicon, we currently propose a strategy (Fig. 2) that includes sequencing genomic DNA with PacBio RSII using initially two (and a third later if required) SMRT cells and a >20 kb insert library (aiming at >100x coverage) combined with Illumina MiSeq paired-end sequencing of a 500 bp library without PCR amplification (to avoid introducing bias from uneven amplification of high G + C actinomycete DNA (aiming at >90x coverage)). Both data sets are assembled and compared, and the Illumina contigs used to correct the PacBio nucleotide omissions/additions, which should be confirmed using GC Frame Plot and BLAST analyses. This consensus is further corrected with the Illumina reads. Despite the highly efficient current assembly algorithms, a considerable amount of human input was still needed to obtain a high quality single contig assembly, and accurate annotation of gene function.

Bottom Line: This project has served to evaluate the current state of NGS for efficient and effective genome mining of high GC actinomycetes.The PacBio technology now permits the assembly of actinomycete replicons into single contigs with >99 % accuracy.The assembled Illumina sequence permitted not only the correction of omissions found in GC homopolymers in the PacBio assembly (exacerbated by the high GC content of actinomycete DNA) but it also allowed us to obtain the sequences of the termini of the chromosome and of a linear plasmid that were not assembled by PacBio.

View Article: PubMed Central - PubMed

Affiliation: Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom. Juan-Pablo.Gomez-Escribano@jic.ac.uk.

ABSTRACT

Background: Next Generation DNA Sequencing (NGS) and genome mining of actinomycetes and other microorganisms is currently one of the most promising strategies for the discovery of novel bioactive natural products, potentially revealing novel chemistry and enzymology involved in their biosynthesis. This approach also allows rapid insights into the biosynthetic potential of microorganisms isolated from unexploited habitats and ecosystems, which in many cases may prove difficult to culture and manipulate in the laboratory. Streptomyces leeuwenhoekii (formerly Streptomyces sp. strain C34) was isolated from the hyper-arid high-altitude Atacama Desert in Chile and shown to produce novel polyketide antibiotics.

Results: Here we present the de novo sequencing of the S. leeuwenhoekii linear chromosome (8 Mb) and two extrachromosomal replicons, the circular pSLE1 (86 kb) and the linear pSLE2 (132 kb), all in single contigs, obtained by combining Pacific Biosciences SMRT (PacBio) and Illumina MiSeq technologies. We identified the biosynthetic gene clusters for chaxamycin, chaxalactin, hygromycin A and desferrioxamine E, metabolites all previously shown to be produced by this strain (J Nat Prod, 2011, 74:1965) and an additional 31 putative gene clusters for specialised metabolites. As well as gene clusters for polyketides and non-ribosomal peptides, we also identified three gene clusters encoding novel lasso-peptides.

Conclusions: The S. leeuwenhoekii genome contains 35 gene clusters apparently encoding the biosynthesis of specialised metabolites, most of them completely novel and uncharacterised. This project has served to evaluate the current state of NGS for efficient and effective genome mining of high GC actinomycetes. The PacBio technology now permits the assembly of actinomycete replicons into single contigs with >99 % accuracy. The assembled Illumina sequence permitted not only the correction of omissions found in GC homopolymers in the PacBio assembly (exacerbated by the high GC content of actinomycete DNA) but it also allowed us to obtain the sequences of the termini of the chromosome and of a linear plasmid that were not assembled by PacBio. We propose an experimental pipeline that uses the Illumina assembled contigs, in addition to just the reads, to complement the current limitations of the PacBio sequencing technology and assembly software.

No MeSH data available.


Related in: MedlinePlus