Limits...
Haloquadratum walsbyi: limited diversity in a global pond.

Dyall-Smith ML, Pfeiffer F, Klee K, Palm P, Gross K, Schuster SC, Rampp M, Oesterhelt D - PLoS ONE (2011)

Bottom Line: Strain C23(T) carries two ∼6 kb plasmids that show similarity to halovirus His1 and to sequences nearby halovirus/plasmid gene clusters commonly found in haloarchaea.Change is also driven by mobile genetic elements but these do not by themselves explain the atypically low gene coding density found in this species.The remarkable genome conservation despite the presence of active systems for genome rearrangement implies both an efficient global dispersal system, and a high selective fitness for this species.

View Article: PubMed Central - PubMed

Affiliation: Department of Membrane Biochemistry, Max-Planck-Institute of Biochemistry, Martinsried, Germany. mdyall-smith@csu.edu.au

ABSTRACT

Background: Haloquadratum walsbyi commonly dominates the microbial flora of hypersaline waters. Its cells are extremely fragile squares requiring >14%(w/v) salt for growth, properties that should limit its dispersal and promote geographical isolation and divergence. To assess this, the genome sequences of two isolates recovered from sites at near maximum distance on Earth, were compared.

Principal findings: Both chromosomes are 3.1 MB in size, and 84% of each sequence was highly similar to the other (98.6% identity), comprising the core sequence. ORFs of this shared sequence were completely synteneic (conserved in genomic orientation and order), without inversion or rearrangement. Strain-specific insertions/deletions could be precisely mapped, often allowing the genetic events to be inferred. Many inferred deletions were associated with short direct repeats (4-20 bp). Deletion-coupled insertions are frequent, producing different sequences at identical positions. In cases where the inserted and deleted sequences are homologous, this leads to variant genes in a common synteneic background (as already described by others). Cas/CRISPR systems are present in C23(T) but have been lost in HBSQ001 except for a few spacer remnants. Numerous types of mobile genetic elements occur in both strains, most of which appear to be active, and with some specifically targetting others. Strain C23(T) carries two ∼6 kb plasmids that show similarity to halovirus His1 and to sequences nearby halovirus/plasmid gene clusters commonly found in haloarchaea.

Conclusions: Deletion-coupled insertions show that Hqr. walsbyi evolves by uptake and precise integration of foreign DNA, probably originating from close relatives. Change is also driven by mobile genetic elements but these do not by themselves explain the atypically low gene coding density found in this species. The remarkable genome conservation despite the presence of active systems for genome rearrangement implies both an efficient global dispersal system, and a high selective fitness for this species.

Show MeSH

Related in: MedlinePlus

Features of the strain C23T chromosome.The constant horizontal axis in all cases is the genome from left to right (first to last base of deposited sequence), with a scale given in Mbp. From top to bottom are plots of: (a) %G+C if the deviation for a 1 kb window is more than 2.5 SD from the average, (b) protein-coding pseudogenes (vertical triangles), excluding those of transposases, (c) variation in tetramer nucleotide composition (TETRA), where darker colors indicated more prominent deviation, (d) GC-profile, (e) cumulative GC-skew, (f) positions and orientations of the following gene categories: CDC6, orc1/cdc6 homologues; tRNA, transfer RNA genes; rRNA, ribosomal RNA operons; r-Prot, ribosomal protein genes; RNAP, RNA polymerase subunit genes; CRISPR, loci of clustered regularly interspersed short palindromic repeats. Smaller, unfilled arrowheads in the CDC6 line represent the positions of cdc6 pseudogenes. DV6 (divergent region 6, see Figure 5) is indicated below the cumulative GC-skew plot. Vertical grey-shaded stripes mark correlating features.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3119063&req=5

pone-0020968-g001: Features of the strain C23T chromosome.The constant horizontal axis in all cases is the genome from left to right (first to last base of deposited sequence), with a scale given in Mbp. From top to bottom are plots of: (a) %G+C if the deviation for a 1 kb window is more than 2.5 SD from the average, (b) protein-coding pseudogenes (vertical triangles), excluding those of transposases, (c) variation in tetramer nucleotide composition (TETRA), where darker colors indicated more prominent deviation, (d) GC-profile, (e) cumulative GC-skew, (f) positions and orientations of the following gene categories: CDC6, orc1/cdc6 homologues; tRNA, transfer RNA genes; rRNA, ribosomal RNA operons; r-Prot, ribosomal protein genes; RNAP, RNA polymerase subunit genes; CRISPR, loci of clustered regularly interspersed short palindromic repeats. Smaller, unfilled arrowheads in the CDC6 line represent the positions of cdc6 pseudogenes. DV6 (divergent region 6, see Figure 5) is indicated below the cumulative GC-skew plot. Vertical grey-shaded stripes mark correlating features.

Mentions: Figure 1 presents the results of several global analyses of the main chromosome of C23T. Major deviations from the average %G+C content (topmost plot) correlate closely with changes in tetramer frequency, as shown by the intense vertical bands in the TETRA plot below (third level). The second level graph shows the distribution of pseudogenes derived from non-transposase ORFs, and these are clearly associated with many of the variant regions identified in the adjacent TETRA and %G+C plots. Bacterial genomes often show large-scale organizational patterns, such as a systematic bias in the nucleotide composition of their leading and lagging strands, preferential placement of ORFs on the leading strand, and highly expressed genes close to the replication origin [22], [23]. In such cases, a plot of cumulative GC-skew versus genome position can show a simple, geometric pattern where the replication origin and the terminus occur near major inflections [24], [25]. In general, statistical deviations are much weaker in archaea so that cumulative GC-skew plots do not give a simple pattern (fifth level of Figure 1), nor does the GC-profile graph, a type of cumulative GC-skew that is more sensitive to local changes in %G+C content [26], shown just above it. However, in comparing the plots from the two strains (see later) one can distinguish between strong local deviations due to insertion of foreign DNA and weak positional deviations related to replication origins. Strong changes in the GC-profile correlate well with significant alterations in %G+C and tetramer composition. These atypical genome regions represent a mixture of unusual genomic features, described in detail below. Two prominent features are, (1) near the left end of the %G+C panel a distinct peak of higher %G+C, labeled hmuI (corresponding to a very long ORF encoding the halomucin gene with a highly biased codon usage) and, (2) a peak of lower %G+C at around 1.6 Mb that corresponds to a prophage, integrated into a tRNA gene.


Haloquadratum walsbyi: limited diversity in a global pond.

Dyall-Smith ML, Pfeiffer F, Klee K, Palm P, Gross K, Schuster SC, Rampp M, Oesterhelt D - PLoS ONE (2011)

Features of the strain C23T chromosome.The constant horizontal axis in all cases is the genome from left to right (first to last base of deposited sequence), with a scale given in Mbp. From top to bottom are plots of: (a) %G+C if the deviation for a 1 kb window is more than 2.5 SD from the average, (b) protein-coding pseudogenes (vertical triangles), excluding those of transposases, (c) variation in tetramer nucleotide composition (TETRA), where darker colors indicated more prominent deviation, (d) GC-profile, (e) cumulative GC-skew, (f) positions and orientations of the following gene categories: CDC6, orc1/cdc6 homologues; tRNA, transfer RNA genes; rRNA, ribosomal RNA operons; r-Prot, ribosomal protein genes; RNAP, RNA polymerase subunit genes; CRISPR, loci of clustered regularly interspersed short palindromic repeats. Smaller, unfilled arrowheads in the CDC6 line represent the positions of cdc6 pseudogenes. DV6 (divergent region 6, see Figure 5) is indicated below the cumulative GC-skew plot. Vertical grey-shaded stripes mark correlating features.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3119063&req=5

pone-0020968-g001: Features of the strain C23T chromosome.The constant horizontal axis in all cases is the genome from left to right (first to last base of deposited sequence), with a scale given in Mbp. From top to bottom are plots of: (a) %G+C if the deviation for a 1 kb window is more than 2.5 SD from the average, (b) protein-coding pseudogenes (vertical triangles), excluding those of transposases, (c) variation in tetramer nucleotide composition (TETRA), where darker colors indicated more prominent deviation, (d) GC-profile, (e) cumulative GC-skew, (f) positions and orientations of the following gene categories: CDC6, orc1/cdc6 homologues; tRNA, transfer RNA genes; rRNA, ribosomal RNA operons; r-Prot, ribosomal protein genes; RNAP, RNA polymerase subunit genes; CRISPR, loci of clustered regularly interspersed short palindromic repeats. Smaller, unfilled arrowheads in the CDC6 line represent the positions of cdc6 pseudogenes. DV6 (divergent region 6, see Figure 5) is indicated below the cumulative GC-skew plot. Vertical grey-shaded stripes mark correlating features.
Mentions: Figure 1 presents the results of several global analyses of the main chromosome of C23T. Major deviations from the average %G+C content (topmost plot) correlate closely with changes in tetramer frequency, as shown by the intense vertical bands in the TETRA plot below (third level). The second level graph shows the distribution of pseudogenes derived from non-transposase ORFs, and these are clearly associated with many of the variant regions identified in the adjacent TETRA and %G+C plots. Bacterial genomes often show large-scale organizational patterns, such as a systematic bias in the nucleotide composition of their leading and lagging strands, preferential placement of ORFs on the leading strand, and highly expressed genes close to the replication origin [22], [23]. In such cases, a plot of cumulative GC-skew versus genome position can show a simple, geometric pattern where the replication origin and the terminus occur near major inflections [24], [25]. In general, statistical deviations are much weaker in archaea so that cumulative GC-skew plots do not give a simple pattern (fifth level of Figure 1), nor does the GC-profile graph, a type of cumulative GC-skew that is more sensitive to local changes in %G+C content [26], shown just above it. However, in comparing the plots from the two strains (see later) one can distinguish between strong local deviations due to insertion of foreign DNA and weak positional deviations related to replication origins. Strong changes in the GC-profile correlate well with significant alterations in %G+C and tetramer composition. These atypical genome regions represent a mixture of unusual genomic features, described in detail below. Two prominent features are, (1) near the left end of the %G+C panel a distinct peak of higher %G+C, labeled hmuI (corresponding to a very long ORF encoding the halomucin gene with a highly biased codon usage) and, (2) a peak of lower %G+C at around 1.6 Mb that corresponds to a prophage, integrated into a tRNA gene.

Bottom Line: Strain C23(T) carries two ∼6 kb plasmids that show similarity to halovirus His1 and to sequences nearby halovirus/plasmid gene clusters commonly found in haloarchaea.Change is also driven by mobile genetic elements but these do not by themselves explain the atypically low gene coding density found in this species.The remarkable genome conservation despite the presence of active systems for genome rearrangement implies both an efficient global dispersal system, and a high selective fitness for this species.

View Article: PubMed Central - PubMed

Affiliation: Department of Membrane Biochemistry, Max-Planck-Institute of Biochemistry, Martinsried, Germany. mdyall-smith@csu.edu.au

ABSTRACT

Background: Haloquadratum walsbyi commonly dominates the microbial flora of hypersaline waters. Its cells are extremely fragile squares requiring >14%(w/v) salt for growth, properties that should limit its dispersal and promote geographical isolation and divergence. To assess this, the genome sequences of two isolates recovered from sites at near maximum distance on Earth, were compared.

Principal findings: Both chromosomes are 3.1 MB in size, and 84% of each sequence was highly similar to the other (98.6% identity), comprising the core sequence. ORFs of this shared sequence were completely synteneic (conserved in genomic orientation and order), without inversion or rearrangement. Strain-specific insertions/deletions could be precisely mapped, often allowing the genetic events to be inferred. Many inferred deletions were associated with short direct repeats (4-20 bp). Deletion-coupled insertions are frequent, producing different sequences at identical positions. In cases where the inserted and deleted sequences are homologous, this leads to variant genes in a common synteneic background (as already described by others). Cas/CRISPR systems are present in C23(T) but have been lost in HBSQ001 except for a few spacer remnants. Numerous types of mobile genetic elements occur in both strains, most of which appear to be active, and with some specifically targetting others. Strain C23(T) carries two ∼6 kb plasmids that show similarity to halovirus His1 and to sequences nearby halovirus/plasmid gene clusters commonly found in haloarchaea.

Conclusions: Deletion-coupled insertions show that Hqr. walsbyi evolves by uptake and precise integration of foreign DNA, probably originating from close relatives. Change is also driven by mobile genetic elements but these do not by themselves explain the atypically low gene coding density found in this species. The remarkable genome conservation despite the presence of active systems for genome rearrangement implies both an efficient global dispersal system, and a high selective fitness for this species.

Show MeSH
Related in: MedlinePlus