Limits...
Sequences, sequence clusters and bacterial species.

Hanage WP, Fraser C, Spratt BG - Philos. Trans. R. Soc. Lond., B, Biol. Sci. (2006)

Bottom Line: Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination.The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy.The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

View Article: PubMed Central - PubMed

Affiliation: Department of Infectious Disease Epidemiology, Imperial College London, St Mary's Hospital Campus, Norfolk Place, London W2 1PG, UK.

ABSTRACT
Whatever else they should share, strains of bacteria assigned to the same species should have house-keeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multilocus sequence analysis, MLSA) is to concatenate the sequences of multiple house-keeping loci and to observe the patterns of clustering among large populations of strains of closely related named bacterial species. Recent studies have shown that large populations can be resolved into non-overlapping sequence clusters that agree well with species assigned by the standard microbiological methods. The use of clustering patterns to inform the division of closely related populations into species has many advantages for poorly studied bacteria (or to re-evaluate well-studied species), as it provides a way of recognizing natural discontinuities in the distribution of similar genotypes. Clustering patterns can be used by expert groups as the basis of a pragmatic approach to assigning species, taking into account whatever additional data are available (e.g. similarities in ecology, phenotype and gene content). The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy. The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

Show MeSH

Related in: MedlinePlus

Resolving populations of N. meningitidis, N. meningitidis and N. gonorrhoeae. Bayesian tree constructed using the concatenated sequences (seven loci) of the first 500 different strains (STs) of N. meningitidis in the public Neisseria MLST database, all different strains of N. lactamica (171) and N. gonorrhoeae (67). The arrow shows the two strains of N. lactamica that cluster anomalously and have probably been incorrectly identified (see text). Only third codon positions were used in the analysis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. Details as in figure 2 with rate matrix r(A↔C) 0.044: r(A↔G) 0.541: r(A↔T) 0.018: r(C↔G) 0.044: r(C↔T) 0.299: r(G↔T) 0.053; nucleotide frequencies A 0.11: C 0.44: G 0.24: T 0.21 and gamma parameter α=0.481. Pinvar=0.30.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1764932&req=5

fig3: Resolving populations of N. meningitidis, N. meningitidis and N. gonorrhoeae. Bayesian tree constructed using the concatenated sequences (seven loci) of the first 500 different strains (STs) of N. meningitidis in the public Neisseria MLST database, all different strains of N. lactamica (171) and N. gonorrhoeae (67). The arrow shows the two strains of N. lactamica that cluster anomalously and have probably been incorrectly identified (see text). Only third codon positions were used in the analysis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. Details as in figure 2 with rate matrix r(A↔C) 0.044: r(A↔G) 0.541: r(A↔T) 0.018: r(C↔G) 0.044: r(C↔T) 0.299: r(G↔T) 0.053; nucleotide frequencies A 0.11: C 0.44: G 0.24: T 0.21 and gamma parameter α=0.481. Pinvar=0.30.

Mentions: Analysis of the sequences of N. meningitidis house-keeping genes has shown extensive evidence for recombinational imports from related commensal species (Feil et al. 1995; Zhou et al. 1997), and the trees of different Neisseria house-keeping genes (and 16S rRNA genes) suggest different phylogenetic relationships between these species (Smith et al. 1999). As expected, the individual trees derived from the sequences of each MLST locus fail to resolve N. meningitidis strains from N. lactamica strains (Hanage et al. 2005a). However, the concatenated sequences of the seven MLST loci completely resolve the N. lactamica strains from those of N. meningitidis (the group of strains shown as N. lactamica were clustered together in 100% of trees drawn from the posterior probability), although a few strains arise from the branch separating these two species (figure 3), and two strains located at the end of a long branch arising from the meningococcal cluster are clear examples of mistaken identity (arrow in figure 3). When examples of other Neisseria species are included, these two ‘lactamica’ strains cluster with these (Hanage et al. 2005a) and they have probably been misidentified as N. lactamica by the submitting microbiological laboratory. Strains of the ecologically isolated species, N. gonorrhoeae, form a tight genotypic cluster on a long branch (100% support) and are closely allied to N. meningitidis.


Sequences, sequence clusters and bacterial species.

Hanage WP, Fraser C, Spratt BG - Philos. Trans. R. Soc. Lond., B, Biol. Sci. (2006)

Resolving populations of N. meningitidis, N. meningitidis and N. gonorrhoeae. Bayesian tree constructed using the concatenated sequences (seven loci) of the first 500 different strains (STs) of N. meningitidis in the public Neisseria MLST database, all different strains of N. lactamica (171) and N. gonorrhoeae (67). The arrow shows the two strains of N. lactamica that cluster anomalously and have probably been incorrectly identified (see text). Only third codon positions were used in the analysis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. Details as in figure 2 with rate matrix r(A↔C) 0.044: r(A↔G) 0.541: r(A↔T) 0.018: r(C↔G) 0.044: r(C↔T) 0.299: r(G↔T) 0.053; nucleotide frequencies A 0.11: C 0.44: G 0.24: T 0.21 and gamma parameter α=0.481. Pinvar=0.30.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1764932&req=5

fig3: Resolving populations of N. meningitidis, N. meningitidis and N. gonorrhoeae. Bayesian tree constructed using the concatenated sequences (seven loci) of the first 500 different strains (STs) of N. meningitidis in the public Neisseria MLST database, all different strains of N. lactamica (171) and N. gonorrhoeae (67). The arrow shows the two strains of N. lactamica that cluster anomalously and have probably been incorrectly identified (see text). Only third codon positions were used in the analysis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. Details as in figure 2 with rate matrix r(A↔C) 0.044: r(A↔G) 0.541: r(A↔T) 0.018: r(C↔G) 0.044: r(C↔T) 0.299: r(G↔T) 0.053; nucleotide frequencies A 0.11: C 0.44: G 0.24: T 0.21 and gamma parameter α=0.481. Pinvar=0.30.
Mentions: Analysis of the sequences of N. meningitidis house-keeping genes has shown extensive evidence for recombinational imports from related commensal species (Feil et al. 1995; Zhou et al. 1997), and the trees of different Neisseria house-keeping genes (and 16S rRNA genes) suggest different phylogenetic relationships between these species (Smith et al. 1999). As expected, the individual trees derived from the sequences of each MLST locus fail to resolve N. meningitidis strains from N. lactamica strains (Hanage et al. 2005a). However, the concatenated sequences of the seven MLST loci completely resolve the N. lactamica strains from those of N. meningitidis (the group of strains shown as N. lactamica were clustered together in 100% of trees drawn from the posterior probability), although a few strains arise from the branch separating these two species (figure 3), and two strains located at the end of a long branch arising from the meningococcal cluster are clear examples of mistaken identity (arrow in figure 3). When examples of other Neisseria species are included, these two ‘lactamica’ strains cluster with these (Hanage et al. 2005a) and they have probably been misidentified as N. lactamica by the submitting microbiological laboratory. Strains of the ecologically isolated species, N. gonorrhoeae, form a tight genotypic cluster on a long branch (100% support) and are closely allied to N. meningitidis.

Bottom Line: Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination.The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy.The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

View Article: PubMed Central - PubMed

Affiliation: Department of Infectious Disease Epidemiology, Imperial College London, St Mary's Hospital Campus, Norfolk Place, London W2 1PG, UK.

ABSTRACT
Whatever else they should share, strains of bacteria assigned to the same species should have house-keeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multilocus sequence analysis, MLSA) is to concatenate the sequences of multiple house-keeping loci and to observe the patterns of clustering among large populations of strains of closely related named bacterial species. Recent studies have shown that large populations can be resolved into non-overlapping sequence clusters that agree well with species assigned by the standard microbiological methods. The use of clustering patterns to inform the division of closely related populations into species has many advantages for poorly studied bacteria (or to re-evaluate well-studied species), as it provides a way of recognizing natural discontinuities in the distribution of similar genotypes. Clustering patterns can be used by expert groups as the basis of a pragmatic approach to assigning species, taking into account whatever additional data are available (e.g. similarities in ecology, phenotype and gene content). The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy. The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

Show MeSH
Related in: MedlinePlus