Limits...
Sequences, sequence clusters and bacterial species.

Hanage WP, Fraser C, Spratt BG - Philos. Trans. R. Soc. Lond., B, Biol. Sci. (2006)

Bottom Line: Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination.The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy.The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

View Article: PubMed Central - PubMed

Affiliation: Department of Infectious Disease Epidemiology, Imperial College London, St Mary's Hospital Campus, Norfolk Place, London W2 1PG, UK.

ABSTRACT
Whatever else they should share, strains of bacteria assigned to the same species should have house-keeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multilocus sequence analysis, MLSA) is to concatenate the sequences of multiple house-keeping loci and to observe the patterns of clustering among large populations of strains of closely related named bacterial species. Recent studies have shown that large populations can be resolved into non-overlapping sequence clusters that agree well with species assigned by the standard microbiological methods. The use of clustering patterns to inform the division of closely related populations into species has many advantages for poorly studied bacteria (or to re-evaluate well-studied species), as it provides a way of recognizing natural discontinuities in the distribution of similar genotypes. Clustering patterns can be used by expert groups as the basis of a pragmatic approach to assigning species, taking into account whatever additional data are available (e.g. similarities in ecology, phenotype and gene content). The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy. The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

Show MeSH

Related in: MedlinePlus

Resolving populations of B. pseudomallei, B. mallei and B. thailandensis. All of the isolates in the B. pseudomallei MLST database (which includes isolates of closely related species) were extracted and the sequences at the seven MLST loci were concatenated for each different multilocus genotype (strain) and a tree was constructed using MrBayes v. 3.1. The dataset included 400 different strains (STs) of B. pseudomallei, 17 of B. thailandensis, and two each of B. mallei and B. oklahomensis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. All nucleotide sites were used in the analysis. A general time reversible model was implemented with rate matrix r(A↔C) 0.012: r(A↔G) 0.419: r(A↔T) 0.020: r(C↔G) 0.024: r(C↔T) 0.509: r(G↔T) 0.016; nucleotide frequencies A 0.18: C 0.35: G 0.32: T 0.15 and gamma parameter α=0.11. Pinvar=0.82. All trees and model parameters are based on 10 000 samples from the posterior probability at stationarity.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1764932&req=5

fig2: Resolving populations of B. pseudomallei, B. mallei and B. thailandensis. All of the isolates in the B. pseudomallei MLST database (which includes isolates of closely related species) were extracted and the sequences at the seven MLST loci were concatenated for each different multilocus genotype (strain) and a tree was constructed using MrBayes v. 3.1. The dataset included 400 different strains (STs) of B. pseudomallei, 17 of B. thailandensis, and two each of B. mallei and B. oklahomensis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. All nucleotide sites were used in the analysis. A general time reversible model was implemented with rate matrix r(A↔C) 0.012: r(A↔G) 0.419: r(A↔T) 0.020: r(C↔G) 0.024: r(C↔T) 0.509: r(G↔T) 0.016; nucleotide frequencies A 0.18: C 0.35: G 0.32: T 0.15 and gamma parameter α=0.11. Pinvar=0.82. All trees and model parameters are based on 10 000 samples from the posterior probability at stationarity.

Mentions: Godoy et al. (2003) characterized strains of Burkholderia pseudomallei, Burkholderia mallei and Burkholderia thailandensis by MLST, and examined the resolution of these species using the MLSA approach. The clustering patterns can be re-examined using the current B. pseudomallei MLST database (http://bpseudomallei.mlst.net), which now includes 770 isolates of B. pseudomallei, 36 of B. mallei and 24 of B. thailandensis. A tree constructed from the concatenated sequences of the seven MLST loci from one example of each of the 421 different multilocus genotypes (strains) in the current database shows that all B. pseudomallei are tightly clustered and are well resolved from a second cluster, which includes all B. thailandensis (figure 2). Both of these named species are soil saprophytes and are very closely related, but can be distinguished phenotypically by whether or not they can assimilate arabinose, and clinically by the fact that B. pseudomallei can cause serious disease following inoculation or inhalation, whereas B. thailandensis is considered to be avirulent.


Sequences, sequence clusters and bacterial species.

Hanage WP, Fraser C, Spratt BG - Philos. Trans. R. Soc. Lond., B, Biol. Sci. (2006)

Resolving populations of B. pseudomallei, B. mallei and B. thailandensis. All of the isolates in the B. pseudomallei MLST database (which includes isolates of closely related species) were extracted and the sequences at the seven MLST loci were concatenated for each different multilocus genotype (strain) and a tree was constructed using MrBayes v. 3.1. The dataset included 400 different strains (STs) of B. pseudomallei, 17 of B. thailandensis, and two each of B. mallei and B. oklahomensis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. All nucleotide sites were used in the analysis. A general time reversible model was implemented with rate matrix r(A↔C) 0.012: r(A↔G) 0.419: r(A↔T) 0.020: r(C↔G) 0.024: r(C↔T) 0.509: r(G↔T) 0.016; nucleotide frequencies A 0.18: C 0.35: G 0.32: T 0.15 and gamma parameter α=0.11. Pinvar=0.82. All trees and model parameters are based on 10 000 samples from the posterior probability at stationarity.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1764932&req=5

fig2: Resolving populations of B. pseudomallei, B. mallei and B. thailandensis. All of the isolates in the B. pseudomallei MLST database (which includes isolates of closely related species) were extracted and the sequences at the seven MLST loci were concatenated for each different multilocus genotype (strain) and a tree was constructed using MrBayes v. 3.1. The dataset included 400 different strains (STs) of B. pseudomallei, 17 of B. thailandensis, and two each of B. mallei and B. oklahomensis. The scale shows genetic distance, corrected for the best-fitting substitution model determined using MrModeltest and MrBayes. All nucleotide sites were used in the analysis. A general time reversible model was implemented with rate matrix r(A↔C) 0.012: r(A↔G) 0.419: r(A↔T) 0.020: r(C↔G) 0.024: r(C↔T) 0.509: r(G↔T) 0.016; nucleotide frequencies A 0.18: C 0.35: G 0.32: T 0.15 and gamma parameter α=0.11. Pinvar=0.82. All trees and model parameters are based on 10 000 samples from the posterior probability at stationarity.
Mentions: Godoy et al. (2003) characterized strains of Burkholderia pseudomallei, Burkholderia mallei and Burkholderia thailandensis by MLST, and examined the resolution of these species using the MLSA approach. The clustering patterns can be re-examined using the current B. pseudomallei MLST database (http://bpseudomallei.mlst.net), which now includes 770 isolates of B. pseudomallei, 36 of B. mallei and 24 of B. thailandensis. A tree constructed from the concatenated sequences of the seven MLST loci from one example of each of the 421 different multilocus genotypes (strains) in the current database shows that all B. pseudomallei are tightly clustered and are well resolved from a second cluster, which includes all B. thailandensis (figure 2). Both of these named species are soil saprophytes and are very closely related, but can be distinguished phenotypically by whether or not they can assimilate arabinose, and clinically by the fact that B. pseudomallei can cause serious disease following inoculation or inhalation, whereas B. thailandensis is considered to be avirulent.

Bottom Line: Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination.The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy.The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

View Article: PubMed Central - PubMed

Affiliation: Department of Infectious Disease Epidemiology, Imperial College London, St Mary's Hospital Campus, Norfolk Place, London W2 1PG, UK.

ABSTRACT
Whatever else they should share, strains of bacteria assigned to the same species should have house-keeping genes that are similar in sequence. Single gene sequences (or rRNA gene sequences) have very few informative sites to resolve the strains of closely related species, and relationships among similar species may be confounded by interspecies recombination. A more promising approach (multilocus sequence analysis, MLSA) is to concatenate the sequences of multiple house-keeping loci and to observe the patterns of clustering among large populations of strains of closely related named bacterial species. Recent studies have shown that large populations can be resolved into non-overlapping sequence clusters that agree well with species assigned by the standard microbiological methods. The use of clustering patterns to inform the division of closely related populations into species has many advantages for poorly studied bacteria (or to re-evaluate well-studied species), as it provides a way of recognizing natural discontinuities in the distribution of similar genotypes. Clustering patterns can be used by expert groups as the basis of a pragmatic approach to assigning species, taking into account whatever additional data are available (e.g. similarities in ecology, phenotype and gene content). The development of large MLSA Internet databases provides the ability to assign new strains to previously defined species clusters and an electronic taxonomy. The advantages and problems in using sequence clusters as the basis of species assignments are discussed.

Show MeSH
Related in: MedlinePlus