Limits...
Identifying currents in the gene pool for bacterial populations using an integrative approach.

Tang J, Hanage WP, Fraser C, Corander J - PLoS Comput. Biol. (2009)

Bottom Line: However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place.Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations.Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland. jing.tang@helsinki.fi

ABSTRACT
The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.

Show MeSH
Testing partition accuracy for different choices of gene flow weights for a small population size  (upper panel) and a large population size  (lower panel).The number of segregating sites for both settings is  and the ratio of mutations at two stages is . Data were generated by assigning  and  randomly at the interval [0,1] with the gene flow topology fixed as in Figure 2. A brighter area corresponds to a range of  and , within which the true partition has been identified by BAPS with a higher accuracy as measured by Rand Index (RI).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2713424&req=5

pcbi-1000455-g003: Testing partition accuracy for different choices of gene flow weights for a small population size (upper panel) and a large population size (lower panel).The number of segregating sites for both settings is and the ratio of mutations at two stages is . Data were generated by assigning and randomly at the interval [0,1] with the gene flow topology fixed as in Figure 2. A brighter area corresponds to a range of and , within which the true partition has been identified by BAPS with a higher accuracy as measured by Rand Index (RI).

Mentions: We reported the partition accuracy with respect to different choices of and under a constant population size in one scenario and in another. The partition accuracy measured by the Rand Index (RI) (see e.g. [29]) is summarized as a grey-scale map (Figure 3). In the presence of a small amount of admixture, i.e. , the tentative population structure can be identified with high accuracy. As the recombination rate increases over a critical threshold, e.g. as for the current setting, the partition accuracy drops quickly. Therefore, a higher recombination rate, indicated by a lower , would imply a lower partition stability. Such an observation matches our expectation that excessive amount of admixture tends to obscure the putative population structure.


Identifying currents in the gene pool for bacterial populations using an integrative approach.

Tang J, Hanage WP, Fraser C, Corander J - PLoS Comput. Biol. (2009)

Testing partition accuracy for different choices of gene flow weights for a small population size  (upper panel) and a large population size  (lower panel).The number of segregating sites for both settings is  and the ratio of mutations at two stages is . Data were generated by assigning  and  randomly at the interval [0,1] with the gene flow topology fixed as in Figure 2. A brighter area corresponds to a range of  and , within which the true partition has been identified by BAPS with a higher accuracy as measured by Rand Index (RI).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2713424&req=5

pcbi-1000455-g003: Testing partition accuracy for different choices of gene flow weights for a small population size (upper panel) and a large population size (lower panel).The number of segregating sites for both settings is and the ratio of mutations at two stages is . Data were generated by assigning and randomly at the interval [0,1] with the gene flow topology fixed as in Figure 2. A brighter area corresponds to a range of and , within which the true partition has been identified by BAPS with a higher accuracy as measured by Rand Index (RI).
Mentions: We reported the partition accuracy with respect to different choices of and under a constant population size in one scenario and in another. The partition accuracy measured by the Rand Index (RI) (see e.g. [29]) is summarized as a grey-scale map (Figure 3). In the presence of a small amount of admixture, i.e. , the tentative population structure can be identified with high accuracy. As the recombination rate increases over a critical threshold, e.g. as for the current setting, the partition accuracy drops quickly. Therefore, a higher recombination rate, indicated by a lower , would imply a lower partition stability. Such an observation matches our expectation that excessive amount of admixture tends to obscure the putative population structure.

Bottom Line: However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place.Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations.Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland. jing.tang@helsinki.fi

ABSTRACT
The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.

Show MeSH