Limits...
On the use of bootstrapped topologies in coalescent-based Bayesian MCMC inference: a comparison of estimation and computational efficiencies.

Rodrigo AG, Tsai P, Shearman H - Evol. Bioinform. Online (2009)

Bottom Line: To do this, we use bootstrapped topologies as fixed genealogies, perform a single MCMC analysis on each genealogy without topological rearrangements, and pool the results across all MCMC analyses.We show, through simulations, that although the standard MCMC performs better than the bootstrap-MCMC at estimating the effective population size (scaled by mutation rate), the bootstrap-MCMC returns better estimates of growth rates.Additionally, we find that our bootstrap-MCMC analyses are, on average, 37 times faster for equivalent effective sample sizes.

View Article: PubMed Central - PubMed

Affiliation: The Bioinformatics Institute, and The Allan Wilson Centre for Molecular Ecology and Evolution, University of Auckland, Private Bag 92019, Auckland, New Zealand. a.rodrigo@auckland.ac.nz

ABSTRACT
Coalescent-based Bayesian Markov chain Monte Carlo (MCMC) inference generates estimates of evolutionary parameters and their posterior probability distributions. As the number of sequences increases, the length of time taken to complete an MCMC analysis increases as well. Here, we investigate an approach to distribute the MCMC analysis across a cluster of computers. To do this, we use bootstrapped topologies as fixed genealogies, perform a single MCMC analysis on each genealogy without topological rearrangements, and pool the results across all MCMC analyses. We show, through simulations, that although the standard MCMC performs better than the bootstrap-MCMC at estimating the effective population size (scaled by mutation rate), the bootstrap-MCMC returns better estimates of growth rates. Additionally, we find that our bootstrap-MCMC analyses are, on average, 37 times faster for equivalent effective sample sizes.

No MeSH data available.


Posterior distribution from bootstrap-MCMC and standard-MCMC. Example of the log-posterior probability distribution from both bootstrap- MCMC (top) and standard-MCMC (below) obtained with 210 sequences simulated with a constant population size. Note also the difference in scales of the horizontal axes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2747130&req=5

f1-ebo-2009-097: Posterior distribution from bootstrap-MCMC and standard-MCMC. Example of the log-posterior probability distribution from both bootstrap- MCMC (top) and standard-MCMC (below) obtained with 210 sequences simulated with a constant population size. Note also the difference in scales of the horizontal axes.

Mentions: Interestingly, the frequency distribution of posterior probabilities is multimodal for the bootstrap-MCMC and unimodal for the standard MCMC (Figs. 1A, B). In retrospect, this is not surprising, since only a small part of topology space is explored under the bootstrap-MCMC. It is worth noting, however, that the number of modes on the marginal distribution of log-posterior probabilities obtained using the bootstrap-MCMC does not correspond to the number of unique topologies obtained using the bootstrap. There are more topologies obtained than modes on the marginal distribution of posterior probabilities. Also, it is worth pointing out that the bootstrap- MCMC obtains lower log-posterior probabilities than the standard MCMC.


On the use of bootstrapped topologies in coalescent-based Bayesian MCMC inference: a comparison of estimation and computational efficiencies.

Rodrigo AG, Tsai P, Shearman H - Evol. Bioinform. Online (2009)

Posterior distribution from bootstrap-MCMC and standard-MCMC. Example of the log-posterior probability distribution from both bootstrap- MCMC (top) and standard-MCMC (below) obtained with 210 sequences simulated with a constant population size. Note also the difference in scales of the horizontal axes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2747130&req=5

f1-ebo-2009-097: Posterior distribution from bootstrap-MCMC and standard-MCMC. Example of the log-posterior probability distribution from both bootstrap- MCMC (top) and standard-MCMC (below) obtained with 210 sequences simulated with a constant population size. Note also the difference in scales of the horizontal axes.
Mentions: Interestingly, the frequency distribution of posterior probabilities is multimodal for the bootstrap-MCMC and unimodal for the standard MCMC (Figs. 1A, B). In retrospect, this is not surprising, since only a small part of topology space is explored under the bootstrap-MCMC. It is worth noting, however, that the number of modes on the marginal distribution of log-posterior probabilities obtained using the bootstrap-MCMC does not correspond to the number of unique topologies obtained using the bootstrap. There are more topologies obtained than modes on the marginal distribution of posterior probabilities. Also, it is worth pointing out that the bootstrap- MCMC obtains lower log-posterior probabilities than the standard MCMC.

Bottom Line: To do this, we use bootstrapped topologies as fixed genealogies, perform a single MCMC analysis on each genealogy without topological rearrangements, and pool the results across all MCMC analyses.We show, through simulations, that although the standard MCMC performs better than the bootstrap-MCMC at estimating the effective population size (scaled by mutation rate), the bootstrap-MCMC returns better estimates of growth rates.Additionally, we find that our bootstrap-MCMC analyses are, on average, 37 times faster for equivalent effective sample sizes.

View Article: PubMed Central - PubMed

Affiliation: The Bioinformatics Institute, and The Allan Wilson Centre for Molecular Ecology and Evolution, University of Auckland, Private Bag 92019, Auckland, New Zealand. a.rodrigo@auckland.ac.nz

ABSTRACT
Coalescent-based Bayesian Markov chain Monte Carlo (MCMC) inference generates estimates of evolutionary parameters and their posterior probability distributions. As the number of sequences increases, the length of time taken to complete an MCMC analysis increases as well. Here, we investigate an approach to distribute the MCMC analysis across a cluster of computers. To do this, we use bootstrapped topologies as fixed genealogies, perform a single MCMC analysis on each genealogy without topological rearrangements, and pool the results across all MCMC analyses. We show, through simulations, that although the standard MCMC performs better than the bootstrap-MCMC at estimating the effective population size (scaled by mutation rate), the bootstrap-MCMC returns better estimates of growth rates. Additionally, we find that our bootstrap-MCMC analyses are, on average, 37 times faster for equivalent effective sample sizes.

No MeSH data available.