BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies.
Bottom Line: Here, we present BitPhylogenyBitPhylogeny, a probabilistic framework to reconstruct intra-tumor evolutionary pathways.Using a full Bayesian approach, we jointly estimate the number and composition of clones in the sample as well as the most likely tree connecting them.We validate our approach in the controlled setting of a simulation study and compare it against several competing methods.
Cancer has long been understood as a somatic evolutionary process, but many details of tumor progression remain elusive. Here, we present BitPhylogenyBitPhylogeny, a probabilistic framework to reconstruct intra-tumor evolutionary pathways. Using a full Bayesian approach, we jointly estimate the number and composition of clones in the sample as well as the most likely tree connecting them. We validate our approach in the controlled setting of a simulation study and compare it against several competing methods. In two case studies, we demonstrate how BitPhylogeny BitPhylogeny reconstructs tumor phylogenies from methylation patterns in colon cancer and from single-cell exomes in myeloproliferative neoplasm.
Related in: MedlinePlus
License 1 - License 2
Mentions: To compare the tree topologies explicitly, we developed a distance measure called consensus node-based shortest path distance (see Materials and methods for details). The performance of BitPhylogeny is examined based on the empirical MAP solution (see Materials and methods). The results for all synthetic data sets (five clone compositions and four noise levels) are presented in Figure 4. For all clonal compositions and noise levels, BitPhylogeny constructs trees that are much closer to the true tree than both baseline methods. For the monoclonal tree, all three methods are able to reconstruct the two clones accurately. However, as clonal composition becomes more complex, the performance of the two baseline methods starts to degrade quickly. The baseline methods overestimated the number of clones and produce much deeper trees for most synthetic data sets. As a result, they perform poorly when the complexity of clone composition increases.Figure 4