Reconstruction of clonal trees and tumor composition from multi-sample sequencing data.
Bottom Line: We derive a combinatorial characterization of the solutions to this problem and show that the problem is NP-complete.We derive an integer linear programming solution to the VAF factorization problem in the case of error-free data and extend this solution to real data with a probabilistic model for errors.The resulting AncesTree algorithm is better able to identify ancestral relationships between individual mutations than existing approaches, particularly in ultra-deep sequencing data when high read counts for mutations yield high confidence VAFs.
Affiliation: Center for Computational Molecular Biology and Department of Computer Science, Brown University, Providence, RI 02912, USA.Show MeSH
Related in: MedlinePlus
Mentions: We further analyzed one renal tumor, EV006, for which we obtained a relatively low mixing proportion of 0.21 (Fig. 6). Samples R6 and R7 from this tumor were found to be the mixture of two and three distinct clones, respectively, that do not appear in other samples. This shows that AncesTree can infer the composition of individual samples containing clones distinct from all other samples. The remaining samples in this tumor all include a clone that appears in at least one other sample. Notably, the two lymph node samples, LN1a and LN1b, are inferred to be mixtures of the same two clones. The only difference between these two samples appears to be that LN1b contains a higher admixture with normal cells (0.45) than LN1a (< 0.01), and indeed the two lymph node samples are grouped together in the original analysis of this tumor by Gerlinger et al. (2014).Fig. 6.
Affiliation: Center for Computational Molecular Biology and Department of Computer Science, Brown University, Providence, RI 02912, USA.