Limits...
Reconstruction of clonal trees and tumor composition from multi-sample sequencing data.

El-Kebir M, Oesper L, Acheson-Field H, Raphael BJ - Bioinformatics (2015)

Bottom Line: We derive a combinatorial characterization of the solutions to this problem and show that the problem is NP-complete.We derive an integer linear programming solution to the VAF factorization problem in the case of error-free data and extend this solution to real data with a probabilistic model for errors.The resulting AncesTree algorithm is better able to identify ancestral relationships between individual mutations than existing approaches, particularly in ultra-deep sequencing data when high read counts for mutations yield high confidence VAFs.

View Article: PubMed Central - PubMed

Affiliation: Center for Computational Molecular Biology and Department of Computer Science, Brown University, Providence, RI 02912, USA.

Show MeSH

Related in: MedlinePlus

Spanning arborescences of the ancestry graph. (A) F cannot be factorized as its ancestry graph does not admit a spanning arborescence. (B) Red arcs indicate a spanning arborescence T of the ancestry graph of F with corresponding matrix B. B does not generate F as the matrix
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4542783&req=5

btv261-F2: Spanning arborescences of the ancestry graph. (A) F cannot be factorized as its ancestry graph does not admit a spanning arborescence. (B) Red arcs indicate a spanning arborescence T of the ancestry graph of F with corresponding matrix B. B does not generate F as the matrix

Mentions: If an ancestry graph does not have a spanning arborescence then there exists no tree T that generates F. Checking whether G has a spanning, arborescence can be done in time since by definition A contains all transitive arcs. Figure 2A shows an example of a frequency matrix whose ancestry graph has no spanning arborescence. Furthermore, not all spanning arborescences T of G generate F. Figure 2B shows such an example, where the matrix U obtained from T and F has negative entries and thus is not a usage matrix. Hence, the existence of a spanning arborescence in G is a necessary but not a sufficient condition for a solution to the VAFFP.Fig. 2.


Reconstruction of clonal trees and tumor composition from multi-sample sequencing data.

El-Kebir M, Oesper L, Acheson-Field H, Raphael BJ - Bioinformatics (2015)

Spanning arborescences of the ancestry graph. (A) F cannot be factorized as its ancestry graph does not admit a spanning arborescence. (B) Red arcs indicate a spanning arborescence T of the ancestry graph of F with corresponding matrix B. B does not generate F as the matrix
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4542783&req=5

btv261-F2: Spanning arborescences of the ancestry graph. (A) F cannot be factorized as its ancestry graph does not admit a spanning arborescence. (B) Red arcs indicate a spanning arborescence T of the ancestry graph of F with corresponding matrix B. B does not generate F as the matrix
Mentions: If an ancestry graph does not have a spanning arborescence then there exists no tree T that generates F. Checking whether G has a spanning, arborescence can be done in time since by definition A contains all transitive arcs. Figure 2A shows an example of a frequency matrix whose ancestry graph has no spanning arborescence. Furthermore, not all spanning arborescences T of G generate F. Figure 2B shows such an example, where the matrix U obtained from T and F has negative entries and thus is not a usage matrix. Hence, the existence of a spanning arborescence in G is a necessary but not a sufficient condition for a solution to the VAFFP.Fig. 2.

Bottom Line: We derive a combinatorial characterization of the solutions to this problem and show that the problem is NP-complete.We derive an integer linear programming solution to the VAF factorization problem in the case of error-free data and extend this solution to real data with a probabilistic model for errors.The resulting AncesTree algorithm is better able to identify ancestral relationships between individual mutations than existing approaches, particularly in ultra-deep sequencing data when high read counts for mutations yield high confidence VAFs.

View Article: PubMed Central - PubMed

Affiliation: Center for Computational Molecular Biology and Department of Computer Science, Brown University, Providence, RI 02912, USA.

Show MeSH
Related in: MedlinePlus