Limits...
Reconstruction of clonal trees and tumor composition from multi-sample sequencing data.

El-Kebir M, Oesper L, Acheson-Field H, Raphael BJ - Bioinformatics (2015)

Bottom Line: We derive a combinatorial characterization of the solutions to this problem and show that the problem is NP-complete.We derive an integer linear programming solution to the VAF factorization problem in the case of error-free data and extend this solution to real data with a probabilistic model for errors.The resulting AncesTree algorithm is better able to identify ancestral relationships between individual mutations than existing approaches, particularly in ultra-deep sequencing data when high read counts for mutations yield high confidence VAFs.

View Article: PubMed Central - PubMed

Affiliation: Center for Computational Molecular Biology and Department of Computer Science, Brown University, Providence, RI 02912, USA.

Show MeSH

Related in: MedlinePlus

Model for clonal evolution and inference. (A) An example of the evolution of a tumor containing seven distinct clones. Passenger mutations (white) occurring before the first clonal expansion will be indistinguishable from mutations driving the growth of the founding clone (light blue). Each subsequent mutation (green, purple, dark blue, orange, red and tan) creates a new clone. (B) Three sequenced tumor samples. Some clones may no longer exist at the time of sequencing (orange). Samples 1 and 2 each contain a single clone (purple and blue respectively), whereas Sample 3 is a mixture of three clones (light blue, red and tan). (C) The frequency matrix F observed for the three sequenced samples indicated in part B. (D) The usage matrix U and clonal matrix B that generate F. Even though some clones existing at the current time may not be contained within a sequenced sample (green), their existence in the evolutionary history of the tumor may be recovered. (E) Tree of the inferred tumor clones. Solid black edges are the clonal tree T corresponding to the clonal matrix B. Gray dashed edges indicate internal vertices used in the mixing of some sample. The number next to each clone in each sample indicates the fraction of cells in the sample from that clone. (F) The ancestry graph for the observed data. The bold arcs indicate the spanning arborescence corresponding to T
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4542783&req=5

btv261-F1: Model for clonal evolution and inference. (A) An example of the evolution of a tumor containing seven distinct clones. Passenger mutations (white) occurring before the first clonal expansion will be indistinguishable from mutations driving the growth of the founding clone (light blue). Each subsequent mutation (green, purple, dark blue, orange, red and tan) creates a new clone. (B) Three sequenced tumor samples. Some clones may no longer exist at the time of sequencing (orange). Samples 1 and 2 each contain a single clone (purple and blue respectively), whereas Sample 3 is a mixture of three clones (light blue, red and tan). (C) The frequency matrix F observed for the three sequenced samples indicated in part B. (D) The usage matrix U and clonal matrix B that generate F. Even though some clones existing at the current time may not be contained within a sequenced sample (green), their existence in the evolutionary history of the tumor may be recovered. (E) Tree of the inferred tumor clones. Solid black edges are the clonal tree T corresponding to the clonal matrix B. Gray dashed edges indicate internal vertices used in the mixing of some sample. The number next to each clone in each sample indicates the fraction of cells in the sample from that clone. (F) The ancestry graph for the observed data. The bold arcs indicate the spanning arborescence corresponding to T

Mentions: Cancer is a disease resulting from somatic mutations that accumulate during an individual’s lifetime and lead to uncontrolled growth of a collection of cells into a tumor. The clonal theory of cancer (Nowell, 1976) predicts that all cells within a tumor have descended from a single founder cell and that subsequent clonal expansions occur from additional advantageous mutations. As a result, the cells within a tumor may differ in their complement of somatic mutations, with each cell being a descendant of a clone from a clonal expansion (Fig. 1A). High-coverage sequencing of tumor genomes allows one to study this intra-tumor heterogeneity by measuring the frequencies of mutations within a tumor (Ding et al., 2012; Nik-Zainal et al., 2012; Shah et al., 2012). Characterization of intra-tumor heterogeneity and inference of the clonal evolutionary history of somatic mutations within a tumor provide useful insight in the tumor’s development and may help inform treatment.Fig. 1.


Reconstruction of clonal trees and tumor composition from multi-sample sequencing data.

El-Kebir M, Oesper L, Acheson-Field H, Raphael BJ - Bioinformatics (2015)

Model for clonal evolution and inference. (A) An example of the evolution of a tumor containing seven distinct clones. Passenger mutations (white) occurring before the first clonal expansion will be indistinguishable from mutations driving the growth of the founding clone (light blue). Each subsequent mutation (green, purple, dark blue, orange, red and tan) creates a new clone. (B) Three sequenced tumor samples. Some clones may no longer exist at the time of sequencing (orange). Samples 1 and 2 each contain a single clone (purple and blue respectively), whereas Sample 3 is a mixture of three clones (light blue, red and tan). (C) The frequency matrix F observed for the three sequenced samples indicated in part B. (D) The usage matrix U and clonal matrix B that generate F. Even though some clones existing at the current time may not be contained within a sequenced sample (green), their existence in the evolutionary history of the tumor may be recovered. (E) Tree of the inferred tumor clones. Solid black edges are the clonal tree T corresponding to the clonal matrix B. Gray dashed edges indicate internal vertices used in the mixing of some sample. The number next to each clone in each sample indicates the fraction of cells in the sample from that clone. (F) The ancestry graph for the observed data. The bold arcs indicate the spanning arborescence corresponding to T
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4542783&req=5

btv261-F1: Model for clonal evolution and inference. (A) An example of the evolution of a tumor containing seven distinct clones. Passenger mutations (white) occurring before the first clonal expansion will be indistinguishable from mutations driving the growth of the founding clone (light blue). Each subsequent mutation (green, purple, dark blue, orange, red and tan) creates a new clone. (B) Three sequenced tumor samples. Some clones may no longer exist at the time of sequencing (orange). Samples 1 and 2 each contain a single clone (purple and blue respectively), whereas Sample 3 is a mixture of three clones (light blue, red and tan). (C) The frequency matrix F observed for the three sequenced samples indicated in part B. (D) The usage matrix U and clonal matrix B that generate F. Even though some clones existing at the current time may not be contained within a sequenced sample (green), their existence in the evolutionary history of the tumor may be recovered. (E) Tree of the inferred tumor clones. Solid black edges are the clonal tree T corresponding to the clonal matrix B. Gray dashed edges indicate internal vertices used in the mixing of some sample. The number next to each clone in each sample indicates the fraction of cells in the sample from that clone. (F) The ancestry graph for the observed data. The bold arcs indicate the spanning arborescence corresponding to T
Mentions: Cancer is a disease resulting from somatic mutations that accumulate during an individual’s lifetime and lead to uncontrolled growth of a collection of cells into a tumor. The clonal theory of cancer (Nowell, 1976) predicts that all cells within a tumor have descended from a single founder cell and that subsequent clonal expansions occur from additional advantageous mutations. As a result, the cells within a tumor may differ in their complement of somatic mutations, with each cell being a descendant of a clone from a clonal expansion (Fig. 1A). High-coverage sequencing of tumor genomes allows one to study this intra-tumor heterogeneity by measuring the frequencies of mutations within a tumor (Ding et al., 2012; Nik-Zainal et al., 2012; Shah et al., 2012). Characterization of intra-tumor heterogeneity and inference of the clonal evolutionary history of somatic mutations within a tumor provide useful insight in the tumor’s development and may help inform treatment.Fig. 1.

Bottom Line: We derive a combinatorial characterization of the solutions to this problem and show that the problem is NP-complete.We derive an integer linear programming solution to the VAF factorization problem in the case of error-free data and extend this solution to real data with a probabilistic model for errors.The resulting AncesTree algorithm is better able to identify ancestral relationships between individual mutations than existing approaches, particularly in ultra-deep sequencing data when high read counts for mutations yield high confidence VAFs.

View Article: PubMed Central - PubMed

Affiliation: Center for Computational Molecular Biology and Department of Computer Science, Brown University, Providence, RI 02912, USA.

Show MeSH
Related in: MedlinePlus