Limits...
PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors.

Deshwar AG, Vembu S, Yung CK, Jang GH, Stein L, Morris Q - Genome Biol. (2015)

Bottom Line: Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations.We describe a new method, PhyloWGS, which can be applied to whole-genome sequencing data from one or more tumor samples to reconstruct complete genotypes of these subpopulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations.We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods.

View Article: PubMed Central - PubMed

ABSTRACT
Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations. We describe a new method, PhyloWGS, which can be applied to whole-genome sequencing data from one or more tumor samples to reconstruct complete genotypes of these subpopulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations. We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods. PhyloWGS is free, open-source software, available at https://github.com/morrislab/phylowgs.

Show MeSH

Related in: MedlinePlus

Subclonal reconstruction algorithms applied to breast tumor PD4120. Left: Area under the precision–recall curve (AUPRC) for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of normal copy number. Right: AUPRC for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of altered and normal copy number. CN, copy number; SSM, simple somatic mutation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4359439&req=5

Fig11: Subclonal reconstruction algorithms applied to breast tumor PD4120. Left: Area under the precision–recall curve (AUPRC) for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of normal copy number. Right: AUPRC for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of altered and normal copy number. CN, copy number; SSM, simple somatic mutation.

Mentions: We analyzed data from WGS at 288 × coverage for tumor PD4120a, first reported in [26] and re-analyzed in [15] (available as accession [EGAD:00001000138]). We confined our analysis to SSMs in genomic regions where THetA and the original analysis agreed on the copy number status of the genome (chr 3,4q,5,10,13,16q,17,19 and 20). These regions contain a total of 26,029 SSMs, of which 4,739 were in regions affected by clonal copy number changes and 2,171 were in regions affected by subclonal copy number changes. We then ran PhyloWGS, PyClone and SciClone on SSMs in regions of normal copy number and on SSMs in regions of both altered and normal copy number. PyClone uses a non-phylogenic correction for copy number alterations and SciClone performs no correction. Based on the semi-manual clustering from [26], we identified those mutations assigned to clusters D, C and B with high probability, which we used as our gold standard for clustering. We then compared the AUPRC for all three algorithms on the two datasets (see Figure 11). All three algorithms have very similar performance when only looking at SSMs in normal regions (Figure 11, left panel). PhyloWGS continues to have very high performance when SSMs in regions of copy number alterations are included, while both PyClone and SciClone have much worse performance than PhyloWGS (Figure 11, right panel).Figure 11


PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors.

Deshwar AG, Vembu S, Yung CK, Jang GH, Stein L, Morris Q - Genome Biol. (2015)

Subclonal reconstruction algorithms applied to breast tumor PD4120. Left: Area under the precision–recall curve (AUPRC) for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of normal copy number. Right: AUPRC for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of altered and normal copy number. CN, copy number; SSM, simple somatic mutation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4359439&req=5

Fig11: Subclonal reconstruction algorithms applied to breast tumor PD4120. Left: Area under the precision–recall curve (AUPRC) for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of normal copy number. Right: AUPRC for PhyloWGS, PyClone and SciClone when looking at SSMs in areas of altered and normal copy number. CN, copy number; SSM, simple somatic mutation.
Mentions: We analyzed data from WGS at 288 × coverage for tumor PD4120a, first reported in [26] and re-analyzed in [15] (available as accession [EGAD:00001000138]). We confined our analysis to SSMs in genomic regions where THetA and the original analysis agreed on the copy number status of the genome (chr 3,4q,5,10,13,16q,17,19 and 20). These regions contain a total of 26,029 SSMs, of which 4,739 were in regions affected by clonal copy number changes and 2,171 were in regions affected by subclonal copy number changes. We then ran PhyloWGS, PyClone and SciClone on SSMs in regions of normal copy number and on SSMs in regions of both altered and normal copy number. PyClone uses a non-phylogenic correction for copy number alterations and SciClone performs no correction. Based on the semi-manual clustering from [26], we identified those mutations assigned to clusters D, C and B with high probability, which we used as our gold standard for clustering. We then compared the AUPRC for all three algorithms on the two datasets (see Figure 11). All three algorithms have very similar performance when only looking at SSMs in normal regions (Figure 11, left panel). PhyloWGS continues to have very high performance when SSMs in regions of copy number alterations are included, while both PyClone and SciClone have much worse performance than PhyloWGS (Figure 11, right panel).Figure 11

Bottom Line: Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations.We describe a new method, PhyloWGS, which can be applied to whole-genome sequencing data from one or more tumor samples to reconstruct complete genotypes of these subpopulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations.We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods.

View Article: PubMed Central - PubMed

ABSTRACT
Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations. We describe a new method, PhyloWGS, which can be applied to whole-genome sequencing data from one or more tumor samples to reconstruct complete genotypes of these subpopulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations. We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods. PhyloWGS is free, open-source software, available at https://github.com/morrislab/phylowgs.

Show MeSH
Related in: MedlinePlus