Limits...
Applying unmixing to gene expression data for tumor phylogeny inference.

Schwartz R, Shackney SE - BMC Bioinformatics (2010)

Bottom Line: Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them.Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA USA. russells@andrew.cmu.edu

ABSTRACT

Background: While in principle a seemingly infinite variety of combinations of mutations could result in tumor development, in practice it appears that most human cancers fall into a relatively small number of "sub-types," each characterized a roughly equivalent sequence of mutations by which it progresses in different patients. There is currently great interest in identifying the common sub-types and applying them to the development of diagnostics or therapeutics. Phylogenetic methods have shown great promise for inferring common patterns of tumor progression, but suffer from limits of the technologies available for assaying differences between and within tumors. One approach to tumor phylogenetics uses differences between single cells within tumors, gaining valuable information about intra-tumor heterogeneity but allowing only a few markers per cell. An alternative approach uses tissue-wide measures of whole tumors to provide a detailed picture of averaged tumor state but at the cost of losing information about intra-tumor heterogeneity.

Results: The present work applies "unmixing" methods, which separate complex data sets into combinations of simpler components, to attempt to gain advantages of both tissue-wide and single-cell approaches to cancer phylogenetics. We develop an unmixing method to infer recurring cell states from microarray measurements of tumor populations and use the inferred mixtures of states in individual tumors to identify possible evolutionary relationships among tumor cells. Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them. Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.

Conclusions: Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop. These reconstructions are likely to have future value in discovering and diagnosing novel cancer sub-types and in identifying targets for therapeutic development.

Show MeSH

Related in: MedlinePlus

Accuracy of methods in inferring simulated mixture components and assigning mixture fractions to data points. (a) Root mean square error in inferred mixture components as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (b) Root mean square error in fractional assignments of components to data points as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (c) Root mean square error in inferred mixture components as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components. (d) Root mean square error in fractional assignments of components to data points as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2823708&req=5

Figure 3: Accuracy of methods in inferring simulated mixture components and assigning mixture fractions to data points. (a) Root mean square error in inferred mixture components as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (b) Root mean square error in fractional assignments of components to data points as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (c) Root mean square error in inferred mixture components as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components. (d) Root mean square error in fractional assignments of components to data points as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components.

Mentions: Fig. 3 quantifies the performance quality across a range of simulated data qualities and evolution scenarios. Fig. 3(a) assesses accuracy on uniform mixtures by the error in inferred components and Fig. 3(b) by the error in inferred mixture fractions. Figs. 3(a, b) reveal that mixture components could be identified with high accuracy provided there were few mixture components and low noise. Accuracy degraded as component number or noise level increased. Errors appear to have grown superlinearly with component number but sublinearly with the noise level. Accuracy of mixture fraction inference appears sensitive to component number but largely insensitive to noise level over the ranges examined here. It should be noted that the high accuracy regardless of noise level likely depended on the assumption that noise in each gene is independent, allowing extremely accurate estimates when noise could be averaged over many genes. Correlated noise between genes or systemic sample-wide errors would be expected to yield poorer performance.


Applying unmixing to gene expression data for tumor phylogeny inference.

Schwartz R, Shackney SE - BMC Bioinformatics (2010)

Accuracy of methods in inferring simulated mixture components and assigning mixture fractions to data points. (a) Root mean square error in inferred mixture components as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (b) Root mean square error in fractional assignments of components to data points as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (c) Root mean square error in inferred mixture components as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components. (d) Root mean square error in fractional assignments of components to data points as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2823708&req=5

Figure 3: Accuracy of methods in inferring simulated mixture components and assigning mixture fractions to data points. (a) Root mean square error in inferred mixture components as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (b) Root mean square error in fractional assignments of components to data points as a function of noise level for uniform mixtures of k = 3 to k = 7 mixture components. (c) Root mean square error in inferred mixture components as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components. (d) Root mean square error in fractional assignments of components to data points as a function of noise level for tree-embedded mixtures of k = 3 to k = 7 mixture components.
Mentions: Fig. 3 quantifies the performance quality across a range of simulated data qualities and evolution scenarios. Fig. 3(a) assesses accuracy on uniform mixtures by the error in inferred components and Fig. 3(b) by the error in inferred mixture fractions. Figs. 3(a, b) reveal that mixture components could be identified with high accuracy provided there were few mixture components and low noise. Accuracy degraded as component number or noise level increased. Errors appear to have grown superlinearly with component number but sublinearly with the noise level. Accuracy of mixture fraction inference appears sensitive to component number but largely insensitive to noise level over the ranges examined here. It should be noted that the high accuracy regardless of noise level likely depended on the assumption that noise in each gene is independent, allowing extremely accurate estimates when noise could be averaged over many genes. Correlated noise between genes or systemic sample-wide errors would be expected to yield poorer performance.

Bottom Line: Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them.Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA USA. russells@andrew.cmu.edu

ABSTRACT

Background: While in principle a seemingly infinite variety of combinations of mutations could result in tumor development, in practice it appears that most human cancers fall into a relatively small number of "sub-types," each characterized a roughly equivalent sequence of mutations by which it progresses in different patients. There is currently great interest in identifying the common sub-types and applying them to the development of diagnostics or therapeutics. Phylogenetic methods have shown great promise for inferring common patterns of tumor progression, but suffer from limits of the technologies available for assaying differences between and within tumors. One approach to tumor phylogenetics uses differences between single cells within tumors, gaining valuable information about intra-tumor heterogeneity but allowing only a few markers per cell. An alternative approach uses tissue-wide measures of whole tumors to provide a detailed picture of averaged tumor state but at the cost of losing information about intra-tumor heterogeneity.

Results: The present work applies "unmixing" methods, which separate complex data sets into combinations of simpler components, to attempt to gain advantages of both tissue-wide and single-cell approaches to cancer phylogenetics. We develop an unmixing method to infer recurring cell states from microarray measurements of tumor populations and use the inferred mixtures of states in individual tumors to identify possible evolutionary relationships among tumor cells. Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them. Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.

Conclusions: Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop. These reconstructions are likely to have future value in discovering and diagnosing novel cancer sub-types and in identifying targets for therapeutic development.

Show MeSH
Related in: MedlinePlus