Limits...
Applying unmixing to gene expression data for tumor phylogeny inference.

Schwartz R, Shackney SE - BMC Bioinformatics (2010)

Bottom Line: Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them.Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA USA. russells@andrew.cmu.edu

ABSTRACT

Background: While in principle a seemingly infinite variety of combinations of mutations could result in tumor development, in practice it appears that most human cancers fall into a relatively small number of "sub-types," each characterized a roughly equivalent sequence of mutations by which it progresses in different patients. There is currently great interest in identifying the common sub-types and applying them to the development of diagnostics or therapeutics. Phylogenetic methods have shown great promise for inferring common patterns of tumor progression, but suffer from limits of the technologies available for assaying differences between and within tumors. One approach to tumor phylogenetics uses differences between single cells within tumors, gaining valuable information about intra-tumor heterogeneity but allowing only a few markers per cell. An alternative approach uses tissue-wide measures of whole tumors to provide a detailed picture of averaged tumor state but at the cost of losing information about intra-tumor heterogeneity.

Results: The present work applies "unmixing" methods, which separate complex data sets into combinations of simpler components, to attempt to gain advantages of both tissue-wide and single-cell approaches to cancer phylogenetics. We develop an unmixing method to infer recurring cell states from microarray measurements of tumor populations and use the inferred mixtures of states in individual tumors to identify possible evolutionary relationships among tumor cells. Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them. Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.

Conclusions: Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop. These reconstructions are likely to have future value in discovering and diagnosing novel cancer sub-types and in identifying targets for therapeutic development.

Show MeSH

Related in: MedlinePlus

Illustration of the geometric mixture model used in the present work. The image shows a hypothetical set of three mixture components (C1, C2, and C3) and two mixed samples (M1 and M2) produced from different mixtures of those components. The triangular simplex enclosed by the mixture components is shown with dashed lines. To the right are the matrices M, C, and F corresponding to the example data points.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2823708&req=5

Figure 1: Illustration of the geometric mixture model used in the present work. The image shows a hypothetical set of three mixture components (C1, C2, and C3) and two mixed samples (M1 and M2) produced from different mixtures of those components. The triangular simplex enclosed by the mixture components is shown with dashed lines. To the right are the matrices M, C, and F corresponding to the example data points.

Mentions: The unmixing problem is illustrated in Fig. 1, which shows a small hypothetical example of a possible M, C, and F for k = 3. In the example, we see two data points, M1 and M2, meant to represent primary tumor samples derived from three mixture components, C1, C2, and C3. For this example, we assume data are assayed on just two genes, G1 and G2. The matrix M provides the coordinates of the observed mixed samples, M1 and M2, in terms of the gene expression levels G1 and G2. We assume here that M1 and M2 are mixtures of the three components, C1, C2, and C3, meaning that they will lie in the triangular simplex that has the components as its vertices. The matrix C provides the coordinates of the three components in terms of G1 and G2. The matrix F then describes how M1 and M2 are generated from C. The first row of F indicates that M1 is a mixture of equal parts of C1 and C2, and thus appears at the midpoint of the line between those two components. The second row of F indicates that M2 is a mixture of 80% C3 with 10% each C1 and C2, thus appearing internal to the simplex but close to C3. In the real problem, we get to observe only M and must therefore infer the C and F matrices likely to have generated the observed M.


Applying unmixing to gene expression data for tumor phylogeny inference.

Schwartz R, Shackney SE - BMC Bioinformatics (2010)

Illustration of the geometric mixture model used in the present work. The image shows a hypothetical set of three mixture components (C1, C2, and C3) and two mixed samples (M1 and M2) produced from different mixtures of those components. The triangular simplex enclosed by the mixture components is shown with dashed lines. To the right are the matrices M, C, and F corresponding to the example data points.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2823708&req=5

Figure 1: Illustration of the geometric mixture model used in the present work. The image shows a hypothetical set of three mixture components (C1, C2, and C3) and two mixed samples (M1 and M2) produced from different mixtures of those components. The triangular simplex enclosed by the mixture components is shown with dashed lines. To the right are the matrices M, C, and F corresponding to the example data points.
Mentions: The unmixing problem is illustrated in Fig. 1, which shows a small hypothetical example of a possible M, C, and F for k = 3. In the example, we see two data points, M1 and M2, meant to represent primary tumor samples derived from three mixture components, C1, C2, and C3. For this example, we assume data are assayed on just two genes, G1 and G2. The matrix M provides the coordinates of the observed mixed samples, M1 and M2, in terms of the gene expression levels G1 and G2. We assume here that M1 and M2 are mixtures of the three components, C1, C2, and C3, meaning that they will lie in the triangular simplex that has the components as its vertices. The matrix C provides the coordinates of the three components in terms of G1 and G2. The matrix F then describes how M1 and M2 are generated from C. The first row of F indicates that M1 is a mixture of equal parts of C1 and C2, and thus appears at the midpoint of the line between those two components. The second row of F indicates that M2 is a mixture of 80% C3 with 10% each C1 and C2, thus appearing internal to the simplex but close to C3. In the real problem, we get to observe only M and must therefore infer the C and F matrices likely to have generated the observed M.

Bottom Line: Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them.Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA USA. russells@andrew.cmu.edu

ABSTRACT

Background: While in principle a seemingly infinite variety of combinations of mutations could result in tumor development, in practice it appears that most human cancers fall into a relatively small number of "sub-types," each characterized a roughly equivalent sequence of mutations by which it progresses in different patients. There is currently great interest in identifying the common sub-types and applying them to the development of diagnostics or therapeutics. Phylogenetic methods have shown great promise for inferring common patterns of tumor progression, but suffer from limits of the technologies available for assaying differences between and within tumors. One approach to tumor phylogenetics uses differences between single cells within tumors, gaining valuable information about intra-tumor heterogeneity but allowing only a few markers per cell. An alternative approach uses tissue-wide measures of whole tumors to provide a detailed picture of averaged tumor state but at the cost of losing information about intra-tumor heterogeneity.

Results: The present work applies "unmixing" methods, which separate complex data sets into combinations of simpler components, to attempt to gain advantages of both tissue-wide and single-cell approaches to cancer phylogenetics. We develop an unmixing method to infer recurring cell states from microarray measurements of tumor populations and use the inferred mixtures of states in individual tumors to identify possible evolutionary relationships among tumor cells. Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them. Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.

Conclusions: Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop. These reconstructions are likely to have future value in discovering and diagnosing novel cancer sub-types and in identifying targets for therapeutic development.

Show MeSH
Related in: MedlinePlus