Limits...
Clinical and biological relevance of genomic heterogeneity in chronic lymphocytic leukemia.

Friedman DR, Lucas JE, Weinberg JB - PLoS ONE (2013)

Bottom Line: We used unsupervised approaches to divide the data into subgroups, evaluated the biological pathways and genetic aberrations that were associated with the subgroups, and compared prognostic and clinical outcome data between the subgroups.We identified seven genomically-defined CLL subgroups that have distinct biological properties, are associated with specific chromosomal deletions and amplifications, and have marked differences in molecular prognostic markers and clinical outcomes.Our results indicate that investigations focusing on small numbers of patient samples likely provide a biased outlook on CLL biology.

View Article: PubMed Central - PubMed

Affiliation: Department of Medicine, Duke University, Durham, North Carolina, USA. daphne.friedman@duke.edu

ABSTRACT

Background: Chronic lymphocytic leukemia (CLL) is typically regarded as an indolent B-cell malignancy. However, there is wide variability with regards to need for therapy, time to progressive disease, and treatment response. This clinical variability is due, in part, to biological heterogeneity between individual patients' leukemias. While much has been learned about this biological variation using genomic approaches, it is unclear whether such efforts have sufficiently evaluated biological and clinical heterogeneity in CLL.

Methods: To study the extent of genomic variability in CLL and the biological and clinical attributes of genomic classification in CLL, we evaluated 893 unique CLL samples from fifteen publicly available gene expression profiling datasets. We used unsupervised approaches to divide the data into subgroups, evaluated the biological pathways and genetic aberrations that were associated with the subgroups, and compared prognostic and clinical outcome data between the subgroups.

Results: Using an unsupervised approach, we determined that approximately 600 CLL samples are needed to define the spectrum of diversity in CLL genomic expression. We identified seven genomically-defined CLL subgroups that have distinct biological properties, are associated with specific chromosomal deletions and amplifications, and have marked differences in molecular prognostic markers and clinical outcomes.

Conclusions: Our results indicate that investigations focusing on small numbers of patient samples likely provide a biased outlook on CLL biology. These findings may have important implications in identifying patients who should be treated with specific targeted therapies, which could have efficacy against CLL cells that rely on specific biological pathways.

Show MeSH

Related in: MedlinePlus

Representative examples of Consensus Cumulative Distribution Function (CDF) plots for the entire dataset (right) and randomly selected sub-datasets of 100 and 600 CLL samples (left and middle, respectively).By evaluating area under the curve and slope of the curves, it is appreciated that CDF plots of Consensus Clustering of sub-datasets the include 600 CLL samples are similar to the CDF plot of the entire dataset containing 893 CLL samples. However, CDF plots obtained upon using smaller sub-datasets, for example comprised of 100 CLL samples, is not similar to the CDF plot of the entire dataset.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3585365&req=5

pone-0057356-g003: Representative examples of Consensus Cumulative Distribution Function (CDF) plots for the entire dataset (right) and randomly selected sub-datasets of 100 and 600 CLL samples (left and middle, respectively).By evaluating area under the curve and slope of the curves, it is appreciated that CDF plots of Consensus Clustering of sub-datasets the include 600 CLL samples are similar to the CDF plot of the entire dataset containing 893 CLL samples. However, CDF plots obtained upon using smaller sub-datasets, for example comprised of 100 CLL samples, is not similar to the CDF plot of the entire dataset.

Mentions: Before assessing the biological and clinical relevance of genomically-defined CLL subgroups, it was important to determine if the number of samples in the combined dataset were sufficient to evaluate genomic heterogeneity in CLL. Assuming there is no bias in the availability of genomic data, we would expect that increasing the number of samples in the combined dataset would cease to increase the number of subgroups once maximum genomic heterogeneity has been reached. Therefore, we evaluated the combined dataset in an iterative fashion to determine if a smaller number of CLL samples could be used to obtain the same subgroups as the entire combined dataset. To do so, we used the Consensus Clustering algorithm to evaluate the CDF of two to eight subgroups on increasing numbers of randomly selecting samples from within the entire dataset. This process was repeated 25 times. CDF plots of sub-datasets were compared to the CDF plot of the entire dataset. The CDF plots for sub-datasets of 50 to 550 samples were different than the CDF plot for the entire dataset (p<0.0001, Fisher’s Exact Test), whereas 600 to 850 samples were not statistically different than the CDF plot for the entire dataset (p>0.05, Fisher’s Exact Test). Representative plots are displayed in Figure 3. Thus, approximately 600 or more CLL samples are required to evaluate genomic complexity in CLL as a whole.


Clinical and biological relevance of genomic heterogeneity in chronic lymphocytic leukemia.

Friedman DR, Lucas JE, Weinberg JB - PLoS ONE (2013)

Representative examples of Consensus Cumulative Distribution Function (CDF) plots for the entire dataset (right) and randomly selected sub-datasets of 100 and 600 CLL samples (left and middle, respectively).By evaluating area under the curve and slope of the curves, it is appreciated that CDF plots of Consensus Clustering of sub-datasets the include 600 CLL samples are similar to the CDF plot of the entire dataset containing 893 CLL samples. However, CDF plots obtained upon using smaller sub-datasets, for example comprised of 100 CLL samples, is not similar to the CDF plot of the entire dataset.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3585365&req=5

pone-0057356-g003: Representative examples of Consensus Cumulative Distribution Function (CDF) plots for the entire dataset (right) and randomly selected sub-datasets of 100 and 600 CLL samples (left and middle, respectively).By evaluating area under the curve and slope of the curves, it is appreciated that CDF plots of Consensus Clustering of sub-datasets the include 600 CLL samples are similar to the CDF plot of the entire dataset containing 893 CLL samples. However, CDF plots obtained upon using smaller sub-datasets, for example comprised of 100 CLL samples, is not similar to the CDF plot of the entire dataset.
Mentions: Before assessing the biological and clinical relevance of genomically-defined CLL subgroups, it was important to determine if the number of samples in the combined dataset were sufficient to evaluate genomic heterogeneity in CLL. Assuming there is no bias in the availability of genomic data, we would expect that increasing the number of samples in the combined dataset would cease to increase the number of subgroups once maximum genomic heterogeneity has been reached. Therefore, we evaluated the combined dataset in an iterative fashion to determine if a smaller number of CLL samples could be used to obtain the same subgroups as the entire combined dataset. To do so, we used the Consensus Clustering algorithm to evaluate the CDF of two to eight subgroups on increasing numbers of randomly selecting samples from within the entire dataset. This process was repeated 25 times. CDF plots of sub-datasets were compared to the CDF plot of the entire dataset. The CDF plots for sub-datasets of 50 to 550 samples were different than the CDF plot for the entire dataset (p<0.0001, Fisher’s Exact Test), whereas 600 to 850 samples were not statistically different than the CDF plot for the entire dataset (p>0.05, Fisher’s Exact Test). Representative plots are displayed in Figure 3. Thus, approximately 600 or more CLL samples are required to evaluate genomic complexity in CLL as a whole.

Bottom Line: We used unsupervised approaches to divide the data into subgroups, evaluated the biological pathways and genetic aberrations that were associated with the subgroups, and compared prognostic and clinical outcome data between the subgroups.We identified seven genomically-defined CLL subgroups that have distinct biological properties, are associated with specific chromosomal deletions and amplifications, and have marked differences in molecular prognostic markers and clinical outcomes.Our results indicate that investigations focusing on small numbers of patient samples likely provide a biased outlook on CLL biology.

View Article: PubMed Central - PubMed

Affiliation: Department of Medicine, Duke University, Durham, North Carolina, USA. daphne.friedman@duke.edu

ABSTRACT

Background: Chronic lymphocytic leukemia (CLL) is typically regarded as an indolent B-cell malignancy. However, there is wide variability with regards to need for therapy, time to progressive disease, and treatment response. This clinical variability is due, in part, to biological heterogeneity between individual patients' leukemias. While much has been learned about this biological variation using genomic approaches, it is unclear whether such efforts have sufficiently evaluated biological and clinical heterogeneity in CLL.

Methods: To study the extent of genomic variability in CLL and the biological and clinical attributes of genomic classification in CLL, we evaluated 893 unique CLL samples from fifteen publicly available gene expression profiling datasets. We used unsupervised approaches to divide the data into subgroups, evaluated the biological pathways and genetic aberrations that were associated with the subgroups, and compared prognostic and clinical outcome data between the subgroups.

Results: Using an unsupervised approach, we determined that approximately 600 CLL samples are needed to define the spectrum of diversity in CLL genomic expression. We identified seven genomically-defined CLL subgroups that have distinct biological properties, are associated with specific chromosomal deletions and amplifications, and have marked differences in molecular prognostic markers and clinical outcomes.

Conclusions: Our results indicate that investigations focusing on small numbers of patient samples likely provide a biased outlook on CLL biology. These findings may have important implications in identifying patients who should be treated with specific targeted therapies, which could have efficacy against CLL cells that rely on specific biological pathways.

Show MeSH
Related in: MedlinePlus