Limits...
Ensemble analyses improve signatures of tumour hypoxia and reveal inter-platform differences.

Fox NS, Starmans MH, Haider S, Lambin P, Boutros PC - BMC Bioinformatics (2014)

Bottom Line: We confirm strong pre-processing effects for all datasets and signatures, and find that these differ between microarray versions.Importantly, exploiting different pre-processing techniques in an ensemble technique improved classification for a majority of signatures.Importantly, ensemble classification improves biomarkers with initially good results but does not result in spuriously improved performance for poor biomarkers.

View Article: PubMed Central - HTML - PubMed

Affiliation: Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, Canada. Paul.Boutros@oicr.on.ca.

ABSTRACT

Background: The reproducibility of transcriptomic biomarkers across datasets remains poor, limiting clinical application. We and others have suggested that this is in-part caused by differential error-structure between datasets, and their incomplete removal by pre-processing algorithms.

Methods: To test this hypothesis, we systematically assessed the effects of pre-processing on biomarker classification using 24 different pre-processing methods and 15 distinct signatures of tumour hypoxia in 10 datasets (2,143 patients).

Results: We confirm strong pre-processing effects for all datasets and signatures, and find that these differ between microarray versions. Importantly, exploiting different pre-processing techniques in an ensemble technique improved classification for a majority of signatures.

Conclusions: Assessing biomarkers using an ensemble of pre-processing techniques shows clear value across multiple diseases, datasets and biomarkers. Importantly, ensemble classification improves biomarkers with initially good results but does not result in spuriously improved performance for poor biomarkers. While further research is required, this approach has the potential to become a standard for transcriptomic biomarkers.

Show MeSH

Related in: MedlinePlus

Signature comparison. Analysis of consistency between signatures. In A, heatmaps are shown for the pair-wise comparison of all the individual pipeline variants. The pipelines are compared using the percent agreement between the patient grouping for the two pipelines. B, shows the ensemble scores (range 0 to 24) per patient for each signature, patients are on the y-axis and signatures on the x-axis. The signatures are ordered by the number of patients classified unanimously; the signature which was most consistent across single pipeline classifications is on the far left and the least consistent one is on the right. Finally, the scatter plots compare all significant signatures when the number of pipelines used to create the ensemble classification is varied. In C, each point is the log2 of the mean hazard ratio of 2000 permutations. D, similarly shows the effect of the number of methods combined on the number of patients classified. For each array platform, only the signatures which have statistically significant prognostic power with the ensemble classifier (including all 24 methods) by Cox modeling are shown. For HG-U133 Plus 2.0, the Hu signature and the Winter Metagene signature have equivalent numbers of patients classified, therefore the Winter Metagene signature line is hiding the Hu signature.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4061774&req=5

Figure 5: Signature comparison. Analysis of consistency between signatures. In A, heatmaps are shown for the pair-wise comparison of all the individual pipeline variants. The pipelines are compared using the percent agreement between the patient grouping for the two pipelines. B, shows the ensemble scores (range 0 to 24) per patient for each signature, patients are on the y-axis and signatures on the x-axis. The signatures are ordered by the number of patients classified unanimously; the signature which was most consistent across single pipeline classifications is on the far left and the least consistent one is on the right. Finally, the scatter plots compare all significant signatures when the number of pipelines used to create the ensemble classification is varied. In C, each point is the log2 of the mean hazard ratio of 2000 permutations. D, similarly shows the effect of the number of methods combined on the number of patients classified. For each array platform, only the signatures which have statistically significant prognostic power with the ensemble classifier (including all 24 methods) by Cox modeling are shown. For HG-U133 Plus 2.0, the Hu signature and the Winter Metagene signature have equivalent numbers of patients classified, therefore the Winter Metagene signature line is hiding the Hu signature.

Mentions: To better understand which signatures were more successful, all individual classifications were compared. Unsupervised clustering of the percentage agreement of concordant patient classifications between individual pipeline variants for each signature showed that they mainly clustered by signature, rather than by pipeline composition (Figure 5A). This indicated that, although pre-processing substantially influenced biomarker performance, the genes in the signature characterized the overall partition and determined whether it was a poor or good biomarker. The Buffa metagene had the most consistent patient classifications across pipelines, but hazard ratios still ranged from 1.51 to 1.87. Although, we evaluated only hypoxia signatures, patient classifications did not agree across signatures (Figure 5A,B and Additional file 7: Figure S4). Signatures of ensemble classifications that were statistically significant generally classified a larger fraction of patients (Additional file 7: Figure S4B).


Ensemble analyses improve signatures of tumour hypoxia and reveal inter-platform differences.

Fox NS, Starmans MH, Haider S, Lambin P, Boutros PC - BMC Bioinformatics (2014)

Signature comparison. Analysis of consistency between signatures. In A, heatmaps are shown for the pair-wise comparison of all the individual pipeline variants. The pipelines are compared using the percent agreement between the patient grouping for the two pipelines. B, shows the ensemble scores (range 0 to 24) per patient for each signature, patients are on the y-axis and signatures on the x-axis. The signatures are ordered by the number of patients classified unanimously; the signature which was most consistent across single pipeline classifications is on the far left and the least consistent one is on the right. Finally, the scatter plots compare all significant signatures when the number of pipelines used to create the ensemble classification is varied. In C, each point is the log2 of the mean hazard ratio of 2000 permutations. D, similarly shows the effect of the number of methods combined on the number of patients classified. For each array platform, only the signatures which have statistically significant prognostic power with the ensemble classifier (including all 24 methods) by Cox modeling are shown. For HG-U133 Plus 2.0, the Hu signature and the Winter Metagene signature have equivalent numbers of patients classified, therefore the Winter Metagene signature line is hiding the Hu signature.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4061774&req=5

Figure 5: Signature comparison. Analysis of consistency between signatures. In A, heatmaps are shown for the pair-wise comparison of all the individual pipeline variants. The pipelines are compared using the percent agreement between the patient grouping for the two pipelines. B, shows the ensemble scores (range 0 to 24) per patient for each signature, patients are on the y-axis and signatures on the x-axis. The signatures are ordered by the number of patients classified unanimously; the signature which was most consistent across single pipeline classifications is on the far left and the least consistent one is on the right. Finally, the scatter plots compare all significant signatures when the number of pipelines used to create the ensemble classification is varied. In C, each point is the log2 of the mean hazard ratio of 2000 permutations. D, similarly shows the effect of the number of methods combined on the number of patients classified. For each array platform, only the signatures which have statistically significant prognostic power with the ensemble classifier (including all 24 methods) by Cox modeling are shown. For HG-U133 Plus 2.0, the Hu signature and the Winter Metagene signature have equivalent numbers of patients classified, therefore the Winter Metagene signature line is hiding the Hu signature.
Mentions: To better understand which signatures were more successful, all individual classifications were compared. Unsupervised clustering of the percentage agreement of concordant patient classifications between individual pipeline variants for each signature showed that they mainly clustered by signature, rather than by pipeline composition (Figure 5A). This indicated that, although pre-processing substantially influenced biomarker performance, the genes in the signature characterized the overall partition and determined whether it was a poor or good biomarker. The Buffa metagene had the most consistent patient classifications across pipelines, but hazard ratios still ranged from 1.51 to 1.87. Although, we evaluated only hypoxia signatures, patient classifications did not agree across signatures (Figure 5A,B and Additional file 7: Figure S4). Signatures of ensemble classifications that were statistically significant generally classified a larger fraction of patients (Additional file 7: Figure S4B).

Bottom Line: We confirm strong pre-processing effects for all datasets and signatures, and find that these differ between microarray versions.Importantly, exploiting different pre-processing techniques in an ensemble technique improved classification for a majority of signatures.Importantly, ensemble classification improves biomarkers with initially good results but does not result in spuriously improved performance for poor biomarkers.

View Article: PubMed Central - HTML - PubMed

Affiliation: Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, Canada. Paul.Boutros@oicr.on.ca.

ABSTRACT

Background: The reproducibility of transcriptomic biomarkers across datasets remains poor, limiting clinical application. We and others have suggested that this is in-part caused by differential error-structure between datasets, and their incomplete removal by pre-processing algorithms.

Methods: To test this hypothesis, we systematically assessed the effects of pre-processing on biomarker classification using 24 different pre-processing methods and 15 distinct signatures of tumour hypoxia in 10 datasets (2,143 patients).

Results: We confirm strong pre-processing effects for all datasets and signatures, and find that these differ between microarray versions. Importantly, exploiting different pre-processing techniques in an ensemble technique improved classification for a majority of signatures.

Conclusions: Assessing biomarkers using an ensemble of pre-processing techniques shows clear value across multiple diseases, datasets and biomarkers. Importantly, ensemble classification improves biomarkers with initially good results but does not result in spuriously improved performance for poor biomarkers. While further research is required, this approach has the potential to become a standard for transcriptomic biomarkers.

Show MeSH
Related in: MedlinePlus