Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Q statistics.
Bottom Line:
Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis.Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity.The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate.
Affiliation: MRC Clinical Trials Unit, 222 Euston Road, London NW1 2DA, UK. jack.bowden@mrc-bsu.cam.ac.uk
ABSTRACT
Show MeSH
Background: Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods: We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results: Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions: Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. Related in: MedlinePlus |
Related In:
Results -
Collection
License getmorefigures.php?uid=PMC3102034&req=5
Mentions: On a closer inspection of these data, the size of the study appears to be correlated with its estimated effect - indeed the largest study (oxford) and the smallest study (CEP-85) cover the complete range of all the estimates. Figure 3 (right) shows a funnel plot [29] of the same data to highlight this more clearly. If the independence assumption in equation (1) holds, then we would expect the plot to be symmetrical around the mean estimate. Although funnel plot asymmetry is not necessarily indicative of dissemination bias (or 'small study' effects), the prevailing, uncontroversial view is that unbiased study dissemination is more likely to occur for larger studies than for smaller studies, and it is certainly one possible explanation for what we see here. Egger's regression [30] - which can be thought of as a very general test for dissemination bias [31] - provides some evidence for a higher than average correlation between effect size and precision (p-value 0.098). This correlation is the reason for the large difference between the estimates and . As the contributing trials spanned 30 years, the types of chemotherapy used varied considerably and consequently the pre-specified main analysis sub-divided trials into chemotherapy categories. This was indeed helpful in resolving the issue before any further, more subjective analyses were attempted. Of the 11 relevant trials identified, it was found that 2 trials used long term alkylating agents, 1 trial used a Vinca alkaloid/etoposide agent and 8 used a Cisplatin based regimen. The results of this subgroup analysis are shown against the results across all trials in Table 2. Among the 8 Cisplatin trials we saw a slight decrease in the estimated magnitude of heterogeneity with down to 68%. The homogeneity p-value from the standard Q-statistic was also far less significant, but is arguably due in some degree to a loss of power resulting from splitting the data. The main effect was however a clear reduction in the amount of funnel plot asymmetry (Egger's regression p-value 0.38) and consequently much better agreement between fixed and random effects model estimates. |
View Article: PubMed Central - HTML - PubMed
Affiliation: MRC Clinical Trials Unit, 222 Euston Road, London NW1 2DA, UK. jack.bowden@mrc-bsu.cam.ac.uk
Background: Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic.
Methods: We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity.
Results: Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses.
Conclusions: Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim.