Limits...
Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range.

Wan X, Wang W, Liu J, Tong T - BMC Med Res Methodol (2014)

Bottom Line: For the third scenario, our method still performs very well for both normal data and skewed data.Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications.We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different situations.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong. xwan@comp.hkbu.edu.hk.

ABSTRACT

Background: In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials.

Methods: In this paper, we propose to improve the existing literature in several directions. First, we show that the sample standard deviation estimation in Hozo et al.'s method (BMC Med Res Methodol 5:13, 2005) has some serious limitations and is always less satisfactory in practice. Inspired by this, we propose a new estimation method by incorporating the sample size. Second, we systematically study the sample mean and standard deviation estimation problem under several other interesting settings where the interquartile range is also available for the trials.

Results: We demonstrate the performance of the proposed methods through simulation studies for the three frequently encountered scenarios, respectively. For the first two scenarios, our method greatly improves existing methods and provides a nearly unbiased estimate of the true sample standard deviation for normal data and a slightly biased estimate for skewed data. For the third scenario, our method still performs very well for both normal data and skewed data. Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications.

Conclusions: In this paper, we discuss different approximation methods in the estimation of the sample mean and standard deviation and propose some new estimation methods to improve the existing literature. We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different situations.

Show MeSH
Relative errors of the sample mean and standard deviation estimations for normal data, where the black solid circles represent the method under scenario, the red solid triangles represent the method under scenario, and the green empty circles represent the method under scenario.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4383202&req=5

Fig4: Relative errors of the sample mean and standard deviation estimations for normal data, where the black solid circles represent the method under scenario, the red solid triangles represent the method under scenario, and the green empty circles represent the method under scenario.

Mentions: In each simulation, we first draw a random sample of size n from each distribution. The true sample mean and the true sample standard deviation are computed using the whole sample. The summary statistics are also computed and categorized into Scenarios , and . We then use the aforementioned formulas to estimate the sample mean and standard deviation, respectively. The sample sizes are n=4Q+1, where Q takes values from 1 to 50. With 1000 simulations, we report the average relative errors in Figure 4 for both and S with the normal distribution, in Figure 5 for the sample mean estimation with the non-normal distributions, and in Figure 6 for the sample standard deviation estimation with the non-normal distributions.Figure 4


Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range.

Wan X, Wang W, Liu J, Tong T - BMC Med Res Methodol (2014)

Relative errors of the sample mean and standard deviation estimations for normal data, where the black solid circles represent the method under scenario, the red solid triangles represent the method under scenario, and the green empty circles represent the method under scenario.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4383202&req=5

Fig4: Relative errors of the sample mean and standard deviation estimations for normal data, where the black solid circles represent the method under scenario, the red solid triangles represent the method under scenario, and the green empty circles represent the method under scenario.
Mentions: In each simulation, we first draw a random sample of size n from each distribution. The true sample mean and the true sample standard deviation are computed using the whole sample. The summary statistics are also computed and categorized into Scenarios , and . We then use the aforementioned formulas to estimate the sample mean and standard deviation, respectively. The sample sizes are n=4Q+1, where Q takes values from 1 to 50. With 1000 simulations, we report the average relative errors in Figure 4 for both and S with the normal distribution, in Figure 5 for the sample mean estimation with the non-normal distributions, and in Figure 6 for the sample standard deviation estimation with the non-normal distributions.Figure 4

Bottom Line: For the third scenario, our method still performs very well for both normal data and skewed data.Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications.We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different situations.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong. xwan@comp.hkbu.edu.hk.

ABSTRACT

Background: In systematic reviews and meta-analysis, researchers often pool the results of the sample mean and standard deviation from a set of similar clinical trials. A number of the trials, however, reported the study using the median, the minimum and maximum values, and/or the first and third quartiles. Hence, in order to combine results, one may have to estimate the sample mean and standard deviation for such trials.

Methods: In this paper, we propose to improve the existing literature in several directions. First, we show that the sample standard deviation estimation in Hozo et al.'s method (BMC Med Res Methodol 5:13, 2005) has some serious limitations and is always less satisfactory in practice. Inspired by this, we propose a new estimation method by incorporating the sample size. Second, we systematically study the sample mean and standard deviation estimation problem under several other interesting settings where the interquartile range is also available for the trials.

Results: We demonstrate the performance of the proposed methods through simulation studies for the three frequently encountered scenarios, respectively. For the first two scenarios, our method greatly improves existing methods and provides a nearly unbiased estimate of the true sample standard deviation for normal data and a slightly biased estimate for skewed data. For the third scenario, our method still performs very well for both normal data and skewed data. Furthermore, we compare the estimators of the sample mean and standard deviation under all three scenarios and present some suggestions on which scenario is preferred in real-world applications.

Conclusions: In this paper, we discuss different approximation methods in the estimation of the sample mean and standard deviation and propose some new estimation methods to improve the existing literature. We conclude our work with a summary table (an Excel spread sheet including all formulas) that serves as a comprehensive guidance for performing meta-analysis in different situations.

Show MeSH