Limits...
On the challenge of fitting tree size distributions in ecology.

Taubert F, Hartig F, Dobner HJ, Huth A - PLoS ONE (2013)

Bottom Line: We test whether three typical frequency distributions, namely the power-law, negative exponential and Weibull distribution can be precisely identified, and how parameter estimates are biased when observations are additionally either binned or contain measurement error.We show that uncorrected MLE already loses the ability to discern functional form and parameters at relatively small levels of uncertainties.We conclude that it is important to reduce binning of observations, if possible, and to quantify observation accuracy in empirical studies for fitting strongly skewed size distributions.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecological Modelling, Helmholtz Centre for Environmental Research, Leipzig, Saxony, Germany. franziska.taubert@ufz.de

ABSTRACT
Patterns that resemble strongly skewed size distributions are frequently observed in ecology. A typical example represents tree size distributions of stem diameters. Empirical tests of ecological theories predicting their parameters have been conducted, but the results are difficult to interpret because the statistical methods that are applied to fit such decaying size distributions vary. In addition, binning of field data as well as measurement errors might potentially bias parameter estimates. Here, we compare three different methods for parameter estimation--the common maximum likelihood estimation (MLE) and two modified types of MLE correcting for binning of observations or random measurement errors. We test whether three typical frequency distributions, namely the power-law, negative exponential and Weibull distribution can be precisely identified, and how parameter estimates are biased when observations are additionally either binned or contain measurement error. We show that uncorrected MLE already loses the ability to discern functional form and parameters at relatively small levels of uncertainties. The modified MLE methods that consider such uncertainties (either binning or measurement error) are comparatively much more robust. We conclude that it is important to reduce binning of observations, if possible, and to quantify observation accuracy in empirical studies for fitting strongly skewed size distributions. In general, modified MLE methods that correct binning or measurement errors can be applied to ensure reliable results.

Show MeSH

Related in: MedlinePlus

Effect of errors on Akaike weights for the correct determination of the underlying distribution.In each row virtual data sets of sample size  = 500 which originate from the three truncated distributions (power-law, negative exponential and Weibull distribution) are evaluated. Weights are calculated supposing these distributions (power-law, negative exponential and Weibull distribution) with (a)–(c) multinomial MLE and (d)–(f) Gaussian MLE. The highest Akaike weight determines the best fit of a frequency distribution to the data. (a)–(c) Effect of binning of virtual data sets with used bin width (x-axis in cm) on Akaike weights. (d)–(f) Effect of random measurement errors added to the virtual data sets on Akaike weights, whereby errors are Gaussian distributed with mean  cm and assumed standard deviation  (x-axis in cm). Solid lines represent the mean of Akaike weights and shaded areas show the standard deviation (of (a)–(c) 1000 values and (d)–(f) 250 values).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3585190&req=5

pone-0058036-g006: Effect of errors on Akaike weights for the correct determination of the underlying distribution.In each row virtual data sets of sample size  = 500 which originate from the three truncated distributions (power-law, negative exponential and Weibull distribution) are evaluated. Weights are calculated supposing these distributions (power-law, negative exponential and Weibull distribution) with (a)–(c) multinomial MLE and (d)–(f) Gaussian MLE. The highest Akaike weight determines the best fit of a frequency distribution to the data. (a)–(c) Effect of binning of virtual data sets with used bin width (x-axis in cm) on Akaike weights. (d)–(f) Effect of random measurement errors added to the virtual data sets on Akaike weights, whereby errors are Gaussian distributed with mean cm and assumed standard deviation (x-axis in cm). Solid lines represent the mean of Akaike weights and shaded areas show the standard deviation (of (a)–(c) 1000 values and (d)–(f) 250 values).

Mentions: The identification of the underlying distribution with MLE including observation uncertainties (multinomial MLE and Gaussian MLE) shows a significant improvement compared to standard MLE (Fig. 6). An underlying power-law or Weibull distribution is always correctly determined (Fig. 6a, 6c, 6d, 6f). For exponentially distributed data, the correct distribution is identified with at least 50% probability for a large range of bin widths ( cm, Table S1). Above this threshold, Akaike weights favour a power-law distribution (Fig. 6b). Concerning measurement errors, the exponential distribution is identified for all measurement errors () in the range of our investigations (Fig. 6e). An increment in sample size has considerable positive effects for both modified MLE methods (Fig. S2, S4).


On the challenge of fitting tree size distributions in ecology.

Taubert F, Hartig F, Dobner HJ, Huth A - PLoS ONE (2013)

Effect of errors on Akaike weights for the correct determination of the underlying distribution.In each row virtual data sets of sample size  = 500 which originate from the three truncated distributions (power-law, negative exponential and Weibull distribution) are evaluated. Weights are calculated supposing these distributions (power-law, negative exponential and Weibull distribution) with (a)–(c) multinomial MLE and (d)–(f) Gaussian MLE. The highest Akaike weight determines the best fit of a frequency distribution to the data. (a)–(c) Effect of binning of virtual data sets with used bin width (x-axis in cm) on Akaike weights. (d)–(f) Effect of random measurement errors added to the virtual data sets on Akaike weights, whereby errors are Gaussian distributed with mean  cm and assumed standard deviation  (x-axis in cm). Solid lines represent the mean of Akaike weights and shaded areas show the standard deviation (of (a)–(c) 1000 values and (d)–(f) 250 values).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3585190&req=5

pone-0058036-g006: Effect of errors on Akaike weights for the correct determination of the underlying distribution.In each row virtual data sets of sample size  = 500 which originate from the three truncated distributions (power-law, negative exponential and Weibull distribution) are evaluated. Weights are calculated supposing these distributions (power-law, negative exponential and Weibull distribution) with (a)–(c) multinomial MLE and (d)–(f) Gaussian MLE. The highest Akaike weight determines the best fit of a frequency distribution to the data. (a)–(c) Effect of binning of virtual data sets with used bin width (x-axis in cm) on Akaike weights. (d)–(f) Effect of random measurement errors added to the virtual data sets on Akaike weights, whereby errors are Gaussian distributed with mean cm and assumed standard deviation (x-axis in cm). Solid lines represent the mean of Akaike weights and shaded areas show the standard deviation (of (a)–(c) 1000 values and (d)–(f) 250 values).
Mentions: The identification of the underlying distribution with MLE including observation uncertainties (multinomial MLE and Gaussian MLE) shows a significant improvement compared to standard MLE (Fig. 6). An underlying power-law or Weibull distribution is always correctly determined (Fig. 6a, 6c, 6d, 6f). For exponentially distributed data, the correct distribution is identified with at least 50% probability for a large range of bin widths ( cm, Table S1). Above this threshold, Akaike weights favour a power-law distribution (Fig. 6b). Concerning measurement errors, the exponential distribution is identified for all measurement errors () in the range of our investigations (Fig. 6e). An increment in sample size has considerable positive effects for both modified MLE methods (Fig. S2, S4).

Bottom Line: We test whether three typical frequency distributions, namely the power-law, negative exponential and Weibull distribution can be precisely identified, and how parameter estimates are biased when observations are additionally either binned or contain measurement error.We show that uncorrected MLE already loses the ability to discern functional form and parameters at relatively small levels of uncertainties.We conclude that it is important to reduce binning of observations, if possible, and to quantify observation accuracy in empirical studies for fitting strongly skewed size distributions.

View Article: PubMed Central - PubMed

Affiliation: Department of Ecological Modelling, Helmholtz Centre for Environmental Research, Leipzig, Saxony, Germany. franziska.taubert@ufz.de

ABSTRACT
Patterns that resemble strongly skewed size distributions are frequently observed in ecology. A typical example represents tree size distributions of stem diameters. Empirical tests of ecological theories predicting their parameters have been conducted, but the results are difficult to interpret because the statistical methods that are applied to fit such decaying size distributions vary. In addition, binning of field data as well as measurement errors might potentially bias parameter estimates. Here, we compare three different methods for parameter estimation--the common maximum likelihood estimation (MLE) and two modified types of MLE correcting for binning of observations or random measurement errors. We test whether three typical frequency distributions, namely the power-law, negative exponential and Weibull distribution can be precisely identified, and how parameter estimates are biased when observations are additionally either binned or contain measurement error. We show that uncorrected MLE already loses the ability to discern functional form and parameters at relatively small levels of uncertainties. The modified MLE methods that consider such uncertainties (either binning or measurement error) are comparatively much more robust. We conclude that it is important to reduce binning of observations, if possible, and to quantify observation accuracy in empirical studies for fitting strongly skewed size distributions. In general, modified MLE methods that correct binning or measurement errors can be applied to ensure reliable results.

Show MeSH
Related in: MedlinePlus