Limits...
The Lambert Way to Gaussianize Heavy-Tailed Data with the Inverse of Tukey's h Transformation as a Special Case.

Goerg GM - ScientificWorldJournal (2015)

Bottom Line: For X being Gaussian it reduces to Tukey's h distribution.Parameters can be estimated by maximum likelihood and applications to S&P 500 log-returns demonstrate the usefulness of the presented methodology.The R package Lambert W implements most of the introduced methodology and is publicly available on CRAN.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

ABSTRACT
I present a parametric, bijective transformation to generate heavy tail versions of arbitrary random variables. The tail behavior of this heavy tail Lambert W × F X random variable depends on a tail parameter δ ≥ 0: for δ = 0, Y ≡ X, for δ > 0 Y has heavier tails than X. For X being Gaussian it reduces to Tukey's h distribution. The Lambert W function provides an explicit inverse transformation, which can thus remove heavy tails from observed data. It also provides closed-form expressions for the cumulative distribution (cdf) and probability density function (pdf). As a special case, these yield analytic expression for Tukey's h pdf and cdf. Parameters can be estimated by maximum likelihood and applications to S&P 500 log-returns demonstrate the usefulness of the presented methodology. The R package Lambert W implements most of the introduced methodology and is publicly available on CRAN.

No MeSH data available.


Related in: MedlinePlus

Lambert W Gaussianization of S&P 500 log-returns: . In (a) and (b): data (top left); autocorrelation function (ACF) (top right); histogram, Gaussian fit, and KDE (bottom left); Normal QQ plot (bottom right).
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4562338&req=5

fig7: Lambert W Gaussianization of S&P 500 log-returns: . In (a) and (b): data (top left); autocorrelation function (ACF) (top right); histogram, Gaussian fit, and KDE (bottom left); Normal QQ plot (bottom right).

Mentions: Figure 7(a) shows the S&P 500 log-returns with a total of N = 2,780 daily observations (R package  MASS, dataset  SP500). Table 3(b) confirms the heavy tails (sample kurtosis 7.70) but also indicates negative skewness (−0.296). As the sample skewness is very sensitive to outliers, we fit a distribution which allows skewness and test for symmetry. In case of the double-tail Lambert W × Gaussian this means testing H0 : δℓ = δr = δ versus H1 : δℓ ≠ δr. Using the likelihood expression in (28), we can use a likelihood ratio test with one degree of freedom (3 versus 4 parameters). The log-likelihood of the double-tail fit (Table 4(a)) equals −3606.0 = −2972.27 + (−633.73) (input log-likelihood + penalty), while the symmetric δ fit gives −3606.56 = −2971.47 + (−635.09). Here the symmetric fit gives a transformed sample that is more Gaussian, but it pays a greater penalty for transforming the data. Comparing twice their difference to a χ12 distribution gives a P-value of 0.29. For comparison, a skew-t fit [51], with location c, scale s, shape α, and ν degrees of freedom, also yields (Function  st.mle in the R package  sn.) a nonsignificant (Table 4(b)). Thus both fits cannot reject symmetry.


The Lambert Way to Gaussianize Heavy-Tailed Data with the Inverse of Tukey's h Transformation as a Special Case.

Goerg GM - ScientificWorldJournal (2015)

Lambert W Gaussianization of S&P 500 log-returns: . In (a) and (b): data (top left); autocorrelation function (ACF) (top right); histogram, Gaussian fit, and KDE (bottom left); Normal QQ plot (bottom right).
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4562338&req=5

fig7: Lambert W Gaussianization of S&P 500 log-returns: . In (a) and (b): data (top left); autocorrelation function (ACF) (top right); histogram, Gaussian fit, and KDE (bottom left); Normal QQ plot (bottom right).
Mentions: Figure 7(a) shows the S&P 500 log-returns with a total of N = 2,780 daily observations (R package  MASS, dataset  SP500). Table 3(b) confirms the heavy tails (sample kurtosis 7.70) but also indicates negative skewness (−0.296). As the sample skewness is very sensitive to outliers, we fit a distribution which allows skewness and test for symmetry. In case of the double-tail Lambert W × Gaussian this means testing H0 : δℓ = δr = δ versus H1 : δℓ ≠ δr. Using the likelihood expression in (28), we can use a likelihood ratio test with one degree of freedom (3 versus 4 parameters). The log-likelihood of the double-tail fit (Table 4(a)) equals −3606.0 = −2972.27 + (−633.73) (input log-likelihood + penalty), while the symmetric δ fit gives −3606.56 = −2971.47 + (−635.09). Here the symmetric fit gives a transformed sample that is more Gaussian, but it pays a greater penalty for transforming the data. Comparing twice their difference to a χ12 distribution gives a P-value of 0.29. For comparison, a skew-t fit [51], with location c, scale s, shape α, and ν degrees of freedom, also yields (Function  st.mle in the R package  sn.) a nonsignificant (Table 4(b)). Thus both fits cannot reject symmetry.

Bottom Line: For X being Gaussian it reduces to Tukey's h distribution.Parameters can be estimated by maximum likelihood and applications to S&P 500 log-returns demonstrate the usefulness of the presented methodology.The R package Lambert W implements most of the introduced methodology and is publicly available on CRAN.

View Article: PubMed Central - PubMed

Affiliation: Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

ABSTRACT
I present a parametric, bijective transformation to generate heavy tail versions of arbitrary random variables. The tail behavior of this heavy tail Lambert W × F X random variable depends on a tail parameter δ ≥ 0: for δ = 0, Y ≡ X, for δ > 0 Y has heavier tails than X. For X being Gaussian it reduces to Tukey's h distribution. The Lambert W function provides an explicit inverse transformation, which can thus remove heavy tails from observed data. It also provides closed-form expressions for the cumulative distribution (cdf) and probability density function (pdf). As a special case, these yield analytic expression for Tukey's h pdf and cdf. Parameters can be estimated by maximum likelihood and applications to S&P 500 log-returns demonstrate the usefulness of the presented methodology. The R package Lambert W implements most of the introduced methodology and is publicly available on CRAN.

No MeSH data available.


Related in: MedlinePlus