Tukey lambda distribution

Tukey lambda distribution
Probability density function

Notation Tukey(λ)
Parameters λRshape parameter
Support x ∈ [−1/λ, 1/λ] for λ > 0,
xR for λ ≤ 0
PDF (Q(p;\lambda)\,,Q'(p;\lambda)^{-1}),\, 0\leq\,p\,\leq\,1
CDF (e^{-x}+1)^{-1},\,\,\lambda\,=\,0
Mean 0,\,\,\lambda > -1
Median 0
Mode 0
Variance \frac{2}{\lambda^2}\bigg(\frac{1}{1+2\lambda}-\frac{\Gamma(\lambda+1)^2}{\Gamma(2\lambda+2)}\bigg),\,\,\lambda > -1/2
\frac{ \pi^{2} }{ 3 },\,\,\lambda\,=\,0
Skewness 0,\,\,\lambda > -1/3
Ex. kurtosis \frac{(2\lambda+1)^2}{2(4\lambda+1)} \frac{ g_2^2\big(3g_2^2-4g_1g_3+g_4\big)}{g_4\big(g_1^2-g_2\big)^2} - 3,
 1.2,\,\,\lambda\,=\,0, where gk = Γ(+1) and λ > -1/4.
Entropy h(\lambda) = \int_0^1 \log (Q'(p;\lambda))\,dp[1]
CF \phi(t;\lambda) = \int_0^1 \exp (\,i t\,Q(p;\lambda))\,dp[2]

Formalized by John Tukey, the Tukey lambda distribution is a continuous probability distribution defined in terms of its quantile function. It is typically used to identify an appropriate distribution (see the comments below) and not used in statistical models directly.

The Tukey lambda distribution has a single shape parameter λ. As with other probability distributions, the Tukey lambda distribution can be transformed with a location parameter, μ, and a scale parameter, σ. Since the general form of probability distribution can be expressed in terms of the standard distribution, the subsequent formulas are given for the standard form of the function.

Quantile function

For the standard form of the Tukey lambda distribution, the quantile function, Q(p), (i.e. the inverse of the cumulative distribution function) and the quantile density function (i.e. the derivative of the quantile function) are


Q\left(p;\lambda\right) = 
\begin{cases}
\frac{ 1 }{ \lambda } \left[p^\lambda - (1 - p)^\lambda\right], & \mbox{if } \lambda \ne 0 \\
\log(\frac{p}{1-p}), & \mbox{if } \lambda = 0,
\end{cases}
Q'\left(p;\lambda\right) = p^{(\lambda-1)} + \left(1-p\right)^{(\lambda-1)}.

The probability density function (pdf) and cumulative distribution function (cdf) are both computed numerically, as the Tukey lambda distribution does not have a simple, closed form for any values of the parameters except λ = 0 (see logistic distribution). However, the pdf can be expressed in parametric form, for all values of λ, in terms of the quantile function and the reciprocal of the quantile density function.

Moments

The Tukey lambda distribution is symmetric around zero, therefore the expected value of this distribution is equal to zero. The variance exists for λ > −½ and is given by the formula (except when λ = 0)


    \operatorname{Var}[X] = \frac{2}{\lambda^2}\bigg(\frac{1}{1+2\lambda} - \frac{\Gamma(\lambda+1)^2}{\Gamma(2\lambda+2)}\bigg).

More generally, the n-th order moment is finite when λ > −1/n and is expressed in terms of the beta function Β(x,y) (except when λ = 0) :


    \mu_n = \operatorname{E}[X^n] = \frac{1}{\lambda^n} \sum_{k=0}^n (-1)^k {n \choose k}\, \Beta(\lambda k+1,\, \lambda(n-k)+1 ).

Note that due to symmetry of the density function, all moments of odd orders are equal to zero.

L-moments

Differently from the central moments, L-moments can be expressed in a closed form. The L-moment of order r>1 is given by[3]


L_{r} \lambda=\sum_{k=0}^{r-1} (-1)^{r-k-1} \binom{r-1}{k} \binom{r+k-1}{k} \left(\frac{1}{k+1+\lambda}+\frac{(-1)^{r}}{k+1+\lambda} \right).

The first six L-moments can be presented as follows:[3]


L_{1}=0

L_2 \lambda = - \frac{2}{1 + \lambda} + \frac{4}{2 + \lambda}

L_3  =0

L_4 \lambda = - \frac{2}{1+\lambda} + \frac{24}{2+\lambda} - \frac{60}{3+\lambda} + \frac{60}{4+\lambda}

L_5  =0

L_6 \lambda = -\frac{2}{1+\lambda} + \frac{60}{2+\lambda} - \frac{420}{3+\lambda} +\frac{1120}{4+\lambda}-\frac{1260}{5+\lambda}+\frac{504}{6+\lambda}.

Comments

The Tukey lambda distribution is actually a family of distributions that can approximate a number of common distributions. For example,

λ = −1 approx. Cauchy C(0,π)
λ = 0 exactly logistic
λ = 0.14 approx. normal N(0, 2.142)
λ = 0.5 strictly concave (\cap-shaped)
λ = 1 exactly uniform U(−1, 1)
λ = 2 exactly uniform U(−½, ½)

The most common use of this distribution is to generate a Tukey lambda PPCC plot of a data set. Based on the PPCC plot, an appropriate model for the data is suggested. For example, if the maximum correlation occurs for a value of λ at or near 0.14, then the data can be modeled with a normal distribution. Values of λ less than this imply a heavy-tailed distribution (with −1 approximating a Cauchy). That is, as the optimal value of lambda goes from 0.14 to −1, increasingly heavy tails are implied. Similarly, as the optimal value of λ becomes greater than 0.14, shorter tails are implied.

Since the Tukey lambda distribution is a symmetric distribution, the use of the Tukey lambda PPCC plot to determine a reasonable distribution to model the data only applies to symmetric distributions. A histogram of the data should provide evidence as to whether the data can be reasonably modeled with a symmetric distribution.[4]

References

  1. Vasicek, Oldrich (1976), "A Test for Normality Based on Sample Entropy", Journal of the Royal Statistical Society, Series B 38 (1): 54–59.
  2. Shaw, W. T.; McCabe, J. (2009), "Monte Carlo sampling given a Characteristic Function: Quantile Mechanics in Momentum Space", arXiv:0903.1592
  3. 1 2 Karvanen, Juha; Nuutinen, Arto (2008). "Characterizing the generalized lambda distribution by L-moments". Computational Statistics & Data Analysis 52: 1971–1983. doi:10.1016/j.csda.2007.06.021.
  4. Joiner, Brian L.; Rosenblatt, Joan R. (1971), "Some Properties of the Range in Samples from Tukey's Symmetric Lambda Distributions", Journal of the American Statistical Association 66 (334): 394–399, doi:10.2307/2283943, JSTOR 2283943

External links

 This article incorporates public domain material from websites or documents of the National Institute of Standards and Technology.

This article is issued from Wikipedia - version of the Friday, April 22, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.