L-moment

In statistics, L-moments are a sequence of statistics used to summarize the shape of a probability distribution.^[1]^[2]^[3]^[4] They are linear combinations of order statistics (L-statistics) analogous to conventional moments, and can be used to calculate quantities analogous to standard deviation, skewness and kurtosis, termed the L-scale, L-skewness and L-kurtosis respectively (the L-mean is identical to the conventional mean). Standardised L-moments are called L-moment ratios and are analogous to standardized moments. Just as for conventional moments, a theoretical distribution has a set of population L-moments. Sample L-moments can be defined for a sample from the population, and can be used as estimators of the population L-moments.

Population L-moments

For a random variable X, the rth population L-moment is^[1]

\lambda_r = r^{-1} \sum_{k=0}^{r-1} {(-1)^k \binom{r-1}{k} \mathrm{E}X_{r-k:r}},

where X_k:n denotes the k^th order statistic (k^th smallest value) in an independent sample of size n from the distribution of X and $\mathrm{E}$ denotes expected value. In particular, the first four population L-moments are

\lambda_1 = \mathrm{E}X

\lambda_2 = (\mathrm{E}X_{2:2} - \mathrm{E}X_{1:2})/2

\lambda_3 = (\mathrm{E}X_{3:3} - 2\mathrm{E}X_{2:3} + \mathrm{E}X_{1:3})/3

\lambda_4 = (\mathrm{E}X_{4:4} - 3\mathrm{E}X_{3:4} + 3\mathrm{E}X_{2:4} - \mathrm{E}X_{1:4})/4.

Note that the coefficients of the k-th L-moment are the same as in the k-th term of the binomial transform, as used in the k-order finite difference (finite analog to the derivative).

The first two of these L-moments have conventional names:

\lambda_1 = \text{mean, L-mean or L-location},

\lambda_2 = \text{L-scale}.

The L-scale is equal to half the mean difference.^[5]

Sample L-moments

The sample L-moments can be computed as the population L-moments of the sample, summing over r-element subsets of the sample $\left\{ x_1 < \cdots < x_j < \cdots < x_r \right\},$ hence averaging by dividing by the binomial coefficient:

\lambda_r = r^{-1}{\tbinom{n}{r}}^{-1} \sum_{x_1 < \cdots < x_j < \cdots < x_r} {(-1)^{r-j} \binom{r-1}{j} x_j}.

Grouping these by order statistic counts the number of ways an element of an n-element sample can be the jth element of an r-element subset, and yields formulas of the form below. Direct estimators for the first four L-moments in a finite sample of n observations are:^[6]

\ell_1 = {\tbinom{n}{1}}^{-1} \sum_{i=1}^n x_{(i)}

\ell_2 = \tfrac{1}{2} {\tbinom{n}{2}}^{-1} \sum_{i=1}^n \left\{ \tbinom{i-1}{1} - \tbinom{n-i}{1} \right\} x_{(i)}

\ell_3 = \tfrac{1}{3} {\tbinom{n}{3}}^{-1} \sum_{i=1}^n \left\{ \tbinom{i-1}{2} - 2\tbinom{i-1}{1}\tbinom{n-i}{1} + \tbinom{n-i}{2} \right\} x_{(i)}

\ell_4 = \tfrac{1}{4} {\tbinom{n}{4}}^{-1} \sum_{i=1}^n \left\{ \tbinom{i-1}{3} - 3\tbinom{i-1}{2}\tbinom{n-i}{1} + 3\tbinom{i-1}{1}\tbinom{n-i}{2} - \tbinom{n-i}{3} \right\} x_{(i)}

where $x (i)$ is the $i$ th order statistic and $\tbinom{\cdot}{\cdot}$ is a binomial coefficient. Sample L-moments can also be defined indirectly in terms of probability weighted moments,^[1]^[7]^[8] which leads to a more efficient algorithm for their computation.^[6]^[9]

L-moment ratios

A set of L-moment ratios, or scaled L-moments, is defined by

\tau_r = \lambda_r / \lambda_2, \qquad r=3,4, \dots.

The most useful of these are $\tau_3$ , called the L-skewness, and $\tau_4$ , the L-kurtosis.

L-moment ratios lie within the interval (–1, 1). Tighter bounds can be found for some specific L-moment ratios; in particular, the L-kurtosis $\tau_4$ lies in [-¼,1), and

\tfrac{1}{4}(5\tau_3^2-1) \leq \tau_4 < 1.

^[1]

A quantity analogous to the coefficient of variation, but based on L-moments, can also be defined: $\tau = \lambda_2 / \lambda_1,$ which is called the "coefficient of L-variation", or "L-CV". For a non-negative random variable, this lies in the interval (0,1)^[1] and is identical to the Gini coefficient.

Related quantities

L-moments are statistical quantities that are derived from probability weighted moments^[10] (PWM) which were defined earlier (1979).^[7] PWM are used to efficiently estimate the parameters of distributions expressable in inverse form such as the Gumbel,^[8] the Tukey, and the Wakeby distributions.

Usage

There are two common ways that L-moments are used, in both cases analogously to the conventional moments:

As summary statistics for data.
To derive estimators for the parameters of probability distributions, applying the method of moments to the L-moments rather than conventional moments.

In addition to doing these with standard moments, the latter (estimation) is more commonly done using maximum likelihood methods; however using L-moments provides a number of advantages. Specifically, L-moments are more robust than conventional moments, and existence of higher L-moments only requires that the random variable have finite mean. One disadvantage of L-moment ratios for estimation is their typically smaller sensitivity. For instance, the Laplace distribution has a kurtosis of 6 and weak exponential tails, but a larger 4th L-moment ratio than e.g. the student-t distribution with d.f.=3, which has an infinite kurtosis and much heavier tails.

As an example consider a dataset with a few data points and one outlying data value. If the ordinary standard deviation of this data set is taken it will be highly influenced by this one point: however, if the L-scale is taken it will be far less sensitive to this data value. Consequently L-moments are far more meaningful when dealing with outliers in data than conventional moments. However, there are also other better suited methods to achieve an even higher robustness than just replacing moments by L-moments. One example of this is using L-moments as summary statistics in extreme value theory (EVT). This application shows the limited robustness of L-moments, i.e. L-statistics are not resistant statistics, as a single extreme value can throw them off, but because they are only linear (not higher-order statistics), they are less affected by extreme values than conventional moments.

Another advantage L-moments have over conventional moments is that their existence only requires the random variable to have finite mean, so the L-moments exist even if the higher conventional moments do not exist (for example, for Student's t distribution with low degrees of freedom). A finite variance is required in addition in order for the standard errors of estimates of the L-moments to be finite.^[1]

Some appearances of L-moments in the statistical literature include the book by David & Nagaraja (2003, Section 9.9)^[11] and a number of papers.^[12]^[13]^[14]^[15]^[16] A number of favourable comparisons of L-moments with ordinary moments have been reported.^[17]^[18]

Values for some common distributions

The table below gives expressions for the first two L-moments and numerical values of the first two L-moment ratios of some common continuous probability distributions with constant L-moment ratios.^[1]^[5] More complex expressions have been derived for some further distributions for which the L-moment ratios vary with one or more of the distributional parameters, including the log-normal, Gamma, generalized Pareto, generalized extreme value, and generalized logistic distributions.^[1]

Distribution	Parameters	mean, $λ 1$	L-scale, $λ 2$	L-skewness, $τ 3$	L-kurtosis, $τ 4$
Uniform	a, b	(a+b) / 2	(b–a) / 6	0	0
Logistic	μ, s	μ	s	0	0.1667 !¹⁄₆ = 0.1667
Normal	μ, σ²	μ	σ / √π	0	0.1226
Laplace	μ, b	μ	3b / 4	0	0.2357 !1 / (3√2) = 0.2357
Student's t, 2 d.f.	ν = 2	0	π/2^3/2 = 1.111	0	0.375 !³⁄₈ = 0.375
Student's t, 4 d.f.	ν = 4	0	15π/64 = 0.7363	0	0.2168 !111/512 = 0.2168
Exponential	λ	1 / λ	1 / (2λ)	0.3333 !¹⁄₃ = 0.3333	0.1667 !¹⁄₆ = 0.1667
Gumbel	μ, β	μ + γβ	β log 2	0.1699	0.1504

The notation for the parameters of each distribution is the same as that used in the linked article. In the expression for the mean of the Gumbel distribution, γ is the Euler–Mascheroni constant 0.57721… .

Extensions

Trimmed L-moments are generalizations of L-moments that give zero weight to extreme observations. They are therefore more robust to the presence of outliers, and unlike L-moments they may be well-defined for distributions for which the mean does not exist, such as the Cauchy distribution.^[19]

References

1 2 3 4 5 6 7 8 Hosking, J.R.M. (1990). "L-moments: analysis and estimation of distributions using linear combinations of order statistics". Journal of the Royal Statistical Society, Series B 52: 105–124. JSTOR 2345653.
↑ Hosking, J.R.M. (1992). "Moments or L moments? An example comparing two measures of distributional shape". The American Statistician 46 (3): 186–189. doi:10.2307/2685210. JSTOR 2685210.
↑ Hosking, J.R.M. (2006). "On the characterization of distributions by their L-moments". Journal of Statistical Planning and Inference 136: 193–198. doi:10.1016/j.jspi.2004.06.004.
↑ Asquith, W.H. (2011) Distributional analysis with L-moment statistics using the R environment for statistical computing, Create Space Independent Publishing Platform, [print-on-demand], ISBN 1-463-50841-7
1 2 Jones, M.C. (2002). "Student's Simplest Distribution". Journal of the Royal Statistical Society, Series D 51 (1): 41–49. doi:10.1111/1467-9884.00297. JSTOR 3650389.
1 2 Wang, Q. J. (1996). "Direct Sample Estimators of L Moments". Water Resources Research 32 (12): 3617–3619. doi:10.1029/96WR02675.
1 2 Greenwood, JA; Landwehr, JM; Matalas, NC; Wallis, JR (1979). "Probability Weighted Moments: Definition and relation to parameters of several distributions expressed in inverse form". Water Resources Research 15: 1049–1054. doi:10.1029/WR015i005p01049. Retrieved 17 January 2013.
1 2 Landwehr, JM; Matalas, NC; Wallis, JR (1979). "Probability weighted moments compared with some traditional techniques in estimating Gumbel parameters and quantiles". Water Resources Research 15: 1055–1064. doi:10.1029/WR015i005p01055. Retrieved 4 February 2013.
↑ L Moments, 6 January 2006, retrieved 19 January 2013 NIST Dataplot documentation
↑ Hosking, JRM; Wallis, JR (2005). Regional Frequency Analysis: An Approach Based on L-moments. Cambridge University Press. p. 3. ISBN 0521019400. Retrieved 22 January 2013.
↑ David, H. A.; Nagaraja, H. N. (2003). Order Statistics (3rd ed.). Wiley. ISBN 0-471-38926-9.
↑ Serfling, R.; Xiao, P. (2007). "A contribution to multivariate L-moments: L-comoment matrices". Journal of Multivariate Analysis 98 (9): 1765–1781. doi:10.1016/j.jmva.2007.01.008.
↑ Delicado, P.; Goria, M. N. (2008). "A small sample comparison of maximum likelihood, moments and L-moments methods for the asymmetric exponential power distribution". Computational Statistics & Data Analysis 52 (3): 1661–1673. doi:10.1016/j.csda.2007.05.021.
↑ Alkasasbeh, M. R.; Raqab, M. Z. (2009). "Estimation of the generalized logistic distribution parameters: comparative study". Statistical Methodology 6 (3): 262–279. doi:10.1016/j.stamet.2008.10.001.
↑ Jones, M. C. (2004). "On some expressions for variance, covariance, skewness and L-moments". Journal of Statistical Planning and Inference 126 (1): 97–106. doi:10.1016/j.jspi.2003.09.001.
↑ Jones, M. C. (2009). "Kumaraswamy's distribution: A beta-type distribution with some tractability advantages". Statistical Methodology 6 (1): 70–81. doi:10.1016/j.stamet.2008.04.001.
↑ Royston, P. (1992). "Which measures of skewness and kurtosis are best?". Statistics in Medicine 11 (3): 333–343. doi:10.1002/sim.4780110306.
↑ Ulrych, T. J.; Velis, D. R.; Woodbury, A. D.; Sacchi, M. D. (2000). "L-moments and C-moments". Stochastic Environmental Research and Risk Assessment 14 (1): 50–68. doi:10.1007/s004770050004.
↑ Elamir, Elsayed A. H.; Seheult, Allan H. (2003). "Trimmed L-moments". Computational Statistics & Data Analysis 43 (3): 299–314. doi:10.1016/S0167-9473(02)00250-5.

External links

The L-moments page Jonathan R.M. Hosking, IBM Research
L Moments. Dataplot reference manual, vol. 1, auxiliary chapter. National Institute of Standards and Technology, 2006. Accessed 2010-05-25.

Theory of probability distributions

probability mass function (pmf) probability density function (pdf) cumulative distribution function (cdf) quantile function

raw moment central moment mean variance standard deviation skewness kurtosis L-moment

moment-generating function (mgf) characteristic function probability-generating function (pgf) cumulant combinant

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque-Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the Monday, August 10, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.