Average absolute deviation

The average absolute deviation (or mean absolute deviation) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In this general form, the central point can be the mean, median, mode, or the result of another measure of central tendency. Furthermore, as described in the article about averages, the deviation averaging operation may refer to the mean or the median. Thus the total number of combinations amounts to at least four types of average absolute deviation.

Measures of dispersion

Several measures of statistical dispersion are defined in terms of the absolute deviation. The term "average absolute deviation" does not uniquely identify a measure of statistical dispersion, as there are several measures that can be used to measure absolute deviations, and there are several measures of central tendency that can be used as well. Thus, to uniquely identify the absolute deviation it is necessary to specify both the measure of deviation and the measure of central tendency. Unfortunately, the statistical literature has not yet adopted a standard notation, as both the #Mean absolute deviation around the mean and the #Median absolute deviation around the median have been denoted by their initials "MAD" in the literature, which may lead to confusion, since in general, they may have values considerably different from each other.

Mean absolute deviation around a central point

For arbitrary differences (not around a central point), see Mean absolute difference.

The mean absolute deviation of a set {x₁, x₂, ..., x_n} is

\frac{1}{n}\sum_{i=1}^n |x_i-m(X)|.

The choice of measure of central tendency, $m(X)$ , has a marked effect on the value of the mean deviation. For example, for the data set {2, 2, 3, 4, 14}:

Measure of central tendency $m(X)$	Mean absolute deviation
Mean = 5	$\frac{\|2 - 5\| + \|2 - 5\| + \|3 - 5\| + \|4 - 5\| + \|14 - 5\|}{5} = 3.6$
Median = 3	$\frac{\|2 - 3\| + \|2 - 3\| + \|3 - 3\| + \|4 - 3\| + \|14 - 3\|}{5} = 2.8$
Mode = 2	$\frac{\|2 - 2\| + \|2 - 2\| + \|3 - 2\| + \|4 - 2\| + \|14 - 2\|}{5} = 3.0$

The mean absolute deviation from the median is less than or equal to the mean absolute deviation from the mean. In fact, the mean absolute deviation from the median is always less than or equal to the mean absolute deviation from any other fixed number.

The mean absolute deviation from the mean is less than or equal to the standard deviation; one way of proving this relies on Jensen's inequality.

Proof

Jensen's inequality is

\varphi\left(\mathbb{E}[X]\right) \leq \mathbb{E}\left[\varphi(X)\right]

,where φ is a convex function, this implies that:

\mathbb{E}\left( |x -\mu \right|)^{2}\leq\mathbb{E}\left(|x-\mu|^2 \right)

\mathbb{E}\left( |x -\mu \right|)^{2}\leq \operatorname{Var}(x)

Since both sides are positive, and the square root is a monotonically increasing function in the positive domain:

\mathbb{E}\left(|x -\mu \right|)\leq \sqrt{\operatorname{Var}(x)}

For a general case of this statement, see Hölder's inequality.

For the normal distribution, the ratio of mean absolute deviation to standard deviation is $\sqrt{2/\pi} = 0.79788456\ldots$ . Thus if X is a normally distributed random variable with expected value 0 then, see Geary (1935):^[1]

w=\frac{ E|X| }{ \sqrt{E(X^2)} } = \sqrt{\frac{2}{\pi}}.

In other words, for a normal distribution, mean absolute deviation is about 0.8 times the standard deviation. However in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian sample n with the following bounds: $w_n \in [0,1]$ , with a bias for small n.^[2]

Mean absolute deviation around the mean

The mean absolute deviation (MAD), also referred to as the "mean deviation" or sometimes "average absolute deviation", is the mean of the data's absolute deviations around the data's mean: the average (absolute) distance from the mean. "Average absolute deviation" can refer to either this usage, or to the general form with respect to a specified central point (see above).

MAD has been proposed to be used in place of standard deviation since it corresponds better to real life.^[3] Because the MAD is a simpler measure of variability than the standard deviation, it can be used as pedagogical tool to help motivate the standard deviation.^[4]^[5]

This method's forecast accuracy is very closely related to the mean squared error (MSE) method which is just the average squared error of the forecasts. Although these methods are very closely related, MAD is more commonly used because it is both easier to compute (avoiding the need for squaring)^[6] and easier to understand.^[7]

Mean absolute deviation around the median

Mean absolute deviation around the median (MAD median) offers a direct measure of the scale of a random variable around its median

D_\text{med} = E|X-\text{median}|

For the normal distribution we have $D_\text{med} = \sigma \sqrt{2/\pi}$ . Since the median minimizes the average absolute distance, we have $D_\text{med} \le D_\text{mean}$ . By using the general dispersion function Habib (2011) defined MAD about median as

D_\text{med} = E|X-\text{median}|=2\operatorname{Cov}(X,I_O)

where the indicator function is

\mathbf{I}_O := \begin{cases} 1 &\text{if } x > \text{median}, \\ 0 &\text{otherwise}. \end{cases}

This representation allows for obtaining MAD median correlation coefficients;^[8]

Median absolute deviation around a central point

Median absolute deviation around the mean

In principle the mean could be taken as the central point for the median absolute deviation, but more often the median value is taken instead.

Median absolute deviation around the median

Main article: Median absolute deviation

The median absolute deviation (also MAD) is the median of the absolute deviation from the median. It is a robust estimator of dispersion.

For the example {2, 2, 3, 4, 14}: 3 is the median, so the absolute deviations from the median are {1, 1, 0, 1, 11} (reordered as {0, 1, 1, 1, 11}) with a median of 1, in this case unaffected by the value of the outlier 14, so the median absolute deviation (also called MAD) is 1.

Maximum absolute deviation

The maximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point. While not strictly a measure of central tendency, the maximum absolute deviation can be found using the formula for the average absolute deviation as above with $m(X)=\max(X)$ , where $\max(X)$ is the sample maximum.

Minimization

The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as minimizing dispersion: The median is the measure of central tendency most associated with the absolute deviation. Some location parameters can be compared as follows:

L² norm statistics: the mean minimizes the mean squared error
L¹ norm statistics: the median minimizes average absolute deviation,
L^∞ norm statistics: the mid-range minimizes the maximum absolute deviation
trimmed L^∞ norm statistics: for example, the midhinge (average of first and third quartiles) which minimizes the median absolute deviation of the whole distribution, also minimizes the maximum absolute deviation of the distribution after the top and bottom 25% have been trimmed off..

Estimation

The mean absolute deviation of a sample is a biased estimator of the mean absolute deviation of the population. In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation. However, it does not. For the population 1,2,3 both the population absolute deviation about the median and the population absolute deviation about the mean are 2/3. The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9. Therefore the absolute deviation is a biased estimator.
However, this argument is based on the notion of mean-unbiasedness. Each measure of location has its own form of unbiasedness (see entry on biased estimator). The relevant form of unbiasedness here is median unbiasedness.

References

↑ Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation as a test of normality. Biometrika, 27(3/4), 310–332.
↑ See also Geary's 1936 and 1946 papers: Geary, R. C. (1936). Moments of the ratio of the mean deviation to the standard deviation for normal samples. Biometrika, 28(3/4), 295–307 and Geary, R. C. (1947). Testing for normality. Biometrika, 34(3/4), 209–242.
↑ http://www.edge.org/response-detail/25401
↑ Kader, Gary (March 1999). "Means and MADS". Mathematics Teaching in the Middle School 4 (6): 398–403. Retrieved 20 February 2013.
↑ Franklin, Christine, Gary Kader, Denise Mewborn, Jerry Moreno, Roxy Peck, Mike Perry, and Richard Scheaffer (2007). Guidelines for Assessment and Instruction in Statistics Education (PDF). American Statistical Association. ISBN 978-0-9791747-1-1.
↑ Nahmias, Steven; Olsen, Tava Lennon (2015), Production and Operations Analysis (7th ed.), Waveland Press, p. 62, ISBN 9781478628248, MAD is often the preferred method of measuring the forecast error because it does not require squaring.
↑ Stadtler, Hartmut; Kilger, Christoph; Meyr, Herbert, eds. (2014), Supply Chain Management and Advanced Planning: Concepts, Models, Software, and Case Studies, Springer Texts in Business and Economics (5th ed.), Springer, p. 143, ISBN 9783642553097, the meaning of the MAD is easier to interpret .
↑ Habib, Elsayed A.E. (2011). "Correlation coefficients based on mean absolute deviation about median". International Journal of Statistics and Systems 6 (4): 413–428.

External links

Advantages of the mean absolute deviation

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque–Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the Thursday, April 07, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.