Wilcoxon signed-rank test

The Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ (i.e. it is a paired difference test). It can be used as an alternative to the paired Student's t-test, t-test for matched pairs, or the t-test for dependent samples when the population cannot be assumed to be normally distributed.^[1]

History

The test is named for Frank Wilcoxon (1892–1965) who, in a single paper, proposed both it and the rank-sum test for two independent samples (Wilcoxon, 1945).^[2] The test was popularized by Sidney Siegel (1956) in his influential text book on non-parametric statistics.^[3] Siegel used the symbol T for a value related to, but not the same as, $W$ . In consequence, the test is sometimes referred to as the Wilcoxon T test, and the test statistic is reported as a value of T.

Assumptions

Data are paired and come from the same population.
Each pair is chosen randomly and independently.
The data are measured at least on an ordinal scale (cannot be nominal).

Test procedure

Let $N$ be the sample size, the number of pairs. Thus, there are a total of 2N data points. For $i = 1, ..., N$ , let $x_{1,i}$ and $x_{2,i}$ denote the measurements.

H₀: difference between the pairs follows a symmetric distribution around zero

H₁: difference between the pairs does not follow a symmetric distribution around zero.

For $i = 1, ..., N$ , calculate $|x_{2,i} - x_{1,i}|$ and $\sgn(x_{2,i} - x_{1,i})$ , where $\sgn$ is the sign function.
Exclude pairs with $|x_{2,i} - x_{1,i}| = 0$ . Let $N_r$ be the reduced sample size.
Order the remaining $N_r$ pairs from smallest absolute difference to largest absolute difference, $|x_{2,i} - x_{1,i}|$ .
Rank the pairs, starting with the smallest as 1. Ties receive a rank equal to the average of the ranks they span. Let $R_i$ denote the rank.
Calculate the test statistic $W$
$W = \sum_{i=1}^{N_r} [\sgn(x_{2,i} - x_{1,i}) \cdot R_i]$ , the sum of the signed ranks.
Under null hypothesis, $W$ follows a specific distribution with no simple expression. This distribution has an expected value of 0 and a variance of $\frac{N_r(N_r + 1)(2N_r + 1)}{6}$ .
$W$ can be compared to a critical value from a reference table.^[1]

The two-sided test consists in rejecting $H_0$ , if $|W| \ge W_{critical, N_r}$ .
As $N_r$ increases, the sampling distribution of $W$ converges to a normal distribution. Thus,
For $N_r \ge 10$ , a z-score can be calculated as $z = \frac{W}{\sigma_W}, \sigma_W = \sqrt{\frac{N_r(N_r + 1)(2N_r + 1)}{6}}$ .

If $|z| > z_{critical}$ then reject $H_0$ (two-sided test)

Alternatively, one-sided tests can be realised with either the exact or the approximative distribution. p-value can also be calculated.

The T statistic used by Siegel is the smaller of two sums of ranks of given sign; in the example given below, therefore, T would equal 3+4+5+6=18. Low values of T are required for significance. As will be obvious from the example below, T is easier to calculate by hand than W and the test is equivalent to the two-sided test above-described (the distribution of the statistic under H0 has to be adjusted).

Example

			$x_{2,i} - x_{1,i}$
$i_{}$	$x_{2,i}$	$x_{1,i}$	$\sgn$	$\text{abs}$
1	125	110	1	15
2	115	122	–1	7
3	130	125	1	5
4	140	120	1	20
5	140	140		0
6	115	124	–1	9
7	140	123	1	17
8	125	137	–1	12
9	140	135	1	5
10	135	145	–1	10

order by absolute difference

			$x_{2,i} - x_{1,i}$
$i_{}$	$x_{2,i}$	$x_{1,i}$	$\sgn$	$\text{abs}$	$R_i$	$\sgn \cdot R_i$
5	140	140		0
3	130	125	1	5	1.5	1.5
9	140	135	1	5	1.5	1.5
2	115	122	–1	7	3	–3
6	115	124	–1	9	4	–4
10	135	145	–1	10	5	–5
8	125	137	–1	12	6	–6
1	125	110	1	15	7	7
7	140	123	1	17	8	8
4	140	120	1	20	9	9

sgn

is the sign function,

\text{abs}

is the absolute value, and

R_i

is the rank. Notice that pairs 3 and 9 are tied in absolute value. They would be ranked 1 and 2, so each gets the average of those ranks, 1.5.

N_r = 10 - 1 = 9, |W| = |1.5+1.5-3-4-5-6+7+8+9| = 9.

|W| < W_{\alpha = 0.05, 9 , two-sided} = 35 \therefore \text{fail to reject } H_0.

Effect size

To compute an effect size for the signed-rank test, one can use the rank correlation.

If the test statistic W is reported, Kerby (2014) has shown that the rank correlation r is equal to the test statistic W divided by the total rank sum S, or r = W/S.^[4] Using the above example, the test statistic is W = 9. The sample size of 9 has a total rank sum of S = (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9) = 45. Hence, the rank correlation is 9/45, so r = 0.20.

If the test statistic T is reported, an equivalent way to compute the rank correlation is with the difference in proportion between the two rank sums, which is the Kerby (2014) simple difference formula.^[4] To continue with the current example, the sample size is 9, so the total rank sum is 45. T is the smaller of the two rank sums, so T is 3 + 4 + 5 + 6 = 18. From this information alone, the remaining rank sum can be computed, because it is the total sum S minus T, or in this case 45 - 18 = 27. Next, the two rank-sum proportions are 27/45 = 60% and 18/45 = 40%. Finally, the rank correlation is the difference between the two proportions (.60 minus .40), hence r = .20.

Implementations

ALGLIB includes implementation of the Wilcoxon signed-rank test in C++, C#, Delphi, Visual Basic, etc.
The free statistical software R includes an implementation of the test as wilcox.test(x,y, paired=TRUE), where x and y are vectors of equal length.^[5]
GNU Octave implements various one-tailed and two-tailed versions of the test in the wilcoxon_test function.
SciPy includes an implementation of the Wilcoxon signed-rank test in Python

References

1 2 Lowry, Richard. "Concepts & Applications of Inferential Statistics". Retrieved 24 March 2011.
↑ Wilcoxon, Frank (Dec 1945). "Individual comparisons by ranking methods" (PDF). Biometrics Bulletin 1 (6): 80–83.
↑ Siegel, Sidney (1956). Non-parametric statistics for the behavioral sciences. New York: McGraw-Hill. pp. 75–83.
1 2 Kerby, D. S. (2014). The simple difference formula: An approach to teaching nonparametric correlation. Innovative Teaching, volume 3, article 1. doi:10.2466/11.IT.3.1. link to pdf
↑ Dalgaard, Peter (2008). Introductory Statistics with R. Springer Science & Business Media. pp. 99–100. ISBN 978-0-387-79053-4.

External links

Wilcoxon Signed-Rank Test in R
Example of using the Wilcoxon signed-rank test
An online version of the test
A table of critical values for the Wilcoxon signed-rank test
Brief guide by experimental psychologist Karl L. Weunsch - Nonparametric effect size estimators (Copyright 2015 by Karl L. Weunsch)

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Dark data Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque–Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the Friday, April 08, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.