Wald test

The Wald test is a parametric statistical test named after the Hungarian statistician Abraham Wald. Whenever a relationship within or between data items can be expressed as a statistical model with parameters to be estimated from a sample, the Wald test can be used to test the true value of the parameter based on the sample estimate.

Suppose an economist, who has data on social class and shoe size, wonders whether social class is associated with shoe size. Say $\theta$ is the average increase in shoe size for upper-class people compared to middle-class people: then the Wald test can be used to test whether $\theta$ is 0 (in which case social class has no association with shoe size) or non-zero (shoe size varies between social classes). Here, $\theta$ , the hypothetical difference in shoe sizes between upper and middle-class people in the whole population, is a parameter. An estimate of $\theta$ might be the difference in shoe size between upper and middle-class people in the sample. In the Wald test, the economist uses the estimate and an estimate of variability (see below) to draw conclusions about the unobserved true $\theta$ . Or, for a medical example, suppose smoking multiplies the risk of lung cancer by some number R: then the Wald test can be used to test whether R = 1 (i.e. there is no effect of smoking) or is greater (or less) than 1 (i.e. smoking alters risk).

A Wald test can be used in a great variety of different models including models for dichotomous variables and models for continuous variables.^[1]

Mathematical details

Under the Wald statistical test, the maximum likelihood estimate $\hat\theta$ of the parameter(s) of interest $\theta$ is compared with the proposed value $\theta_0$ , with the assumption that the difference between the two will be approximately normally distributed. Typically the square of the difference is compared to a chi-squared distribution.

Test on a single parameter

In the univariate case, the Wald statistic is

\frac{ ( \widehat{ \theta}-\theta_0 )^2 }{\operatorname{var}(\hat \theta )}

which is compared against a chi-squared distribution.

Alternatively, the difference can be compared to a normal distribution. In this case the test statistic is

\frac{\widehat{\theta}-\theta_0}{\operatorname{se}(\hat\theta)}

where $\operatorname{se}(\widehat\theta)$ is the standard error of the maximum likelihood estimate (MLE). A reasonable estimate of the standard error for the MLE can be given by $\frac{1}{\sqrt{I_n(MLE)}}$ , where $I_n$ is the Fisher information of the parameter.

Test(s) on multiple parameters

The Wald test can be used to test a single hypothesis on multiple parameters, as well as to test jointly multiple hypotheses on single/multiple parameters. Let $\hat{\theta}_n$ be our sample estimator of P parameters (i.e, $\hat{\theta}_n$ is a Px1 vector), which is supposed to follow asymptotically a normal distribution with covariance matrix V, $\sqrt{n}(\hat{\theta}_n-\theta)\xrightarrow{\mathcal{D}} N(0, V)$ . The test of Q hypotheses on the P parameters is expressed with a Q x P matrix R:

H_0: R\theta=r

H_1: R\theta\neq r

The test statistic is:

(R\hat{\theta}_n-r)^{'}[R(\hat{V}_n/n)R^{'}]^{-1}(R\hat{\theta}_n-r) \quad \xrightarrow{\mathcal{D}}\quad \Chi^2_Q

where $\hat{V}_n$ is an estimator of the covariance matrix.^[2]

Proof

Suppose $\sqrt{n}(\hat{\theta}_n-\theta)\xrightarrow{\mathcal{D}} N(0, V)$ . Then, by Slutsky's theorem and by the properties of the normal distribution, multiplying by R has distribution:

R\sqrt{n}(\hat{\theta}_n-\theta) =\sqrt{n}(R\hat{\theta}_n-r)\xrightarrow{\mathcal{D}} N(0, RVR^{'})

Recalling that a quadratic form of normal distribution has a Chi-squared distribution:

\sqrt{n}(R\hat{\theta}_n-r)^{'}[RVR^{'}]^{-1}\sqrt{n}(R\hat{\theta}_n-r) \xrightarrow{\mathcal{D}} \Chi^2_Q

Rearranging n finally gives:

(R\hat{\theta}_n-r)^{'}[R(V/n)R^{'}]^{-1}(R\hat{\theta}_n-r) \quad \xrightarrow{\mathcal{D}}\quad \Chi^2_Q

What if the covariance matrix is not known a-priori and needs to be estimated from the data? If we have a consistent estimator $\hat{V}_n \sim \Chi^2_{n-P}$ of $V$ , then by independence of the the covariance estimator and equation above, we have:

(R\hat{\theta}_n-r)^{'}[R(\hat{V}_n/n)R^{'}]^{-1}(R\hat{\theta}_n-r) \quad \xrightarrow{\mathcal{D}}\quad F(Q,n-P)

Nonlinear hypothesis

In the standard form, the Wald test is used to test linear hypotheses, that can be represented by a single matrix R. If one wishes to test a non-linear hypothesis of the form:

H_0: c(\theta)=0

H_1: c(\theta)\neq 0

The test statistic becomes:

c(\hat{\theta}_n)^{'}[c^{'}(\hat{\theta}_n)(\hat{V}_n/n)c^{'}(\hat{\theta}_n)^{'}]^{-1}c(\hat{\theta}_n) \quad \xrightarrow{\mathcal{D}}\quad \Chi^2_Q

where $c^{'}(\hat{\theta}_n)$ is the derivative of c evaluated at the sample estimator. This result is obtained using the delta method, which uses a first order approximation of the variance.

Non-invariance to re-parametrisations

The fact that one uses an approximation of the variance has the drawback that the Wald statistic is not-invariant to a non-linear transformation/reparametrisation of the hypothesis: it can give different answers to the same question, depending on how the question is phrased.^[3] For example, asking whether R = 1 is the same as asking whether log R = 0; but the Wald statistic for R = 1 is not the same as the Wald statistic for log R = 0 (because there is in general no neat relationship between the standard errors of R and log R, so it needs to be approximated).

Alternatives to the Wald test

There exist several alternatives to the Wald test, namely the likelihood-ratio test and the Lagrange multiplier test (also known as the score test). Robert F. Engle showed that these three tests, the Wald test, the likelihood-ratio test and the Lagrange multiplier test are asymptotically equivalent.^[4] Although they are asymptotically equivalent, in finite samples, they could disagree enough to lead to different conclusions.

There are several reasons to prefer the likelihood ratio test or the lagrange multiplier to the Wald test:^[5]^[6]^[7]

Non-invariance: As argued above, the Wald test is not invariant to a reparametrization, while the Likelihood ratio tests will give exactly the same answer whether we work with R, log R or any other monotonic transformation of R.
The other reason is that the Wald test uses two approximations (that we know the standard error, and that the distribution is chi-squared), whereas the likelihood ratio test uses one approximation (that the distribution is chi-squared).
The Wald test requires an estimate under the null hypothesis. In some cases, the model is simpler under the alternative hypothesis, so that one might prefer to use the score test (also called Lagrange Multiplier test), which has the advantage that it can be formulated in situations where the variability is difficult to estimate; e.g. the Cochran–Mantel–Haenzel test is a score test.^[8]

References

↑ Harrell, Frank E., Jr. (2001). "Sections 9.2, 10.5". Regression modeling strategies. New York: Springer-Verlag. ISBN 0387952322.
↑ Harrell, Frank E., Jr. (2001). "Section 9.3.1". Regression modeling strategies. New York: Springer-Verlag. ISBN 0387952322.
↑ Fears, Thomas R.; Benichou, Jacques; Gail, Mitchell H. (1996). "A reminder of the fallibility of the Wald statistic". The American Statistician 50 (3): 226–227. doi:10.1080/00031305.1996.10474384.
↑ Engle, Robert F. (1983). "Wald, Likelihood Ratio, and Lagrange Multiplier Tests in Econometrics". In Intriligator, M. D.; and Griliches, Z. Handbook of Econometrics II. Elsevier. pp. 796–801. ISBN 978-0-444-86185-6.
↑ Harrell, Frank E., Jr. (2001). "Section 9.3.3". Regression modeling strategies. New York: Springer-Verlag. ISBN 0387952322.
↑ Collett, David (1994). Modelling Survival Data in Medical Research. London: Chapman & Hall. ISBN 0412448807.
↑ Pawitan, Yudi (2001). In All Likelihood. New York: Oxford University Press. ISBN 0198507658.
↑ Agresti, Alan (2002). Categorical Data Analysis (2nd ed.). Wiley. p. 232. ISBN 0471360937.

External links

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque–Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the Friday, April 29, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.