Vector autoregression

The vector autoregression (VAR) is an econometric model used to capture the linear interdependencies among multiple time series. VAR models generalize the univariate autoregressive model (AR model) by allowing for more than one evolving variable. All variables in a VAR are treated symmetrically in a structural sense (although the estimated quantitative response coefficients will not in general be the same); each variable has an equation explaining its evolution based on its own lags and the lags of the other model variables. VAR modeling does not require as much knowledge about the forces influencing a variable as do structural models with simultaneous equations: The only prior knowledge required is a list of variables which can be hypothesized to affect each other intertemporally.

Specification

Definition

A VAR model describes the evolution of a set of k variables (called endogenous variables) over the same sample period (t = 1, ..., T) as a linear function of only their past values. The variables are collected in a k × 1 vector y_t, which has as the i^th element, y_i,t, the observation at time "t" of the i^th variable. For example, if the i^th variable is GDP, then y_i,t is the value of GDP at time t.

A p-th order VAR, denoted VAR(p), is

y_t = c + A_1 y_{t-1} + A_2 y_{t-2} + \cdots + A_p y_{t-p} + e_t, \,

where the l-periods back observation y_t−l is called the l-th lag of y, c is a k × 1 vector of constants (intercepts), A_i is a time-invariant k × k matrix and e_t is a k × 1 vector of error terms satisfying

$\mathrm{E}(e_t) = 0\,$ — every error term has mean zero;
$\mathrm{E}(e_t e_t') = \Omega\,$ — the contemporaneous covariance matrix of error terms is Ω (a k × k positive-semidefinite matrix);
$\mathrm{E}(e_t e_{t-k}') = 0\,$ for any non-zero k — there is no correlation across time; in particular, no serial correlation in individual error terms.^[1]

A pth-order VAR is also called a VAR with p lags. The process of choosing the maximum lag p in the VAR model requires special attention because inference is dependent on correctness of the selected lag order.^[2]^[3]

Order of integration of the variables

Note that all variables have to be of the same order of integration. The following cases are distinct:

All the variables are I(0) (stationary): one is in the standard case, i.e. a VAR in level
All the variables are I(d) (non-stationary) with d > 0:
- The variables are cointegrated: the error correction term has to be included in the VAR. The model becomes a Vector error correction model (VECM) which can be seen as a restricted VAR.
- The variables are not cointegrated: the variables have first to be differenced d times and one has a VAR in difference.

Concise matrix notation

One can stack the vectors in order to write a VAR(p) with a concise matrix notation:

Y=BZ +U \,

Details of the matrices are in a separate page.

Example

For a general example of a VAR(p) with k variables, see General matrix notation of a VAR(p).

A VAR(1) in two variables can be written in matrix form (more compact notation) as

\begin{bmatrix}y_{1,t} \\ y_{2,t}\end{bmatrix} = \begin{bmatrix}c_{1} \\ c_{2}\end{bmatrix} + \begin{bmatrix}A_{1,1}&A_{1,2} \\ A_{2,1}&A_{2,2}\end{bmatrix}\begin{bmatrix}y_{1,t-1} \\ y_{2,t-1}\end{bmatrix} + \begin{bmatrix}e_{1,t} \\ e_{2,t}\end{bmatrix},

(in which only a single A matrix appears because this example has a maximum lag p equal to 1), or, equivalently, as the following system of two equations

y_{1,t} = c_{1} + A_{1,1}y_{1,t-1} + A_{1,2}y_{2,t-1} + e_{1,t}\,

y_{2,t} = c_{2} + A_{2,1}y_{1,t-1} + A_{2,2}y_{2,t-1} + e_{2,t}.\,

Each variable in the model has one equation. The current (time t) observation of each variable depends on its own lagged values as well as on the lagged values of each other variable in the VAR.

Writing VAR(p) as VAR(1)

A VAR with p lags can always be equivalently rewritten as a VAR with only one lag by appropriately redefining the dependent variable. The transformation amounts to stacking the lags of the VAR(p) variable in the new VAR(1) dependent variable and appending identities to complete the number of equations.

For example, the VAR(2) model

y_t = c + A_1 y_{t-1} + A_2 y_{t-2} + e_t

can be recast as the VAR(1) model

\begin{bmatrix}y_{t} \\ y_{t-1}\end{bmatrix} = \begin{bmatrix}c \\ 0\end{bmatrix} + \begin{bmatrix}A_{1}&A_{2} \\ I&0\end{bmatrix}\begin{bmatrix}y_{t-1} \\ y_{t-2}\end{bmatrix} + \begin{bmatrix}e_{t} \\ 0\end{bmatrix},

where I is the identity matrix.

The equivalent VAR(1) form is more convenient for analytical derivations and allows more compact statements.

Structural vs. reduced form

Structural VAR

A structural VAR with p lags (sometimes abbreviated SVAR) is

B_0 y_t = c_0 + B_1 y_{t-1} + B_2 y_{t-2} + \cdots + B_p y_{t-p} + \epsilon_t,

where c₀ is a k × 1 vector of constants, B_i is a k × k matrix (for every i = 0, ..., p) and ε_t is a k × 1 vector of error terms. The main diagonal terms of the B₀ matrix (the coefficients on the i^th variable in the i^th equation) are scaled to 1.

The error terms ε_t (structural shocks) satisfy the conditions (1) - (3) in the definition above, with the particularity that all the elements off the main diagonal of the covariance matrix $\mathrm{E}(\epsilon_t\epsilon_t') = \Sigma$ are zero. That is, the structural shocks are uncorrelated.

For example, a two variable structural VAR(1) is:

\begin{bmatrix}1&B_{0;1,2} \\ B_{0;2,1}&1\end{bmatrix}\begin{bmatrix}y_{1,t} \\ y_{2,t}\end{bmatrix} = \begin{bmatrix}c_{0;1} \\ c_{0;2}\end{bmatrix} + \begin{bmatrix}B_{1;1,1}&B_{1;1,2} \\ B_{1;2,1}&B_{1;2,2}\end{bmatrix}\begin{bmatrix}y_{1,t-1} \\ y_{2,t-1}\end{bmatrix} + \begin{bmatrix}\epsilon_{1,t} \\ \epsilon_{2,t}\end{bmatrix},

where

\Sigma = \mathrm{E}(\epsilon_t \epsilon_t') = \begin{bmatrix}\sigma_{1}^2&0 \\ 0&\sigma_{2}^2\end{bmatrix};

that is, the variances of the structural shocks are denoted $\mathrm{var}(\epsilon_i) = \sigma_i^2$ (i = 1, 2) and the covariance is $\mathrm{cov}(\epsilon_1,\epsilon_2) = 0$ .

Writing the first equation explicitly and passing y_2,t to the right hand side one obtains

y_{1,t} = c_{0;1} - B_{0;1,2}y_{2,t} + B_{1;1,1}y_{1,t-1} + B_{1;1,2}y_{2,t-1} + \epsilon_{1,t}\,

Note that y_2,t can have a contemporaneous effect on y_1,t if B_0;1,2 is not zero. This is different from the case when B₀ is the identity matrix (all off-diagonal elements are zero — the case in the initial definition), when y_2,t can impact directly y_1,t+1 and subsequent future values, but not y_1,t.

Because of the parameter identification problem, ordinary least squares estimation of the structural VAR would yield inconsistent parameter estimates. This problem can be overcome by rewriting the VAR in reduced form.

From an economic point of view, if the joint dynamics of a set of variables can be represented by a VAR model, then the structural form is a depiction of the underlying, "structural", economic relationships. Two features of the structural form make it the preferred candidate to represent the underlying relations:

1. Error terms are not correlated. The structural, economic shocks which drive the dynamics of the economic variables are assumed to be independent, which implies zero correlation between error terms as a desired property. This is helpful for separating out the effects of economically unrelated influences in the VAR. For instance, there is no reason why an oil price shock (as an example of a supply shock) should be related to a shift in consumers' preferences towards a style of clothing (as an example of a demand shock); therefore one would expect these factors to be statistically independent.

2. Variables can have a contemporaneous impact on other variables. This is a desirable feature especially when using low frequency data. For example, an indirect tax rate increase would not affect tax revenues the day the decision is announced, but one could find an effect in that quarter's data.

Reduced-form VAR

By premultiplying the structural VAR with the inverse of B₀

y_t = B_0^{-1}c_0 + B_0^{-1} B_1 y_{t-1} + B_0^{-1} B_2 y_{t-2} + \cdots + B_0^{-1} B_p y_{t-p} + B_0^{-1}\epsilon_t,

and denoting

B_{0}^{-1} c_0 = c,\quad B_{0}^{-1}B_i = A_{i}\text{ for }i = 1, \dots, p\text{ and }B_{0}^{-1}\epsilon_t = e_t

one obtains the pth order reduced VAR

y_t = c + A_1 y_{t-1} + A_2 y_{t-2} + \cdots + A_p y_{t-p} + e_t

Note that in the reduced form all right hand side variables are predetermined at time t. As there are no time t endogenous variables on the right hand side, no variable has a direct contemporaneous effect on other variables in the model.

However, the error terms in the reduced VAR are composites of the structural shocks e_t = B₀⁻¹ε_t. Thus, the occurrence of one structural shock ε_i,t can potentially lead to the occurrence of shocks in all error terms e_j,t, thus creating contemporaneous movement in all endogenous variables. Consequently, the covariance matrix of the reduced VAR

\Omega = \mathrm{E}(e_t e_t') = \mathrm{E} (B_0^{-1} \epsilon_t \epsilon_t' (B_0^{-1})') = B_0^{-1}\Sigma(B_0^{-1})'\,

can have non-zero off-diagonal elements, thus allowing non-zero correlation between error terms.

Estimation

Estimation of the regression parameters

Starting from the concise matrix notation (for details see this annex):

Y=BZ +U \,

The multivariate least squares (MLS) for B yields:

\hat B= YZ^{'}(ZZ^{'})^{-1}

It can be written alternatively as:

\operatorname{Vec}(\hat B) = ((ZZ^{'})^{-1} Z \otimes I_{k})\ \operatorname{Vec}(Y)

Where $\otimes$ denotes the Kronecker product and Vec the vectorization of the matrix Y.

This estimator is consistent and asymptotically efficient. It is furthermore equal to the conditional maximum likelihood estimator.^[4]

As the explanatory variables are the same in each equation, the multivariate least squares estimator is equivalent to the ordinary least squares estimator applied to each equation separately.^[5]

Estimation of the covariance matrix of the errors

As in the standard case, the maximum likelihood estimator (MLE) of the covariance matrix differs from the ordinary least squares (OLS) estimator.

MLE estimator: $\hat \Sigma = \frac{1}{T} \sum_{t=1}^T \hat \epsilon_t\hat \epsilon_t'$

OLS estimator: $\hat \Sigma = \frac{1}{T-kp-1} \sum_{t=1}^T \hat \epsilon_t\hat \epsilon_t'$ for a model with a constant, k variables and p lags.

In a matrix notation, this gives:

\hat \Sigma = \frac{1}{T-kp-1} (Y-\hat{B}Z)(Y-\hat{B}Z)'.

Estimation of the estimator's covariance matrix

The covariance matrix of the parameters can be estimated as

\widehat \mbox{Cov} (\mbox{Vec}(\hat B)) =({ZZ'})^{-1} \otimes\hat \Sigma.\,

Interpretation of estimated model

Main article: Variance decomposition of forecast errors

Properties of the VAR model are usually summarized using structural analysis using Granger causality, Impulse responses, and forecast error variance decompositions.

Forecasting using an estimated VAR model

Main articles: Autoregressive model § n-step-ahead forecasting and Autoregressive model § Evaluating the quality of forecasts

An estimated VAR model can be used for forecasting, and the quality of the forecasts can be judged, in ways that are completely analogous to the methods used in univariate autoregressive modelling.

Applications

Christopher Sims advocated VAR models, criticizing the claims and performance of earlier modeling in macroeconomic econometrics.^[6] He recommended VAR models, which had previously appeared in time series statistics and in system identification, a statistical specialty in control theory. Sims advocated VAR models as providing a theory-free method to estimate economic relationships, thus being an alternative to the "incredible identification restrictions" in structural models.^[6]

Software

R: there is a package called vars which deals with VAR models.^[7]
Python: PyFlux has support for VARs and Bayesian VARs.
SAS: VARMAX
Stata: "var"
EViews: "VAR"
Gretl: "var"
Regression analysis of time series

Notes

↑ For multivariate tests for autocorrelation in the VAR models, see Hatemi-J, A. (2004). "Multivariate tests for autocorrelation in the stable and unstable VAR models". Economic Modelling 21 (4): 661–683. doi:10.1016/j.econmod.2003.09.005.
↑ Hacker, R. S.; Hatemi-J, A. (2008). "Optimal lag-length choice in stable and unstable VAR models under situations of homoscedasticity and ARCH". Journal of Applied Statistics 35 (6): 601–615. doi:10.1080/02664760801920473.
↑ Hatemi-J, A.; Hacker, R. S. (2009). "Can the LR test be helpful in choosing the optimal lag order in the VAR model when information criteria suggest different lag orders?". Applied Economics 41 (9): 1489–1500.
↑ Hamilton, James D. (1994). Time Series Analysis. Princeton University Press. p. 293.
↑ Zellner, Arnold (1962). "An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias". Journal of the American Statistical Association 57 (298): 348–368. doi:10.1080/01621459.1962.10480664.
1 2 Sims, Christopher (1980). "Macroeconomics and Reality". Econometrica 48 (1): 1–48. JSTOR 1912017.
↑ Bernhard Pfaff VAR, SVAR and SVEC Models: Implementation Within R Package vars

Asteriou, Dimitrios; Hall, Stephen G. (2011). "Vector Autoregressive (VAR) Models and Causality Tests". Applied Econometrics (Second ed.). London: Palgrave MacMillan. pp. 319–333.
Enders, Walter (2010). Applied Econometric Time Series (Third ed.). New York: John Wiley & Sons. pp. 272–355. ISBN 978-0-470-50539-7.
Favero, Carlo A. (2001). Applied Macroeconometrics. New York: Oxford University Press. pp. 162–213. ISBN 0-19-829685-1.
Lütkepohl, Helmut (2005). New Introduction to Multiple Time Series Analysis. Berlin: Springer. ISBN 3-540-40172-5.
Qin, Duo (2011). "Rise of VAR Modelling Approach". Journal of Economic Surveys 25 (1): 156–174. doi:10.1111/j.1467-6419.2010.00637.x.

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Dark data Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque–Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

Economics

Divisions	Economic theory Econometrics Applied economics

Macroeconomics	Adaptive expectations Aggregate demand Balance of payments Business cycle Capacity utilization Capital flight Central bank Consumer confidence Currency Demand shock Depression (Great Depression) DSGE Economic growth Economic indicator Economic rent Effective demand General Theory of Keynes Hyperinflation Inflation Interest Interest rate Investment IS–LM model Microfoundations Monetary policy Money NAIRU National accounts PPP Rate of profit Rational expectations Recession Saving Stagflation Supply shock Unemployment Macroeconomics publications

Microeconomics	Aggregation problem Budget set Consumer choice Convexity Cost–benefit analysis Deadweight loss Distribution Duopoly Economic equilibrium Economic shortage Economic surplus Economies of scale Economies of scope Elasticity Expected utility hypothesis Externality General equilibrium theory Indifference curve Intertemporal choice Marginal cost Market failure Market structure Monopoly Monopsony Non-convexity Oligopoly Opportunity cost Preference Production set Profit Public good Returns to scale Risk aversion Scarcity Social choice theory Sunk costs Supply and demand Theory of the firm Trade Uncertainty Utility Microeconomics publications

Applied fields	Agricultural Business Demographic Development Economic history Education Environmental Financial Health Industrial organization International Labour Law and economics Monetary Natural resource Public Urban Welfare

Methodology	Behavioral economics Computational economics Econometrics Economic systems Experimental economics Mathematical economics Methodological publications

Economic thought	Ancient economic thought Chicago school of economics Classical economics Feminist economics Heterodox economics Institutional economics Keynesian economics Mainstream economics Marxian economics Neoclassical economics Post-Keynesian economics Schools overview

Notable economists and thinkers within economics	Kenneth Arrow Gary Becker Francis Ysidro Edgeworth Milton Friedman Ragnar Frisch Friedrich Hayek Harold Hotelling John Maynard Keynes Tjalling Koopmans Paul Krugman Robert Lucas, Jr. Jacob Marschak Alfred Marshall Karl Marx John von Neumann Vilfredo Pareto David Ricardo Paul Samuelson Joseph Schumpeter Amartya Sen Herbert A. Simon Adam Smith Robert Solow Léon Walras more

International organizations	Asia-Pacific Economic Cooperation Economic Cooperation Organization European Free Trade Association International Monetary Fund Organisation for Economic Co-operation and Development World Bank World Trade Organization

Category Index Lists Outline Publications Business and economics portal

This article is issued from Wikipedia - version of the Sunday, April 10, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.

Vector autoregression

Specification

Definition

Order of integration of the variables

Concise matrix notation

Example

Writing VAR(p) as VAR(1)

Structural vs. reduced form

Structural VAR

Reduced-form VAR

Estimation

Estimation of the regression parameters

Estimation of the covariance matrix of the errors

Estimation of the estimator's covariance matrix

Interpretation of estimated model

Forecasting using an estimated VAR model

Applications

Software

See also

Notes

Further reading