Random effects model

Regression analysis
Part of a series on Statistics

Models
Linear regression Simple regression Ordinary least squares Polynomial regression General linear model
Generalized linear model Discrete choice Logistic regression Multinomial logit Mixed logit Probit Multinomial probit Ordered logit Ordered probit Poisson
Multilevel model Fixed effects Random effects Mixed model
Nonlinear regression Nonparametric Semiparametric Robust Quantile Isotonic Principal components Least angle Local Segmented
Errors-in-variables
Estimation
Least squares Ordinary least squares Linear (math) Partial Total Generalized Weighted Non-linear Non-negative Iteratively reweighted Ridge regression
Least absolute deviations Bayesian Bayesian multivariate
Background
Regression model validation Mean and predicted response Errors and residuals Goodness of fit Studentized residual Gauss–Markov theorem
Statistics portal

In statistics, a random effect(s) model, also called a variance components model, is a kind of hierarchical linear model. It assumes that the dataset being analysed consists of a hierarchy of different populations whose differences relate to that hierarchy. In econometrics, random effects models are used in the analysis of hierarchical or panel data when one assumes no fixed effects (it allows for individual effects). The random effects model is a special case of the fixed effects model. Contrast this to the biostatistics definitions,^[1]^[2]^[3]^[4] as biostatisticians use "fixed" and "random" effects to respectively refer to the population-average and subject-specific effects (and where the latter are generally assumed to be unknown, latent variables).

Qualitative description

Such models assist in controlling for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. This constant can be removed from the data through differencing, for example by taking a first difference which will remove any time invariant components of the model.

There are two common assumptions made about the individual specific effect, the random effects assumption and the fixed effects assumption. The random effects assumption (made in a random effects model) is that the individual specific effects are uncorrelated with the independent variables. The fixed effect assumption is that the individual specific effect is correlated with the independent variables. If the random effects assumption holds, the random effects model is more efficient than the fixed effects model. However, if this assumption does not hold, the random effects model is not consistent.

Simple example

Suppose m large elementary schools are chosen randomly from among thousands in a large country. Suppose also that n pupils of the same age are chosen randomly at each selected school. Their scores on a standard aptitude test are ascertained. Let Y_ij be the score of the jth pupil at the ith school. A simple way to model the relationships of these quantities is

Y_{ij} = \mu + U_i + W_{ij},\,

where μ is the average test score for the entire population. In this model U_i is the school-specific random effect: it measures the difference between the average score at school i and the average score in the entire country and it is "random" because the school has been randomly selected from a larger population of schools. The term, W_ij is the individual-specific error. That is, it is the deviation of the j-th pupil’s score from the average for the i-th school. Again this is regarded as random because of the random selection of pupils within the school, even though it is a fixed quantity for any given pupil.

The model can be augmented by including additional explanatory variables, which would capture differences in scores among different groups. For example:

Y_{ij} = \mu + \beta_1 \mathrm{Sex}_{ij} + \beta_2 \mathrm{Race}_{ij} + \beta_3 \mathrm{ParentsEduc}_{ij} + U_i + W_{ij},\,

where Sex_ij is the dummy variable for boys/girls, Race_ij is the dummy variable for white/black pupils, and ParentsEduc_ij records the average education level of child’s parents. This is a mixed model, not a purely random effects model.

Variance components

The variance of Y_ij is the sum of the variances τ² and σ² of U_i and W_ij respectively.

Let

\overline{Y}_{i\bullet} = \frac{1}{n}\sum_{j=1}^n Y_{ij}

be the average, not of all scores at the ith school, but of those at the ith school that are included in the random sample. Let

\overline{Y}_{\bullet\bullet} = \frac{1}{mn}\sum_{i=1}^m\sum_{j=1}^n Y_{ij}

be the "grand average".

Let

SSW = \sum_{i=1}^m\sum_{j=1}^n (Y_{ij} - \overline{Y}_{i\bullet})^2 \,

SSB = n\sum_{i=1}^m (\overline{Y}_{i\bullet} - \overline{Y}_{\bullet\bullet})^2 \,

be respectively the sum of squares due to differences within groups and the sum of squares due to difference between groups. Then it can be shown that

\frac{1}{m(n - 1)}E(SSW) = \sigma^2

and

\frac{1}{(m - 1)n}E(SSB) = \frac{\sigma^2}{n} + \tau^2.

These "expected mean squares" can be used as the basis for estimation of the "variance components" σ² and τ².

Unbiasedness

In general, random effects are efficient, and should be used (over fixed effects) if the assumptions underlying them are believed to be satisfied. For random effects to work in the school example it is necessary that the school-specific effects be uncorrelated to the other covariates of the model. This can be tested by running fixed effects, then random effects, and doing a Hausman specification test. If the test rejects, then random effects is biased and fixed effects is the correct estimation procedure.

Notes

↑ Diggle, Peter J.; Heagerty, Patrick; Liang, Kung-Yee; Zeger, Scott L. (2002). Analysis of Longitudinal Data (2nd ed.). Oxford University Press. pp. 169–171. ISBN 0-19-852484-6.
↑ Fitzmaurice, Garrett M.; Laird, Nan M.; Ware, James H. (2004). Applied Longitudinal Analysis. Hoboken: John Wiley & Sons. pp. 326–328. ISBN 0-471-21487-6.
↑ Laird, Nan M.; Ware, James H. (1982). "Random-Effects Models for Longitudinal Data". Biometrics 38 (4): 963–974. JSTOR 2529876.
↑ Gardiner, Joseph C.; Luo, Zhehui; Roman, Lee Anne (2009). "Fixed effects, random effects and GEE: What are the differences?". Statistics in Medicine 28: 221–239. doi:10.1002/sim.3478.

External links

This article is issued from Wikipedia - version of the Sunday, April 17, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.