Repeated measures design

See also: longitudinal study and panel study

Repeated measures design uses the same subjects with every branch of research, including the control.^[1] For instance, repeated measurements are collected in a longitudinal study in which change over time is assessed. Other (non-repeated measures) studies compare the same measure under two or more different conditions. For instance, to test the effects of caffeine on cognitive function, a subject's math ability might be tested once after they consume caffeine and another time when they consume a placebo.

Crossover studies

Main article: Crossover study

A popular repeated-measures design is the crossover study. A crossover study is a longitudinal study in which subjects receive a sequence of different treatments (or exposures). While crossover studies can be observational studies, many important crossover studies are controlled experiments. Crossover designs are common for experiments in many scientific disciplines, for example psychology, education, pharmaceutical science and health-care, especially medicine.

Randomized, controlled, crossover experiments are especially important in health care. In a randomized clinical trial, the subjects are randomly assigned treatments. When such a trial is a repeated measures design, the subjects are randomly assigned to a sequence of treatments. A crossover clinical trial is a repeated-measures design in which each patient is randomly assigned to a sequence of treatments, including at least two treatments (of which one may be a standard treatment or a placebo): Thus each patient crosses over from one treatment to another.

Nearly all crossover designs have "balance", which means that all subjects should receive the same number of treatments and that all subjects participate for the same number of periods. In most crossover trials, each subject receives all treatments.

However, many repeated-measures designs are not crossovers: the longitudinal study of the sequential effects of repeated treatments need not use any "crossover", for example (Vonesh & Chinchilli; Jones & Kenward).

Uses of a repeated measures design

Limited number of subjects—The repeated measure design reduces the variance of estimates of treatment-effects, allowing statistical inference to be made with fewer subjects.^[2]
Efficiency—Repeated measure designs allow many experiments to be completed more quickly, as fewer groups need to be trained to complete an entire experiment. For example, experiments in which each condition takes only a few minutes, whereas the training to complete the tasks take as much, if not more time.
Longitudinal analysis—Repeated measure designs allow researchers to monitor how participants change over time, both long- and short-term situations.

Order effects

Order effects may occur when a participant in an experiment is able to perform a task and then perform it again. Examples of order effects include performance improvement or decline in performance, which may be due to learning effects, boredom or fatigue. The impact of order effects may be smaller in long-term longitudinal studies or by counterbalancing using a crossover design.

Counterbalancing

In this technique two groups each perform the same two tasks, but in reverse order. With two branches, four groups are formed.

Counter Balancing
	Condition 1	Condition 2	Remarks
Group A	Group A1	Group A2	A1 performs Condition 1 first
Group B	Group B2	Group B1	B2 performs Condition 1 first

Counterbalancing tries to take account of two important sources of systematic variation in this type of design: practice and boredom effects. Both might lead to different performance of participants due to familiarity or tiredness to the treatments.

Limitations

It may not be possible for each participant to be in all conditions of the experiment (i.e. time constraints, location of experiment, etc.). Severely diseased subjects tend to drop out of longitudinal studies, potentially biasing the results. In these cases mixed effects models would be preferable as they can deal with missing values.

Mean regression may affect conditions with significant repetitions. Maturation may affect studies that extend over time. Events outside the experiment may change the response between repetitions.

Repeated measures ANOVA

Repeated measures analysis of variance (rANOVA) is a commonly used statistical approach to repeated measure designs.^[3] With such designs, the repeated-measure factor (the qualitative independent variable) is the within-subjects factor, while the dependent quantitative variable on which each participant is measured is the dependent variable.

Partitioning of error

One of the greatest advantages to rANOVA, as is the case with repeated measures designs in general, is the ability to partition out variability due to individual differences. Consider the general structure of the F-statistic:

F = MS_Treatment / MS_Error = (SS_Treatment/df_Treatment)/(SS_Error/df_Error)

In a between-subjects design there is an element of variance due to individual difference that is combined with the treatment and error terms:

SS_Total = SS_Treatment + SS_Error

df_Total = n-1

In a repeated measures design it is possible to partition subject variability from the treatment and error terms. In such a case, variability can be broken down into between-treatments variability (or within-subjects effects, excluding individual differences) and within-treatments variability. The within-treatments variability can be further partitioned into between-subjects variability (individual differences) and error (excluding the individual differences) ^[4]

SS_Total = SS_{Treatment (excluding individual difference)} + SS_Subjects + SS_Error

df_Total = df_{Treatment (within subjects)} + df_{between subjects} + df_error = (k-1) + (n-1) + ((n-k)-(n-1))

In reference to the general structure of the F-statistic, it is clear that by partitioning out the between-subjects variability, the F-value will increase because the sum of squares error term will be smaller resulting in a smaller MSError. It is noteworthy that partitioning variability reduces degrees of freedom from the F-test, therefore the between-subjects variability must be significant enough to offset the loss in degrees of freedom. If between-subjects variability is small this process may actually reduce the F-value.^[4]

Assumptions

As with all statistical analyses, specific assumptions should be met to justify the use of this test. Violations can moderately to severely affect results and often lead to an inflation of type 1 error. With the rANOVA, standard univariate and multivariate assumptions apply.^[5] The univariate assumptions are:

Normality—For each level of the within-subjects factor, the dependent variable must have a normal distribution.
Sphericity—Difference scores computed between two levels of a within-subjects factor must have the same variance for the comparison of any two levels. (This assumption only applies if there are more than 2 levels of the independent variable.)
Randomness—Cases should be derived from a random sample, and scores from different participants should be independent of each other.

The rANOVA also requires that certain multivariate assumptions be met, because a multivariate test is conducted on difference scores. These assumptions include:

Multivariate normality—The difference scores are multivariately normally distributed in the population.
Randomness—Individual cases should be derived from a random sample, and the difference scores for each participant are independent from those of another participant.

F test

As with other analysis of variance tests, the rANOVA makes use of an F statistic to determine significance. Depending on the number of within-subjects factors and assumption violations, it is necessary to select the most appropriate of three tests:^[5]

Standard Univariate ANOVA F test—This test is commonly used given only two levels of the within-subjects factor (i.e. time point 1 and time point 2). This test is not recommended given more than 2 levels of the within-subjects factor because the assumption of sphericity is commonly violated in such cases.
Alternative Univariate test^[6]—These tests account for violations to the assumption of sphericity, and can be used when the within-subjects factor exceeds 2 levels. The F statistic is the same as in the Standard Univariate ANOVA F test, but is associated with a more accurate p-value. This correction is done by adjusting the degrees of freedom downward for determining the critical F value. Two corrections are commonly used—The Greenhouse-Geisser correction and the Huynh-Feldt correction. The Greenhouse-Geisser correction is more conservative, but addresses a common issue of increasing variability over time in a repeated-measures design.^[7] The Huynh-Feldt correction is less conservative, but does not address issues of increasing variability. It has been suggested that lower Huynh-Feldt be used with smaller departures from sphericity, while Greenhouse-Geisser be used when the departures are large.
Multivariate Test—This test does not assume sphericity, but is also highly conservative.

Effect size

One of the most commonly reported effect size statistics for rANOVA is partial eta-squared (η_p²). It is also common to use the multivariate η² when the assumption of sphericity has been violated, and the multivariate test statistic is reported. A third effect size statistic that is reported is the generalized η², which is comparable to η_p² in a one-way repeated measures ANOVA. It has been shown to be a better estimate of effect size with other within-subjects tests.^[8]^[9]

Cautions

rANOVA is not always the best statistical analyses for repeated measure designs. The rANOVA is vulnerable to effects from missing values, imputation, unequivalent time points between subjects and violations of sphericity.^[10] These issues can result in sampling bias and inflated rates of Type I error.^[11] In such cases it may be better to consider use of a linear mixed model.^[12]

Notes

↑ Shuttleworth, Martyn (2009-11-26). "Repeated Measures Design". Experiment-resources.com. Retrieved 2013-09-02.
↑ Barret, Julia R. (2013). "Particulate Matter and Cardiovascular Disease: Researchers Turn an Eye toward Microvascular Changes". Environmental Health Perspectives 121: a282. doi:10.1289/ehp.121-A282.
↑ Gueorguieva; Krystal (2004). "Move Over ANOVA". Arch Gen Psychiatry 61: 310. doi:10.1001/archpsyc.61.3.310.
1 2 Howell, David C. (2010). Statistical methods for psychology (7th ed.). Belmont, CA: Thomson Wadsworth. ISBN 978-0-495-59784-1.
1 2 Salkind, Samuel B. Green, Neil J. Using SPSS for Windows and Macintosh : analyzing and understanding data (6th ed.). Boston: Prentice Hall. ISBN 978-0-205-02040-9.
↑ Vasey; Thayer (1987). "The Continuing Problem of False Positives in Repeated Measures ANOVA in Psychophysiology: A Multivariate Solution". Psychophysiology 24: 479–486. doi:10.1111/j.1469-8986.1987.tb00324.x.
↑ Park (1993). "A comparison of the generalized estimating equation approach with the maximum likelihood approach for repeated measurements". Stat Med 12: 1723–1732. doi:10.1002/sim.4780121807.
↑ Bakeman (2005). "Recommended effect size statistics for repeated measures designs". Behavior Research Methods 37 (3): 379–384. doi:10.3758/bf03192707.
↑ Olejnik; Algina (2003). "Generalized eta and omega squared statistics: Measures of effect size for some common research designs.". Psychological Methods 8: 434–447. doi:10.1037/1082-989x.8.4.434.
↑ Gueorguieva; Krystal (2004). "Move Over ANOVA". Arch Gen Psychiatry 61: 310–317. doi:10.1001/archpsyc.61.3.310.
↑ Muller; Barton (1989). "Approximate Power for Repeated -Measures ANOVA lacking sphericity". Journal of the American Statistical Association 84 (406): 549–555. doi:10.1080/01621459.1989.10478802.
↑ Kreuger; Tian (2004). "A comparison of the general linear mixed model and repeated measures ANOVA using a dataset with multiple missing data points". Biological Research for Nursing 6: 151–157. doi:10.1177/1099800404267682.

References

Design and analysis of experiments

Jones, Byron; Kenward, Michael G. (2003). Design and Analysis of Cross-Over Trials (Second ed.). London: Chapman and Hall.
Vonesh, Edward F. and Chinchilli, Vernon G. (1997). Linear and Nonlinear Models for the Analysis of Repeated Measurements. London: Chapman and Hall.

Exploration of longitudinal data

Davidian, Marie; David M. Giltinan (1995). Nonlinear Models for Repeated Measurement Data. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. ISBN 978-0-412-98341-2.
Fitzmaurice, Garrett, Davidian, Marie, Verbeke, Geert and Molenberghs, Geert, eds. (2008). Longitudinal Data Analysis. Boca Raton, FL: Chapman and Hall/CRC. ISBN 1-58488-658-7.
Jones, Byron; Kenward, Michael G. (2003). Design and Analysis of Cross-Over Trials (Second ed.). London: Chapman and Hall.
Kim, Kevin and Timm, Neil (2007). ""Restricted MGLM and growth curve model" (Chapter 7)". Univariate and multivariate general linear models: Theory and applications with SAS (with 1 CD-ROM for Windows and UNIX). Statistics: Textbooks and Monographs (Second ed.). Boca Raton, FL: Chapman & Hall/CRC. ISBN 978-1-58488-634-1.
Kollo, Tõnu and von Rosen, Dietrich (2005). ""Multivariate linear models" (chapter 4), especially "The Growth curve model and extensions" (Chapter 4.1)". Advanced multivariate statistics with matrices. Mathematics and its applications 579. New York: Springer. ISBN 978-1-4020-3418-3.
Kshirsagar, Anant M. and Smith, William Boyce (1995). Growth curves. Statistics: Textbooks and Monographs 145. New York: Marcel Dekker, Inc. ISBN 0-8247-9341-2.
Pan, Jian-Xin and Fang, Kai-Tai (2002). Growth curve models and statistical diagnostics. Springer Series in Statistics. New York: Springer-Verlag. ISBN 0-387-95053-2.
Seber, G. A. F. and Wild, C. J. (1989). ""Growth models (Chapter 7)"". Nonlinear regression. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. New York: John Wiley & Sons, Inc. pp. 325–367. ISBN 0-471-61760-1.
Timm, Neil H. (2002). ""The general MANOVA model (GMANOVA)" (Chapter 3.6.d)". Applied multivariate analysis. Springer Texts in Statistics. New York: Springer-Verlag. ISBN 0-387-95347-7.
Vonesh, Edward F. and Chinchilli, Vernon G. (1997). Linear and Nonlinear Models for the Analysis of Repeated Measurements. London: Chapman and Hall. (Comprehensive treatment of theory and practice)
Conaway, M. (1999, October 11). Repeated Measures Design. Retrieved February 18, 2008, from http://biostat.mc.vanderbilt.edu/twiki/pub/Main/ClinStat/repmeas.PDF
Minke, A. (1997, January). Conducting Repeated Measures Analyses: Experimental Design Considerations. Retrieved February 18, 2008, from Ericae.net: http://ericae.net/ft/tamu/Rm.htm
Shaughnessy, J. J. (2006). Research Methods in Psychology. New York: McGraw-Hill.

External links

Examples of all ANOVA and ANCOVA models with up to three treatment factors, including randomized block, split plot, repeated measures, and Latin squares, and their analysis in R

Design of experiments

Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size

Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable

Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison

Designs Completely randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial and rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test

Glossary Category Statistics portal Statistical outline Statistical topics

Statistics

Descriptive statistics

Continuous data

Location	Mean arithmetic geometric harmonic Median Mode

Dispersion	Range Standard deviation Coefficient of variation Percentile Interquartile range

Shape	Variance Skewness Kurtosis Moments L-moments

Count data

Index of dispersion

Summary tables

Dependence

Statistical graphics

Data collection

Study design	Dark data Effect size Standard error Statistical power Sample size determination

Survey methodology	Sampling stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Confidence interval Testing hypotheses Power

Unbiased estimators	Mean unbiased minimum-variance Median unbiased

Biased estimators	Maximum likelihood Method of moments Minimum distance Density estimation

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F Shapiro–Wilk Kolmogorov–Smirnov

Goodness of fit	Chi-squared G Sample source (Anderson–Darling) Sample normality (Shapiro–Wilk) Skewness / kurtosis normality (Jarque–Bera) Model comparison (Likelihood-ratio) Model quality (Akaike criterion)

Signed-rank	1-sample (Wilcoxon) 2-sample (Mann–Whitney U) 1-way anova (Kruskal–Wallis)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

This article is issued from Wikipedia - version of the Friday, April 08, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.