ABSTRACT
When using repeated measures linear regression models to make causal inference in laboratory, clinical and environmental research, it is often assumed that the Within Subject association of differences (or changes) in predictor value across replicates is the same as the Between Subject association of differences in those predictor values. But this is often false, for example with body weight as the predictor and blood cholesterol the outcome i) a 10 pound weight increase in the same adult more greatly a higher increase in cholesterol in that adult than does ii) one adult weighing 10 pounds more than a second reflect increased cholesterol levels in the first adult as the weigh difference in i) more closely tracks higher body fat while that in ii) is also influenced by heavier adults being taller. Hence to make causal inferences, different Within and Between subject slopes should be separately modeled. A related misconception commonly made using generalized estimation equations (GEE) and mixed models (MM) on repeated measures (i.e. for fitting Cross Sectional Regression) is that the working correlation structure used only influences variance of model parameter estimates. But only independence working correlation guarantees the modeled parameters have any interpretability. We illustrate this with an example where changing working correlation from independence to equicorrelation qualitatively biases parameters of GEE models and show this happens because Between and Within Subject slopes for the predictor variables differ. We then describe several common mechanisms that cause Within and Between Subject slopes to differ as; change effects, lag/reverse lag and spillover causality, shared within subject measurement bias or confounding, and predictor variable measurement error. The misconceptions noted here should be better publicized in laboratory, clinical and environmental research. Repeated measures analyses should compare Within and Between subject slopes of predictors and when they differ, investigate the reasons this has happened.
HIGHLIGHTS When using repeated measures with time varying predictors variables in laboratory, clinical and environmental research:
Cross sectional regressions with any working correlation structure other than independence often give non-meaningful results
Between/Within subject decomposition of slopes should be undertaken when making causal inferences
Investigators should investigate the reasons Between and Within Subject slopes differ if this occurs
- ABBREVIATIONS
- AR(1)
- –Autoregressive Order 1
- BS
- – Between Subject
- BUN
- – Blood Urine Nitrogen
- Co-DOSE
- – Conditionally Dependent On Sibling Exposure
- CS
- – Cross Sectional
- E
- – Equicorrelation
- EGFR
- – Estimated Glomerular Filtration Rate
- GEE
- – Generalize Estimation Equations
- IND
- – Independent
- MA(2)
- – Moving Average Order 2
- MM
- – Mixed Models
- WIHS
- – Women’s Interagency HIV Study
- WS
- – Within Subject