Abstract
One of the fundamental assumptions underlying linear regression models is that the errors have a constant variance (i.e., homoscedastic). When this assumption is violated, standard errors from a regression can be biased and inconsistent, meaning that the associated p values and 95% confidence intervals cannot be trusted. The assumption of homoscedasticity is made for statistical reasons rather than biological reasons; in most real datasets, some form of heteroscedasticity is likely to exist. However, a survey of the behavioural ecology literature showed that only about 5% of articles explicitly mentioned heteroscedasticity, leaving 95% of articles in which heteroscedasticity was apparently absent. These results strongly indicate that the prevalence of heteroscedasticity is widely under-reported within behavioural ecology. The aim of this article is to raise awareness of heteroscedasticity amongst behavioural ecologists. Using topical examples from fields in behavioural ecology such as sexual dimorphism and animal personality, we highlight the biological importance of considering heteroscedasticity. We also emphasize that researchers should pay closer attention to the variance in their data and consider what factors could cause heteroscedasticity. In addition, we introduce some simple methods of dealing with heteroscedasticity. The two methods we focus on are: (1) incorporating variance functions within a generalised least squares (GLS) framework to model the functional form of heteroscedasticity and; (2) heteroscedasticity-consistent standard error (HCSE) estimators, which can be used when the functional form of heteroscedasticity is unknown. Using case studies, we show how both methods can influence the output from linear regression models. Finally, we hope that more researchers will consider heteroscedasticity as an important source of additional information about the particular biological process being studied, rather than an impediment to statistical analysis.
Similar content being viewed by others
References
Badyaev AV (2002) Growing apart: an ontogenetic perspective on the evolution of sexual size dimorphism. Trends Ecol Evol 17:369–378
Bivand RS, Pebesma EJ, Gómez-Rubio V (2008) Applied spatial data analysis with R. Springer, New York
Breusch TS, Pagan AR (1979) A simple test for heteroskedasticity and random coefficient variation. Econometrica 47:1287–1294
Brotherstone S, Hill WG (1986) Heterogeneity of variance amongst herds for milk production. Anim Prod 42:297
Cardoso FF, Rosa GJM, Tempelman RJ (2005) Multiple-breed genetic inference using heavy-tailed structural models for heterogeneous residual variances. J Anim Sci 83:1766–1779
Carroll RJ (2003) Variances are not always nuisance parameters. Biometrics 59:211–220
Clayton GA, Morris JA, Robertson A (1957) An experimental check on quantitative genetical theory: I. Short term responses to selection. J Genet 55:131–151
Cleasby IR (2010) The influence of early environment and parental care on offspring growth and survival in the house sparrow. PhD thesis, University of Sheffield, UK
Cohen J (1990) Things I have learned (so far). Am Pyscol 45:1304–1312
Congdon PD (2010) Applied Bayesian hierarchical methods. CRC Press, Florida
Cribari-Neto F (2004) Asymptotic inference under heteroskedasticity of unknown form. Comput Stats Data Anal 45:215–233
Cribari-Neto F, da Silva WB (2011) A new heteroskedasticity-consistent covariance matrix estimator for the linear regression model. Adv Stat Anal 95:129–146
Cribari-Neto F, Ferrari SLP, Cordeiro GM (2000) Improved heteroscedasticity-consistent covariance matrix estimators. Biometrika 87:907–918
Cribari-Neto F, Ferrari SLP, Oliveira WASC (2005) Numerical evaluation of tests based on different heteroskedasticity consistent covariance matrix estimators. J Stat Comput Simul 75:611–628
Cribari-Neto F, Souza TC, Vasconcellos KLP (2007) Inference under heteroskedasticity and leveraged data. Commun Stat Theory Methods 36:1877–1888, Errata 37:3329–3330, 2008
Cryer JD, Chan K (2008) Time-series analysis with applications in R, 2nd edn. Springer, New York
Darlington RB (1990) Regression and linear models. McGraw-Hill, New York
Dingemanse NJ, Kazem AJN, Réale D, Wright J (2010) Behavioural reaction norms: animal personality meets individual plasticity. Trends Ecol Evol 25:81–89
Dutilleul P, Potvin C (1995) Among-environment heteroscedasticity and genetic autocorrelation: implications for the study of phenotypic plasticity. Genetics 139:1815–1829
Erceg-Hurn DM, Mirosevich VM (2008) An easy way to maximise the accuracy and power of your research. Am Psychol 63:591–601
Fox J, Weisberg S (2010) An R companion to applied regression. Sage, California
Furno M (1996) Small sample behaviour of a robust heteroskedasticity consistent covariance matrix estimator with improved finite-sample properties. J Stat Comput Simul 54:115–128
Gelman A (2005) Analysis of variance – why it is more important than ever. Ann Stat 33:1–53
Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New York
Goldfeld SM, Quandt RE (1965) Some tests for homoscedasticity. J Am Stat Assoc 60:539–547
Griffiths SC, Owens IPF, Burke T (1999) Environmental determination of a sexually selected trait. Nature 400:358–360
Grissom RJ, Kim JJ (2005) Effect sizes for research: A broad practical approach. Erlbaum, New Jersey
Hadfield JD (2010) MCMC methods for multi-response Generalised Linear Mixed Models: the MCMCglmm R package. J Stat Soft 33:1–22
Hayes AF, Cai L (2007) Using heteroskedasticity-consistent standard error estimators in OLS regression: an introduction and software implementation. Behav Res Methods 39:709–722
Herberich E, Sikorski J, Hothron T (2010) A robust procedure for comparing multiple means under heteroscedasticity in unbalanced designs. PLoS ONE 5(3):e9788. doi:10.1371/journal.pone.0009788
Hill WG (1984) On selection among groups with heterogeneous variance. Anim Prod 39:473–477
Hill WG, Zhang X (2004) Effects on phenotypic variability of directional selection arising through genetic differences in residual variability. Genet Res 83:121–132
Honkanen T, Jormalainen V (2005) Genotypic variation in tolerance and resistance to fouling in the brown Alga Fucus vesiculosus. Ecology 144:196–205
Jones KS, Nakagawa S, Sheldon BC (2009) Environmental sensitivity in relation to size and sex in birds: meta-regression analysis. Am Nat 174:122–133
Kauermann G, Carroll RJ (2001) A note on the efficiency of sandwich covariance matrix estimation. J Am Stat Assoc 96:1387–1396
Keppel G, Wickens TD (2004) Design and analysis: A researcher’s handbook, 4th edn. Pearson, New Jersey
Long JS, Ervin LH (2000) Using heteroskedasticity consistent standard errors in the linear regression model. Am Stat 54:217–224
Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, New York
Nakagawa S, Cuthill IC (2007) Effect size, confidence interval and statistical significance: a practical guide for biologists. Biol Rev 82:591–605
Nakagawa S, Gillespie DOS, Hatchwell BJ, Burke T (2007a) Predictable males and unpredictable females: sex difference in repeatability of care in a wild bird population. J Evol Biol 20:1674–1681
Nakagawa S, Ockendon N, Gillespie DOS, Hatchwell BJ, Burke T (2007b) Assessing the function of house sparrows’ bib size using a flexible meta-analysis method. Behav Ecol 18:831–840
Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-PLUS. Springer, New York
Pinheiro JC, Bates DM, DebRoy S, Sarkar D (2010) nlme: linear and nonlinear mixed effects models. R package version 3:1–97
Qian L, Wang S (2001) Bias-corrected heteroscedasticity robust covariance matrix (sandwich) estimators. J Stat Comput Simul 70:161–174
R Development Core Team (2010) R: A language and environment for statistical computing, version 2.12.1. R Foundation for Statistical Computing, Vienna, Austria
Reale D, Dingemanse NJ (2010) Personality and individual social specialisation. In Szekely T, et al (eds) Social behaviour: Genes, ecology and evolution. Cambridge University Press
Rönnegård L, Felleki M, Fikse F, Mulder HA, Strandberg E (2010) Genetic heterogeneity of residual variance—estimation of variance components using double hierarchical generalized. linear models. Genet Selection Evol 42:8
Ruxton GD (2006) The unequal variance t-test is an underused alternative to Student’s t-test and the Mann–Whitney U test. Behav Ecol 17:688–690
Schwagmeyer PL, Mock DW (2003) How consistently are good parents good parents? Repeatability of parental care in the house sparrow, Passer domesticus? Ethology 109:303–313
Sinn DL, Gosling SD, Moltschaniwskyj NA (2008) Development of shy/bold behaviour in squid: context-specific phenotypes associated with developmental plasticity. Anim Behav 75:433–442
Teder T, Tammaru T, Esperk T (2008) Dependence of phenotypic variance in body size on environmental quality. Am Nat 172:223–232
Venables WN (2000) Exegeses on linear models. Paper presented to S-PLUS User’s Conference, Washington, DC, 8–9 October, 1998
White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48:817–838
Wilcox RR (1998) How many discoveries have been lost by ignoring modern statistical methods? Am Psychol 53:300–314
Wilcox RR (2005) Introduction to robust estimation and hypothesis testing. Academic Press, New York
Wilkin TA, King LE, Sheldon BC (2009) Habitat quality, nestling diet, and provisioning behaviour in great tits Parus major. J Avian Biol 40:135–145
Wooldridge JM (2000) Introductory econometrics: a modern approach. South-Western College Publishing, Ohio
Zeileis A (2004) Econometric computing with HC and HAC covariance matrix estimators. J Stat Software 11:1–17
Zeileis A, Hothorn T (2002) Diagnostic checking in regression relationships. R News 2:7–10
Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. Springer, New York
Acknowledgements
We thank Barbara Morrissey, Eduardo Santos and Alistair Senior for commenting upon earlier versions of this manuscript. We are grateful to three anonymous reviewers for comments which improved the manuscript. We would also like to thank Terry Burke for his encouragement during the writing of this manuscript. S.N. is supported by the Marsden Fund.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by L. Z. Garamszegi
Rights and permissions
About this article
Cite this article
Cleasby, I.R., Nakagawa, S. Neglected biological patterns in the residuals. Behav Ecol Sociobiol 65, 2361–2372 (2011). https://doi.org/10.1007/s00265-011-1254-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00265-011-1254-7