Skip to main content
Log in

Neglected biological patterns in the residuals

A behavioural ecologist’s guide to co-operating with heteroscedasticity

  • Methods
  • Published:
Behavioral Ecology and Sociobiology Aims and scope Submit manuscript

Abstract

One of the fundamental assumptions underlying linear regression models is that the errors have a constant variance (i.e., homoscedastic). When this assumption is violated, standard errors from a regression can be biased and inconsistent, meaning that the associated p values and 95% confidence intervals cannot be trusted. The assumption of homoscedasticity is made for statistical reasons rather than biological reasons; in most real datasets, some form of heteroscedasticity is likely to exist. However, a survey of the behavioural ecology literature showed that only about 5% of articles explicitly mentioned heteroscedasticity, leaving 95% of articles in which heteroscedasticity was apparently absent. These results strongly indicate that the prevalence of heteroscedasticity is widely under-reported within behavioural ecology. The aim of this article is to raise awareness of heteroscedasticity amongst behavioural ecologists. Using topical examples from fields in behavioural ecology such as sexual dimorphism and animal personality, we highlight the biological importance of considering heteroscedasticity. We also emphasize that researchers should pay closer attention to the variance in their data and consider what factors could cause heteroscedasticity. In addition, we introduce some simple methods of dealing with heteroscedasticity. The two methods we focus on are: (1) incorporating variance functions within a generalised least squares (GLS) framework to model the functional form of heteroscedasticity and; (2) heteroscedasticity-consistent standard error (HCSE) estimators, which can be used when the functional form of heteroscedasticity is unknown. Using case studies, we show how both methods can influence the output from linear regression models. Finally, we hope that more researchers will consider heteroscedasticity as an important source of additional information about the particular biological process being studied, rather than an impediment to statistical analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Badyaev AV (2002) Growing apart: an ontogenetic perspective on the evolution of sexual size dimorphism. Trends Ecol Evol 17:369–378

    Article  Google Scholar 

  • Bivand RS, Pebesma EJ, Gómez-Rubio V (2008) Applied spatial data analysis with R. Springer, New York

    Google Scholar 

  • Breusch TS, Pagan AR (1979) A simple test for heteroskedasticity and random coefficient variation. Econometrica 47:1287–1294

    Article  Google Scholar 

  • Brotherstone S, Hill WG (1986) Heterogeneity of variance amongst herds for milk production. Anim Prod 42:297

    Article  Google Scholar 

  • Cardoso FF, Rosa GJM, Tempelman RJ (2005) Multiple-breed genetic inference using heavy-tailed structural models for heterogeneous residual variances. J Anim Sci 83:1766–1779

    PubMed  CAS  Google Scholar 

  • Carroll RJ (2003) Variances are not always nuisance parameters. Biometrics 59:211–220

    Article  PubMed  Google Scholar 

  • Clayton GA, Morris JA, Robertson A (1957) An experimental check on quantitative genetical theory: I. Short term responses to selection. J Genet 55:131–151

    Article  Google Scholar 

  • Cleasby IR (2010) The influence of early environment and parental care on offspring growth and survival in the house sparrow. PhD thesis, University of Sheffield, UK

  • Cohen J (1990) Things I have learned (so far). Am Pyscol 45:1304–1312

    Article  Google Scholar 

  • Congdon PD (2010) Applied Bayesian hierarchical methods. CRC Press, Florida

    Book  Google Scholar 

  • Cribari-Neto F (2004) Asymptotic inference under heteroskedasticity of unknown form. Comput Stats Data Anal 45:215–233

    Article  Google Scholar 

  • Cribari-Neto F, da Silva WB (2011) A new heteroskedasticity-consistent covariance matrix estimator for the linear regression model. Adv Stat Anal 95:129–146

    Article  Google Scholar 

  • Cribari-Neto F, Ferrari SLP, Cordeiro GM (2000) Improved heteroscedasticity-consistent covariance matrix estimators. Biometrika 87:907–918

    Google Scholar 

  • Cribari-Neto F, Ferrari SLP, Oliveira WASC (2005) Numerical evaluation of tests based on different heteroskedasticity consistent covariance matrix estimators. J Stat Comput Simul 75:611–628

    Article  Google Scholar 

  • Cribari-Neto F, Souza TC, Vasconcellos KLP (2007) Inference under heteroskedasticity and leveraged data. Commun Stat Theory Methods 36:1877–1888, Errata 37:3329–3330, 2008

    Article  Google Scholar 

  • Cryer JD, Chan K (2008) Time-series analysis with applications in R, 2nd edn. Springer, New York

    Google Scholar 

  • Darlington RB (1990) Regression and linear models. McGraw-Hill, New York

    Google Scholar 

  • Dingemanse NJ, Kazem AJN, Réale D, Wright J (2010) Behavioural reaction norms: animal personality meets individual plasticity. Trends Ecol Evol 25:81–89

    Article  PubMed  Google Scholar 

  • Dutilleul P, Potvin C (1995) Among-environment heteroscedasticity and genetic autocorrelation: implications for the study of phenotypic plasticity. Genetics 139:1815–1829

    PubMed  CAS  Google Scholar 

  • Erceg-Hurn DM, Mirosevich VM (2008) An easy way to maximise the accuracy and power of your research. Am Psychol 63:591–601

    Article  PubMed  Google Scholar 

  • Fox J, Weisberg S (2010) An R companion to applied regression. Sage, California

    Google Scholar 

  • Furno M (1996) Small sample behaviour of a robust heteroskedasticity consistent covariance matrix estimator with improved finite-sample properties. J Stat Comput Simul 54:115–128

    Article  Google Scholar 

  • Gelman A (2005) Analysis of variance – why it is more important than ever. Ann Stat 33:1–53

    Article  Google Scholar 

  • Gelman A, Hill J (2007) Data analysis using regression and multilevel/hierarchical models. Cambridge University Press, New York

    Google Scholar 

  • Goldfeld SM, Quandt RE (1965) Some tests for homoscedasticity. J Am Stat Assoc 60:539–547

    Article  Google Scholar 

  • Griffiths SC, Owens IPF, Burke T (1999) Environmental determination of a sexually selected trait. Nature 400:358–360

    Article  Google Scholar 

  • Grissom RJ, Kim JJ (2005) Effect sizes for research: A broad practical approach. Erlbaum, New Jersey

    Google Scholar 

  • Hadfield JD (2010) MCMC methods for multi-response Generalised Linear Mixed Models: the MCMCglmm R package. J Stat Soft 33:1–22

    Google Scholar 

  • Hayes AF, Cai L (2007) Using heteroskedasticity-consistent standard error estimators in OLS regression: an introduction and software implementation. Behav Res Methods 39:709–722

    Article  PubMed  Google Scholar 

  • Herberich E, Sikorski J, Hothron T (2010) A robust procedure for comparing multiple means under heteroscedasticity in unbalanced designs. PLoS ONE 5(3):e9788. doi:10.1371/journal.pone.0009788

    Article  PubMed  Google Scholar 

  • Hill WG (1984) On selection among groups with heterogeneous variance. Anim Prod 39:473–477

    Article  Google Scholar 

  • Hill WG, Zhang X (2004) Effects on phenotypic variability of directional selection arising through genetic differences in residual variability. Genet Res 83:121–132

    Article  PubMed  CAS  Google Scholar 

  • Honkanen T, Jormalainen V (2005) Genotypic variation in tolerance and resistance to fouling in the brown Alga Fucus vesiculosus. Ecology 144:196–205

    Google Scholar 

  • Jones KS, Nakagawa S, Sheldon BC (2009) Environmental sensitivity in relation to size and sex in birds: meta-regression analysis. Am Nat 174:122–133

    Article  PubMed  Google Scholar 

  • Kauermann G, Carroll RJ (2001) A note on the efficiency of sandwich covariance matrix estimation. J Am Stat Assoc 96:1387–1396

    Article  Google Scholar 

  • Keppel G, Wickens TD (2004) Design and analysis: A researcher’s handbook, 4th edn. Pearson, New Jersey

    Google Scholar 

  • Long JS, Ervin LH (2000) Using heteroskedasticity consistent standard errors in the linear regression model. Am Stat 54:217–224

    Article  Google Scholar 

  • Mardia KV, Kent JT, Bibby JM (1979) Multivariate analysis. Academic Press, New York

    Google Scholar 

  • Nakagawa S, Cuthill IC (2007) Effect size, confidence interval and statistical significance: a practical guide for biologists. Biol Rev 82:591–605

    Article  PubMed  Google Scholar 

  • Nakagawa S, Gillespie DOS, Hatchwell BJ, Burke T (2007a) Predictable males and unpredictable females: sex difference in repeatability of care in a wild bird population. J Evol Biol 20:1674–1681

    Article  PubMed  CAS  Google Scholar 

  • Nakagawa S, Ockendon N, Gillespie DOS, Hatchwell BJ, Burke T (2007b) Assessing the function of house sparrows’ bib size using a flexible meta-analysis method. Behav Ecol 18:831–840

    Article  Google Scholar 

  • Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-PLUS. Springer, New York

    Book  Google Scholar 

  • Pinheiro JC, Bates DM, DebRoy S, Sarkar D (2010) nlme: linear and nonlinear mixed effects models. R package version 3:1–97

    Google Scholar 

  • Qian L, Wang S (2001) Bias-corrected heteroscedasticity robust covariance matrix (sandwich) estimators. J Stat Comput Simul 70:161–174

    Article  Google Scholar 

  • R Development Core Team (2010) R: A language and environment for statistical computing, version 2.12.1. R Foundation for Statistical Computing, Vienna, Austria

  • Reale D, Dingemanse NJ (2010) Personality and individual social specialisation. In Szekely T, et al (eds) Social behaviour: Genes, ecology and evolution. Cambridge University Press

  • Rönnegård L, Felleki M, Fikse F, Mulder HA, Strandberg E (2010) Genetic heterogeneity of residual variance—estimation of variance components using double hierarchical generalized. linear models. Genet Selection Evol 42:8

    Article  Google Scholar 

  • Ruxton GD (2006) The unequal variance t-test is an underused alternative to Student’s t-test and the Mann–Whitney U test. Behav Ecol 17:688–690

    Article  Google Scholar 

  • Schwagmeyer PL, Mock DW (2003) How consistently are good parents good parents? Repeatability of parental care in the house sparrow, Passer domesticus? Ethology 109:303–313

    Article  Google Scholar 

  • Sinn DL, Gosling SD, Moltschaniwskyj NA (2008) Development of shy/bold behaviour in squid: context-specific phenotypes associated with developmental plasticity. Anim Behav 75:433–442

    Article  Google Scholar 

  • Teder T, Tammaru T, Esperk T (2008) Dependence of phenotypic variance in body size on environmental quality. Am Nat 172:223–232

    Article  PubMed  Google Scholar 

  • Venables WN (2000) Exegeses on linear models. Paper presented to S-PLUS User’s Conference, Washington, DC, 8–9 October, 1998

  • White H (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48:817–838

    Article  Google Scholar 

  • Wilcox RR (1998) How many discoveries have been lost by ignoring modern statistical methods? Am Psychol 53:300–314

    Article  Google Scholar 

  • Wilcox RR (2005) Introduction to robust estimation and hypothesis testing. Academic Press, New York

    Google Scholar 

  • Wilkin TA, King LE, Sheldon BC (2009) Habitat quality, nestling diet, and provisioning behaviour in great tits Parus major. J Avian Biol 40:135–145

    Article  Google Scholar 

  • Wooldridge JM (2000) Introductory econometrics: a modern approach. South-Western College Publishing, Ohio

    Google Scholar 

  • Zeileis A (2004) Econometric computing with HC and HAC covariance matrix estimators. J Stat Software 11:1–17

    Google Scholar 

  • Zeileis A, Hothorn T (2002) Diagnostic checking in regression relationships. R News 2:7–10

    Google Scholar 

  • Zuur AF, Ieno EN, Walker NJ, Saveliev AA, Smith GM (2009) Mixed effects models and extensions in ecology with R. Springer, New York

    Book  Google Scholar 

Download references

Acknowledgements

We thank Barbara Morrissey, Eduardo Santos and Alistair Senior for commenting upon earlier versions of this manuscript. We are grateful to three anonymous reviewers for comments which improved the manuscript. We would also like to thank Terry Burke for his encouragement during the writing of this manuscript. S.N. is supported by the Marsden Fund.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ian R. Cleasby.

Additional information

Communicated by L. Z. Garamszegi

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 230 kb)

ESM 2

(DOC 3 kb)

ESM 3

(DOC 1 kb)

ESM 4

(CSV 3 kb)

ESM 5

(CSV 2 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cleasby, I.R., Nakagawa, S. Neglected biological patterns in the residuals. Behav Ecol Sociobiol 65, 2361–2372 (2011). https://doi.org/10.1007/s00265-011-1254-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00265-011-1254-7

Keywords

Navigation