RT Journal Article
SR Electronic
T1 Modeling Zero-Inflated Count Data With glmmTMB
JF bioRxiv
FD Cold Spring Harbor Laboratory Press
DO 10.1101/132753
A1 Brooks, Mollie E.
A1 Kristensen, Kasper
A1 van Benthem, Koen J.
A1 Magnusson, Arni
A1 Berg, Casper W.
A1 Nielsen, Anders
A1 Skaug, Hans J.
A1 Maechler, Martin
A1 Bolker, Benjamin M.
YR 2017
UL http://biorxiv.org/content/early/2017/05/01/132753.abstract
AB Ecological phenomena are often measured in the form of count data. These data can be analyzed using generalized linear mixed models (GLMMs) when observations are correlated in ways that require random effects. However, count data are often zero-inflated, containing more zeros than would be expected from the standard error distributions used in GLMMs, e.g., parasite counts may be exactly zero for hosts with effective immune defenses but vary according to a negative binomial distribution for non-resistant hosts. We present a new R package, glmmTMB, that increases the range of models that can easily be fitted to count data using maximum likelihood estimation. The interface was developed to be familiar to users of the lme4 R package, a common tool for fitting GLMMs. To maximize speed and flexibility, estimation is done using Template Model Builder (TMB), utilizing automatic differentiation to estimate model gradients and the Laplace approximation for handling random effects. We demonstrate glmmTMB and compare it to other available methods using two ecological case studies. In general, glmmTMB is more flexible than other packages available for estimating zero-inflated models via maximum likelihood estimation and is faster than packages that use Markov chain Monte Carlo sampling for estimation; it is also more flexible for zero-inflated modelling than INLA, but speed comparisons vary with model and data structure. Our package can be used to fit GLMs and GLMMs with or without zero-inflation as well as hurdle models. By allowing ecologists to quickly estimate a wide variety of models using a single package, glmmTMB makes it easier to find appropriate models and test hypotheses to describe ecological processes.