RT Journal Article
SR Electronic
T1 Identifying and Interpreting Subgroups in Health Care Utilization Data with Count Mixture Regression Models
JF bioRxiv
FD Cold Spring Harbor Laboratory
SP 488924
DO 10.1101/488924
A1 Christoph F. Kurz
A1 Laura A. Hatfield
YR 2018
UL http://biorxiv.org/content/early/2018/12/07/488924.abstract
AB Inpatient care is a large share of total health care spending, making analysis of inpatient utilization patterns an important part of understanding what drives health care spending growth. Common features of inpatient utilization measures such as length of stay and spending include zero inflation, over-dispersion, and skewness, all of which complicate statistical modeling. Moreover, latent subgroups of patients may have distinct patterns of utilization and relationships between that utilization and observed covariates. In this work, we apply and compare likelihood-based and parametric Bayesian mixtures of Negative Binomial and zero-inflated Negative Binomial regression models. In a simulation, we find that the Bayesian approach finds the true number of mixture components more accurately than using information criteria to select among likelihood-based finite mixture models. When we apply the models to data on hospital lengths of stay for patients with lung cancer, we find distinct subgroups of patients with different means and variances of hospital days, health and treatment covariates, and relationships between covariates and length of stay.