RT Journal Article SR Electronic T1 Identifying and Interpreting Subgroups in Health Care Utilization Data with Count Mixture Regression Models JF bioRxiv FD Cold Spring Harbor Laboratory SP 488924 DO 10.1101/488924 A1 Christoph F. Kurz A1 Laura A. Hatfield YR 2018 UL http://biorxiv.org/content/early/2018/12/07/488924.abstract AB Inpatient care is a large share of total health care spending, making analysis of inpatient utilization patterns an important part of understanding what drives health care spending growth. Common features of inpatient utilization measures such as length of stay and spending include zero inflation, over-dispersion, and skewness, all of which complicate statistical modeling. Moreover, latent subgroups of patients may have distinct patterns of utilization and relationships between that utilization and observed covariates. In this work, we apply and compare likelihood-based and parametric Bayesian mixtures of Negative Binomial and zero-inflated Negative Binomial regression models. In a simulation, we find that the Bayesian approach finds the true number of mixture components more accurately than using information criteria to select among likelihood-based finite mixture models. When we apply the models to data on hospital lengths of stay for patients with lung cancer, we find distinct subgroups of patients with different means and variances of hospital days, health and treatment covariates, and relationships between covariates and length of stay.