Abstract
Background The human microbiome comprises the microorganisms that inhabit the various locales of the human body and plays a vital role in human health. The composition of a microbial population is often quantified through measures of species diversity, which summarize the number of species along with their relative abundances into a single value. In a microbiome sample there will certainly be species missing from the target population which will affect the diversity estimates.
Methods We employ a model based on the hierarchical Pitman-Yor (HPY) process to model the species abundance distributions over multiple microbiome populations. The model parameters are estimated using a Gibbs sampler. We also derive estimates of species diversity as a function of the HPY parameters.
Results We show that the Gibbs sampler for the HPY model performs well in the simulation study. We also show that the estimates of diversity from the HPY model improve over naïve estimates when species are missing. Similarly the HPY estimates tend to perform better than the naïve estimates when the number of individuals sampled from a population is small.
Competing Interest Statement
The authors have declared no competing interest.