Abstract
Background One of the major challenges in microbial studies is to discover associations between microbial communities and a specific disease. A specialized feature of microbiome count data is that intestinal bacterial communities have clusters reffered as enterotype characterized by differences in specific bacterial taxa, which makes it difficult to analyze these data under health and disease conditions. Traditional probabilistic modeling cannot distinguish dysbiosis of interest with the individual differences.
Results We propose a new probabilistic model, called ENIGMA (Enterotype-like uNIGram mixture model for Microbial Association analysis), to address these problems. ENIGMA enables us to simultaneously estimate enterotype-like clusters characterized by the abundances of signature bacterial genera and environmental effects associated with the disease.
Conclusion We illustrate the performance of the proposed method both through the simulation and clinical data analysis. ENIGMA is implemented with R and is available from GitHub (https://github.com/abikoushi/enigma).