Abstract
Background This paper presents a strategy for statistical analysis and interpretation of longitudinal intervention effects on bacterial communities. Data from such experiments often suffers from small sample size, high degree of irrelevant variation, and missing data points. Our strategy is a combination of multi-way decomposition methods, multivariate ANOVA, multi-block regression, hierarchical clustering and phylogenetic network graphs. The aim is to provide answers to relevant research questions, which are both statistically valid and easy to interpret.
Results The strategy is illustrated by analysing an intervention design where two mice groups were subjected to a treatment that caused inflammation in the intestines. Total microbiota in fecal samples was analysed at five time points, and the clinical end point was the load of colon cancer lesions. By using different combinations of the aforementioned methods, we were able to show that:
The treatment had a significant effect on the microbiota, and we have identified clusters of bacteria groups with different time trajectories.
Individual differences in the initial microbiota had a large effect on the load of tumors, but not on the formation of early-stage lesions (flat ACFs).
The treatment resulted in an increase in Bacteroidaceae, Prevotellaceae and Paraprevotellaceae, and this increase could be associated with the formation of cancer lesions.
Conclusion The results show that by applying several data analytical methods in combination, we are able to view the system from different angles and thereby answer different research questions. We believe that multiway methods and multivariate ANOVA should be used more frequently in the bioinformatics fields, due to their ability to extract meaningful components from data sets with many collinear variables, few samples and a high degree of noise or irrelevant variation.