RT Journal Article SR Electronic T1 Multiclass Disease Classification from Microbial Whole-Community Metagenomes using Graph Convolutional Neural Networks JF bioRxiv FD Cold Spring Harbor Laboratory SP 726901 DO 10.1101/726901 A1 Saad Khan A1 Libusha Kelly YR 2019 UL http://biorxiv.org/content/early/2019/08/16/726901.abstract AB There is a wealth of information contained within one’s microbiome regarding their physiology and environment, and this is a promising avenue for developing non-invasive diagnostic tools. Here, we utilize 5643 aggregated, annotated whole-community metagenomes from 19 different diseases to implement the first multiclass microbiome disease classifier of this scale. We compared three different machine learning models: random forests, deep neural nets, and a novel graph convolutional architecture which exploits the graph structure of phylogenetic trees as its input. We show that the graph convolutional model outperforms deep neural nets in terms of accuracy (achieving 75% average test-set accuracy), receiver-operator-characteristics (92.1% average AUC), and precision-recall (50% average AUPR). Additionally, the convolutional net’s performance complements that of the random forest, achieving similar accuracy but better receiver-operator-characteristics and lower area under precision-recall. Lastly, we are able to achieve over 90% average top-3 accuracy across all of our models. Together, these results indicate that there are predictive, disease specific signatures across microbiomes which could potentially be used for diagnostic purposes.