RT Journal Article SR Electronic T1 Adapting macroecology to microbiology: using occupancy modelling to assess functional profiles across metagenomes JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.06.21.449349 DO 10.1101/2021.06.21.449349 A1 Angus S. Hilts A1 Manjot S. Hunjan A1 Laura A. Hug YR 2021 UL http://biorxiv.org/content/early/2021/06/22/2021.06.21.449349.abstract AB Metagenomic sequencing provides information on the metabolic capacities and taxonomic affiliations for members of a microbial community. When assessing metabolic functions in a community, missing genes in pathways can occur in two ways: the genes may legitimately be missing from the community whose DNA was sequenced, or the genes were missed during shotgun sequencing or failed to assemble, and thus the metabolic capacity of interest is wrongly absent from the sequence data. Here, we borrow and adapt occupancy modelling from macroecology to provide mathematical context to metabolic predictions from metagenomes. We review the five assumptions underlying occupancy modelling through the lens of microbial community sequence data. Using the methane cycle, we apply occupancy modelling to examine the presence and absence of methanogenesis and methanotrophy genes from nearly 10,000 metagenomes spanning global environments. We determine that methanogenesis and methanotrophy are positively correlated across environments, and note that the lack of available standardized metadata for most metagenomes is a significant hindrance to large-scale statistical analyses. We present this adaptation of macroecology’s occupancy modelling to metagenomics as a tool for assessing presence/absence of traits in environmental microbiological surveys. We further initiate a call for stronger metadata standards to accompany metagenome deposition, to enable robust statistical approaches in the future.IMG/MIntegrated Microbial Genomes and MicrobiomesJGIJoint Genome InstituteKEGGKyoto Encyclopedia of Genes and GenomesKOKEGG OrthologymcrMethyl-coenzyme M reductase (referring to the gene)MCRMethyl coenzyme M reductase (referring to the enzyme)pmoParticulate methane monooxygenase (referring to the gene)pMMOParticulate methane monooxygenase (referring to the enzyme)