Abstract
Recent work on microbial communities from various environments has shown that coexisting microorganisms with similar metabolic functions can be combined into high level “functional groups”, which explain a larger proportion of variance in abiotic parameters than any individual taxon. However, the general rules by which taxa should be aggregated into functional groups remain elusive. Here, we show that two conditions are required for species-assemblages to explain a higher percentage of variance in abiotic factors than single taxa. 1) consistent taxa-environment correlations, and 2) weak or negative correlations (i.e. complementarity) between taxa. Applying this recipe to the ocean microbiome, we found that the best grouping of taxa is one that partitions it into only two groups, a core and a flexible assemblage. The core assemblage is enriched in Cyanobacteria and oligotrophic heterotrophs, and was strongly correlated to the first principal component of the taxa-sample matrix. The flexible assemblage instead was enriched in metabolically versatile copiotrophs, abundant at higher depths. This simple core / flexible bipartition explained the most variance in abiotic parameters and outperformed annotation-based functional groups as well as individual taxa. It therefore represents the simplest and best grouping of taxa that can be extracted from current ocean microbiome surveys.