Abstract
An operon is a functional unit of DNA whose genes are co-transcribed on polycistronic mRNA, in a co-regulated fashion. Operons are a power-ful mechanism of introducing functional complexity in bacteria, and are therefore of interest in microbial genetics, physiology, biochemistry, and evolution. While several methods have been developed to identify operons in whole genomes, there are few that can identify them in metagenomes. Here we present a Pipeline for Operon Exploration in Metagenomes or POEM. At the heart of POEM lies a neural network that classifies genes as intra- or extra-operonic. POEM then looks for proximity associations between identified intra-operonic genes, to identify core operons in the metagenome. Core operons are operons that may exist in more than one species in the metagenome, and being evolutionarily conserved increases the probability of accurate prediction. We tested POEM using several different assemblers on a simulated metagenome, and we show it to be highly accurate. We also demonstrate its use on a human gut metagenome sample, and discover a putative new operon. We conclude that POEM is a useful tool for analyzing metagenomes beyond the genomic level, and for identifying multi-gene functionalities and possible neofunctionalization in metagenomes








