ABSTRACT
Genes in prokaryotic genomes often assemble in transcription units called operons. Detecting operons are of significant importance to help infer functionality and detect regulatory networks. Several tools have been proposed to detect such operons computationally. We propose a new method, which we name Operon Hunter, that uses visual representations of genomic fragments and a residual neural network architecture to make operon predictions. Our method uses a pertained network via transfer learning to leverage big datasets.
We report the highest accuracy in the literature that we know of when tested on the standard datasets that are reported in various studies: E. coli and B. subtilis, with an F1 score of 0.83, outperforming the previously reported state of the art tools. Our method also demonstrates a clear advantage when it comes to detecting full operons rather than separate gene pairs.