Abstract
Motivation Bacteriophages are broadly classified into two distinct lifestyles: temperate (lysogenic) and virulent (lytic). Temperate phages are capable of a latent phase of infection within a host cell, whereas virulent phages directly replicate and lyse host cells upon infection. Accurate lifestyle identification is critical for determining the role of individual phage species within ecosystems and their effect on host evolution.
Results Here, we present BACPHLIP, a BACterioPHage LIfestyle Predictor. BACPHLIP detects the presence of a set of conserved protein domains within an input genome and uses this data to predict lifestyle via a Random Forest classifier. The classifier was trained on 634 phage genomes. On an independent test set of 423 phages, BACPHLIP has an accuracy of 98%, greatly exceeding that of the best existing available tool (79%).
Availability BACPHLIP is freely available on GitHub (https://github.com/adamhockenberry/bacphlip) and the code used to build and test the classifier is provided in a separate repository (https://github.com/adamhockenberry/bacphlip-model-dev).
Competing Interest Statement
The authors have declared no competing interest.