Abstract
Many major human pathogens are multi-host pathogens, able to infect other vertebrate species. Describing the general patterns of host-pathogen associations across pathogen taxa is therefore important to understand risk factors for human disease emergence. However, there is a lack of comprehensive curated databases for this purpose, with most previous efforts focusing on viruses. Here, we report the largest manually compiled host-pathogen association database, covering 2,595 bacteria and viruses infecting 2,656 vertebrate hosts. We also build a tree for host species using nine mitochondrial genes, giving a quantitative measure of the phylogenetic similarity of hosts. We find that the majority of bacteria and viruses are specialists infecting only a single host species, with bacteria having a significantly higher proportion of specialists compared to viruses. Conversely, multi-host viruses have a more restricted host range than multi-host bacteria. We perform multiple analyses of factors associated with pathogen richness per host species and the pathogen traits associated with greater host range and zoonotic potential. We show that factors previously identified as important for zoonotic potential in viruses—such as phylogenetic range, research effort, and being vector-borne—are also predictive in bacteria. We find that the fraction of pathogens shared between two hosts decreases with the phylogenetic distance between them. Our results suggest that host phylogenetic similarity is the primary factor for host-switching in pathogens.
Footnotes
↵a Co-first authors
DIFFERENCE TO v2: Minor additions, including small corrections of typos, more information in Methods and some extra Discussion. DIFFERENCE TO ORIGINAL v1: Major revisions, mostly the addition of further statistical analysis: using generalized additive models (GAMs) including a proxy for research effort to model the effect of host traits on pathogen richness per species (Figure 4) and zoonotic potential of a pathogen (Figure 5); inclusion of two new figures related to GAMs (new Figure 4 and Figure 5); changes to pathogen sharing analysis between host orders (new Figure 6); further detail in methods; link to updated code and data repository.