Abstract
The emergence of the novel coronavirus SARS-CoV-2, which in humans is highly infectious and leads to the potentially fatal disease COVID-19, has caused tens of thousands of deaths and huge global disruption. The viral infection may also represent an existential threat to our closest living relatives, the nonhuman primates, many of which have already been reduced to small and endangered populations. The virus engages the host cell receptor, angiotensin‐converting enzyme‐2 (ACE2), through the receptor binding domain (RBD) on the spike protein. The contact surface of ACE2 displays amino acid residues that are critical for virus recognition, and variations at these critical residues are likely to modulate infection susceptibility across species. While infection studies have shown that rhesus macaques exposed to the virus develop COVID-19-like symptoms, the susceptibility of other nonhuman primates is unknown. Here, we show that all apes, including chimpanzees, bonobos, gorillas, and orangutans, and all African and Asian monkeys, exhibit the same set of twelve key amino acid residues as human ACE2. Monkeys in the Americas, and some tarsiers, lemurs and lorisoids, differ at significant contact residues, and protein modeling predicts that these differences should greatly reduce the binding affinity of the ACE2 for the virus, hence moderating their susceptibility for infection. Our study suggests that apes and African and Asian monkeys are all likely to be highly susceptible to SARS-CoV-2, representing a critical threat to their survival. Urgent actions may be necessary to limit their exposure to humans.
Introduction
In late 2019 a novel coronavirus SARS-CoV-2 emerged in China. In humans, this virus can lead to the respiratory disease COVID-19, which can be fatal1,2. Since then, SARS-CoV-2 has spread around the world, causing widespread mortality, and with major impacts on societies and economies. While the virus and its resulting disease represent a major humanitarian disaster, they also represent a potential existential risk to our closest living relatives, the nonhuman primates. Transmission of diseases such as Ebola and yellow fever from humans has previously caused mass mortality in wild populations of nonhuman primates3–6, raising concerns among the global conservation community with respect to the impact of the current pandemic7.
Infection studies of rhesus macaques as a biomedical model have made it clear that at least some nonhuman primate species can be infected with SARS-CoV-2; rhesus macaques developed symptoms in response to infection that closely mimicked those of humans following the development of COVID-198,9. Recognizing the potential danger of COVID-19 to nonhuman primates, the International Union for the Conservation of Nature (IUCN), together with the Great Apes section of the Primate Specialist Group, released a joint statement on precautions that should be taken for researchers and caretakers when interacting with great apes10. However, the actual risk to great apes – and to other endangered primates globally – remains unknown. Here we begin to assess the potential likelihood that our closest living relatives are susceptible to SARS-CoV-2 infection.
While the biology underlying susceptibility to SARS-CoV-2 infection remains to be fully elucidated, the viral target is well established. The SARS-CoV-2 virus binds to the cellular receptor protein angiotensin-converting enzyme-2 (ACE2), which is expressed on the extracellular surface of endothelial cells of diverse bodily tissues, including the lungs, kidneys, small intestine and renal tubes11. Characterizations of the infection dynamics of SARS-CoV-2 have demonstrated that the binding affinity for the human ACE2 receptor is high, which is a key factor in determining the susceptibility and transmission dynamics. When compared to SARS-CoV, which caused a serious global outbreak of disease in 2002-200312,13, the binding affinity between SARS-CoV2 and ACE2 is estimated to be between 4-fold14–17 and 10-to 20-fold greater18. Recent reports describing structural characterization of ACE2 in complex with the SARS-CoV2 spike protein receptor binding domain14–17 allow identification of the key binding residues that enable the host-pathogen protein-protein recognition. Approaches examining variation in ACE2 tissue expression and gene sequences can offer insight into variation in human susceptibility to COVID-1919,20. Similarly, we can use such an approach to compare sequence variation across species, and hence try to predict the likely interspecific variation in susceptibility. Previous analysis of comparative variation at these sites enabled estimates of the affinity of the ACE2 receptor for SARS-CoV in nonhuman species21. Here, we undertake such an analysis for SARS-CoV-2 across the primate radiation.
We compiled ACE2 gene sequence data from 27 primate species for which genomes are publicly available, covering primate taxonomic breadth. For comparison, we assessed 4 species of other mammals that have been tested directly for SARS-CoV2 susceptibility in laboratory infection studies22. We also included in our analysis the amino acid sequence variation at these sites for horseshoe bats, thought to be the original vector of the virus, and pangolins, a potential intermediate host, where viral recombination may have led to the novel viral form SARS-CoV-223. We assessed the amino acid residues identified as critical for ACE2 recognition by CoV RBD, and undertook protein modeling to gauge the likely effect of the differences. Our aim was to develop predictions about the susceptibility of our closest living relatives to SARS-CoV-2 to guide all stakeholders, including researchers, caretakers, practitioners, conservationists, and governmental and nongovernmental agencies.
Methods
Variation in ACE2 sequences
We compiled ACE2 gene sequences for 16 catarrhine primates: 4 species from all 3 genera of great ape (Gorilla, Pan, Pongo), 2 genera of gibbons (Hylobates, Nomascus), and 10 species of African and Asian monkeys in 7 genera (Cercocebus, Chlorocebus, Macaca, Mandrillus, Papio, Rhinopithecus, Piliocolobus, Theropithecus); 6 genera of platyrrhines (monkeys from the Americas: Alouatta, Aotus, Callithrix, Cebus, Saimiri, Sapajus); 1 species of tarsier (Carlito syrichta); and 3 genera of strepsirrhines (lemurs and lorisoids: Microcebus, Propithecus, Otolemur) (Suppl. Table S1). We also included 4 species of mammals that have been tested clinically for susceptibility to SARS-CoV-2 infection22, including the domestic cat (Felis catus), dog (Canis lupus familiaris), pig (Sus scrofa), and ferret (Mustela putorius furo). Finally, we included the pangolin (Manis javanica) and several bat species, including horseshoe bats (Rhinolophus spp., Hipposideros pratti, Myotis daubentonii). Coding sequences were translated using Geneious Version 9.1.8 and we aligned the amino acid sequences with MAFFT. We manually inspected and corrected any misalignments, and verified the absence of indels and premature stop codons.
To visualize patterns of gene conservation across taxa and identify the congruence of the ACE2 gene tree with currently accepted phylogenetic relationships among species, we reconstructed trees using both Bayesian (MrBayes 3.2.624) and Maximum Likelihood (RAxML 8.2.1125) methods with 200,000 MCMC cycles and 1,000 bootstrap replicates, respectively (code available on GitHub26). Gene trees were compared to a current species phylogeny assembled using TimeTree27.
Identification of critical binding residues and species-specific ACE2–RBD interactions
Critical ACE2 protein contact sites for the viral spike protein receptor binding domain (RBD) have been identified using cryo EM and X-ray crystallography structural analysis methods14–17. The ACE2-RBD complex is characteristic of protein-protein interactions (PPIs) that feature extended interfaces spanning a multitude of binding residues. Experimental and computational analyses of PPIs have shown that a handful of contact residues can dominate the binding energy landscape28. Alanine scanning mutagenesis provides an assessment of the contribution of each residue to complex formation29–31. Critical binding residues can be computationally identified by assessing the change in binding free energy of complex formation upon mutation of the particular residue to alanine, which is the smallest residue that may be incorporated without significantly impacting the protein backbone conformation32.
Critical residues are defined as those that upon mutation to alanine decrease the binding energy by a threshold value ΔΔGbind ≥1.0 kcal/mol. Nine residues involved in the ACE2-RBD complex met this criterion. There was large congruence in the sites identified by alanine-scanning with those highlighted by other methods. Five of the sites we identified overlap with those that were implicated by cryo EM14, which also identified an additional three sites as important. We chose to analyze amino acid variation in the 9 ACE2 sites we identified through alanine scanning, and the additional 3 sites identified by structural analyses14–17 for a total of 12 critical sites (Supple. Table S2). All computational alanine scanning mutagenesis analyses were performed using Rosetta software32. The alanine mutagenesis approach has been extensively evaluated and used to analyze PPIs and design their inhibitors, including by members of the present authorship33,34.
We utilized the SSIPe program35 to predict how ACE2 amino acid differences in each species would affect the relative binding energy of the ACE2/SARS-Cov-2 interaction. Using human ACE2 bound to the SARS-Cov-2 RBD as a benchmark (PDB 6M0J), the program mutates selected residues and compares the binding energy to that of the original. Using this algorithm, we studied interactions of all primates across the full suite of amino acid changes occurring at critical binding sites for each species. To more thoroughly assess the impact of each amino acid substitution, we also examined the predicted effect of individual amino acid changes (in isolation) on protein-binding affinity.
Results
Variation in ACE2 sequences
ACE2 gene and translated protein sequences are strongly conserved across primates. Few amino acid substitutions are present, and gene trees closely recapitulate the currently accepted phylogeny of primates (Figure 1; Suppl. Fig. S2 a,b). In particular, the twelve sites in the ACE2 protein that are critical for binding of the SARS-CoV-2 virus are invariant across the Catarrhini, which includes great apes, gibbons, and monkeys of Africa and Asia (Figure 1). The other major radiation of monkeys, those found in the Americas (Platyrrhini), have less similar ACE2 sequences across the length of the protein. They share nine of twelve critical amino acid residues with catarrhine primates; the three sites that vary from catarrhines, H41, E42 and T82, are conserved within this clade. Strepsirrhine primates and tarsiers, with the exception of Propithecus coquereli, also have an H41 residue, and additionally have further amino acid variation inside and outside of the binding sites.
ACE2 protein sequence alignment and phylogeny of study species. Branch lengths represent evolutionary distance estimated from TimeTree.org27. We outline amino acid residues at critical binding sites for the SARS-CoV-2 spike receptor binding domain. Solid outlines highlight sites predicted to have the most substantial impact on viral binding affinity. Notably, protein sequences of catarrhine primates are highly conserved, including uniformity among amino acids at all binding sites. Primate species that are able to be successfully infected with COVID-19 are indicated in red. Predicted susceptibility to COVID-19 for other primates is additionally coded by terminal branch colors.
In non-primate mammals, an increasing number of substitutions are evident, including at critical binding sites. All species possess a different residue to primates at site 24. Bats are exceptionally variable, with the genus Rhinolophus alone encompassing all of the variation seen in the rest of the non-primate mammals. Where primates have glutamine (Q24), bats have glutamate (E24), lysine (K24), leucine (L24), or arginine (R24) (Figure 1). All fasta alignments of ACE2 gene and protein sequences are available on GitHub26 and available in the supplemental materials.
Analysis of species-specific residues on ACE2–RBD interactions
The ACE2 receptors of all catarrhines have identical residues to humans across all 12 analyzed sites and are predicted to have similar binding affinity for SARS-CoV-2. Platyrrhines diverge from catarrhines at three of the twelve critical amino acid residues. Compared to catarrhine ACE2, the platyrrhines’ ACE2 is predicted to bind SARS-CoV2 RBD with roughly 400-fold reduced affinity (ΔΔGbind=3.5 kcal/mol) (Table 1a). In particular, the change at site 41 from Y to H found in monkeys in the Americas has the largest impact of any residue change examined (Table 1b), which alone is predicted to lead to a 25-fold decrease in the binding affinity to SARS-CoV-2 (Figure 2). This single mutation combined with additional substitutions, especially Q42E, found in platyrrhines is predicted to significantly reduce the likelihood of successful viral binding (Table 1b). Of the other primates modeled, two of the three strepsirhines, and tarsiers, also have the H41 residue and furthermore have additional protein sequence differences leading to further decreases in predicted binding affinity. The predicted binding affinity of tarsier ACE2 is the most dissimilar to humans and this primate might be the least susceptible of the species we examine. In contrast, Coquerel’s sifaka (Propithecus coquereli) shares the same residue as humans and other catarrhines at site 41 and has projected affinities that are near to humans (Table 1b).
Results of protein-protein interaction experiments predicting impact of amino acid changes, relative to human ACE2 residues, at critical binding sites with SARS-CoV-2 receptor binding domain. Impacts of changes across the full complement of critical binding sites are presented in (A), single residue replacements are presented in (B).
Model of human ACE2 in complex with SARS-CoV-2 RBD. Key ACE2 interfacial residues are highlighted. (A). Interactions at critical binding sites 41 and 42 are shown for catarrhines (apes and monkeys in Africa and Asia); (B), and for platyrrhines (monkeys in the Americas) (C). The dashed lines indicate predicted hydrogen bonding interactions. Y41 participates in extensive van der Waals and hydrogen bonding interactions with RBD; these interactions are abrogated with histidine. Q42 side chain amide serves as a hydrogen acceptor and donor to contact RBD; change to glutamic acid diminishes the hydrogen bonding interactions.
Other mammals included in our study - ferrets, cats, dogs, pigs, pangolin and two of the seven bat species (R. pusillus and R. macrotis) - show the same residue as humans (Y) at site 41, with accompanying strong affinities for SARS-CoV-2. The remaining five sister species of bats possess H41 and lower binding affinities (Table 1b). Variation in other critical sites across non-primate mammals is projected to impact binding, but more modestly.
Discussion
Our results strongly suggest that catarrhines - all apes, and all monkeys of Africa and Asia, are likely to be highly susceptible to SARS-CoV-2. There is high conservancy in the protein sequence of the target receptor, ACE2, including uniformity at all identified and tested major binding sites. Consistent with our results, infection studies show that rhesus monkeys can be successfully infected with SARS-CoV-2, and go on to develop COVID-19 like symptoms8,9. Our results based on protein modeling offer better news for monkeys in the Americas (platyrrhines). There are three differences in amino acid residues between platyrrhines and catarrhines, and two of these, H41Y and E42Q show strong evidence of being impactful changes. These amino acid changes are modeled to reduce the binding affinity between SARS-CoV-2 and ACE2 by ca. 400-fold. Similar reduced susceptibility is predicted for tarsiers, and two of the three lemurs and lorisoids (strepsirrhines). One of the analyzed lemurs, Coquerel’s sifaka, has fewer differences from catarrhines at important binding sites, including possessing the high-risk residue variant at site 41, and as such is also predicted to be highly susceptible. Nonetheless, these are only predicted results based on amino acid residues, and protein-protein interaction models. We urge extreme caution in using our analyses as the basis for relaxing policies regarding the protection of platyrrhines and strepsirrhines. Experimental assessment of synthetic protein interactions can now occur in the laboratory e.g.36, and confirmation of our model predictions should be sought before any firm conclusions are reached.
Emerging evidence in experimental mammalian models appears to support our results; dogs, ferrets, pigs, and cats have all shown some susceptibility to SARS-CoV-2 but have demonstrated variation in disease severity and presentation, including across studies22,37. Substitutions at binding sites might be at least partially protective against COVID-19 in these mammals. For example, the limited experimental evidence to date suggests that while cats - which have the same residue as humans at site 34 - are not strongly symptomatic, they present lung lesions, while dogs - which have a substitution at this site - do not22. The amino acid residue at site 24 differs from primates in all other mammalian species examined. However, our models suggest that the variant residues may confer relatively minor reductions in binding affinity. Other sources of variation, including residues affecting ACE2 protein stability20. Our results are also consistent with previous reports that ACE2 genetic diversity is greater among bats than that observed among mammals susceptible to SARS-CoV-type viruses. This variation has been suggested to indicate that bat species may act as a reservoir of SARS-CoV viruses or their progenitors21. Intriguingly, all but 2 bat species we examined have the putatively protective variant, H41. Further clinical and laboratory study is needed to fully understand infection dynamics.
There are a number of important caveats to our study. Firstly, all of our predictions are based on interpretations of gene and resultant amino acid sequences. Some of our results, such as the uniform conservation of ACE2 among catarrhines, backed up by the demonstrated high susceptibility of humans and rhesus macaques to SARS-CoV-2, should give a good degree of confidence of high levels of risk. Given the identical residues of humans to other apes and monkeys in Asia and Africa at the target site, it seems unlikely that the ACE2 receptor and the SARS-CoV-2 proteins would not readily bind. Our results for other taxa are dependent on modeling, hence should be treated more cautiously. This includes all interpretations of the susceptibility of platyrrhines and strepsirrhines, where the effects of residue differences on binding affinities have been estimated based on protein-protein interaction modeling. Another caveat is that we have modeled only interactions at binding sites, and not predictions based on full residue sequence variation. Residues that are not in direct contact may still affect binding allosterically. More generally, if adhering to the precautionary principle, then our results highlighting higher risks to some species should be taken with greater gravity than our results that predict potential lower risks to others. Another limitation of our study is that we have looked at only 27 primate species, albeit with broad taxonomic scope. Analysis of additional species is important, especially among strepsirrhine species, where our coverage is relatively scant. In particular, the almost identical residue sequences of Coquerel’s sifaka to those of catarrhines suggests a need to assess the residue sequences of a wider diversity of lemur species. It is also important to remember that our study assesses only the potential for initial binding of the virus to the target site. Downstream consequences of infection may differ drastically based on species-specific proteases, genomic variants, metabolism, and immune system responses38,39. In humans, the development of COVID-19 can lead to a pro-inflammatory cytokine storm of hyperinflammation, which may lead to some of the more severe impacts of infection40,41.
Many endangered primate species are now only found in very small population sizes42. For example, there are believed to be only around 1000 mountain gorillas left in their entire range43. With such small populations, the introduction of a new highly infectious disease is a potential extinction-level event. Re-opening access to habituated great ape groups for tourism purposes, which may be critical to local economies44, may be fraught with issues. IUCN best practices recommend that tourists stay at least 7 metres away from great apes45, but in practice, almost all tourists get far closer than this - for example, the average distance that tourists get from mountain gorillas at the Bwindi Impenetrable National Park in Uganda is just 2.76 metres46. Concerted effort may be required by all stakeholders to try to avoid the introduction of SARS-CoV-2 into wild primate populations7. Recent measures suggested by the IUCN for researchers and caretakers of great ape populations include: ensuring that all individuals wear clean clothing and disinfected footwear; providing hand-washing facilities; requiring that a surgical face mask be worn by anyone coming within 10 meters of great apes; ensuring that individuals needing to cough or sneeze ideally leave the area, or at least cough/sneeze into the crux of their elbows; imposing a 14-day quarantine for all people arriving into great ape areas who will come into frequent close proximity with them10. The IUCN’s ‘Best Practice Guidelines for Health Monitoring and Disease Control in Great Ape Populations’ should also be followed47.
Our results suggest that dozens of nonhuman primate species, including all of our closest relatives, are likely to be highly susceptible to SARS-CoV-2 infection, and vulnerable to its effects. Major actions may be needed to limit the exposure of many wild primate populations to humans. This is likely to require coordinated input from all stakeholders, including local communities, international and national governmental agencies, nongovernmental conservation and development organizations, and academics and researchers. While the focus of many at this time is rightly on mitigating the humanitarian devastation of COVID-19, we also have a duty to ensure that our closest living relatives do not suffer extinctions, or massive population declines, in response to yet another human-induced catastrophe.
Data Availability Statement
Nucleotide and protein sequences used in this study are available from NCBI and are also available as fasta files and alignments on Github (https://github.com/MareikeJaniak/ACE2). All code used in this project is available in the same repository.
Author Contributions
ADM, JPH, and MCJ designed the study. ADM and JPH wrote the paper with input and edits from MCJ and PSA. MCJ conducted genetic analyses with input from ADM. FM ran the protein substitution models with input from PSA. All authors have approved the final submission for publication.
Acknowledgements
MCJ was funded by a National Sciences and Engineering Council of Canada Discovery Accelerator Supplement to ADM and by a postdoctoral fellowship from the Alberta Children’s Hospital Research Institute. PSA thanks the National Institutes of Health (R35GM130333) for financial support.