Main

Viral attachment to receptors that are expressed on host cells initiates infection and therefore, viral receptors are determinants of host range and govern host cell susceptibility. Various cell surface carbohydrates, including sialylated glycans1,2,3,4,5,6, glycosaminoglycans7,8,9,10 and human blood group antigens (HBGAs)11,12, function as host cell receptors for viral attachment and entry.

Although viruses have been known for some time to use cell surface carbohydrates to bind to host cells, recent advances in glycan array (also known as glycan microarray) screening technology have accelerated the identification of glycan receptors. Together with new structural information about how viruses bind to glycans, the interactions between viruses and glycans can now be analysed in unprecedented detail.

In this Review, we highlight the molecular and structural determinants of virus–sialylated glycan interactions and the influence of glycan binding on viral tropism, with an emphasis on well-studied examples, including influenza virus, reovirus, adenovirus and rotavirus (Table 1). Although this group of sialic acid-binding viruses is not exhaustive4,13,14, all four have stalk-like attachment proteins, which enables more direct comparisons of virus–glycan interactions to be made. Specifically, we examine how glycan array studies and structure determination, coupled with in vivo experiments to establish the function of sialic acid binding in pathogenesis, have provided insights into the remarkable complexity of virus–sialic acid relationships. In addition, we discuss how information that has been gained from studies of these viruses has yielded general principles of virus–glycan interactions that may aid in the design of antiviral drugs and viral vectors.

Table 1 Sialic acid-binding viruses*

Virus–sialic acid interactions

Sialic acids are derivatives of neuraminic acid, which is a nine-carbon monosaccharide that is ubiquitously expressed in higher vertebrates15. The C5 carbon is frequently modified with an N-acetyl group to form N-acetylneuraminic acid (Neu5Ac), which can be further hydroxylated to form N-glycolylneuraminic acid (Neu5Gc)15 (Fig. 1a). Additional modifications of neuraminic acid involve acetylation, methylation and sulphation of its hydroxyl groups. Sialic acids are often α-linked from the C2 carbon to carbohydrate chains on glycoproteins and glycolipids (Fig. 1b). In the host, sialic acids function in cell–cell adhesion, in cell signalling (especially within the immune system) and in development16,17. In addition, they are known to be key components of receptors for many viruses and bacterial toxins18,19,20,21. Virus interactions with sialylated glycans are usually of low affinity and are strengthened by the multivalency of the virus1.

Figure 1: Sialic acid types and glycosidic linkage.
figure 1

a | Sialic acids are nine-carbon monosaccharide derivatives of neuraminic acid. The two most common sialic acids are N-acetyl neuraminic acid (Neu5Ac) and N-glycolylneuraminic acid (Neu5Gc). The C5 carbon in Neu5Ac is modified with an N-acetyl group, which can be further hydroxylated to form Neu5Gc. The hydroxyl groups at C4, C7, C8 and C9 are subject to various modifications (not shown). Common constituents include O-acetyl, O-sulphate, O-lactyl, O-methyl and O-phosphate groups. b | Sialic acids are attached to carbohydrate chains on glycoproteins and glycolipids via different glycosidic linkages. The most common linkage types are α2,3-linkage to a galactose residue, α2,6-linkage to a galactose moiety or to an N-acetylgalactosamine moiety, and α2,8-linkage to another sialic acid moiety on a glycan.

PowerPoint slide

Studying virus–sialic acid interactions

To gain a comprehensive understanding of virus–sialylated glycan interactions, it is crucial to identify the precise glycan receptor, define the molecular and structural basis of the interaction and establish the contribution of binding to sialylated glycan receptors in disease.

Identification of glycan receptors. The interaction between a virus and sialylated glycan is often first investigated using cell-based infectivity assays in which attachment is blocked by sialic acid-binding lectins or enzymatic removal of sialic acids by neuraminidases. However, neuraminidase does not efficiently remove sialic acid from branched gangliosides such as GM1 (Ref. 22). In addition, the specificity of neuraminidase is limited to the type of sialic acid linkage. Conversely, glycan arrays can be used to discern finer differences in virus–glycan binding preferences by enabling rapid, high-throughput screening of several glycans as potential virus receptors23,24,25,26. This technology has been used to identify glycan ligands of adenovirus21, influenza virus27,28, polyomavirus14, reovirus19 and rotavirus11,29, among other viruses.

Glycan array screening is analogous to the more familiar microarrays that are used to study gene expression. Glycans are immobilized on an array and then incubated with whole virus or the viral attachment protein to identify specific glycan receptors for viruses19,21,30 and compare glycan-binding preferences of different virus strains28,31. Binding is usually quantified using fluorescence-based detection systems. Different glycan arrays vary in glycan composition32 and the mode of glycan immobilization; for example, covalent binding of amine-terminating glycans to N-hydroxysuccinimide (NHS)-activated glass slides23 or glycan linkage to lipids that are printed on nitrocellulose-coated glass slides (known as neoglycolipid (NGL)-based arrays25,33,34). Although arrays from the different platforms vary in the composition of the glycans on the array, as well as the glycan-coupling method, both types of arrays have been useful in identifying glycan receptors for viruses. Different glycan array platforms have previously been compared in-depth32,35,36.

Structural studies of virus–glycan interactions. Structural studies of virus–glycan interactions enable the identification of regions of the viral attachment protein and glycan that contribute to binding and facilitate the engineering of mutant viruses that can be used to investigate the physiological consequences of glycan engagement37,38,39,40. Although X-ray crystallography is not a new technique, advances in nearly every step of the crystallographic process have accelerated structural determination41. Protein purification techniques have also improved, and the use of robots in crystal screening reduces the amount of protein required42. Complementary methods, such as nuclear magnetic resonance (NMR) spectroscopy, are well suited for mapping protein–glycan interactions in solution13,19,43. Taken together, glycan array, crystallography and functional studies provide a more complete understanding of virus–sialylated glycan engagement.

Influenza virus

Influenza virus is a segmented, single-stranded RNA (ssRNA) virus in the Orthomyxoviridae family that infects mammals and birds; infections with influenza virus are common in humans. The trimeric viral haemagglutinin protein binds to sialic acid, commonly Neu5Ac, to adhere to host cells. Influenza viruses engage α2,3-linked and α2,6-linked sialic acid attached to a penultimate galactose of the glycan receptor. Avian influenza viruses primarily bind to α2,3-linked sialic acid, whereas human influenza viruses preferentially bind to α2,6-linked sialic acid5,20. The virus-encoded neuraminidase protein catalyses removal of Neu5Ac from the cell surface and viral glycoproteins to release newly formed virions.

Binding of influenza virus haemagglutinin to sialic acid. Influenza virus haemagglutinin is anchored in the viral envelope and projects away from the viral surface. The haemaglutinin trimer is composed of the globular HA1 domain, which engages sialic acid, and the stalk-like HA2 domain, which facilitates membrane fusion (Fig. 2a). The carbohydrate-binding site is conserved in all influenza subtypes and is located in a shallow groove in the HA1 domain44. The orientation of Neu5Ac and its interactions with HA1 are also mostly conserved among influenza virus strains44,45,46,47. In influenza virus haemagglutinin–sialic acid interactions, the Neu5Ac carboxylate inserts deeply into the carbohydrate-binding site of HA1 (Fig. 2b,c), where it forms two hydrogen bonds with adjacent residues, and the glycerol and N-acetyl chains form hydrogen bonds with additional residues in viral haemagglutinin. Moreover, the methyl group of the N-acetyl chain is inserted into a hydrophobic pocket in the virus-binding site (Fig. 2d), which is a common feature that is observed in virus–glycan interactions. Rotation around the glycosidic bond enables the galactose molecule to adopt either a cis or a trans position with respect to the N-acetyl group of Neu5Ac to accommodate different haemagglutinin molecules48. Avian influenza virus haemagglutinin molecules are commonly bound in a trans conformation, whereas human receptors are commonly found in a cis conformation. Structural analysis of avian H5 and H7 strains showed that a point mutation that changes the conformation from trans to cis leads to an increase in affinity for α2,6-linked sialic acid49,50.

Figure 2: Influenza virus binding to differentially linked sialic acids.
figure 2

a | Schematic of the trimeric influenza virus haemagglutinin, with the monomers depicted in purple, orange and grey. Haemagglutinin is a transmembrane protein that is composed of the globular HA1 domain and the stalk-like HA2 domain. Each HA1 domain in the trimer binds to sialic acid (commonly N-acetyl neuraminic acid (Neu5Ac)), and the binding site is indicated in one monomer with a red circle. b | Avian influenza viruses preferentially bind to host cell receptors that contain α2,3-linked sialic acid, and human-adapted viruses bind to receptors that contain α2,6-linked sialic acid moieties. Schematics of an example of an avian influenza virus receptor (α2,3-sialyllactose) and a human influenza virus receptor (α2,6-siallylactose) are shown. Glucose (Glc), galactose (Gal) and Neu5Ac are depicted as blocks. c | Surface representation of trimeric haemagglutinin (monomers are shown in purple, orange and grey) in complex with Neu5Ac (in yellow as a stick representation) (Protein Data Bank (PDB) accession 1HGG). Red circles indicate the glycan-binding site. d | Close-up view of the glycan-binding site of haemagglutinin. Selected crucial contacts between the haemagglutinin residues Ser136, Asn137 and Glu190 (purple) and Neu5Ac (yellow) are highlighted (grey dashes). The glycan receptors are shown in stick representation, with oxygen atoms in red and nitrogen atoms in blue. e | Superposition of an avian influenza virus haemagglutinin in complex with α2,3-sialyllactosamine (yellow) (PDB accession 2WR2) and the human receptor α2,6-sialyllactosamine (cyan) (PDB accession 2WR7). The avian receptor generally has a linear conformation, whereas the human receptor is more flexible and has an umbrella-like topology.

PowerPoint slide

Determinants of influenza virus binding specificity. Although all influenza strains are thought to bind to sialic acids, the context of these monosaccharides in the recognized glycan structures varies. Glycans that contain α2,3-linked sialic acid have restricted conformational freedom and form a cone-like glycan structure. Conversely, glycans that contain α2,6-linked sialic acid have greater conformational flexibility51, and such glycans form umbrella-like shapes (Fig. 2e). The linkage between sialic acid and galactose in the receptor molecules thus determines the affinity of HA1 for a given oligosaccharide by defining the topology of the glycan.

The 1918, 1957 and 1968 pandemic influenza viruses were not of human origin but acquired receptor-binding specificity for glycans that contain α2,6-linked sialic acid27,52,53,54,55. Glycan arrays have helped to define the binding preferences of pandemic strains and are thus aiding in understanding mechanisms of the host jump. A conserved region of haemagglutinin of the 1918 pandemic H1N1 strain A/South Carolina/1/1918, which differs from the consensus amino acid sequence of the avian virus by E190D and G225D mutations, preferentially binds to α2,6-linked sialyl-oligosaccharides. Conversely, pandemic strain A/New York/1/1918, which differs from the avian influenza virus consensus sequence by an E190D substitution, binds to both α2,3-linked and α2,6-linked sialyl-oligosaccharides. The presence of a glycine at position 225 in either the avian or human strain enables binding to α2,3-linked sialic acids, whereas an aspartic acid at that position does not enable binding to this receptor28,56. In addition, mutation of residue 190 in A/New York/1/1918 to the avian consensus sequence results in exclusive binding to α2,3-linked sialyl-oligosaccharides like the avian counterpart28. Thus, these two residues are determinants of influenza virus receptor-binding specificity. Interestingly, the presence of a glycine at position 225 in some H1N1 isolates from the 2009 pandemic is also associated with increased binding to α2,3-linked sialic acids57. Glycan arrays showed that the binding preferences of the 2009 H1N1 pandemic influenza virus more closely resemble the binding preferences of the swine influenza virus isolates rather than those of seasonal strains27,58.

Despite this work, it is clear that the classification of influenza virus strains that are specific for either α2,3- or α2,6-linked sialic acids is too simplistic; for example, glycan array screening of seasonal H3N2 influenza virus strains did not identify a single moiety that all of the 45 strains tested bound59, and the preference of different strains for certain ligands changed with time. A study that investigated the binding specificity of human H3N2 influenza viruses that were isolated from 1968 to 2012 showed that early isolates preferentially bind to short and branched sialylated glycans, whereas more recent strains bind with high avidity to sialic acids that are attached to long polylactosamine chains59.

Although human influenza viruses bind to glycans that contain α2,6-linked sialic acid, the linkage type alone is not sufficient to explain strain-specific binding preferences. The affinity and avidity of haemagglutinin for α2,3- and α2,6-linked sialic acid, and not just the capacity to engage either or both ligands, influence influenza virus transmission48,60. Sialic acid modifications, including fucosylation and sulphation, also influence binding: influenza viruses that primarily bind to glycans that contain α2,3-linked sialic acids interact with greater avidity with glycans that contain a sulphate or sialic acid on position six of the penultimate N-acetylglucosamine (GlcNAc) than with glycans that are fucosylated at this site28,60. These studies suggest that the attachment of influenza virus to sialic acid is determined by the linkage of the sialic acid as well as other factors, such as the length, branching and sialic acid modifications of the glycan.

Glycan binding influences viral transmission. As described above, the substitution of a few amino acid residues in haemagglutinin can alter the receptor specificity of influenza virus. In addition, mutation of a few residues in haemagglutinin and other viral proteins also influences influenza virus transmission. Ferrets are a useful animal model to study influenza virus infection as they mimic the tropism and pathogenesis observed in humans61. Mutations in haemagglutinin from the H5 subtype influence the spread of influenza virus between ferrets50,62,63, as these mutations lead to a shift in the binding specificity from α2,3- to α2,6-linked sialic acid. This alteration in binding enables the virus to adhere to nasal turbinates, which are known to express α2,6-linked sialic acid63. In addition to mutations in the glycan-binding site that result in a shift in haemagglutinin-binding preference, mutations in viral proteins that regulate transcription and replication also contribute to the transmission phenotype63. Thus, although a shift in the binding specificity of haemagglutinin influences the transmission of influenza virus, it is not the only determinant. The role of glycans in influenza virus pathogenesis has recently been reviewed in depth64.

Reovirus

Reoviruses are non-enveloped viruses that belong to the Reoviridae family. They contain ten segments of double-stranded RNA (dsRNA), which are encapsidated within two concentric protein shells. Nearly all mammals function as hosts for reovirus, but disease is restricted to the very young. Reovirus infections are common in humans and most are exposed by adulthood65. Attachment of reovirus to host cells is mediated by the outer-capsid protein σ1, which is a trimeric fibre that protrudes from the surface of the virion. The σ1 attachment protein has three structurally distinct domains — the head, the body and the tail (Fig. 3a) — and binds to both carbohydrate and protein receptors. Reovirus serotypes differentially bind to sialic acid19,66 in an initial adhesive step1 before serotype-independent engagement of junctional adhesion molecule A (JAM-A)67,68,69, which is expressed at tight junctions that link polarized cells as well as on some leukocytes70,71,72,73.

Figure 3: T1 and T3 reovirus σ1 proteins differentially bind to sialylated glycans.
figure 3

a | Schematic showing the reovirus attachment protein σ1, which is a trimeric fibre that is composed of three structurally distinct domains: the head (purple), body (green) and tail (grey). The glycan-binding sites of serotypes 1 (T1) and 3 (T3) are located in different domains of σ1 (indicated with red circles). b | T1 reovirus binds to the GM2 glycan, which is composed of glucose (Glc) and galactose (Gal), with an α2,3-linked N-acetyl neuraminic acid (Neu5Ac), and β1,4-linked N-acetylgalactosamine (GalNAc). c | Close-up view of the T1 reovirus–GM2 interaction (Protein Data Bank (PDB) accession 4GU3). The protein surface is depicted in white, and the glycan-binding site is shown as a ribbon tracing in blue. Ser370 and Gln371 are crucial residues that are involved in GM2 binding and are shown in stick representation. The glycan receptor is shown in stick representation (yellow), with oxygen (red) and nitrogen atoms (blue). d | T3 reovirus binds to α2,3-, α2,6-, and α2,8-linked sialylated glycans. e | Close-up view of T3 reovirus σ1 in complex with α2,3-linked sialyllactose (PDB accession 3S6X). The protein surface is shown with the glycan-binding site depicted as a ribbon tracing in green. Arg202, which is required for the virus–sialic acid interaction, is shown in stick representation. Contacts between viral residues and sialic acid are depicted as grey dashes.

PowerPoint slide

Reovirus–glycan interactions. Haemagglutination studies suggested that the T1 and T3 serotypes differentially bind glycans2,74. T1 reovirus agglutinates erythrocytes of human and non-human primates, whereas T3 reovirus agglutinates erythrocytes of various mammalian species. T3 reovirus, but not T1 reovirus, binds to glycophorin, which is a sialylated glycoprotein that is expressed on erythrocytes75,76,77. Glycan array screening provided new information on the specificity of different reovirus serotypes for distinct glycans. T1 reovirus σ1 specifically engages the GM2 glycan19 (Fig. 3b,c), whereas T3 reovirus σ1 binds to a range of sialylated glycans66 (Fig. 3d,e).

The structural basis for reovirus–glycan interactions. The σ1 proteins from T1 and T3 reovirus have been crystallized in complex with sialylated glycans19,66 (Fig. 3b–e). Interestingly, the glycan-binding sites of T1 and T3 reovirus are located in different domains of σ1. The carboxy terminal head domain of T1 σ1 binds to the GM2 glycan19, whereas the body domain is the glycan-binding region of T3 σ1 (Ref. 66). The T1 and T3 σ1 head domains also bind to JAM-A67, but the binding sites for GM2 and JAM-A in the T1 σ1 head domain are distinct, which suggests that T1 σ1 can interact with both receptors independently. The terminal Neu5Ac and N-acetylgalactosamine (GalNAc) moieties of the branched GM2 glycan contact the T1 σ1 head domain (Fig. 3c). The carboxyl group of Neu5Ac forms a hydrogen bond with the side chain of Gln371 in the attachment protein, whereas the Neu5Ac N-acetyl nitrogen and the glycerol chain form hydrogen bonds with residues in the σ1 backbone19. The finding that most of the interactions between T1 σ1 and Neu5Ac occur via backbone elements and not via amino acid side chains is rare in virus–glycan interactions and was confirmed by structure-guided mutagenesis studies19. Of note, the methyl group of the Neu5Ac N-acetyl chain inserts into a hydrophobic pocket, which is similar to the interaction that is observed for influenza virus haemagglutinin. The GalNAc moiety of the GM2 glycan is located in a surface-exposed shallow pocket of σ1 and provides contact via van der Waals interactions, which increase the specificity of T1 σ1 for GM2.

Although the precise glycan ligands for T3 reovirus σ1 are not known, T3 σ1 can bind to α2,3-, α2,6-, and α2,8-linked Neu5Ac (Fig. 3d) using a loop that connects β-spirals 2 and 3 in the body domain. Neu5Ac is anchored in the σ1 binding site by a bidentate salt bridge that is formed between the Neu5Ac carboxylate and Arg202 of T3 σ1 (Fig. 3e). This salt bridge is required for the interaction, as replacement of Arg202 with alanine or tryptophan abolishes the sialic acid-binding capacity of T3 reovirus66. Additional hydrogen bonds between the hydroxyl, acetyl and glycerol groups of Neu5Ac and the backbone carbonyl groups of T3 σ1 strengthen the interaction. Similarly to both influenza virus haemagglutinin and T1 reovirus σ1, the Neu5Ac N-acetyl methyl group inserts into a partially hydrophobic pocket of T3 reovirus σ1 (Ref. 66).

Glycan binding and reovirus tropism. Binding of the reovirus attachment protein to sialic acid is crucial for viral tropism and spread. Alteration of one or two residues in σ1 is sufficient to disrupt this interaction3,19,66. Binding of T3 reovirus to sialic acid promotes dissemination from the mouse intestines to sites of secondary replication, including the brain, heart and liver, and leads to infection of the bile duct epithelium, which results in biliary obstruction78. Moreover, sialic acid-binding T3 reoviruses replicate to higher titres in the mouse spinal cord and brain and are substantially more virulent than strains that do not bind to sialic acid79. Concordantly, sialic acid-binding T3 reoviruses infect primary cultures of cortical neurons more efficiently than strains that do not bind to sialic acid79, and infectivity of sialic acid-binding strains is reduced following neuraminidase treatment79,80. It is not established whether the sialic acid-binding capacity of T1 reovirus influences its pathogenesis, but preliminary findings suggest that this might be the case (J.E.S.-B. and T.S.D., unpublished observations).

Reovirus displays serotype-dependent pathology in the central nervous system (CNS) of newborn mice. The viral gene that encodes the σ1 attachment protein determines these serotype-dependent differences in neural tropism81,82,83,84, probably via the differential engagement of σ1 with cell surface receptors. Therefore, given that T1 and T3 reoviruses have distinct glycan-binding preferences, it is possible that differential glycan expression correlates with the serotype-dependent differences in the CNS tropism of reovirus. However, this model remains speculative and requires a comprehensive evaluation of the glycan expression profiles in vivo.

Adenovirus

Adenoviruses are non-enveloped double-stranded DNA (dsDNA) viruses in the Adenoviridae family that infect humans, other mammals and birds. Some adenovirus strains cause conjunctivitis or upper respiratory illness in humans, whereas others only rarely produce symptoms in immunocompetent individuals. Like reovirus, adenovirus serotypes differ in sialic acid binding; for example, although most adenoviruses use protein receptors85, the species D adenovirus 37 (Ad37) agglutinates human erythrocytes86 in a neuraminidase-sensitive manner87,88, which indicates that this adenovirus binds to sialic acid.

Structural basis of adenovirus–glycan binding. Similarly to reovirus σ1, adenovirus binds to host cells using a filamentous trimeric fibre that extends from the viral capsid at the twelve icosahedral vertices (Fig. 4a). The C-terminal region of the fibre folds into a globular structure (known as the knob), which binds to protein or carbohydrate receptors in a species-specific manner85. Interactions with sialic acid are strengthened by the presence of multiple attachment molecules per virion.

Figure 4: Interaction between adenovirus 37 and glycan.
figure 4

a | Schematic representation of the trimeric adenovirus 37 (Ad37) fibre, which is the viral attachment protein that binds to the GD1a glycan. The monomers are depicted in purple, orange and grey. The glycan-binding site is located in the knob domain of two monomers (indicated by a red circle). b | Schematic showing the GD1a glycan, which is the Ad37 glycan receptor on host cells. GD1a is composed of glucose (Glc), galactose (Gal), a terminal α2,3-linked sialic acid and N-acetylgalactosamine (GalNAc), depicted as blocks. c | Surface representation of a top view of the Ad37 fibre knob in complex with the GD1a glycan (Protein Data Bank (PDB) accession 3N0I). The monomers are coloured as in part a, and the GD1a glycan is shown in stick representation (in yellow, with oxygen atoms in red and nitrogen atoms in blue). The two N-acetyl neuraminic acid (Neu5Ac)-binding sites that are occupied by the GD1a glycan are marked with red circles, and the third potential binding site (X) remains unoccupied. d | Surface representation of two knob monomers bound to the GD1a glycan, which is shown in stick representation (oxygen and nitrogen atoms as in part c). The interaction between the Ad37 fibre knob and sialic acid is mediated by several interactions (depicted as grey dashes), including a salt bridge between Lys345 and the Neu5Ac carboxylate, and hydrogen bonds between residues Tyr312 and Pro317 of the Ad37 knob and the N-acetyl chain of Neu5Ac of GD1a.

PowerPoint slide

Glycan array screening assays showed that Ad37 binds specifically to the GD1a glycan, which is a branched hexasaccharide with two arms that terminate in α2,3-linked Neu5Ac21 (Fig. 4b). An initial structural analysis of the trimeric Ad37 fibre knob identified three equivalent binding sites for Neu5Ac89. However, another crystal structure of the knob–GD1a complex shows a stoichiometry 'mismatch', in which two knob monomers engage the two terminal Neu5Ac groups of GD1a in an identical manner and the third sialic acid-binding site on the knob remains unoccupied21 (Fig. 4c). Bivalent binding of GD1a increases the affinity of the interaction compared with the interaction of the fibre knob with monovalent sialyllactose alone21. The interaction between the Ad37 fibre knob and sialic acid involves a salt bridge between Lys345 and the Neu5Ac carboxylate. Hydrogen bonds between the Ad37 knob and additional Neu5Ac functional groups strengthen the interaction (Fig. 4d). A central salt bridge that anchors the Neu5Ac carboxylate group to the viral protein is required for binding89, similarly to binding of T3 reovirus σ1 to glycan66.

Glycan binding specificity and cell tropism. Sialic acid binding also influences the susceptibility of cells to infection by certain adenovirus types. Soluble GD1a diminishes attachment to human corneal epithelial cells of species D adenovirus serotypes Ad8, Ad19a, Ad19p and Ad37, but not species C adenovirus serotype Ad5 (Ref. 21), which suggests that different types have specific binding preferences for cellular receptors. Furthermore, this finding suggests that compounds that mimic GD1a might function as antiviral agents90. Ad8, Ad19a and Ad37 cause epidemic keratoconjunctivitis, whereas Ad19p does not21; however, the binding of all strains is GD1a-dependent, and therefore factors other than GD1a might contribute to serotype-dependent tropism21.

Rotavirus

Rotaviruses are non-enveloped dsRNA viruses that belong to the Reoviridae family. These viruses are a leading cause of childhood diarrhoea worldwide. Rotavirus attachment is dependent on glycans and is mediated by the trimeric outer-capsid protein VP4 (Ref. 91). Rotavirus infectivity is increased following proteolytic cleavage of the VP4 trimer into amino-terminal VP8* and C-terminal VP5* subunits. The VP8* subunit mediates attachment of the virus by binding to cell surface glycans92 (Fig. 5a), whereas the VP5* subunit facilitates membrane penetration93.

Figure 5: The VP8* domain of rotavirus VP4 differentially engages glycans.
figure 5

a | The schematic depicts the rotavirus outer-capsid protein VP4, which is composed of VP5* and VP8*. The protein is a trimer, but only two of the three monomers are visible in some structures, and hence the third monomer is depicted in grey. The VP8* subunit binds to glycans (the binding site is indicated by the red circle), whereas the VP5* subunit facilitates membrane penetration. b | The glycan ligand for the human HAL1166 rotavirus, human blood group antigen (HBGA), comprises N-acetylgalactosamine (GalNAc), galactose (Gal) and fucose (Fuc). c | The crystal structure of rhesus rotavirus (RRV) VP8* in complex with N-acetyl neuraminic acid (Neu5Ac) (Protein Data Bank (PDB) accession 1KQR). The protein surface is shown in purple, and Neu5Ac is depicted in stick representation (yellow carbon with red oxygen atoms and blue nitrogen atoms). d | The crystal structure of human rotavirus strain HAL1166 VP8* (purple) in complex with HBGA (with orange carbons) (PDB accession 4DRV). The glycan receptors are shown as orange sticks with red oxygen atoms and blue nitrogen atoms. The HAL1166 VP8* binds to a completely different glycan at the same position at which RRV VP8* engages Neu5Ac.

PowerPoint slide

Glycans that are bound by rotavirus. Many animal rotavirus strains, including rhesus rotavirus (RRV), bind to terminal sialic acid-containing receptors94,95,96,97,98,99, such as GM3 (Ref. 99). Some human rotaviruses, including strain Wa, bind to sialylated receptors in which the sialic acid is attached to one branch of biantennary glycans, such as ganglioside GM1 (Ref. 100), but other human rotavirus strains, such as HAL1166, do not. The combination of glycan array screening and crystallographic analysis of VP8* from the human strain HAL1166 (P[14] VP4 genotype) showed that this virus specifically binds to A-type HBGAs11 (Fig. 5b). HBGAs are oligosaccharides that are expressed on erythrocytes and epithelial cells and are also present in mucosal secretions. In addition, human P[11] rotavirus strains, which cause diarrhoea in neonates, bind to HBGA precursors12,29.

Structural basis of glycan-binding specificity. Remarkably, rotaviruses bind to sialylated and non-sialylated glycans using the same site in VP8* (Ref. 11). The crystal structure of RRV VP8* (P[3] VP4 genotype) in complex with sialic acid showed that VP8* assumes a galectin-like fold94. Galectins are glycan-binding proteins that usually bind ligands at a conserved binding site at the top of the galectin molecule. However, this site is blocked in VP8*, and the virus instead engages sialic acid via a different interface on the side of the spike-shaped VP8* protein94 (Fig. 5c).

The crystal structure of rotavirus strain HAL1166 VP8* in complex with A-type HBGA (Fig. 5d) shows subtle modifications in the binding site of VP8* that render it incapable of binding to sialic acid and instead enable binding to A-type HBGA. The change in specificity is due to the insertion of a single amino acid, Asn187, in the binding pocket, which reorients a neighbouring tyrosine, Tyr188, such that its side chain blocks binding to sialic acid via steric hindrance. At the same time, the reoriented tyrosine can form hydrophobic contacts with HBGA11. As the remaining residues in the binding site are mostly conserved among sialic acid-binding and non-sialic acid-binding rotaviruses, it is clear that a single amino acid substitution in the receptor-binding pocket has a substantial effect on glycan specificity. By contrast, minor amino acid changes in influenza virus lead to altered specificity for similar types of glycans, such as sialylated oligosaccharides with different linkages. Structurally, glycans such as GM1 and HBGAs have little in common and it is therefore remarkable that rotaviruses can switch between entirely different classes of glycans via such small changes in the VP8* receptor-binding pocket.

Glycan binding and cell tropism. Rotavirus pathogenesis varies between neonates and older children. Whereas a broad group of rotavirus strains cause disease in older children29,101, neonatal infection is commonly asymptomatic, and only a few select strains, including the P[11] VP4 serotype, preferentially infect and cause diarrhoea29 in neonates101. Glycan array screening showed that the P[11] serotype binds to glycan precursors of HBGAs12,29. These blood group precursor glycans are more commonly expressed in neonates12 compared with older children or adults, which may explain the age restriction of rotavirus disease. Of note, the P[4] and P[8] genotypes that target older children bind to HBGAs but not to the precursors29,102,103. In addition, the VP8* of HAL1166, which engages A-type HBGA, agglutinates only type A erythrocytes. This suggests that human polymorphisms influence susceptibility to rotavirus infection, and individuals with blood group A may be at increased risk for infection with G8 P[11] rotavirus11.

RNA interference-mediated knockdown of genes that are involved in the synthesis of gangliosides decreases the capacity of human, porcine, bovine and simian rotaviruses to infect cells in vitro104. Moreover, ovine erythrocytes, which are naturally covered with sialic acid, interfere with rotavirus replication in mice by blocking Neu5Ac-binding sites on the virus and preventing attachment to other cells. Concordantly, neuraminidase treatment of these erythrocytes negates the therapeutic effect105. Taken together, these studies demonstrate a relationship between glycan-binding capacity and rotavirus pathogenesis.

Inhibition of virus–sialic acid interactions

Sialic acid-binding viruses include important human pathogens, such as adenovirus, influenza virus and rotavirus, as well as viruses with therapeutic applications, such as adenovirus and reovirus, which are being tested as gene-delivery vectors and oncolytic agents106,107. Therefore, manipulating the interactions of these viruses with sialic acid may improve therapeutic design and efficacy. Influenza virus attachment and release necessitate interactions with sialic acid and are important antiviral targets. Structure-based therapeutic design led to the development of oseltamivir and zanamivir, which are sialic acid derivatives that inhibit influenza virus neuraminidase and block the release of progeny virions108,109. Structural analysis shows that the sialic acid-binding pocket in group 1 neuraminidase proteins (N1,N4, N5 and N8) is larger than that observed in group 2 neuraminidase proteins (N2, N3, N6, N7 and N9), and group 2 neuraminidase proteins were used for the design of oseltamivir and zanamivir110. Therefore, it will probably be possible to generate group-specific neuraminidase inhibitors that fit more tightly in the active site110. A new class of neuraminidase inhibitor forms a stable covalent intermediate of neuraminidase, inhibits neuraminidase activity for extended intervals and has been shown to be effective in prophylaxis and therapy for influenza virus infection in mice111.

The influenza virus haemagglutinin is also an attractive drug target. However, it binds to sialylated glycans with low affinity, and it has been difficult to generate monovalent sialic acid derivatives that compete with native glycans112. Unfortunately, polyvalent sialic acid derivatives that target haemagglutitin are difficult to deliver into host cells and have considerable toxicity113. An interesting alternative approach to block haemagglutinin involves liposomes coated with lactoseries tetrasaccharide c (LSTc), which is an α2,6-linked sialic acid-bearing pentasaccharide114. This approach provides a framework to design a multivalent, but safe, delivery vehicle114.

Whereas most influenza antiviral therapeutic agents target the virus, DAS181 targets the host receptors. This drug is a fusion protein comprising an epithelial anchoring domain and a sialidase, which removes α2,3- and α2,6-linked sialic acid from respiratory epithelial cells115,116. DAS181 is effective against influenza A and B strains in vitro115 and protects mice from lethal challenge with H1N1 (Ref. 115) and H5N1 (Ref. 116) isolates. Phase II clinical trials showed that DAS181 reduced viral shedding in humans117. Thus, both virus and host determinants of sialic acid binding provide antiviral targets.

The crystal structure of Ad37 in complex with the GD1a glycan led to the development of trivalent sialic acid-based compounds that interact with all three binding pockets of the Ad37 fibre knob, thus engaging the knob with high avidity. Such compounds could be delivered topically, which bypasses potential problems of systemic drug delivery and could thus be useful for the treatment of epidemic keratoconjunctivitis. As it is unlikely that the Ad37–GD1a interaction is unique, multivalent sialic acid-based inhibitors form a template for the design of antiviral drugs in cases in which there are multiple sialic acid-binding sites in close proximity on multimeric viral attachment proteins.

Knowledge that has been gained from studies of virus–glycan interactions may be particularly useful to retarget viruses either for use as gene delivery vehicles or oncolytic agents. Reoviruses are naturally cytotoxic and preferentially infect transformed cells118,119,120,121. Targeting of transformed cells, coupled with the relative avirulence of these viruses in humans following the first few weeks of life, makes reovirus a suitable candidate for oncolytic therapy. Phase I and Phase II clinical trials have shown that the reovirus strain T3 Dearing (Reolysin; Oncolytics Biotech) is safe and non-toxic even at high doses122,123,124. T3 Dearing is now being tested in Phase III clinical trials for the treatment of head and neck cancer125.

The sialylation pattern in transformed cells is altered compared with that in untransformed cells126. Sialic acid abundance is increased in transformed cells owing to overexpression of sialyltransferases127. Understanding reovirus–glycan interactions could improve tumour targeting. In this regard, a T3 Dearing virus that lacks the σ1 head domain is less toxic in the host but retains its oncolytic potential128. This truncated T3 reovirus cannot bind to JAM-A, which indicates that the virus must adhere to cells using only sialic acids or using a receptor that has not been identified. It is possible that the altered glycan profile of cancer cells enables all three sialic acid-binding sites of the T3 σ1 trimer to be occupied, which increases the avidity of the binding interaction. Structural studies of reovirus σ1–sialic acid interactions19,66, coupled with structure-guided mutagenesis39,40, can also facilitate the generation of strains that have increased affinity for sialic acids and that may have increased tumour specificity and oncolytic potential.

Future directions

Structure–function studies of sialic acid-binding viruses with stalk-like attachment proteins show that these viruses primarily engage the sialic acid moiety using a small number of contacts. Additional residues confer specificity for a given linkage or glycan type. The location of the glycan-binding site is often conserved among attachment proteins of different strains of the same virus — for example, as seen for influenza virus and rotavirus. However, some viruses, such as reovirus, have evolved distinct glycan-binding regions in their attachment proteins, depending on viral serotype. A common feature of virus–glycan binding is the insertion of the methyl group of the N-acetyl chain of Neu5Ac into a hydrophobic pocket of the viral attachment protein. However, it is remarkable that even viruses with similarly structured, stalk-like attachment proteins, such as reovirus σ1 and adenovirus fibre, engage similar Neu5Ac-based glycans in an entirely different way. Even more remarkable is that the same protein from different serotypes of reovirus uses different binding sites for the same Neu5Ac.

The capacity to bind to sialyloligosaccharides contributes to host range, as exemplified by influenza virus, and influences tropism, as exemplified by adenovirus, influenza virus and reovirus. However, a comprehensive understanding of the role of glycan binding in viral tropism has been hindered by the lack of information about the specific glycans that are present on tissues that are targeted by viruses. Studies using plant lectins and immunohistochemistry suggest that the generalized binding preference of human influenza virus strains for α2,6-linked sialic acid and of avian strains for α2,3-linked sialic acid20,28 reflects the pattern of sialic acid expression of the target host129,130. However, these expression studies are limited in specificity to the sialic acid linkage type, which is insufficient to explain differences in glycan binding. A remaining challenge is to increase our understanding of glycan expression profiles in vivo. This knowledge gap currently presents the largest obstacle to attaining a comprehensive understanding of virus–glycan interactions and their functions in disease. Mass spectrometry131, microarray technology132 and shotgun glycomics are being used to define the glycome and tissue-specific glycan expression profiles. In shotgun glycomics, glycolipids and glycoproteins are extracted from organs, tissues or cells, and labelled. The identity and composition of these glycans are determined by high-throughput liquid chromatography (HPLC). This approach133,134, coupled with glycan array screening, could provide a framework for studying organ- and cell type-specific glycan use by viruses, as shown for swine influenza virus134.

The characterization of glycan expression on the cell surface is required to synergize glycan array technology with pathogenesis studies. Unfortunately, none of the glycan array platforms fully represent the glycans that are found on the lung and bronchial epithelium131, which can lead to discrepancies between glycan array screening data and functional studies. For example, glycan array screening indicates that certain influenza virus strains bind similarly to specific glycans, whereas such strains differ in their capacity to bind to lung tissue explants. Thus, the physiologically relevant receptors are not known.

Shotgun glycomics could be complemented to incorporate glycans onto arrays in their relative biological abundance. Virus–sialic acid interactions are usually of low affinity. Physiologically relevant glycan receptors are presumably expressed on the surface in moderate to high abundance to facilitate efficient attachment. Further studies investigating the avidity of haemagglutinin for glycans, as well as the tissue distribution of these carbohydrates, will improve our understanding of glycan receptors for influenza and other viruses. Future work in this field will determine how the intricate glycan-binding preferences that are displayed by viruses function in disease and provide new ideas for altering glycan use to improve therapeutic applications.