Summary
Varicella-zoster virus (VZV), a double-stranded DNA virus, causes varicella, establishes lifelong latency in ganglionic neurons, and reactivates later in life to cause herpes zoster, commonly associated with chronic pain. The VZV genome is densely packed and produces multitudes of overlapping transcripts deriving from both strands. While 71 distinct open reading frames (ORFs) have thus far been experimentally defined, the full coding potential of VZV remains unknown. Here, we integrated multiple short-read RNA sequencing approaches with long-read direct RNA sequencing on RNA isolated from VZV-infected cells to provide a comprehensive reannotation of the lytic VZV transcriptome architecture. Through precise mapping of transcription start sites, splice junctions, and polyadenylation sites, we identified 136 distinct polyadenylated VZV RNAs that encode canonical ORFs, non-canonical ORFs, and ORF fusions, as well as putative non-coding RNAs (ncRNAs). Furthermore, we determined the kinetic class of all VZV transcripts and observed, unexpectedly, that transcripts encoding the ORF62 protein, previously designated as immediate-early, were expressed with late kinetics. Our work showcases the complexity of the VZV transcriptome and provides a comprehensive resource that will facilitate future functional studies of coding RNAs, ncRNAs, and the biological mechanisms underlying the regulation of viral transcription and translation during lytic VZV infection.
Introduction
Varicella-zoster virus (VZV) is a ubiquitous human alphaherpesvirus and causative agent of both varicella (chickenpox) and herpes zoster (HZ or shingles) (Gershon et al., 2015). Varicella results from primary VZV infection and leads to the establishment of a lifelong latent infection in sensory neurons of the trigeminal and dorsal root ganglia (Depledge et al., 2018a; Gilden et al., 1983). In one-third of infected individuals, VZV reactivates from latency later in life to cause HZ (Gershon et al., 2015). Whereas varicella is generally experienced as a benign childhood disease, HZ is frequently associated with difficult-to-treat chronic pain (post-herpetic neuralgia) (Gilden et al., 2000; Johnson and Rice, 2014). Despite the recent availability of the highly effective HZ subunit vaccine (Shingrix), the health and societal burden of HZ and its complications remains high due to adverse side effects, the high cost of the vaccines, and changing demographics (Gater et al., 2015).
The 125 kb double-stranded DNA (dsDNA) genome of VZV, first sequenced in 1986, encodes at least 71 unique open-reading frames (ORFs) that are expressed during lytic infection (Cohen, 2010; Davison and Scott, 1986). The current annotation of the VZV genome largely relies on both in silico ORF predictions and homologous ORFs in the closely related human alphaherpesvirus herpes simplex virus type 1 (HSV-1) (Cohen, 2010). For most VZV ORFs, the boundaries of transcription are not accurately determined meaning that transcription start sites and polyadenylation sites are poorly resolved. Moreover, in silico ORF prediction does not account for the possibility of spliced transcripts. Indeed, we recently applied ultra-deep short-read RNA-sequencing to define the latent viral transcriptome and discovered the spliced VZV latency-associated transcript (VLT) (Depledge et al., 2018b).
The compact nature of viral genomes, combined with their ability to encode overlapping RNAs, presents a significant challenge to studies that rely on interrogating genome sequences or that use viral mutants to probe the function(s) of viral proteins. Here, a single point mutation or frameshift may impact multiple distinct RNAs at the same time, a factor that may be further confounded by inaccurate or missing transcript annotations. The re-annotation of the human cytomegalovirus (HCMV) transcriptional and translational landscape and subsequent refinements of HSV-1, Kaposi’s sarcoma-associated herpesvirus (KSHV), and human herpesvirus 6 (HHV6) transcriptome architectures have all demonstrated that herpesviruses exhibit a complex transcriptional pattern of alternative splicing, opposing transcription, read-through transcription, fusion transcripts, 5’ untranslated region (5’ UTR) and 3’ UTR variations and previously unidentified non-coding RNAs (Arias et al., 2014; Bencun et al., 2018; Depledge et al., 2019; Finkel et al., 2020; O’Grady et al., 2016, 2019; Stern-Ginossar et al., 2012; Whisnant et al., 2019). Indeed, recent cDNA-based long-read sequencing also indicated that the lytic VZV transcriptome is substantially more complex than previously recognized (Prazsák et al., 2018).
By analogy to other herpesviruses and limited experimental data (Lenac Rovis et al., 2013; Reichelt et al., 2009), VZV transcripts and their encoded proteins have been divided into three kinetic classes: immediate-early (IE), early (E) and late (L). Expression of E and L transcripts is considered dependent on viral proteins of the preceding kinetic classes, while expression of IE transcripts occurs in the absence of viral protein synthesis (Honess and Roizman, 1974). Prior studies have defined four VZV proteins encoded by ORF4, ORF61, ORF62, and ORF63 as being transcriptional regulators that initiate lytic transcript expression (Defechereux et al., 1993; Kost et al., 1995; Moriuchi et al., 1993, 1994; Perera et al., 1993), whose corresponding transcripts have been classified as IE by analogy to their HSV-1 orthologues. L transcripts, such as VLTly, the lytic isoform of VLT, are either expressed at very low levels prior to or exclusively after viral DNA replication has commenced (Depledge et al., 2018b). However, the species specificity and highly cell-associated nature of VZV in vitro have hampered detailed analysis of VZV transcription. Improved protocols to obtain cell-free VZV and mass-spectrometry have provided some insight into the temporal pattern of viral protein expression (Ouwendijk et al., 2020), but lack sensitivity compared to RNA-sequencing and do not provide information on viral transcription.
To address this, we have decoded the architecture of the lytic VZV transcriptome in both human epithelial cells and neurons while contrasting discrete VZV strains. We subsequently established the kinetic class of all lytic viral transcripts and integrated these results to provide a comprehensive overview of the complexity and structure of the lytic VZV transcriptome as a rich resource that will enhance future functional studies of VZV biology.
Results
Decoding the complexity of lytic VZV gene expression
Standard methods for annotating viral transcriptomes require the integration of multiple types of Illumina RNA sequencing (RNA-Seq) data to identify transcription start sites (TSS), cleavage and polyadenylation sites (CPAS), splice sites, and transcript structures. The latter is particularly challenging to infer using conventional short-read sequencing approaches (Depledge et al., 2018c). By contrast, direct RNA sequencing (dRNA-Seq) using nanopore arrays offers the potential to capture all these distinct data points in a single sequencing run (Depledge et al., 2019; Garalde et al., 2018). To examine the structure of the lytic VZV transcriptome, ARPE-19 cells were infected with the VZV pOka wild-type strain and total RNA was extracted at 96 hours post infection (hpi, Figure 1A). The 96 hpi time-point was chosen to maximise the diversity of VZV transcripts likely to be present. Sequencing of the polyadenylated RNA fraction was performed using both RNA-Seq and dRNA-Seq (Figure 1A). Whereas short reads generated by standard Illumina RNA-Seq are not amenable for accurate isoform reconstruction in complex reads, they provide higher sequencing depth, detection of CPAS, and, crucially, enables splice-site correction of dRNA-Seq reads via the junction-polishing package in the software FLAIR (Tang et al., 2020). By contrast, dRNA-Seq can sequence full-length RNAs and provides critical information on the presence of discrete RNA isoforms with regions of overlap, while also allow mapping of TSS and CPAS. Finally, we performed Illumina Cap Analysis Gene Expression Sequencing (CAGE-Seq) (Murata et al., 2014) to map TSS by an orthologous approach (Figure 1A).
TSS and CPAS estimates provided by dRNA-Seq data closely overlapped with those derived from our Illumina approaches (Figure 1B). TSS sensitivity was nearly four-fold higher in CAGE-Seq datasets compared to dRNA-Seq, with effectively all TSS uniquely found by CAGE-Seq being low abundance – likely reflecting artefacts derived from RNA processing (i.e. recapping of cleaved RNA) or a generalized dysregulation of transcription initiation accompanying the late stages of a viral infection. A total of 104 TSS overlapped between the dRNA-Seq and CAGE-Seq datasets, all of which were the most abundant TSS in both datasets (Table S1). Importantly, dRNA-Seq TSS estimates were located up to 20 nucleotides (nt) downstream (median 11 nt) of TSS derived via CAGE-Seq (Figure 1B). This difference is best explained by the presence of low-quality ends of dRNA-Seq reads that are not aligned when using local alignment strategies. CPAS sensitivity was higher in dRNA-Seq than RNA-Seq datasets (70 vs. 60 sites) with 53 sites detected by both approaches (Table S2). CPAS estimates provided by dRNA-Seq aligned closely with those derived from RNA-Seq data (median 1 nt difference, Figure 1B) due the 3’ → 5’ direction of dRNA-Seq. Finally, we reconstructed the VZV transcriptome using TSS and CPAS to define transcript structures followed by visual confirmation of read data to identify splice sites, define alternatively spliced transcripts, and examine read-through transcription (Figure 1C).
Reannotation of the VZV transcriptome reveals alternative transcript isoforms and putative non-coding RNAs
In VZV pOka-infected ARPE-19 cells, we identified 136 distinct VZV RNAs that were readily detectable at 96 hpi. Along with defining the UTRs of 96 RNAs encoding the 71 canonical VZV ORFs, we also identified 40 additional RNAs (Figure 2, Table S3). To reduce confusion, we numbered all RNAs according to the respective canonical ORFs and delineated transcript isoforms encoding at least part of the same ORF by number. For instance, three transcripts are transcribed from the ORF0 locus and these are here referred to as VZV RNA 0-1, 0-2, and 0-3. The identified RNAs included transcripts encoding 5’ extended ORFs, 5’ truncated ORFs, 3’ extended ORFs, 3’ truncated ORFs, internally spliced variants, and putative non-coding RNAs (ncRNAs). Importantly, our study confirmed several previously described RNA isoforms such as a genome-termini spanning RNA that encodes ORF0 (RNA 0-3) (Kemble et al., 2000) and identified an additional 5’ truncated ORF0 RNA of unknown function (RNA 0-2) (Figure 3A). Similarly, extensive low-level internal splicing of ORF50 has previously been reported (Sadaoka et al., 2010) (RNAs 50-1 to 50-5) and was similarly observed here, supplemented by two additional 5’ truncated isoforms (RNAs 50-6 and 50-7) expressed at relatively high abundance (Figure 3B). Examples of previously undocumented RNAs include two new spliced ORF9 transcript isoforms and two internally spliced transcript isoforms encoding N-terminal ORF24 and ORF48 coding sequences (CDS) domains that are spliced into novel C-terminal domains (Figure 2). Finally, we confirmed expression of transcripts encoding the novel ORF9 and ORF48 variants in VZV-infected ARPE-19 cells by RT-PCR (Figure S1A-B).
Fusion transcripts combine sequences from two or more distinct canonical viral transcripts, and mostly likely result from transcription termination occurring at an alternative CPAS downstream of the canonical CPAS, followed by internal splicing (Depledge et al., 2019). The resulting fusion transcripts are predicted to encode new proteins that contain fused domains from two or more distinct protein products. We identified seven VZV fusion transcripts in total. Four of these contain distinct fusion of transcripts encoding VLT and ORF63. Internal splicing of RNA 12-2 and RNA 12-3 transcripts yields two distinct splice variants that fuse parts of ORF12 and ORF13 CDS domains (Figure 3C). Additionally, we observed transcripts that fused the 5’ UTR of ORF31 transcripts to ORF32 transcripts, resulting in a transcript encoding pORF32 with an alternative 5’ UTR. We confirmed expression of transcripts encoding the ORF12-ORF13 and ORF31-ORF32 fusions in VZV-infected ARPE-19 cells by RT-PCR (Figure S1C-D).
Additionally, we discovered two novel polyadenylated VZV transcripts: RNA 13.5-1 and RNA 43-2. RNA 13.5-1 is 634 nt long, encodes two putative CDS domains (88 and 55 amino acids; aa), and is positioned antisense to the RNA 14-1 (encoding ORF14). Like RNA 14-1, RNA 13.5-1 stretches across the R2 reiterative region, a short repeat region that exhibits length variations between viral strains and within viral populations and thus leads to length variations in the encoded transcripts (Jensen et al., 2020) (Figures 2 and S1E). We also identified a highly expressed 3’ truncated RNA (RNA 43-2) that overlaps with the 5’ end of RNA 43-1 (encoding ORF43, Figures 2 and S1F). RNA 43-2 is a spliced 590 nt transcript that encodes only a short CDS domain (21 aa). Expression of both RNA 13.5-1 and RNA 43-2 in VZV-infected ARPE-19 cells was confirmed by RT-PCR (Figures S1E-F)
Finally, we used an in-silico approach to predict the coding potential of all 136 polyadenylated VZV RNAs (Table S3). The Coding Potential Calculator version 2 algorithm (CPC 2.0) (Kang et al., 2017) calculates the coding probability of a transcript based on its length, isoelectric point, and Fickett score of the longest CDS encoded. Of the 96 VZV RNAs encoding the 71 canonical ORFs, 89 were assigned a coding probability exceeding 90% with only two canonical RNAs – encoding the two smallest VZV proteins: pORF49 (81 aa) (Sadaoka et al., 2007) and pORF57 (71 aa) (Cox et al., 1998) – incorrectly predicted to be a non-coding transcript (Table S3). Of the 40 VZV RNAs encoding non-canonical products, 17 were predicted to be non-coding (Table S3), including two novel transcripts: RNA 13.5-1 (7%) and RNA 43-2 (13%) (Figure S2).
The lytic VZV transcriptome is not influenced by viral strain or cell type
VZV genome sequences are highly conserved (Norberg et al., 2015), suggesting that strain-specific differences in coding capacity are likely minimal. To test this hypothesis, we infected ARPE-19 cells with VZV EMC-1 for 96 hrs and sequenced the poly(A) fraction of RNA by dRNA-Seq (Table S4). This enabled a comparative analysis of datasets obtained from ARPE-19 cells lytically infected with either VZV pOka or EMC-1 to determine whether either strain encodes unique transcripts (Figure S3). No such transcripts were identified at the RNA level, although we note that nucleotide level changes may still impact encoded proteins – as is exemplified by the N-terminal extended pORF0 uniquely present in pOka (Figure S4).
As VZV is capable of infecting diverse cell types including epithelial cells and neurons, we also determined if the VZV transcriptome remains similar between VZV pOka-infected ARPE-19 cells and human embryonic stem cell (hESC)-derived neurons (Sadaoka et al., 2016, 2017). Again, no cell-type specific novel VZV RNAs or variant VZV RNAs were identified (Figure S5), indicating that observed differences in VZV RNA expression levels and infectivity in distinct cell types (Baird et al., 2014; Sadaoka et al., 2017) is not due to the presence of cell-type specific VZV RNAs, but is likely driven by host cell factors.
Overexpression of RNA 43-2 does not impair VZV replication in epithelial cells
Based on the high relative expression of RNA 43-2, its low protein coding potential and genomic location we hypothesized that it may function as a ncRNA involved in regulating the expression of the longer RNA 43-1 (encoding pORF43). ORF43 is an essential gene (Zhang et al., 2010), which putatively encodes for the capsid vertex component 1 and is postulated to be important for viral DNA encapsidation from the analogy to HSV UL17 (Toropova et al., 2011). To test this hypothesis, we generated three stable ARPE-19 cell lines expressing RNA 43-2, or an empty vector as control, and analyzed VZV EMC-1 replication at 48 and 72 hpi by flow cytometry and plaque assay. We did not observe any significant differences in number of VZV-infected cells, number of plaques nor plaque sizes between RNA 43-2 expressing cells and vehicle control cells (Figures S2A-B). Additionally, RNA 43-2 expression did not significantly reduce expression of RNA 43-1 in VZV-infected cells (p = 0.08 by Student’s t-test) (Figure S2C).
Decoding the kinetic class of VZV transcripts in lytically infected cells
To determine the kinetic class of each lytic VZV RNA, cell-cycle synchronized ARPE-19 cells were infected with cell-free VZV EMC-1 and cultured for 12 or 24 hrs in the presence or absence of actinomycin D (ActD, transcription inhibitor), cycloheximide (CHX, translation inhibitor), or phosphonoacetic acid (PAA, inhibitor of the viral DNA polymerase), and viral RNAs were subsequently profiled using dRNA-seq and RNA-seq (Figure 4 & S6). Transcription of IE RNAs is not dependent on de novo viral protein production, whereas transcription of E RNAs depends on IE proteins. L RNAs are further subclassified into two kinetic classes, leaky-late (LL) and true-late (TL); LL RNAs are expressed at very low levels before, and TL RNAs exclusively after, viral DNA replication has commenced.
Given the very high sensitivity of our Illumina RNA-seq and Nanopore dRNA-seq analyses many VZV transcripts were detected in all experimental conditions except ActD-treated cells, albeit at vastly different abundancies (Table S3). Therefore, we first established objective criteria to classify VZV transcripts into distinct kinetic classes, based on their susceptibility to CHX treatment (to identify IE genes) and PAA treatment (to identify E and L genes). Taking advantage of the fact that one dRNA-Seq equals one RNA, we counted the number of reads mapping unambiguously to each of the 136 VZV RNAs for each sample. We subsequently calculated the relative expression level for each transcript by expressing each count as a fraction of the total VZV transcripts counts for that sample (Figure 5). Reads that could not be unambiguously assigned were excluded from this analysis. We defined IE VZV transcripts as those which accounted for an equal to or higher fraction of transcripts in the CHX treated sample than in the PAA or untreated samples (Figure 5). E RNAs were assigned as those which had a proportional distribution in PAA-treated samples of at least 50% of the untreated sample. LL transcripts were those which had a proportional distribution in PAA-treated samples of 5% – 50% of the untreated sample. TL transcripts were those with a proportional distribution in PAA-treated samples of less than 5% of the untreated sample and were effectively only detected in untreated samples.
Collectively, our approach showed that two VZV transcripts (RNA 4-1 and RNA 61-1, encoding pORF4 and pORF61, respectively) were expressed at very high levels (accounting for 60% of all VZV transcripts) in the absence of de novo protein production and, in agreement with prior studies (Moriuchi et al., 1993, 1994), were classified as IE transcripts (Figure 5). Six additional transcripts were expressed to high levels in CHX-treated samples relative to other conditions and were also assigned IE status. These included RNA 63-1 (encoding pORF63), RNA 0-1 (pORF0), RNA 0-2 (putative ncRNA), RNA 61-2 (N’ terminal truncated pORF61), and RNA 43-2 (putative ncRNA). We also observed that RNA 9-1 (pORF9) is expressed at similar levels as the transcripts above and provisionally classified it as IE but note that relative expression levels of this transcript increases throughout infection. pORF4, pORF61, and pORF63 are known transcriptional activators of VZV and canonical IE transcripts (Kost et al., 1995; Moriuchi et al., 1993, 1994), whereas the kinetic class of ORF0 has not been fully resolved (Koshizuka et al., 2010). Low level transcription of other VZV transcripts was observed but attributed to low-level transactivation by viral tegument proteins delivered from incoming virions. We also sequenced samples treated with ActD to control for the potential presence of residual background transcripts in the virus preparations and confirmed only minimal amounts of VZV transcripts to be present (Figure S6).
In total, 69 transcripts were classified as viral DNA replication insensitive E RNAs, including the experimentally validated transcripts encoding pORF28 and pORF29 (Yang et al., 2004). A further 27 transcripts were classed as LL and 31 transcripts as TL, the latter including both RNA 14-1 (pORF14) and VLT (pVLT), both of which have been experimentally confirmed previously (Depledge et al., 2018b; Storlie et al., 2006). Notably, about half of the VZV RNAs originating from the same locus were of different transcriptional class (Figure S6). Typically, the shortest RNA isoforms were of earlier kinetic class, with alternative TSS and CPAS usage increasing transcript diversity by producing longer alternative RNA isoforms at later stages of infection, e.g. RNA 9-1, 9A-1, 9-2, and 9-3 (Figure S6B).
ORF62 transcripts are expressed with Late kinetics during lytic VZV infection
VZV transcripts RNA 62-1 and RNA 62-2 encode for, respectively, the major viral transcriptional activator protein, pORF62, and a predicted N-terminal truncated pORF62 variant. Surprisingly, our data indicate that expression of the ORF62 encoding RNAs is both dependent on de novo (viral) protein synthesis and viral DNA replication, thereby classifying these RNA 62 transcripts as L. This contradicts the current classification of ORF62 as an IE gene, although we note this classification was obtained by analogy to the function of its HSV-1 orthologue infected cell polypeptide 4 (ICP4) (Felser et al., 1988). To confirm and substantiate our findings, we analyzed the impact of viral DNA replication on ORF62-encoding RNA and protein expression in multiple VZV-susceptible cell types at 24 hpi. RT-qPCR analysis showed that PAA treatment was associated with an approximately 10-fold decrease in expression of IE transcripts encoding ORF61 and ORF63, consistent with the absence of VZV DNA replication and spread in culture, and >10,000-fold decrease in expression of TL, viral DNA replication-dependent, RNA VLTly (Figure 6A). Notably, expression of RNAs encoding ORF62 was more severely affected (∼500-fold reduction) by PAA treatment than IE RNAs encoding ORF61 and ORF63. Similar results were obtained for VZV strain pOka infected epithelial ARPE-19 cells, hESC-derived neurons and lung fibroblast MRC-5 cells (Figure S7A to C).
Similarly, PAA treatment severely reduced the abundance and affected the cellular localization of pORF62 in VZV-infected ARPE-19 cells (Figure 6B). In the absence of PAA, both pORF61 and pORF62 were abundantly expressed in plaques of VZV-infected cells, with mostly diffuse nuclear pORF61 staining and pORF62 staining presenting as abundant globular nuclear and diffuse cytoplasmic staining (Figure 6B, left panels). Consistent with inhibition of VZV replication, no plaques were observed in PAA-treated cultures and infected cells were rare. The pORF61 staining pattern in infected cells was comparable between PAA-treated and untreated VZV-infected cells, whereas pORF62 staining intensity was severely reduced and showed weak, mostly diffuse nuclear staining with fewer intensely staining punctae (Figure 6B, right panels); possibly, reflecting incoming pORF62 originating from VZV virions. Identical IF staining results were observed in ARPE-19, MRC-5 and melanoma MeWo cells (Figure S7D and E, respectively). Overall, our data demonstrate that RNA 62-1, as well as RNA 62-2 and RNA 62-3, are expressed at low levels prior to viral DNA replication, with robust expression occurring only after viral DNA replication is initiated, consistent with the expression of LL, but not IE transcripts.
Discussion
Understanding the full coding capacity of a given virus is crucial to understanding its biology. With the advent of new RNA-sequencing methodologies it has become clear that transcription of herpesvirus genomes is much more complex than previously anticipated. Here, we demonstrate that VZV is no exception and provide a comprehensive reannotation of the VZV transcriptome during lytic infection of human retinal pigment epithelial cells and hESC-derived neurons, incorporating data from two distinct VZV strains. By integrating RNA-Seq, CAGE-seq, dRNA-seq, we have resolved the architecture of the lytic VZV transcriptome. Specifically, we report the TSS and CPAS for all annotated VZV transcripts – including refinement of the 5’ UTRs and 3’ UTRs in RNAs encoding canonical ORFs – and show that VZV further diversifies its transcription by through the use of (1) additional TSS and CPAS, (2) disruption of transcription termination, and (3) alternative splicing. As a result, several transcript isoforms are expressed from the same locus and fusion RNAs are produced that modify UTRs and CDS of multiple transcripts. Given that 5’ UTR sequences influence the translational efficiency of the downstream CDS (Leppek et al., 2018), alternative UTR usage may provide the virus additional mechanism to regulate its protein expression throughout its infectious cycle. Collectively, this study defined 136 polyadenylated VZV RNAs that are expressed during lytic VZV infection, many of which are predicted to increase the diversity of the viral proteome.
Although the VZV genome is considered relatively stable, multiple strains currently co-circulate and recombine (Norberg et al., 2015; Tyler et al., 2007), potentially influencing the viral transcriptome. However, our comparison of the transcriptional landscape of a VZV clade 1 (strain EMC-1) and a clade 2 (strain pOka) virus revealed that no strain-specific lytic transcript isoforms exist in VZV-infected ARPE-19 cells. However, inter-strain differences in repeat variations could nevertheless impact transcription of RNA 11-1 (containing R1), RNA 13.5-1 (R2), RNA 14-1 (R2), RNA 22-1 (R3), RNA 63-2 (R4), RNA 63-3 (R4) and all RNA 59 and RNA 60 isoforms (R5) (Jensen et al., 2020). Similarly, strain-specific polymorphisms may function to extend coding domains such as the N-terminal extended pORF0 (RNA 0-3) in VZV pOka in comparison with other VZV clades (Figure S4). Additionally, while previous studies suggested that the VZV transcriptome is generally similar across diverse cell types, none had sufficient resolution or used the methodologies required to disentangle transcript structures (Baird et al., 2014; Depledge et al., 2018b; Jones et al., 2014). Here, we demonstrated that identical transcript isoforms were detected during lytic VZV infection of human retinal pigment epithelial cells and hESC-derived neurons. Thus, while VZV strain-specific polymorphisms and/or cell type may influence viral gene expression, their impact on the lytic VZV transcriptome structure appears to be small.
Many herpesviruses express ncRNAs during lytic and latent infections (Hancock and Skalsky, 2018). Previously, we identified the putative dual-function polyadenylated VZV RNA VLT, which encodes a protein expressed during lytic infection, but is also functional as an RNA, inhibiting ORF61 RNA expression in overexpression experiments (Depledge et al., 2018b). Here, we identified 17 additional polyadenylated VZV RNAs that are predicted to be non-coding. Diverse functions have been attributed to human ncRNAs, including the modification of antisense or overlapping transcription events (Pelechano and Steinmetz, 2013; Saxena and Carninci, 2011). The generation of functional VZV mutant viruses with impaired expression of identified putative ncRNAs is challenging due to overlap with other viral transcripts. Therefore, we have studied the function of VZV RNA 43-2 by means of RNA 43-2 overexpression followed by VZV superinfection. However, RNA 43-2 did not significantly reduce expression of the overlapping RNA 43-1 transcript nor influence VZV replication. These results most likely reflect the complexity of VZV transcript regulation during lytic infection, as putative ncRNA 43-2 is expressed earlier (IE kinetics) and at higher abundance than RNA 43-1 (E kinetics). Considering the multitude of ncRNAs expressed by other herpesviruses and their crucial roles during infection (Hancock and Skalsky, 2018), delineating the functional importance of VZV ncRNAs should be considered a research.
Twenty-eight of 136 VZV transcripts are (multiply) spliced. Strikingly, the majority of splicing events occur in a hypercomplex region of the VZV genome encoding both VLT and ORF61. We have previously shown this locus to be characterized by extensive alternative splicing (Depledge et al., 2018b) and here report the presence of multiple transcripts that variously encode a fusion protein of pVLT and pORF63 (pVLT-ORF63) or an N’ terminal extended pORF63 (pORF63-N+). The coding potential of pVLT-ORF63 or pORF63-N+ and functional consequences of (and requirement for) these fusion transcripts during reactivation from latency is described in a separate study (Sadaoka et al. unpublished data), while their functional role(s) during lytic infection are under investigation. Except for VLTlyt63-1, all spliced VZV RNAs used the canonical splice donor (GT) site (Figure S8A). The splice donor sites were highly enriched for C/A (−3 position), A (−2), G (−1) and A (+1). All spliced VZV RNAs used the canonical splice acceptor site (AG), often flanked by C (−1) and G/A (+1) (Figure S8B). Consistent with the use of the cellular splicing machinery to process viral pre-mRNAs, VZV consensus splice donor and acceptor sites closely resemble those of the human transcriptome (Zhang et al., 2007).
Herpesvirus transcripts are traditionally assigned kinetic classes based on their temporal expression pattern and dependence on de novo protein synthesis or viral DNA replication. Here, we provide a transcriptome-wide classification of VZV transcripts during lytic infection of epithelial cells and identified that multiple transcripts originating from the same locus could either share the same kinetic class (e.g. RNAs 15-1, 15-2, and 15-3) or be of different kinetic class (e.g. RNAs 9-1, 9-2, 9-3, and 9A-1) (Table S3 and Figure S6). While the biological impetus for this is not clear, it seems likely that TSS and CPAS usage are dynamically regulated during lytic infection. For example, RNA 43-1 (E, pORF43) and the putative ncRNA 43-2 (IE) diverge in CPAS usage and expression kinetics (Figure 5). Similarly, multiple transcripts encoding pORF63 utilize different TSS are expressed as different temporal classes and varying abundancies, with canonical IE RNA 63-1 being most abundant, followed by E RNA VLTlyt63-1 and low quantities of two TL RNAs 63-2 and 63-3 (Figure 5).
Finally, our data provide novel insight into the expression of IE transcripts and the role of pORF62 during lytic VZV infection. The most abundantly expressed VZV IE transcripts, produced in the absence of new protein synthesis, encode for pORF4 and pORF61, which are also the earliest proteins detected in during lytic VZV infection (Ouwendijk et al., 2020). Interestingly, prior studies of the RNA 4-1 and RNA 61-1 promoter regions have shown that their efficient transactivation is dependent on pORF62 (Michael et al., 1998; Wang et al., 2009), a major component of the viral tegument (Kinchington et al., 1992). However, our data indicate that abundant expression of VZV RNA 62-1 (pORF62), as well as RNAs 62-2 and 62-3 is dependent on viral DNA replication (LL and TL kinetics), suggesting that tegument-derived pORF62 but not de novo pORF62 transactivates RNA 4-1 and RNA 61-1 expression at least during establishment of lytic infection cycle. Classification of RNA 62-1 (pORF62) as LL is also supported by prior observations that a marked increase in pORF62 abundance occurs only after DNA replication had commenced (Reichelt et al., 2009). Importantly, our findings do not exclude any of the previously assigned functions of pORF62, most notably its function as a major transcriptional regulator of VZV transcripts (Perera et al., 1992; Ruyechan et al., 2003; Yang et al., 2006). However, future studies aimed at better understanding the regulation of VZV transcript expression, and the distinct roles of newly produced RNA 62 isoforms, pORF62, and tegument-derived pORF62 are warranted.
In summary, this study describes the detailed analysis of the VZV transcriptome architecture and kinetic classification of viral transcripts in the context of lytic VZV infection. We provide these data as a comprehensive resource that will facilitate functional studies of coding RNAs and their protein products, the role of non-coding RNAs, and the regulation of VZV transcription and translation during lytic infection.
Author Contributions
W.J.D.O. and D.P.D. conceived of the project; W.J.D.O., D.P.D., and T.S. designed the experiments with additional input from G.M.G.M.V.; S.E.B., T.S., W.J.D.O., and D.P.D. performed the experiments and analyzed the data; S.E.B., T.S., W.J.D.O., and D.P.D. wrote the manuscript; All authors read, edited, and approved the final paper.
Declaration of Interests
The authors declare no competing interests, financial or otherwise.
Methods
Cells and viruses
Human retinal pigmented epithelium ARPE-19 cells [American Type Culture Collection (ATCC) CRL-2302] were grown in a 1:1 (v/v) mixture of DMEM (Lonza) and Ham’s F12 (Gibco) medium supplemented with 10% heat-inactivated fetal bovine serum (FBS; Lonza) and 0.6 mg/mL L-sodium glutamate (Lonza) or in DMEM/F-12+GlutaMAX-I (Thermo Fisher Scientific) supplemented with heat-inactivated 8% FBS (Sigma-Aldrich). Human neuroblastoma SH-SY5Y cells were grown in a 1:1 (v/v) mixture of EMEM with EBSS (Lonza) and Ham’s F12 (Gibco) medium supplemented with 15% FBS (Lonza), L-sodium glutamate, penicillin-streptomycin, non-essential amino acids (MP biomedicals) and natrium-bicarbonate (Lonza). Human embryonic lung fibroblasts MRC-5 and human melanoma MeWo cells were cultured in DMEM (Lonza), supplemented with 10% FBS, L-sodium glutamate and penicillin-streptomycin. Human embryonic stem cell (hESC; H9)-derived neural stem cells (NSC) (Thermo Fisher Scientific) were cultured, propagated and differentiated into neurons as described previously (Sadaoka et al., 2020). Cell cultures were maintained at 37°C in a humidified CO2 incubator. VZV strain pOka (parental Oka) was maintained in, and the cell-free virus was prepared from, ARPE-19 cells as described previously for MRC-5 cells (Sadaoka et al., 2007). VZV strain EMC-1 is a low-passage clinical isolate, was cultured on ARPE-19 cells and cell-free VZV was extracted as described (Lenac Rovis et al., 2013; Ouwendijk et al., 2014). Cell-free EMC-1 was freshly harvested on the day of use and pretreated with DNAse I, RNAse T1 and RNAse A (all: Thermo Fisher Scientific) for 30 min at 37°C prior to infection.
RNA extraction and cDNA synthesis
ARPE-19 cells were infected with cell-associated VZV EMC-1 by co-cultivation of uninfected and VZV EMC-1 infected ARPE-19 cells at an 8:1 cell ratio for 96 hrs. Alternatively, ARPE-19, SH-SY5Y, MRC-5 and MeWo cells were infected with cell-free VZV EMC-1 for indicated time. Cells were harvested in 1 mL TRIzol (Thermo Fisher Scientific), mixed with 200 μL chloroform and centrifuged for 15 min at 12,000xg at 4°C. RNA was isolated from the aqueous phase using the RNeasy Mini kit (Qiagen) according to manufacturer’s instructions, including on-column DNase I treatment, as described (Ouwendijk et al., 2013). RNA concentration and integrity were analyzed using a Nanodrop spectrophotometer (Thermo Fisher Scientific), and RNA was subjected to a second round of DNAse treatment using the TURBO DNA-free kit (Ambion) according to manufacturer’s instructions. For cDNA synthesis maximum 5 µg RNA was reverse transcribed using Superscript IV reverse transcriptase and oligo(dT) primers (Thermo Fisher Scientific) (RT+). As control, the same reaction was performed without reverse transcriptase (RT-). Alternatively, RNA was isolated using the FavorPrep Blood/Cultured Cell Total RNA Mini Kit (Favorgen Biotech) in combination with the NucleoSpin RNA/DNA buffer set (Macherey-Nagel). DNA was first eluted from the column in 100 µL DNA elution buffer and subsequently the column was treated with recombinant DNase I (20 units/100 µL; Roche Diagnostics) for 30 min at 37°C and finally RNA was eluted in 50 µL nuclease free water. RNA was directly treated with Baseline-ZERO DNase (2.5 units/50 µL; Epicentre) for 30 min at 37°C. cDNA was synthesized with 12 µL of RNA and anchored oligo(dT)18 primer in a 20 µL reaction using the Transcriptor First Strand cDNA synthesis kit at 55°C for 30 min for reverse transcriptase reaction (Roche Diagnostics).
PCR and sequence analysis
PCR was performed on RT+ and RT-cDNA reactions using Amplitaq Gold DNA Polymerase (Thermo Fisher Scientific) and primer pairs corresponding to each newly identified VZV transcript (Table S5). Primers were directed to the predicted 5’ and 3’ end of each transcript so that newly identified transcripts were completely amplified from 5’→3’ end. For ORF9 variants and ORF48 additional reverse primers were used to confirm splice junctions (Table S5). PCR amplification was performed as follows: initial denaturation at 95°C for 10 min, followed by 40 cycles of alternating denaturation (30 sec, 95°C), primer annealing (30 sec at the appropriate temperature; Table S5), and subsequently primer extension (1 min / 1,000bp, 72°C; Table S5). Final extension step of 10 min at 72°C was included. To amplify ORF13.5 each dNTP, including equimolar amounts of dGTP and 7-deaza-GTP (New England Biolabs), at a concentration of 200 µM was used (Maertzdorf et al., 1999). PCR amplification of ORF13.5 was performed as follows: initial touchup PCR from 58°C→70°C using a transcript specific forward primer and an anchored primer on the poly(A) tail. Subsequently, semi-nested PCR was performed using the same forward primer and a reverse primer within the transcript using standard PCR protocol. Amplicons were purified from gel using the QIAquick Gel Extraction Kit (Qiagen) and sequenced using the BigDye v3.1 Cycle Sequencing Kit (Applied Biosciences) with corresponding forward and reverse primer on the ABI Prism 3130 XL Genetic Analyser.
Plasmid construction and generation of stable cell lines
The RNA 43-2 transcript sequence (77,775 - 78,619 excluding the intron at 77,869 - 78,149; strain Dumas, NC_001348.1) was amplified using cDNA from VZV EMC-1 infected ARPE-19 cells and primers NheI_ORF43-5_Fw and XhoI_ORF43.5_Rv (Table S5). Amplicon was digested with NheI and XhoI and cloned into pcDNA3.1. Three independent batches of ARPE-19 cells were transfected with either pcDNA3.1/empty or pcDNA3.1/RNA 43-2 using polyethylenimine (PEI). After 2 days, cells were incubated with 1 mg/mL geneticin and cultured for at least 3 weeks to select for transfected cells. Subsequently, DNA was isolated using the QiaAmp DNA Mini kit according to manufacturer’s instructions and presence of RNA 43-2 DNA was confirmed by PCR. Next, RNA was isolated as described above, and expression of RNA 43-2 was confirmed by RT-PCR.
Flow cytometry
ARPE-19 cells stably transfected with pcDNA3.1/empty or pcDNA3.1/RNA 43-2 were plated one day prior to infection in a 48-well plate. Cells were infected with cell-free VZV EMC-1 (multiplicity of infection, MOI = 0.01), harvested at 48 hpi or 72 hpi, fixed and permeablized with BD Cytofix/Cytoperm, stained for VZV glycoprotein E (gE) (MAB8612, Millipore) in BD PermWash, labeled with secondary APC-conjugated goat anti-mouse Ig antibody (BD Biosciences). Frequency of VZV-infected (i.e. APC positive) cells was measured on a BD FACSLyric flow cytometer and analyzed using FlowJo software (BD Biosciences).
Plaque assay
Confluent monolayers of stable pcDNA3.1-empty or pcDNA3.1-RNA 43-2 cells grown in a 12-wells plate were infected with 2,000 plaque forming units (PFU)/well VZV EMC-1. At 72 hrs post-infection plates were washed and fixed with 4% paraformaldehyde (PFA) in PBS. Subsequently plates were permeabilized using 0.1% Triton-X100 in PBS for 10 min, blocked with 5% normal goat serum in PBS-0.05% Tween-20 (PBS-T) for 30 min, incubated with mouse anti-VZV gE antibody (MAB8612) diluted in PBS-T containing 0.1% BSA for 1 hr at room temperature. Cells were washed with PBS-T, incubated for 1 hr with polyclonal rabbit-anti-mouse Ig antibody (Dako) in PBS-T + 0.1% BSA, washed and stained with Alexa Fluor 488 (AF488)-conjugated goat anti-rabbit Ig (H+L) antibody (Thermo-Fisher) in PBS-T. Plates were measured using the Immunospot S6 Ultimate UV Image Analyzer and plaque size was determined using Immunospot software (Cellular Technology Limited).
Kinetic class of VZV genes
ARPE-19 cells were synchronized in the cell cycle using a double thymidine block approach (Ma and Poon, 2016). Briefly, ARPE-19 cells were seeded at semi-confluence in 12-wells plates and next day medium was replaced for growth medium containing 2mM thymidine (Sigma-Aldrich). After 24 hrs, medium was replaced for normal growth medium for 8 hrs, after which medium was changed back to thymidine containing medium. 30 min prior to infection cells were released from thymidine by replacing the medium with regular culture medium. Cells were infected with freshly harvested cell-free VZV EMC-1 using spin-inoculation for 15 min at 1,000xg (MOI after spin-inoculation = 0.1 - 0.2). Cells were incubated for 45 min at 37°C, after which the inoculum was replaced for fresh medium, medium with 10 µg/mL Actinomycin D (ActD; Sigma-ALdich), medium with 50 μg/mL cycloheximide (CHX; C4859, Sigma-Aldrich) or medium with 400 µg/mL phosphonoacetic acid (PAA; Sigma-Aldrich). Infected cells were harvested in 500 μL TRIzol at 12 hpi (ActD and CHX) or 24 hpi (untreated and PAA) for RNA extraction. Alternatively, ARPE-19 cells, MRC-5 cells (1 x 105 cells/well on 24-well plate) and hESC-derived neurons (1 x 105 cells/well on 24-well plate as NSC and differentiated for 18 days) were infected with cell-free VZV pOka (20 µL of 4 x 104 PFU/mL) in the presence or absence of phosphonophormic acid (200 µg/mL) (Sigma-Aldrich) for 1 hr, the inoculum was replaced for fresh media with phosphonophormic acid (200 µg/mL) and cultures were maintained for 24 hrs.
Quantitative PCR analysis
Quantitative Taqman real-time PCR (qPCR) was performed in duplicate on RT- and RT+ cDNA using 4x Taqman Fast Advanced Master mix (Applied Biosystems) on a 7500 Taqman PCR system. Primer-probe sets directed to ORFs 61, 62, 63 and VLT have been described previously (Depledge et al., 2018b; Ouwendijk et al., 2012) and those directed to ORF43 are described in Table S7. Alternatively, cDNAs were subjected to qPCR using KOD SYBR qPCR Mix (TOYOBO) in the StepOnePlus Real-time PCR system (Thermo Fisher Scientific) (1 µL of cDNA per 10 µL reaction). All primer sets used for SYBR Green chemistry (Table S6) were first confirmed for the amplification rate (98-100%) using 10-106 copies (10-fold dilution) of pOka-BAC genome or VLT plasmid (Depledge et al., 2018b) and the lack of non-specific amplification using water. The qPCR program is as follows; 95°C for 2 min (1 cycle), 95°C for 10 sec and 60°C 15 sec (40 cycles), and 60 to 95°C for a dissociation curve analysis. Data is presented as relative VZV level to cellular beta-actin defined as 2-(Ct-value VZV gene - Ct-value beta-actin).
Immunofluorescence staining
ARPE-19 cells were plated on glass coverslips in 24-wells plates one day prior to infection. Cells were inoculated with freshly harvested cell-free VZV EMC-1 in medium with or without 400 µg/mL PAA and incubated for 24 hrs. Infected cells were fixed with 4% PFA, permeabilized for 10 min with 0.1% Triton-X100 in PBS, blocked with 5% goat serum diluted in 0.2% gelatin-PBS solution and incubated on 30 µL 0.2% gelatin-PBS droplets containing primary antibody overnight at 4°C. The following primary antibodies were used: anti-pORF61 antibody (1:1000, gift from Dr. P. Kinchington) and monoclonal mouse anti-pORF62 antibody (1:200) (Lenac Rovis et al., 2013). Cells were washed 3-times with 0.2% gelatin-PBS and incubated for 1hr at room temperature with secondary antibodies diluted in 0.2% gelatin-PBS. The following secondary antibodies were used: AF488-conjugated goat anti-rabbit Ig (H+L) antibody (1:500; Thermo-Fisher) and AF594-conjugated goat anti-mouse Ig (H+L) antibody (1:500; Thermo-Fisher). Cells were washed once with 0.2% gelatin-PBS, washed once with PBS, incubated with a 1:1000 dilution of Hoechst 33342 (Life Technologies, 20 mM) in PBS for 5 min, washed with PBS and mounted using Prolong Gold Antifade Mounting medium (Thermo Fisher). Stained cells were analyzed using a Zeiss LSM 700 confocal laser scanning microscope (Zeiss) with a magnification of 400x or 1,000x. Photoshop CC 2019 software (Adobe) was used to adjust brightness and contrast.
Illumina RNA sequencing and analysis
Stranded RNA libraries were prepared from poly(A)-selected RNA using the NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina (New England Biolabs) and sequenced using a NextSeq 550. Sequence reads were trimmed using TrimGalore (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) (--paired --length 30 –quality 30) and aligned against the VZV reference genome (strain Dumas, NC_001348.1) using BBMAP (https://sourceforge.net/projects/bbmap/) with post-alignment processing performed using SAMtools (Li et al., 2009) and BEDtools (Quinlan and Hall, 2010) to generate BEDGRAPH and BED12 files. Candidate CPAS were identified using ContextMap2 (Bonfert et al., 2015) (-aligner_name bowtie --polyA –strand-specific) with sequence reads aligned to VZV Dumas under default parameters by BWA (Li and Durbin, 2009).
Cap analysis of gene expression (CAGE) sequencing and analysis
Two biological replicates of total RNA were extracted from ARPE-19 cells infected with VZV pOka for 96 hrs and CAGE-Seq libraries prepared by DNAFORM (Yokohama, Japan) and subsequently sequenced using an Illumina NextSeq 550, as previously described (Murata et al., 2014). Resulting sequence reads were trimmed (--length 30 --q 30 --clip_R1 1) using TrimGalore prior to alignment against the VZV Dumas genome using BBMAP. Post-alignment processing was performed using SAMtools and BEDtools. TSS were identified using the HOMER findPeaks module (-style tss -localSize 100 -size 15). Only TSS present in both biological replicates were retained for analysis.
Nanopore direct RNA sequencing
For each biological sample, up to 1,000 ng of poly(A) RNA was isolated from up to 50 µg of total RNA using the Dynabeads™ mRNA Purification Kit (Invitrogen, 61006). Isolated poly(A) RNA was subsequently spiked with 0.3 µL of a synthetic Enolase 2 (ENO2) calibration RNA (Oxford Nanopore Technologies Ltd.) and dRNA-Seq libraries prepared as described previously (Depledge et al., 2019). Sequencing was performed on a MinION MkIb using R9.4.1 (rev D) flow cells (Oxford Nanopore Technologies Ltd.) for 18 – 44 hrs (one library per flowcell) and yielded between 720,000 – 1,290,000 reads per dataset (Table S4). Raw fast5 datasets were then basecalled using Guppy v3.2.2 (-f FLO-MIN106 -k SQK-RNA002) with only reads in the pass folder used for subsequent analyses. Sequence reads were aligned against the VZV Dumas genome, using MiniMap2 (Li, 2018) (-ax splice -k14 -uf --secondary=no), with subsequent parsing through SAMtools and BEDtools. Here sequence reads were filtered to retain only primary alignments (Alignment flag 0 (top strand) or 16 (bottom strand)).
Splice junction correction in dRNA-Seq alignments
Illumina-assisted correction of splice junctions in RNA-Seq data was performed using FLAIR v1.3 (Tang et al., 2020) in a stranded manner. Briefly, Illumina reads aligning to the VZV Dumas genome were split according to orientation and mapping strand [-f83 & -f163 (forward) and -f99 & -f147 (reverse) ] and used to produce strand-specific junction files that were filtered to remove junctions supported by less than 50 Illumina reads. Direct RNA-Seq reads were similarly aligned to the Ad5 genome and separated according to orientation [-F4095 (forward) and - f16 (reverse)] prior to correction using the FLAIR correct module (default parameters). FLAIR-corrected alignments were used for all subsequent downstream analyses.
TSS and CPAS identification in dRNA-Seq data
TSS and CPAS were identified by parsing SAM files to BED12 files in a strand-specific manner using BEDtools, and then truncating each aligned sequence read to its 5’ or 3’ termini for TSS and CPAS identification, respectively. Peak regions containing TSS and CPAS were identified using the HOMER findpeaks module (-o auto -style tss) using a --localSize of 100 and 500 and --size of 15 and 50 for TSS and CPAS, respectively. TSS peaks were compared against Illumina annotated splice sites to identify and remove peak artefacts derived from local alignment errors around splice junctions. 38 putative TSS identified in the dRNA-Seq dataset alone were flagged as artefacts and removed. Each of these TSS mapped precisely to a splice acceptor within a spliced RNA and closer inspection of the reads showed these to result from local alignment processes (Depledge et al., 2019).
Generating RNA abundance counts from dRNA-Seq data
Using the updated VZV strain Dumas annotation presented here, we generated a transcriptome database by parsing our GFF3 file to a BED12 file using the gff3ToGenePred and genePredtoBED functions within UCSCutils (https://github.com/itsvenu/UCSC-Utils-Download) and subsequently extracting a fasta sequence for each annotated RNA using the getfasta function within BEDtools. dRNA-Seq reads were then aligned against the transcriptome database using parameters optimized for transcriptome-level alignment (minimap2 -ax map-ont -p 0.99). RNA counts were generated by counting alignments against a given RNA only if the alignment 5’ end was located within the first 50 nt of the RNA and the alignment was not marked as supplementary.
In silico prediction of coding potential
CPC 2.0 (Kang et al., 2017) was used to examine the coding potential of all VZV RNAs defined in this study (Table S3). Note that RNAs were excluded from CPC 2.0 analysis and defined at putatively non-coding if no proteins greater than fifty amino acids in length were encoded.
Data visualization
Figures associated with this study were generated using the R packages Gviz (Hahne and Ivanek, 2016) and GenomicRanges (Lawrence et al., 2013).
Data availability
All sequencing datasets associated with this study are available via the European Nucleotide Archive under the accession PRJEB36978. Analysed datasets generated as part of this study, including a database of transcripts, BED12 alignment files, and GFF3 files describing our VZV annotation are freely available at https://github.com/dandepledge/vzv-2.0.
Supplemental Information titles and legends
Acknowledgements
We extend special thanks to Ian Mohr (New York University School of Medicine) for support of D.P.D. in part through National Institutes of Health (NIH) grants R01-AI073898 and R01-GM056927. S.E.B. was in part supported by the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (NWO) project 022.005.032. This work was supported in part by the Takeda Science Foundation, Daiichi Sankyo Foundation of Life Science, Japan Society for the Promotion of Science (JSPS KAKENHI JP17K008858, JP16H06429 and JP16K21723) and the Ministry of Education, Culture, Sports, Science and Technology (MEXT KAKENHI JP17H05816) (T.S.). J.B acknowledges support from the National Institute for Health Research University College London Hospitals Biomedical Research Centre. The computational requirements for this work were supported in part by the NYU Langone High Performance Computing (HPC) Core’s resources and personnel.
Footnotes
↵* co-senior authors