ABSTRACT
Monosynaptic tracing using rabies virus is an important technique in neuroscience, allowing brain-wide labeling of neurons directly presynaptic to a targeted neuronal population. A 2017 article reported development of a noncytotoxic version – a major advance – based on attenuating the rabies virus by addition of a destabilization domain to the C-terminus of a viral protein. However, this modification did not appear to hinder the ability of the virus to spread between neurons. We analyzed two viruses provided by the authors and show here that both were mutants that had lost the intended modification, explaining the paper’s paradoxical results. We then made a virus that actually did have the intended modification and found that it did not spread under the conditions described in the original paper – namely, without an exogenous protease being expressed in order to remove the destabilization domain – but that it did spread, albeit with relatively low efficiency, if the protease was supplied. We conclude that the new approach is not robust but that it may become a viable technique given further optimization and validation.
SIGNIFICANCE STATEMENT Rabies virus, which spreads between synaptically connected neurons, has been one of the primary tools used by neuroscientists to reveal the organization of the brain. A new modification to rabies virus was recently reported to allow the mapping of connected neurons without adverse effects on the cells’ health, unlike earlier versions. Here we show that the conclusions of that study were probably incorrect and based on having used viruses that had lost the intended modification because of mutations. We also show that a rabies virus that does retain the intended modification does not spread between neurons under the conditions reported previously; however, it does spread between neurons under different conditions, suggesting that the approach may be successful if refined further.
INTRODUCTION
Viruses have become important tools for neuroscience (1-16), and “monosynaptic tracing” based on rabies virus (5) has become the primary method of labeling neurons directly presynaptic to some targeted group of neurons (17-20). Its core components are, first, selective infection of the targeted neuronal group with a recombinant rabies virus with a deleted gene (which in all work published to date is the “G” gene encoding the glycoprotein that coats the viral envelope) and, second, complementation of the deletion in the targeted starting neurons, by expression of the deleted gene in trans. With all of its gene products therefore present in the starting cells, the virus can fully replicate within them and spreads, as wild-type rabies virus does, to cells directly presynaptic to the initially infected neurons. Assuming that G has not been provided in trans in these presynaptic cells too, the deletion-mutant (”ΔG”, denoting the deletion of G) virus is unable to spread beyond them, resulting in labeling of just the neurons in the initially targeted population and ones that are directly presynaptic to them (5).
A drawback of these ΔG (or “first-generation” (21)) rabies viruses is that they are cytotoxic (4, 21, 22), which has spurred several labs to develop less toxic versions. Reardon, Murray, and colleagues (22) showed that simply using ΔG rabies virus of a different parent strain — switching from the original SAD B19 strain to the more neuroinvasive CVS N2c strain (23) — decreased toxicity and increased the efficiency of transneuronal spread. Our own group has taken a more drastic approach, recently introducing “second-generation” rabies viruses from which both G and a second gene, “L”, encoding the viral polymerase, have been deleted (21). Although in our published work we have only shown that these second-generation, “ΔGL” viruses are efficient means of direct retrograde targeting of projection neurons, it is at least theoretically possible for ΔGL viruses to be used for monosynaptic tracing, if the second deleted gene were also expressed in trans.
Taking a quite different approach, Ciabatti et al. (24) introduced “self-inactivating rabies” (”SiR”) viruses, which differed from simple first-generation (ΔG, SAD B19 strain) ones by the addition of a destabilization domain to the C-terminus of one of the viral proteins, so that the protein would be rapidly degraded soon after it was produced. Because the protein in question (the nucleoprotein, encoded by the “N” gene) is essential for viral gene expression and replication, its destabilization was intended to “silence” viral gene expression and prevent replication, making the viruses nontoxic.
These SiR viruses were designed to be unable to replicate unless an exogenous protease (tobacco etch virus protease, TEVP) was expressed in infected cells in order to remove the destabilization (or “PEST” (25)) domain. However, the paper reported that they were able to spread between neurons – which requires replication – just as efficiently as unmodified first-generation viruses did, without the protease being provided at all.
We hypothesized that the viruses that were used for the reported transsynaptic tracing experiments (24) were mutants with premature stop codons at or near the end of the native nucleoprotein gene and before the sequence of the destabilization domain. Rhabdoviruses have high mutation rates (26-31), and production of high-titer rabies virus stocks for in vivo injection typically involves repeated passaging on complementing cell lines (32-34), which affords ample opportunity for accumulation of mutants with a selective replication advantage.
Here we show that, in both of the two SiR virus samples that we analyzed, the great majority of viral particles did have mutations in their genomes that caused the complete loss of the intended C-terminal addition to the nucleoprotein, so that they were effectively just ordinary first-generation ΔG rabies viral vectors. We also tested the SiR-CRE virus in vivo and found that it was rapidly cytotoxic.
We also show that a ΔG virus that does have the intended modification to the nucleoprotein does not spread transsynaptically in vivo in the absence of TEVP, but that it does spread when TEVP is provided, although with lower efficiency than a virus without the modification.
RESULTS
We analyzed samples of two viruses sent directly from the Tripodi lab to MIT two months after their publication(24) and given directly to the Wickersham lab soon afterward, still frozen and unopened, with express permission from Marco Tripodi by email on December 5, 2017: “EnvA/SiR-CRE” (made from genome plasmid Addgene 99608, pSAD-F3-NPEST-iCRE-2A-mCherryPEST) and “EnvA/SiR-FLPo” (made from genome plasmid Addgene 99609, pSAD-F3-NPEST-FLPo-2A-mCherryPEST). Both had the SAD B19 strain of rabies virus as their parent strain and had been packaged with the avian and sarcoma virus subgroup A envelope glycoprotein (”EnvA”) for targeted infection of cells expressing EnvA’s receptor, TVA (5).
For comparison with the two SiR viruses, we made five control viruses in our own laboratory: three first-generation vectors, RVΔG-4Cre (21), RVΔG-4FLPo (see Methods), and RVΔG-4mCherry (35), and two second-generation vectors, RVΔGL-4Cre and RVΔGL-4FLPo (21). All of these viruses are on the SAD B19 background, like the SiR viruses. For each of the four recombinase-expressing viruses from our laboratory, we made one preparation packaged with the EnvA envelope protein and one preparation packaged with the native rabies virus (SAD B19 strain) glycoprotein (denoted as “B19G”); RVΔG-4mCherry (used only as a control for the Sanger sequencing) was only packaged with the EnvA envelope protein.
Sequencing of viral genomes: Sanger sequencing
In order to directly test our hypothesis that the SiR viruses had developed premature stop codons removing the PEST domain in a majority of viral particles, we sequenced the genomes of a large number of individual viral particles using two different techniques.
First, we used ordinary Sanger sequencing to determine the sequence in the vicinity of the end of the nucleoprotein gene for 50 to 51 individual viral particles of each of the two SiR viruses and of a first-generation virus from our own laboratory, RVΔG-4mCherry (Figure 1 and Supplementary File S1). We ensured the isolation of individual viral genomes by using a primer with a random 8-base index for the reverse transcription step, so that the cDNA copy of each RNA viral genome would have a unique index. Following the reverse transcription step, we amplified the genomes by standard PCR, cloned the amplicons into a generic plasmid, transformed this library into E.coli and sequenced plasmids purified from individual colonies.
(A) Schematic of the RT-PCR workflow. In the reverse transcription (RT) step, the RT primer, containing a random 8-nucleotide sequence, anneals to the 3’ rabies virus leader, adding a unique random index to the 5’ end of the cDNA molecule corresponding to each individual viral particle’s RNA genome. In the PCR step, the forward PCR primer anneals to the RT primer sequence and the reverse PCR primer anneals within the viral phosphoprotein gene P. Both PCR primers have 15-base sequences homologous to those flanking an insertion site in a plasmid used for sequencing, allowing the amplicons to be cloned into the plasmid using a seamless cloning method before transformation into bacteria. The resulting plasmid library consists of plasmids containing up to 48 different index sequences, allowing confirmation that the sequences of plasmids purified from individual picked colonies correspond to the sequences of different individual rabies viral particles’ genomes.
(B) Representative Sanger sequencing data of the 8-bp index and the TEV-PEST sequence. Mutations are highlighted in red.
(C) Mutation variants and their frequencies in each viral vector sample based on Sanger sequencing data. No unmutated genomes were found in the SiR-CRE sample: 50 out of 51 had a substitution creating an opal stop codon just before the TEV cleavage site, and the 51st genome contained a frameshift which also removed the C-terminal addition. In the SiR-FLPo sample, only 4 out of 50 clones had an intact sequence of the C-terminal addition; the other 46 out of 50 had one of two de novo stop codons at the end of N or the beginning of the TEV cleavage site. In the sample of RVΔG-4mCherry, a virus from our laboratory included as a control to distinguish true mutations on the rabies genomes from mutations due to the RT-PCR process, none of the 51 clones analyzed had mutations in the sequenced region.
As shown in Figure 1, the results confirmed our hypothesis that SiR viruses are prone to loss of the 3’ addition to the nucleoprotein gene. Specifically, in the SiR-CRE sample, 100% of the 51 sequenced viral particles had lost the PEST domain. Fifty out of the 51 had the same point mutation in the linker between the end of the native nucleoprotein gene and the TEVP cleavage site, converting a glycine codon (GGA) to a stop codon (TGA) so that the only modification to the C-terminus of the nucleoprotein was the addition of two amino acids (a glycine and a serine). The one sequenced viral particle that did not have this point mutation had a single-base insertion in the second-to-last codon of the native nucleoprotein gene, frameshifting the rest of the sequence and resulting in 15 amino acids of nonsense followed by a stop codon before the start of the PEST domain sequence.
In the SiR-FLPo sample, the population was more heterogeneous: out of 50 sequenced viral particles, 18 had the same stop codon that was found in almost all genomes in the Cre sample, while another 28 had a different stop codon three amino acids upstream, immediately at the end of the native nucleoprotein gene (converting a serine codon (TCA) to a stop codon (TGA)). Four viral particles had no mutations in the sequenced region. Thus 46/50 (92%) of the SiR-FLPo viral particles sequenced had lost the PEST domain.
In contrast, in the first-generation virus from our own lab, RVΔG-4mCherry, none of the 50 viral particles sequenced had mutations in the sequenced region containing the end of the nucleoprotein gene.
Sequencing of viral genomes: Single-molecule, real-time (SMRT) sequencing
As a second approach to analyzing the mutations present in the SiR viruses, we employed a large-scale sequencing technology: single-molecule, real-time (”SMRT”) sequencing, which provides independent sequences of tens of thousands of individual single molecules in a sample in parallel (Figure 2 and Supplementary File S2). The results from this advanced sequencing method were quite consistent with the results from the Sanger sequencing presented above. As with the sample preparation for Sanger sequencing, we included a random index (10 bases, in this case) in the reverse transcription primer, so that again the cDNA copy of each RNA viral genome molecule would be labeled with a unique index.
(A) Schematic of workflow for SMRT sequencing. An RT primer with a random 10-nucleotide sequence anneals to the leader sequence on the negative-sense single-stranded RNA genome. Forward and reverse PCR primers have distinct SMRT barcodes at their 5’ ends for the three different virus samples. After RT-PCR, each amplicon library consists of amplicons each containing a SMRT barcode to identify the sample of origin as well as the 10-nucleotide index (i.e., with a potential diversity of 410 different indices) to uniquely label the individual genome of origin. SMRT “dumbbell” adaptors are then ligated to the amplicons’ ends, making circular templates which are then repeatedly traversed by a DNA polymerase, resulting in long polymerase reads each containing multiple reads of the positive and negative strands. The individual subreads for a given molecule are combined to form circular consensus sequence (CCS) reads.
(B) High-frequency (>2%) point mutations found in the rabies vector samples based on SMRT sequencing. Horizontal axis represents nucleotide position along the reference sequences (see text); vertical axis represents variant frequency. Total number of CCS3 reads (i.e., with at least 3 subreads for each position) are 22,205 for SiR-CRE, 17,086 reads for SiR-FLPo, and 17,978 reads for RVΔG-4Cre. The great majority of SiR-CRE and SiR-FLPo genomes have point mutations creating premature stop codons at or just after the end of N and before the C-terminal addition. The only frequent (>2%) mutation found in the control virus, RVΔG-4Cre, was a single amino acid substitution at position 419 in 9.49% of virions. Insertions and deletions are not shown here (see text).
(C) Summary of results. In the SiR virus samples, 99.22% of SiR-CRE virions and 83.85% of SiR-FLPo virions had point mutations creating premature stop codons that completely removed the intended C-terminal addition to the nucleoprotein, making them simply first-generation (ΔG) rabies viral vectors. This does not include any insertions or deletions causing frameshifts (see text), which would further increase the percentage of first-generation-type virions in these samples. In the RVΔG-4Cre sample, there were no premature stop codons at or near the end of the nucleoprotein gene.
SMRT sequencing entails circularization of the DNA amplicons and multiple consecutive passes around the resulting circular molecule, with the redundancy provided by this repeated sequencing of each position increasing the signal to noise ratio and statistical significance of the results. The numbers presented in Figure 2 and below use the default of including only clones that had at least three reads of each base (”circular consensus sequence 3”, or “CCS3” in Supplementary File S2). Using the increasingly stringent criteria of requiring either five or eight reads per base (CCS5 or CCS8) reduced the numbers of qualifying genomes in all cases and changed the percentages slightly but gave very similar results overall. Because read accuracy for SMRT sequencing is ≥98% for circular consensus sequencing with 3 passes (see https://www.mscience.com.au/upload/pages/pacbio/technical-note---experimental-design-for-targeted-sequencing.pdf), we used a conservative threshold of 2% frequency of any given point mutation position in order to screen out false positives. Also to be very conservative, for Figure 2 we ignored all apparent frame shifts caused by insertions and deletions, because insertions in particular are prone to false positives with SMRT sequencing (36). See Supplementary File S2 for details, including details of frameshifts due to insertions; Supplementary Files S3-S5 contain the sequences of the PCR amplicons that would be expected based on published sequences of the three viruses, but to summarize here:
As a control, we used a virus from our own laboratory, RVΔG-4Cre (21) (see Addgene #98034 for reference sequence). Out of 17,978 sequenced genomes of this virus, we found no mutations above threshold frequency at the end of N. We did find that 1,706 viral particles (9.49%) had a nonsynonymous mutation (TCT (Ser) → ACT (Thr)) farther up in N at amino acid position 419 (31 amino acids upstream of the end of the 450-aa native protein). We do not know if this mutation is functionally significant, although it is not present in CVS N2c (37), HEP-Flury (38), ERA (39), or Pasteur strains (Genbank GU992320), so these particles may effectively be N-knockouts that were propagated by coinfection with virions with intact N (see Discussion for more on such parasitic co-propagating mutants).
For the SiR-CRE virus, out of 22,205 viral genomes sequenced, 22,032 had the premature stop codon (GGA -> TGA) in the linker between the native nucleoprotein gene and the TEVP cleavage site sequence. In other words, even without including frameshifts, at least 99.22% of the individual viral particles in the SiR-CRE sample were essentially first-generation ΔG vectors, with the only modification of the nucleoprotein being an additional two amino acids at the C-terminus.
For the SiR-FLPo virus, out of 17,086 viral genomes sequenced, 5,979 had the stop codon (GGA -> TGA) in the linker, 8,624 had the stop codon (TCA -> TGA) at the end of N, and a further 28 had a different stop codon (TCA -> TAA) at the same position at the end of N. Of these, 305 viral particles had premature stop codons at both of these two positions, so that the total number of viral particles with one or both stop codons immediately before the PEST domain was (8624 + 5979 + 28 - 305 = 14,326. In other words, again even without including frameshifts, at least 83.85% of the individual viral particles in the SiR-FLPo sample were essentially first-generation ΔG vectors, with the only modification of the nucleoprotein being either two amino acids added to, or one amino acid lost from, the C-terminus.
Anti-nucleoprotein immunostaining
We infected reporter cell lines with serial dilutions of the two EnvA-enveloped SiR viruses as well as the eight recombinase-expressing ones from our own lab: ΔG vs. ΔGL, Cre vs. FLPo, EnvA vs. B19G envelopes. Three days later, we immunostained the cells for rabies virus nucleoprotein and imaged the cells with confocal microscopy.
As seen in Supplementary Figure S1, we found that the cells infected with the SiR viruses looked very similar to those infected with the first-generation, ΔG viruses. Notably, the viral nucleoprotein, which in the SiR viruses is intended to be destabilized and degrade rapidly in the absence of TEVP, accumulated in the SiR-infected cells in clumpy distributions that looked very similar to those in the cells infected with the first-generation, ΔG viruses. By contrast, the cells infected with the second-generation, ΔGL viruses, which we have shown to be noncytotoxic (21), did not show any such nucleoprotein accumulation, clumped or otherwise, only punctate labeling presumably indicating isolated viral particles or post-infection uncoated viral particles (ribonucleoprotein complexes) that are not replicating.
Longitudinal two-photon imaging in vivo
To see whether the SiR viruses kill neurons in the brain, we conducted longitudinal two-photon imaging in vivo of virus-labeled neurons in visual cortex of tdTomato reporter mice, as we had done previously to demonstrate the nontoxicity of second-generation rabies virus (21) (Figure 3). Because the SiR viruses were EnvA-enveloped, we first injected a lentivirus expressing EnvA’s receptor TVA, then one week later we injected either SiR-CRE or one of two EnvA-enveloped viruses made in our laboratory: the first-generation virus RVΔG-4Cre(EnvA) or the second-generation virus RVΔGL-4Cre(EnvA). Beginning one week after rabies virus injection, we imaged labeled neurons at the injection site every seven days for four weeks, so that we could track the fate of individual neurons over time.
A) Representative fields of view (FOVs) of visual cortical neurons labeled with RVΔG-4Cre (top row), RVΔGL-4Cre (middle row), or SiR-CRE (bottom row) in Ai14 mice (Cre-dependent expression of tdTomato). Images within each row are of the same FOV imaged at the four different time points in the same mouse. Circles indicate cells that are present at 7 days postinjection but no longer visible at a subsequent time point. Scale bar: 50 µm, applies to all images.
B-D) Numbers of cells present at week 1 that were still present in subsequent weeks. While very few cells labeled with RVΔGL-4Cre were lost, and RVΔG-4Cre killed a significant minority of cells, SiR-CRE killed the majority of labeled neurons within 14 days following injection.
E) Percentages of cells present at week 1 that were still present in subsequent imaging sessions. By 28 days postinjection, an average of only 20.5% of cells labeled by SiR-CRE remained.
As we found in our previous work (21), our second-generation virus RVΔGL-4Cre did not kill neurons to any appreciable degree: all but a tiny handful of the neurons labeled by this virus at seven days after injection were still present three weeks later in all mice. Also as we have found previously (21), our first-generation virus RVΔG-4Cre did kill neurons, but by no means all of them (see the Discussion for possible reasons for this).
However, we found that the putatively nontoxic SiR-CRE caused a steep loss of neurons much more pronounced than even our first-generation virus did. By 14 days after injection, 70% of cells seen at seven days were dead; by 28 days, 81% were.
There is a possible confound from our use of the tdTomato reporter line Ai14 (which we used primarily because we already had large numbers of mice of this line): because SiR-CRE is actually “SiR-iCRE-2A-mCherryPEST”, designed to coexpress mCherry (with an added C-terminal PEST domain intended to destabilize it, as for the nucleoprotein) along with Cre, it is conceivable that some of the SiR-CRE-labeled red cells at seven days were only expressing mCherry and not tdTomato. If the destabilized mCherry were expressed only transiently, as intended(24), and a significant fraction of SiR-CRE virions had mutations in the Cre gene so that they did not express functioning Cre, then it is possible that some of the red cells seen at seven days were labeled only with mCherry that stopped being visible by 14 days, so that it would only look like those cells had died.
We viewed this alternative explanation as unlikely, because the designers of SiR-CRE had injected it in an EYFP reporter line and found no cells labeled only with mCherry and not EYFP at six days and nine days postinjection (see Figure S4 in Ciabatti et al. (24)). Nevertheless, we addressed this potential objection in several ways.
First, we sequenced the transgene inserts (iCre-P2A-mCherryPEST) of 21 individual SiR-CRE viral particles (see Supplementary File S6) and found that only two out of 21 had mutations in the Cre gene, suggesting that there would not have been a large population of cells only labeled by mCherry and not by tdTomato.
Second, we repeated some of the SiR-CRE injections and imaging in a different reporter line: Ai35, expressing Arch-EGFP-ER2 after Cre recombination (40) (Jax 012735). Although we found that the membrane-localized green fluorescence from the Arch-EGFP-ER2 fusion protein was too dim and diffuse at seven days postinjection to be imaged clearly, we were able to obtain clear images of a number of cells at 11 days postinjection. We found that 46% of them had disappeared only three days later (see Supplementary Figure S2 and Supplementary Video S1), and 86% had disappeared by 28 days postinjection, consistent with a rapid die-off. Furthermore, we found that the red fluorescence in Ai35 mice, which was due only to the mCherry expressed by the virus, was much dimmer than the red fluorescence in Ai14 mice at the same time point of seven days postinjection and with the same imaging parameters (see Supplementary Figure S3): the mean intensity was 45.86 (arbitrary units, or “a.u.”) in Ai14 but only 16.29 a.u. in Ai35. This is consistent with the published findings that tdTomato is a much brighter fluorophore than mCherry (41), particularly with two-photon excitation (42), and it is also consistent with Ciabatti et al.’s addition of a destabilization domain to mCherry’s C-terminus. We therefore redid the counts of labeled cells in our Ai14 datasets to include only cells with fluorescence at seven days of more than 32.33 a.u., the midpoint of the mean intensities in Ai35 versus Ai14 mice, in order to exclude neurons that might have been labeled with mCherry alone. As seen in Supplementary Figure S4, restricting the analysis to the cells that were brightest at seven days (and therefore almost certainly not labeled with just mCherry instead of either just tdTomato or a combination of both mCherry and tdTomato) made no major difference: 70.0% of SiR-labeled neurons had disappeared by 14 days, and 80.8% were gone by 21 days.
Although in theory it is possible that the disappearance of the infected cells could be due to cessation of tdTomato or Arch-EGFP-ER2 expression rather than to the cells’ deaths, because of downregulation by rabies virus of host cell gene expression (43), we view this as highly unlikely. Downregulation of host cell gene expression by rabies virus is neither total (”cells with high expression of RbV transcripts retain sufficient transcriptional information for their classification into a specific cell type.” (43)) nor uniform (44); in practice, we saw no evidence of a decline in reporter expression in the infected cells but in fact found the exact opposite. As can be seen in a number of cells in Figures 3 and S4, the cells got brighter and brighter over time, unless they abruptly disappeared. In our experience, including in this case, cells infected with rabies virus increase in brightness until they die, often blebbing and coming apart into brightly labeled pieces, regardless of whether the fluorophore is expressed from a reporter allele (as in this case) or directly by the virus (see Chatterjee et al. 2018 for many more examples of this (21)).
A) Sequences of the region of the junction between the nucleoprotein gene N and its 3’ addition in SiR-CRE (intended sequence) as well as in our two new viruses RVΔG-NPEST-4Cre and RVΔG-N*PEST-4Cre. Apart from synonymous mutations we made in five codons (highlighted in green) to make them less likely to mutate to stop codons, RVΔG-NPEST-4Cre has exactly the same 3’ addition to N as was intended for SiR. RVΔG-N*PEST-4Cre has exactly the same sequence as RVΔG-NPEST-4Cre apart from a stop codon that we deliberately introduced at the same location as the nonsense mutation that we had found in almost every SiR-CRE virion, so that the only modification to the native nucleoprotein is an additional two amino acids (Gly-Ser) on its C-terminus. Sanger sequencing of 32 viral particles of each virus confirmed that the intended additions were still present in the final stocks.
B) Diagram of virus injections for monosynaptic tracing experiments. An AAV2-retro expressing FLPo was injected in dorsolateral striatum, and a FLP-dependent helper virus combination was injected into barrel cortex to express TVA and G in corticostriatal cells. Rabies virus was injected seven days later at the same location in barrel cortex, and mice were perfused either 12 days or 21 days after rabies virus injection.
C) At 12 days after injection, the NPEST virus (top row), with an intact PEST domain, shows no evidence of spread, whereas the N*PEST virus (bottom row), without the PEST domain, has labeled thousands of neurons in ipsilateral cortex (shown here) and also spread to contralateral cortex and thalamus (see Supplementary Figure S5 for images). For all panels in this figure as well as Figure 5 and the related supplementary figures, the red channel shows tdTomato, reporting Cre expression from the rabies viruses; the green channel shows EGFP, coexpressed with TVA (and tTA) by the first helper virus; the blue channel shows mTagBFP2, coexpressed with G by the second helper virus. Scale bar: 200 µm, applies to all images.
D) At 21 days after injection, the N*PEST virus again shows no evidence of spread, while the N*PEST virus has labeled many thousands of neurons. Scale bar: 200 µm, applies to all images.
E) Counts of labeled neurons in ipsilateral cortex (left), contralateral cortex (center), and thalamus (right) at 12 days and 21 days for the two viruses. Each dot indicates the total number of labeled cells found in a given mouse brain when examining every sixth 50 µm section (see Methods). All differences in numbers of cells labeled by the two viruses are highly significant for all conditions (see Supplementary File S8 for all counts and statistical analyses), except for the numbers of contralateral cells at 12 days.
Construction and testing of a virus with an intact PEST domain
We decided to directly test whether a rabies virus with an intact PEST domain fused to its nucleoprotein can spread transsynaptically, with or without TEVP (Figure 4). Beginning with our lab’s first-generation virus RVΔG-4Cre(21), we constructed a “self-inactivating” version, “RVΔG-NPEST-4Cre”, by adding the coding sequence for the C-terminal addition from Ciabatti et al.(24) to the 3’ end of the nucleoprotein gene. To reduce the chance of the PEST domain being lost to nonsense mutations during production of the virus, we made synonymous changes to five codons near the junction of the end of the native nucleoprotein gene and the beginning of the addition, so that those codons were no longer a single mutation away from being stop codons; apart from these five synonymous changes, the nucleotide sequence of the addition was identical to that used by Ciabatti et al.(24) (Figure 4A). We also made a matched “revertant” version, “RVΔG-N*PEST-4Cre”, with exactly the same sequence as RVΔG-NPEST-4Cre except with a stop codon three codons into the linker, at the same location as the stop codon that we had found in the overwhelming majority of viral particles in the original SiR-CRE. While the genomes of these two new viruses only differed from each other by one codon, therefore, at the protein level one virus (RVΔG-NPEST-4Cre) had a full-length PEST domain on the end of its nucleoprotein, while the other (RVΔG-N*PEST-4Cre) had only a two-amino-acid (Gly-Ser) addition to the end of its nucleoprotein and otherwise was an ordinary first-generation ΔG virus.
Following production of high-titer EnvA-enveloped virus (see Methods), we confirmed that the final stocks retained the intended 3’ additions to the nucleoprotein gene by extracting the genomic RNA and used Sanger sequencing on the genomes of 32 viral particles for each virus. All of the clones sequenced for each of the two viruses had the respective intended modifications (Figure 4, panel A; details in Supplementary File S7), except for one N*PEST clone that had a synonymous mutation in the PEST domain after the introduced stop codon (and that was therefore irrelevant. We also found three other incidental mutations in one clone each: one NPEST clone had a synonymous mutation in the nucleoprotein gene (at Ser437, from TCA to TCG), one N*PEST clone had a different synonymous mutation in the nucleoprotein gene (at Asn436, from AAC to AAT), and one NPEST clone had a single point mutation in the intergenic region between the N and P genes).
Having verified that the NPEST and N*PEST viruses retained their respective modifications, we tested their ability to spread transsynaptically in vivo in the absence of TEVP, using corticostriatal neurons as the starting cells (Figure 4, panel B). We injected an AAV2-retro(11) expressing FLPo into the dorsolateral striatum of three mice each of the tdTomato reporter line Ai14(45), with a cocktail of two helper viruses (AAV1)(46-48) injected into the primary somatosensory cortex (barrel field) in the same surgery. Corticostriatal neurons were therefore coinfected with all three viruses for expression of TVA (to allow infection by EnvA-enveloped RV) and G (to allow spread of the ΔG viruses to presynaptic cells), as well as the fluorophores EGFP (marking TVA expression) and mTagBFP2 (marking G expression). One week after AAV injection, we injected either the NPEST or the N*PEST virus at equalized titers, then perfused the mice either 12 days or 3 weeks after the rabies virus injections. Figure 4 and Supplementary Figure S5 show the results.
The N*PEST virus, which had only an additional two amino acids on the C-terminus of its nucleoprotein, spread very efficiently (bottom rows of images in panels B and C, and rightmost bar in each pair in the charts in panel E). We counted tdTomato-labeled neurons in ipsilateral cortex as well as in thalamus and in contralateral cortex (note that we counted neurons only in every sixth 50 µm section (see Methods), so that the numbers of labeled neurons in the entire brain of each mouse would be approximately six times the numbers given below and in the figures). At 12 days, we found an average of 2,772 tdTomato-labeled neurons in ipsilateral cortex; we also found an average of 12 labeled neurons in contralateral cortex and 40 in thalamus. At 3 weeks after rabies virus injection, a time point which is unusually long for a first-generation virus(21) but which we included to match the duration used by Ciabatti et al.(24), the N*PEST virus had spread to vastly more neurons: on average 13,070 in ipsilateral cortex, 253 in contralateral cortex, and 507 in thalamus. Importantly, all this label was not due simply to leaky TVA expression or residual RVG-enveloped virus, as control experiments without G resulted in very minimal labeling (Supplementary Figure S7), but instead indicates efficient transsynaptic spread of the N*PEST virus, consistent with our lab’s prior experience with the parent virus RVΔG-4Cre(EnvA) as well as with other first-generation RVΔG vectors.
In contrast, the NPEST virus, which had the intact PEST domain fused to its nucleoprotein, showed no clear evidence of transsynaptic spread without TEVP at either time point examined (top rows of images in panels C and D, and leftmost bar in each pair in the charts in panel E). At 12 days, we found averages of 10 labeled cells at the injection site, 2 in contralateral cortex, and none at all in thalamus. At 3 weeks, the situation was not much different: we found averages of 30 labeled cells at the injection site, 0.333 in contralateral cortex, and still zero in thalamus. These numbers were not significantly different from those in the matched controls without G, for which the means at 12 days were 4.667, 0.667, and 0 (injection site, contralateral cortex, and thalamus, respectively) and the means at 3 weeks were 12.333, 0, and 0. Single-factor ANOVAs were used for all comparisons; all cell counts and results of statistical analyses are given in Supplementary File S8.
We then tested the ability of the two viruses to spread when both TEVP and G are supplied (Figure 5 and Supplementary Figure S6). These experiments were done in exactly the same way as the ones without TEVP but with a third virus, AAV1-TREtight-H2b-emiRFP670-TEVP, included in the helper virus mixture. This third helper virus was of the same design as the G-expressing AAV (AAV1-TREtight-mTagBFP2-B19G) but expressed TEVP (S219V mutant) instead of G and an H2b-fused near-infrared fluorescent protein (emiRFP670) instead of the blue fluorophore mTagBFP2.
With the experimental design exactly the same as shown in Figure 4, Panel B but with a TEVP-expressing AAV included in the helper virus mixture, the NPEST virus did spread transsynaptically, but to a more limited degree than the N*PEST one.
A, B) At both 12 (A) and 21 (B) days after injection, both NPEST (top row) and N*PEST (bottom row) viruses have labeled many cells in ipsilateral cortex (shown here), although many more cells are labeled by the NPEST virus. Both viruses have also spread to contralateral cortex and thalamus (see Supplementary Figure S6 for images).
C - D) Counts of tdTomato+ cells (C) and ratios (D) of number of tdTomato+ cells to number of “starter cells”, defined here as cells coexpressing tdTomato, mTagBFP2, and emiRFP670, with TEVP supplied for both viruses (in cases in which TEVP is not supplied (see below), starter cells are defined as cells coexpressing tdTomato and mTagBFP2). Both the absolute numbers and the ratios are higher (in all cases except for contralateral cortex at 12 days) for the N*PEST virus than for the NPEST one, although the differences were not significant with the small size of each group (see text). Note that the inclusion of TEVP reduced the spread of the N*PEST virus considerably (cf. Figure 4; all counts and statistical comparisons are in Supplementary File S8).
E - F) Comparison of the best condition for the NPEST virus (with TEVP; same data as in panel C and D above) to the best condition for the N*PEST one (without TEVP; same data as in Figure 4) gives highly significant differences for most comparisons, with the advantage of the N*PEST virus over the NPEST version increasing between 12 days and 3 weeks.
With both TEVP and G provided, the NPEST virus did show evidence of spread: at 12 days, there were on average 469 labeled cells in ipsilateral cortex, 5 in contralateral cortex, and 3 in thalamus; at 3 weeks, there were on average 1,257 in ipsilateral cortex, 4 in contralateral cortex, and 18 in thalamus.
The N*PEST virus still labeled many more cells than the NPEST one did under these conditions (panels C and D). At 12 days, the N*PEST virus had labeled on average 4.5 times as many cells in ipsilateral cortex (2,090 cells), twice as many in contralateral cortex (10 cells), and 13,3 times as many in thalamus (40 cells); at 3 weeks, it had labeled 2.7 times as many cells in ipsilateral cortex (3,397 cells), 9.3 times as many in contralateral cortex (37 cells), and 2.4 times as many in thalamus (43 cells). While these differences were not found to be statistically significant, the comparisons were presumably underpowered, due to the very low number of subjects (n=3) we used per group, which was dictated by the limited availability of Ai14 mice.
Comparing the best conditions for the NPEST virus (that is, with TEVP provided) to the best conditions for the N*PEST one (that is, without TEVP provided) gave very large and highly significant differences for almost all comparisons (panels E and F). At 3 weeks, for example, the N*PEST virus had labeled 10.3 times as many cells in ipsilateral cortex (averages of 13,070 vs 1,257; p=0.000119), 14.3 times as many in thalamus as the NPEST virus (252.6 vs 17.6; p=0.00433), and 117.8 times as many cells in contralateral cortex (506.6 vs 4.3; p=0.00550).
DISCUSSION
Our transsynaptic tracing results using virus with an intact PEST domain contradict the earlier claim that such a virus can spread between neurons in the absence of TEVP(24). Our findings from sequencing the SiR viruses from the originating laboratory suggest that the reason for that claim was that the viruses used in the original study had lost the intended modification.
While it is possible that the escape of the two SiR virus samples from the modification intended to attenuate them was a fluke due to bad luck with those two batches, we view this as unlikely, for three reasons. First, the reported finding that viral replication and spread occurred in the absence of TEVP is difficult to understand in the absence of mutations but is easily explained if the viral preparations used for those experiments harbored the kind of mutations that we found in the two preparations to which we had access. Second, the two SiR virus samples that we analyzed had independently developed mutations causing loss of the intended C-terminal addition to the nucleoprotein; we know that the mutations were independent because the two samples were of different viruses so did not both derive from a single compromised parental stock. Third, the mutation profiles of the two viruses were very different: whereas the SiR-CRE sample had the same point mutation in nearly 100% of its viral particles, only a minority of the SiR-FLPo particles had that particular mutation, with the majority having a different point mutation three codons away that had the same result. This suggests that any of the many opportunities for removing the C-terminal addition — creation of a premature stop codon at any one of a number of sites, or a frameshift mutation anywhere in the vicinity — can be exploited by a given batch of virus, greatly increasing the probability of such mutants arising.
While it is clearly possible to make virus with the intended modification to the nucleoprotein, because we have done so here (and the authors of the original paper also report doing so, in a recent preprint (49) in response to our own preprint of an earlier version of this paper), our findings suggest that the approach in its current form is vulnerable to being undermined by viral mutation. Although a number of groups have made recombinant rabies viruses — as well as other rhabdoviruses and other nonsegmented negative-strand RNA viruses — encoding fusions of exogenous proteins to viral proteins (50-60), most of these groups have found that the additions significantly impaired function, and some have found that the viruses rapidly lost C-terminal additions to viral proteins. For example, an attempt to make SAD B19 rabies virus with EGFP fused to the C-terminus of the nucleoprotein was unsuccessful, suggesting that large C-terminal additions make the nucleoprotein dysfunctional; the authors of that paper resorted instead to making virus encoding the fusion protein in addition to the wild-type nucleoprotein (53). A vesicular stomatitis virus with GFP fused to the C-terminus of the glycoprotein gene lost the modification within a single passage of the virus because of a point mutation creating a premature stop codon (50). Relatedly, a VSV with its glycoprotein gene replaced with that of a different virus was found to quickly develop a premature stop codon causing loss of the last 21 amino acids of the exogenous glycoprotein, conferring a marked replication advantage to the mutants bearing the truncated version (61). Generalizing from these prior examples as well as our findings here, we suggest that any attempt to attenuate a virus by addition to the C-terminus of a viral protein will be vulnerable to loss of the modification, and that any such virus will therefore need to be monitored very carefully.
It is unclear whether further improvements to the design of the viruses can be made to make loss of the PEST domain less likely. Although the synonymous mutations that we made to five codons near the junction of N and the 3’ addition are a good start, there are numerous other codons in the immediate vicinity that are also one point mutation away from being stop codons but that do not have synonyms that are not. Furthermore, no such changes would protect against frameshifts.
If the viruses used for the transsynaptic tracing experiments in the original paper were actually de facto first-generation, ΔG viruses like the SiR samples that we analyzed, how could the authors have found, in postmortem tissue, cells labeled by SiR-CRE that had survived for weeks? The answer may simply be that, as we have shown in Chatterjee et al. (21) and again here (Figure 3), and as the original authors also report in their new preprint (49), a preparation of first-generation rabies viral vector expressing Cre can leave a large fraction of labeled cells alive for at least months, in contrast to similar ones encoding tdTomato (21) or EGFP (4). Similarly, Gomme et al. found long-term survival of some neurons following infection by a replication-competent rabies virus expressing Cre (62)). Our results with the “revertant” control virus RVΔG-N*PEST-4Cre (Figures 4 and 5, and Supplementary Figures S5-S7), showing thousands of labeled neurons three weeks after rabies virus injection, are consistent with this as well.
One reason, in turn, why a preparation of a simple ΔG rabies virus encoding Cre can leave many cells alive may be that not all the virions are in fact first-generation viral particles, because of the high mutation rate that we have highlighted in this paper. We have shown in Chatterjee et al. (21) that a second-generation (ΔGL) rabies virus, which has both its glycoprotein gene G and its polymerase gene L deleted, leaves cells alive for the entire four months that we followed them. However, any first-generation (ΔG) virus that contains a frameshift or point mutation knocking out L will in practice be a ΔGL virus. Indeed, a stop codon or frameshift mutation in any of several other viral genes is likely to have a similar effect as one in L (and it might be that the Ser419Thr mutation that we found in 9.49% of our RVΔG-4Cre virions is just such a knockout mutation of N). Together with the high mutation rate of rabies virus, this means that, within every preparation of first-generation rabies virus there is almost guaranteed to be a population of de facto second-generation variants mixed in with the intended first-generation population and propagated in the producer cells by complementation by the first-generation virions. Any rabies virus preparation (whether made in the laboratory or occurring naturally) can be expected to contain a population of such knockout (whether by substitution, frameshift, or deletion) mutants (related to the classic phenomenon of “defective interfering particles”, or mutants with a marked replication advantage (63-65), and the higher the multiplicity of infection when passaging the virus, the higher the proportion of such freeloading viral particles typically will be. This would not necessarily be noticed in the case of a virus encoding a more common transgene product such as a fluorophore, because the expression levels of these by the knockout mutants would be too low to label cells clearly (see Figure 1 in Chatterjee et al. (21)). However, with Cre as the payload, any “second-generation” particles would be able to label neurons but not kill them, because second-generation rabies viral vectors do not kill cells for at least months (21). This explanation would predict that the percentage of neurons surviving infection with a rabies virus encoding Cre will depend on the particular viral preparation that is injected, with some having a greater fraction of knockout particles than others.
This could explain why the SiR-CRE virus sample killed cells faster than our own RVΔG-4Cre (Figure 3). This analysis would also presumably apply to first-generation (ΔG) viruses expressing FLPo: while we found that the FLPo-expressing version that we made did not leave as many cells alive as the Cre-expressing version did (Supplementary Figure S8), that preparation may simply have had fewer mutants with knockout of genes essential for replication.
On a positive note, we found that virus with an intact PEST domain does spread between neurons when TEVP is supplied. This suggests that such viruses could in fact become the basis for monosynaptic tracing systems with reduced toxicity. Our finding is complementary to those of the original authors in their recent preprint (49) that an intact SiR virus did not kill labeled neurons for five months: that study was of neurons directly infected by B19G-enveloped virus (i.e., without using EnvA-enveloped virus, expression of TVA to mediate its selective infection, expression of G to complement the ΔG virus, or expression of TEVP to remove the C-terminal addition to the nucleoprotein); they did not show that their intact virus could spread between neurons or that it is nontoxic as it does so. Conversely, we showed here that an intact NPEST virus can spread between neurons, but we did not assay toxicity during this process. It remains to be seen to what degree an NPEST or SiR virus that spreads transsynaptically is cytotoxic, not only to transsynaptically-labeled cells but also to the starting cells, which need to express both G and TEVP to allow replication and spread of the virus. This is not a trivial point, as G is toxic when overexpressed (66), and the original authors’ finding that cultured cells rapidly lost TEVP activity (49) suggests that TEVP expressed at sufficient levels may be toxic as well.
Although we found that the “intact” NPEST virus spread considerably less than that of the “revertant” control virus, there may be room for improvement: the TEVP expression levels that we engineered are unlikely to happen to be optimal, and the numbers of labeled neurons could increase further with longer survival times. On the other hand, if overexpression of TEVP causes toxicity (see above), increasing it beyond what we achieved could be deleterious. Another consideration is the nine additional amino acids that are unavoidably left on the C-terminus of the nucleoprotein after the rest of the addition is removed by TEVP: while these additional amino acids may not much impair the function of the protein, they are unlikely to help it.
It is also unclear to what degree a minority population of revertant mutants that did arise in stocks of otherwise-intact virus would pose a problem for monosynaptic tracing studies. If, for example, ∼5% of the virions in a given preparation were revertant mutants (e.g., as the original authors report obtaining after six passages in cells highly expressing TEVP (49)), one might expect the same percentage of labeled presynaptic neurons to be labeled by those mutants and therefore to experience the toxicity of infection by a first-generation, ΔG virus. However, because the process of infecting the starter cells, replicating within them (if provided with G), and spreading to other cells is comparable to an additional passage in cell culture, the percentage of presynaptic cells labeled by the revertant mutants could be higher, and would presumably depend on the level of TEVP expression in each starter cell.
In summary, our results suggest that rabies virus with a PEST domain added to its nucleoprotein only spreads between neurons if the PEST domain is removed, whether by expression of TEVP or by mutation. Our finding that virus with an intact PEST domain can spread when TEVP is provided, as the designers had presumably originally intended, raises the possibility that further optimization and validation could make the SiR approach a viable option for monosynaptic tracing with reduced toxicity.
SUMMARY OF METHODS (see Supplementary Methods for details)
Cloning
The following novel plasmids were made using standard cloning techniques (see Supplementary Methods): pLV-CAG-FLEX-BFP-(mCherry)’ (Addgene 115234), pLV-CAG-F14F15S-BFP-(mCherry)’ (Addgene 115235), pLV-U-TVA950 (Addgene 115236), pRVΔG-4FLPo (Addgene 122050), pAAV-synP-F14F15S-splitTVA-EGFP-tTA (Addgene 136917), pB-CAG-TEVP-IRES-mCherry (Addgene 174377), pAAV-synP-FLPo (Addgene 174378), pAAV-TREtight-H2b-emiRFP670-TEVP (Addgene 174379), pRVΔG-NPEST-4Cre (Addgene 174380), pRVΔG-N*PEST-4Cre (Addgene 174381), pCAG-hypBase. All of the above novel plasmids have been deposited with Addgene and can be obtained from there, except for pCAG-hypBase, the distribution of which is not permitted due to intellectual property constraints.
Production of lentiviral and adeno-associated viral vectors
Lentiviral vectors were made as described (67) but using a vesicular stomatitis virus envelope expression plasmid pMD2.G for most vectors except for LV-U-TVA950(B19G), which was made using the rabies virus envelope expression plasmid pCAG-B19GVSVGCD (67).
AAV1-synP-F14F15S-splitTVA-EGFP-tTA was packaged as serotype 1 by the UNC vector core (and can be purchased from there as well as from Addgene (catalog # 136917)).
AAV1-TREtight-mTagBFP2-B19G (which we have described previously (46, 47)), was packaged as serotype 1 by Addgene (catalog # 100798-AAV1).
AAV1-TREtight-H2b-emiRFP670-P2A-TEVP and AAV2-retro-synP-FLPo were made by standard techniques (see Supplementary Methods).
Production of titering cell lines
Reporter cell lines 293T-FLEX-BC and 293T-F14F15S-BC were made using lentiviral vectors made from pLV-CAG-FLEX-BFP-(mCherry)’ and pLV-CAG-F14F15S-BFP-(mCherry)’, described above. TVA-expressing versions, 293T-FLEX-BC-TVA and 293T-F14F15S-BC-TVA, were made by infecting the above lines with LV-U-TVA950(VSVG) (described above).
Production of TEVP-expressing cell line
293T-TEVP was made by transfecting HEK 293T/17 cells with pCAG-hypBase and pB-CAG-TEVP-IRES-mCherry, then sorting.
Production and titering of rabies viruses
RVΔG-4Cre, RVΔGL-4Cre, RVΔG-NPEST-4Cre, and RVΔG-N*PEST-4Cre were produced mostly as described (21, 34) (see Supplementary Methods); titering was as described (32) but using the 293T-FLEX-BC and 293T-F14F15S-BC lines used for B19G-enveloped viruses and the 293T -FLEX-BC-TVA and 293T-F14F15S-BC-TVA used for the EnvA-enveloped viruses.
Extraction of viral genomic RNA and preparation for Sanger sequencing
RNA viral genomes were extracted from virus samples using a Nucleospin RNA kit (Macherey-Nagel, Germany), then converted to cDNA by RT-PCR (Agilent Technologies, USA) with a barcoded primer. cDNA sequences were amplified using Platinum SuperFi Green Master Mix (Invitrogen (Thermo Fisher), USA) and cloned into pEX-A (Eurofins Genomics, USA) using an In-Fusion HD Cloning Kit (Takara Bio, Japan). Sequencing data was collected for over fifty clones per sample.
Single-molecule, real-time (SMRT) sequencing
Double-stranded DNA samples for SMRT sequencing were prepared similarly to the above, except that that the clones generated from each of the three virus samples were tagged with one of the standard PacBio barcode sequences to allow identification of each clone’s sample of origin following multiplex sequencing. This was in addition to the random index (10 nucleotides in this case) that was again included in the RT primers in order to uniquely tag each individual genome.
Surgeries and virus injections for two-photon imaging
All experimental procedures using mice were conducted according to NIH guidelines and were approved by the MIT Committee for Animal Care (CAC). Mice were housed 1-4 per cage under a normal light/dark cycle for all experiments.
Adult mice of Cre-dependent reporter strains Ai14 (68) (Jackson Laboratory #007908) or Ai35D (40) (Jackson Laboratory # 012735) mice were injected in V1 with LV-U-TVA950(B19G), then implanted with a glass window. Seven days later, windows were removed and one of the three EnvA-enveloped rabies viral vectors (with equalized titers) was injected at the same coordinates, then coverslips were reapplied.
In vivo two-photon imaging and image analysis
Beginning seven days after injection of each rabies virus and continuing every seven days up to a maximum of four weeks following rabies virus injection, the injection sites were imaged on a two-photon microscope. One field of view was chosen in each mouse in the area of maximal fluorescent labelling. Cell counting was performed with the ImageJ Cell Counter plugin.
Monosynaptic tracing experiments: surgeries and virus injections
The three helper AAV1s were combined at final titers of 3.6E10 gc/ml for AAV1-synP-F14F15S-splitTVA-EGFP-tTA and 6.60E11 gc/ml for AAV1-TREtight-mTagBFP2-B19G and/or AAV1-TREtight-H2b-emiRFP670-P2A-TEVP. 250 nl of helper virus mixture was injected into layer 5 of barrel cortex of Ai14 mice; in the same surgery, 300 nl of AAV2-retro-synP-FLPo (1.16E13 gc/ml) was injected into dorsolateral striatum. 7 days after AAV injection, 300nl of RVΔG-NPEST-4Cre(EnvA) (1.86E9 iu/ml) or RVΔG-N*PEST-4Cre(EnvA) (diluted to 1.86E9 iu/ml) was injected in barrel cortex at the same site as the helper AAV mixtures.
Monosynaptic tracing experiments: perfusions and histology
12 days or 3 weeks after injection of rabies virus, mice were perfused; Brains were postfixed overnight and cut into 50 µm coronal sections on a vibrating microtome. Sections were immunostained as described(47) with a chicken anti-GFP (Aves Labs GFP-1020) 1:500 and donkey anti-chicken Alexa Fluor 488 (Jackson Immuno 703-545-155) 1:200.
Monosynaptic tracing experiments: cell counts and microscopy
tdTomato-labeled neurons in contralateral cortex and thalamus were counted manually with the Cell Counter plugin in ImageJ. Cells at the injection site were counted either manually or (when dense) using the Analyze Particle function in ImageJ. Only one of the six series of sections (i.e., every sixth section: see above) was counted for each mouse. Images for figures were taken on a confocal microscope (Zeiss, LSM 900).
SUPPLEMENTARY METHODS
Cloning
Lentiviral transfer plasmids were made by cloning, into pCSC-SP-PW-GFP (1) (Addgene #12337), the following components:
the CAG promoter (2) and a Cre-dependent “FLEX” (3) construct consisting of pairs of orthogonal lox sites flanking a back-to-back fusion of the gene for mTagBFP2 (4) immediately followed by the reverse-complemented gene for mCherry (5), to make the Cre reporter construct pLV-CAG-FLEX-BFP-(mCherry)’ (Addgene 115234);
the CAG promoter (2) and a Flp-dependent “FLEX” (3) construct consisting of pairs of orthogonal FRT sites (6) flanking a back-to-back fusion of the gene for mTagBFP2 (4) immediately followed by the reverse-complemented gene for mCherry (5), to make the Flp reporter construct pLV-CAG-F14F15S-BFP-(mCherry)’ (Addgene 115235);
the ubiquitin C promoter from pUB-GFP (7) (Addgene 11155) and the long isoform of TVA (8) to make the TVA expression vector pLV-U-TVA950 (Addgene 115236).
The first-generation vector genome plasmid pRVΔG-4FLPo (Addgene 122050) was made by cloning the FLPo gene (9) into pRVΔG-4Cre.
pAAV-synP-F14F15S-splitTVA-EGFP-tTA (Addgene 136917) is a FLP-dependent version of the Cre-dependent helper virus genome plasmid pAAV-synP-FLEX-splitTVA-EGFP-tTA (Addgene 52473) with orthogonal FRT sites(6) instead of orthogonal lox sites.
pCAG-hypBase was made by synthesizing the 1785-bp gene for an improved version of piggyBac transposase 1. (10) and cloning it into the EcoRI and NotI sites of pCAG-GFP (7) (Addgene 11150).
pB-CAG-TEVP-IRES-mCherry (Addgene 174377) was made by cloning the CAG promoter (2), a mammalian codon-optimized version (11, 12) of the TEVP gene (S219V mutant) (13), the EMCV IRES (14), and the mCherry gene (5) into pB-CMV-MCS-EF1-Puro (System Biosciences #PB510B-1).
pAAV-synP-FLPo (Addgene 174378) was made by cloning the FLPo gene(9) into the EcoRI and AccIII sites of pAAV-synP-FLEX-EGFP-B19G (Addgene 59333).
pAAV-TREtight-H2b-emiRFP670-TEVP (Addgene 174379) was made by cloning an H2b-emiRFP670 fusion gene (15) (sequence from Addgene 136571 but with the internal kozak sequence replaced by a short GSG linker to prevent translation of fluorophore unfused to H2b), followed by a P2A sequence and the above-described TEVP gene, into the EcoRI and NheI sites of pAAV-TREtight-mTagBFP2-B19G (16) (Addgene 100799).
pRVΔG-NPEST-4Cre (Addgene 174380) was made by cloning a synthesized fragment containing the 3’ addition from Ciabatti et al. ’17, with synonymous changes to 5 codons (17) in the immediate vicinity of the junction between the nucleoprotein gene and the 3’ addition so that more than a single point mutation would be required to convert them to stop codons, into the PmlI and BstI sites of pRVΔG-4Cre(18) (Addgene 98034).
pRVΔG-N*PEST-4Cre (Addgene 174381) was made identically to the above except that the glycine codon in position 453 was replaced with a stop codon (TGA).
All of the above novel plasmids have been deposited with Addgene, with the accession numbers given above, and can be purchased from there except for pCAG-hypBase, the distribution of which is not permitted due to intellectual property constraints.
Production of lentiviral and adeno-associated viral vectors
Lentiviral vectors were made by transfection of HEK-293T/17 cells (ATCC 11268) as described (19) but using the vesicular stomatitis virus envelope expression plasmid pMD2.G (Addgene 12259) for all vectors except for LV-U-TVA950(B19G), which was made using the rabies virus envelope expression plasmid pCAG-B19GVSVGCD (19). Lentiviral vectors expressing fluorophores were titered as described (20); titers of LV-U-TVA950(VSVG) and LV-U-TVA950(B19G) were assumed to be approximately the same as those of the fluorophore-expressing lentiviral vectors produced in parallel.
AAV1-synP-F14F15S-splitTVA-EGFP-tTA was packaged as serotype 1 by the UNC vector core (and can be purchased from there as well as from Addgene (catalog # 136917)).
AAV1-TREtight-mTagBFP2-B19G (which we have described previously(16, 21)), was packaged as serotype 1 by Addgene (catalog # 100798-AAV1).
AAV1-TREtight-H2b-emiRFP670-P2A-TEVP and AAV2-retro-synP-FLPo were made by transfecting HEK 293T/17 cells with the respective genome plasmid along with pHelper (Cellbiolabs VPK-421) and either pAAV-RC1 (Cellbiolabs VPK-421) or “rAAV2-retro helper” (Addgene 81070)(22), using Xfect Transfection Reagent (Takara 631318) according to the manufacturer’s protocol, with collection of supernatant (and replacement with fresh media) at 3 days after transfection and collection of supernatant as well as the transfected cells at 5 days after transfection. Virus was pelleted from supernatants using PEG 8000; cells were lysed by four freeze-thaw cycles. Pelleted virus and cell lysate were pooled and treated with benzonase, purified on an iodixanol gradient, then concentrated in an Amicon Ultra-15 centrifugal filter unit (Millipore Sigma UFC9100).
Production of titering cell lines
To make reporter cell lines, HEK-293T/17 cells were infected with either pLV-CAG-FLEX-BFP-(mCherry)’ or pLV-CAG-F14F15S-BFP-(mCherry)’ at a multiplicity of infection of 100 in one 24-well plate well each. Cells were expanded to 2x 15cm plates each, then sorted on a FACS Aria to retain the top 10% most brightly blue fluorescent cells. After sorting, cells were expanded again to produce the cell lines 293T-FLEX-BC and 293T-F14F15S-BC, reporter lines for Cre and FLPo activity, respectively. TVA-expressing versions of these two cell lines were made by infecting one 24-well plate well each with LV-U-TVA950(VSVG) at an MOI of approximately 100; these cells were expanded to produce the cell lines 293T-FLEX-BC-TVA and 293T-F14F15S-BC-TVA.
Production of TEVP-expressing cell line
The 293T-TEVP cell line was made by transfecting HEK 293T/17 (ATCC CRL-11268) cells with pCAG-hypBase and pB-CAG-TEVP-IRES-mCherry in a 15 cm plate (293T/17) or one well each of a 24 well plate (BHK-B19G2 and BHK-EnvA2) using Lipofectamine 2000 (Thermo Fisher 11668019) according to the manufacturer’s instructions, then expanding and sorting the cells on a FACSAria (BD Biosciences) for the brightest 20% of red fluorescent cells, then expanding and freezing the sorted cells.
Production and titering of rabies viruses
RVΔG-4Cre and RVΔGL-4Cre were produced as described (18, 23), with EnvA-enveloped viruses made by using cells expressing EnvA instead of G for the last passage. Titering and infection of cell lines with serial dilutions of viruses was as described (24), with the 293T-TVA-FLEX-BC and 293T-TVA-F14F15S-BC lines used for B19G-enveloped viruses and the 293T-TVA-FLEX-BC and 293T-TVA-F14F15S-BC used for the EnvA-enveloped viruses. For the in vivo injections, the three EnvA-enveloped, Cre-encoding viruses were titered side by side, and the two higher-titer viruses were diluted so that the final titer of the injected stocks of all three viruses were approximately equal at 1.39E9 infectious units per milliliter.
The two new rabies viruses RVΔG-NPEST-4Cre and RVΔG-N*PEST-4Cre were produced as described(18, 23) but with only one amplification passage (P1) between the rescue transfection step and the final passage on EnvA-expressing cells. For the NPEST version, the following additional modifications were made: for the rescue transfection, the new cell line 293T-TEVP (see above) was used instead of HEK 293T-17, and pB-CAG-TEVP-IRES-mCherry (10 µg per 15 cm plate) was included in the plasmid mix. Because pilot testing showed no clear advantage of the 293T-TEVP line over transfected 293Ts for passaging, the P1 passage was on HEK 293T/17 cells transfected with equal amounts of pCAG-B19G and pB-CAG-TEVP-IRES-mCherry (32 µg of each plasmid per 15 cm plate). The final passage was on BHK-EnvA2 cells(24) transfected with pB-CAG-TEVP-IRES-mCherry. Lipofectamine 2000 (Thermo Fisher 11668019) was used for all transfections, following the manufacturer’s protocol.
The final stocks of RVΔG-NPEST-4Cre(EnvA) and RVΔG-N*PEST-Cre(EnvA) were titered side by side as described(24) series on 293T-FLEX-BC-TVA cells (see above) in two-fold dilution series. Final titers of the viruses were determined to be 1.86e9 iu/ml for RVΔG-NPEST-4Cre(EnvA) and 2.41e10 iu/mL (normalized) for RVΔG-NEPEST-4Cre(EnvA). Prior to injection in vivo (see below), the N*PEST virus was diluted 12.96-fold in DPBS to match the titer of the NPEST version.
Immunostaining and imaging of cultured cells
Reporter cells (see above) plated on coverslips coated in poly-L-lysine (Sigma) were infected with serial dilutions of RVΔG-4Cre and RVΔGL-4Cre as described (24). Three days after infection, cells were fixed with 2% paraformaldehyde, washed repeatedly with blocking/permeabilization buffer (0.1% Triton-X (Sigma) and 1% bovine serum albumin (Sigma) in PBS), then labeled with a blend of three FITC-conjugated anti-nucleoprotein monoclonal antibodies (Light Diagnostics Rabies DFA Reagent, EMD Millipore 5100) diluted 1:100 in blocking buffer for 30 minutes, followed by further washes in blocking buffer, then finally briefly rinsed with distilled water and air-dried before mounting the coverslips onto microscope slides with Prolong Diamond Antifade (Thermo P36970) mounting medium. Images of wells at comparable multiplicities of infection (∼0.1) were collected on a Zeiss 710 confocal microscope.
Extraction of viral genomic RNA and preparation for Sanger sequencing
RNA viral genomes were extracted from two Tripodi lab (EnvA/SiR-CRE and EnvA/Sir-FLPo) and three Wickersham lab (RVΔG-4mCherry, RVΔG-NPEST-4Cre(EnvA), and RVΔG-N*PEST-Cre(EnvA)) rabies virus samples using a Nucleospin RNA kit (Macherey-Nagel, Germany) and treated with DNase I (37°C for 1 hour, followed by 70°C for 5 minutes). Extracted RNA genomes were converted to complementary DNA using an AccuScript PfuUltra II RT-PCR kit (Agilent Technologies, USA) at 42°C for 2 hours with the following barcoded (so that individual viral particles’ genomes would be marked with distinct barcodes) primer annealing to the rabies virus leader sequence:
Adapter_N8_leader_fp: TCAGACGATGCGTCATGCNNNNNNNNACGCTTAACAACCAGATC
cDNA sequences from the leader through the first half of the rabies virus P gene were amplified using Platinum SuperFi Green Master Mix (Invitrogen (Thermo Fisher), USA) with cycling conditions as follows: denaturation at 98°C for 30 seconds, followed by 25 cycles of amplification (denaturation at 98°C for 5 seconds and extension at 72°C for 75 seconds), with a final extension at 72°C for 5 minutes, using the following primers: pEX_adapter_fp: CAGCTCAGACGATGCGTCATGC
Barcode2_P_rp: GCAGAGTCATGTATAGCTTCTTGAGCTCTCGGCCAG
The ∼2kb PCR amplicons were extracted from an agarose gel, purified with Nucleospin Gel and PCR Clean-up (Macherey-Nagel, Germany), and cloned into pEX-A (Eurofins Genomics, USA) using an In-Fusion HD Cloning Kit (Takara Bio, Japan). The cloned plasmids were transformed into Stellar competent cells (Takara Bio, Japan), and 200 clones per rabies virus sample were isolated and purified for sequencing. For each clone, the index and the 3’ end of the N gene were sequenced until sequencing data was collected for over fifty clones per sample: 51 clones from SiR-CRE(EnvA), 50 from SiR-FLPo(EnvA), and 51 from RVΔG-4mCherry(EnvA). Although viral samples may contain plasmid DNA, viral mRNA, and positive-sense anti-genomic RNA, this RT-PCR procedure can amplify only the negative-sense RNA genome: the reverse transcription primer is designed to anneal to the leader sequence of the negative-strand genome so that cDNA synthesis can start from the negative-sense RNA genome, with no other possible templates. Additionally, the PCR amplifies the cDNA, not any plasmids which were transfected into producer cell lines during viral vector production, because the forward PCR primer anneals to the primer used in the reverse transcription, rather than any viral sequence. This RT-PCR protocol ensures that only negative-sense RNA rabies viral genomes can be sequenced.
Sanger sequencing of transgenes in SiR viruses
The procedure for sequencing the transgene inserts was the same as above, but with the RT primer being Adaptor_N8_M_fp (see below), annealing to the M gene and again with a random 8-nucleotide index to tag each clone, and with PCR primers pEX_adaptor_fp (see above) and Barcode2_L_rp (see below), to amplify the sequences from the 3’ end of the M gene to the 5’ end of the L gene, covering the iCre-P2A-mCherryPEST (or FLPo-P2A-mCherryPEST) sequence.
Primers for RT and PCR for Sanger sequencing were as follows: Adaptor_N8_M_fp: TCAGACGATGCGTCATGCNNNNNNNNCAACTCCAACCCTTGGGAGCA Barcode2_L_rp: GCAGAGTCATGTATAGTTGGGGACAATGGGGGTTCC
Sanger sequencing analysis of PEST region in SiR and control viruses
Alignment and mutation detection were performed using SnapGene 4.1.9 (GSL Biotech LLC, USA). Reference sequences of the viral samples used in this study were based on deposited plasmids in Addgene: pSAD-F3-NPEST-iCRE-2A-mCherryPEST (Addgene #99608), pSAD-F3-NPEST-FLPo-2A-mCherryPEST (Addgene #99609), and pRVΔG-4mCherry (Addgene #52488). Traces corresponding to indices and mutations listed in Figure 1 and Supplementary File S1 were also manually inspected and confirmed.
Single-molecule, real-time (SMRT) sequencing
Double-stranded DNA samples for SMRT sequencing were prepared similarly to the above, except that that the clones generated from each of the three virus samples were tagged with one of the standard PacBio barcode sequences to allow identification of each clone’s sample of origin following multiplex sequencing (see https://www.pacb.com/wp-content/uploads/multiplex-target-enrichment-barcoded-multi-kilobase-fragments-probe-based-capture-technologies.pdf and https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/Barcoding-with-SMRT-Analysis-2.3). This was in addition to the random index (10 nucleotides in this case) that was again included in the RT primers in order to uniquely tag each individual genome.
RNA viral genomes were extracted from two Tripodi lab (SiR-CRE and Sir-FLPo) and one Wickersham lab (RVΔG-4Cre (18); see Addgene #98034 for reference sequence) virus samples using a Nucleospin RNA kit (Macherey-Nagel, Germany) and treated with DNase I (37°C for 1 hour, followed by 70° for 5 minutes). Primers for RT and PCR are listed below. PCR cycling conditions were as follows: denaturation at 98°C for 30 seconds, followed by 20 cycles of amplification (denaturation at 98°C for 5 seconds and extension at 72°C for 75 seconds), with a final extension at 72°C for 5 minutes. This left each amplicon with a 16bp barcode at each of its ends that indicated which virus sample it was derived from, in addition to a 10-nt index sequence that was unique to each genome molecule.
Primers for RT and PCR for SMRT sequencing were as follows: RVΔG-4Cre:
RT:
Barcode1_cagc_N10_leader_fp:
TCAGACGATGCGTCATCAGCNNNNNNNNNNACGCTTAACAACCAGATC
PCR:
Barcode1_cagc_fp: TCAGACGATGCGTCAT-CAGC
Barcode2_P_rp (see above)
SiR-CRE:
RT:
Barcode5_cagc_N10_leader_fp:
ACACGCATGACACACTCAGCNNNNNNNNNNACGCTTAACAACCAGATC
PCR:
Barcode5_cagc_fp: ACACGCATGACACACT-CAGC
Barcode3_P_rp: GAGTGCTACTCTAGTACTTCTTGAGCTCTCGGCCAG
SiR-FLPo:
RT:
Barcode9_cagc_N10_leader_fp:
CTGCGTGCTCTACGACCAGCNNNNNNNNNNACGCTTAACAACCAGATC
PCR:
Barcode9_cagc_fp: CTGCGTGCTCTACGAC-CAGC
Barcode4_P_rp: CATGTACTGATACACACTTCTTGAGCTCTCGGCCAG
After the amplicons were extracted and purified from an agarose gel, the three were mixed together at 1:1:1 molar ratio. The amplicons’ sizes were confirmed on the Fragment Analyzer (Agilent Technologies, USA), then hairpin loops were ligated to both ends of the mixed amplicons to make circular SMRTbell templates for Pacbio Sequel sequencing. SMRTbell library preparation used the PacBio Template Preparation Kit v1.0 and Sequel Chemistry v3. Samples were then sequenced on a PacBio Sequel system running Sequel System v.6.0 (Pacific Biosciences, USA), with a 10-hour movie time.
Bioinformatics for PacBio sequence analysis
For the ∼2kb template, the DNA polymerase with a strand displacement function can circle around the template and hairpins multiple times; the consensus sequence of multiple passes yields a CCS (circular consensus sequence) read for each molecule. Raw sequences were initially processed using SMRT Link v.6.0 (Pacific Biosciences, USA). Sequences were filtered for a minimum of read length 10 bp, pass 3, and read score 65. 127,178 CCS reads were filtered through passes 3 and Q10; 89,188 CCS reads through passes 5 and Q20; 29,924 CCS reads through passes 8 and Q30. Downstream bioinformatics analysis was performed using BLASR V5.3.2 for the alignment, bcftools v.1.6 for variant calling. Mutations listed in Figure 2 and Supplementary File S2 were also manually inspected and confirmed using Integrative Genomics Viewer 2.3.32 (software.broadinstitute.org/software/igv/). Analysis steps included the following: 1. Exclude CCS reads under 1000 bases, which may have been derived from non-specific reverse transcription or PCR reactions. 2. Classify the CCS reads to the three samples, according to the PacBio barcodes on the 5’ ends. 3. For any CCS reads that contain the same 10-nucleotide random index, select only one of them, to avoid double-counting of clones derived from the same cDNA molecule. 4. Align the reads to the corresponding reference sequence (see Supplementary Files S3-S5). 5. Count the number of mutations at each nucleotide position of the reference sequences.
Surgeries and virus injections for two-photon imaging
All experimental procedures using mice were conducted according to NIH guidelines and were approved by the MIT Committee for Animal Care (CAC). Mice were housed 1-4 per cage under a normal light/dark cycle for all experiments.
Adult (>9 weeks, male and female) Cre-dependent tdTomato reporter Ai14 (25) (Jackson Laboratory #007908) or Arch-EGFP reporter Ai35D (26) (Jackson Laboratory # 012735) mice were anesthetized with isoflurane (4% in oxygen) and ketamine/xylazine (100mg/kg and 10mg/kg respectively, i.p.). Mice were given buprenorphine (0.1 mg/kg s.q.) and meloxicam (2 mg/kg s.q.) as preemptive analgesics, as well as eye ointment (Puralube); the scalp was then shaved, depilated with Nair, and thoroughly rinsed before the mice were mounted on a stereotaxic instrument (Stoelting Co.) with a hand warmer (Heat Factory) underneath the animal to maintain body temperature. The scalp was then disinfected with povidone-iodine, and an incision was made at the appropriate location (see below).
A 3 mm craniotomy was opened over primary visual cortex (V1). 300 nl of LV-U-TVA950(B19G) (see above) was injected into V1 (-2.70 mm AP, 2.50 mm LM, -0.26 mm DV; AP and LM stereotaxic coordinates are with respect to bregma; DV coordinate is with respect to brain surface) using a custom injection apparatus comprised of a hydraulic manipulator (MO-10, Narishige) with headstage coupled via custom adaptors to a wire plunger advanced through pulled glass capillaries (Wiretrol II, Drummond) back-filled with mineral oil and front-filled with virus solution. Glass windows composed of a 3mm-diameter glass coverslip (Warner Instruments CS-3R) glued (Optical Adhesive 61, Norland Products) to a 5mm-diameter glass coverslip (Warner Instruments CS-5R) were then affixed over the craniotomy with Metabond (Parkell). Seven days after injection of the lentiviral vector, the coverslips were removed and 300 nl of one of the three EnvA-enveloped rabies viral vectors (with equalized titers as described above) was injected at the same stereotaxic coordinates. Coverslips were reapplied and custom stainless steel headplates (eMachineShop) were affixed to the skulls around the windows.
In vivo two-photon imaging and image analysis
Beginning seven days after injection of each rabies virus and continuing every seven days up to a maximum of four weeks following rabies virus injection, the injection sites were imaged on a Prairie/Bruker Ultima IV In Vivo two-photon microscope driven by a Spectra Physics Mai-Tai Deep See laser with a mode locked Ti:sapphire laser emitting at a wavelength of 1020 nm for tdTomato and mCherry or 920 nm for EGFP. Mice were reanesthetized and mounted via their headplates to a custom frame, again with ointment applied to protect their eyes and with a handwarmer maintaining body temperature. One field of view was chosen in each mouse in the area of maximal fluorescent labelling. The imaging parameters were as follows: image size 512 X 512 pixels (282.6 μm x 282.6 μm), 0.782 Hz frame rate, dwell time 4.0 μs, 2x optical zoom, Z-stack step size 1 μm. Image acquisition was controlled with Prairie View 5.4 software. Laser power exiting the 20x water-immersion objective (Zeiss, W plan-apochromat, NA 1.0) varied between 20 and 65 mW depending on focal plane depth (Pockel cell value was automatically increased from 450 at the top section of each stack to 750 at the bottom section). For the example images of labeled cells, maximum intensity projections (stacks of 150-400 μm) were made with Fiji software. Cell counting was performed with the ImageJ Cell Counter plugin. When doing cell counting, week 1 tdTomato labelled cells were defined as a reference; remaining week 1 cells were the same cells at later time point that align with week 1 reference cells but the not-visible cells at week 1 (the dead cells). Plots of cell counts were made with Origin 7.0 software (OriginLab, Northampton, MA). For the thresholded version of this analysis (Supplementary Figure S4), in order to exclude cells that could possibly have been labeled only with mCherry in the SiR-CRE group, only cells with fluorescence intensity greater than the average of the mean red fluorescence intensities of cells imaged in Ai35 versus Ai14 mice at the same laser power at 1020 nm at 7 days postinjection (32.33 a.u.) were included in the population of cells tracked from 7 days onward.
Monosynaptic tracing experiments: Sanger sequencing of viral genomic RNA
Sanger sequencing of RVΔG-NPEST-4Cre(EnvA) and RVΔG-N*PEST-4Cre(EnvA) was as described above but with the following modifications. The following barcoded primer was used for RT-PCR.
Adaptor-UMI-N_fp_57: ACACTCTTTCCCTACACGACGCTCTTCCGATCTNNNNNNNNNNNNNNNNNNNNAGAAGTCCGGAGGCTG TTTAT
cDNA sequences from the nucleoprotein gene through the first half of the rabies virus P gene were amplified using Platinum SuperFi Green Master Mix (Invitrogen (Thermo Fisher), USA) with cycling conditions as follows: denaturation at 98°C for 30 seconds, followed by 21 cycles of amplification (denaturation at 98°C for 5 seconds, annealing at 60°C for 10 seconds and extension at 72°C for 21 seconds), with a final extension at 72°C for 5 minutes, using the following primers:
i5-anchor_CTAGCGCT_fp_56:
AATGATACGGCGACCACCGAGATCTACACCTAGCGCTACACTCTTTCCCTACACGAC
P_Sanger_rp_56:
CAAGCAGAAGACGGCACGATTTTCCATCATCCAGGTG
The 700bp PCR amplicons for RVdG-NPEST-4Cre (EnvA) or RVΔG-N*PEST-4Cre (EnvA) virus were cloned into pEX-A (Eurofins Genomics, USA) using an In-Fusion HD Cloning Kit (Takara Bio, Japan) as above. The plasmids were transformed into Stellar competent cells (Takara Bio, Japan), and 32 clones from each rabies virus sample were isolated and purified for sequencing.
Monosynaptic tracing experiments: surgeries and virus injections
Following preparation of mice as described above (”Surgeries and virus injections for two-photon imaging”) the three helper AAV1s were combined at final titers of 3.6E10 gc/ml for AAV1-synP-F14F15S-splitTVA-EGFP-tTA and 6.60E11 gc/ml for AAV1-TREtight-mTagBFP2-B19G (when included) and AAV1-TREtight-H2b-emiRFP670-P2A-TEVP (when included) in DPBS (Fisher, 14-190-250). 250 nl of helper virus mixture was injected into layer 5 of barrel cortex (AP -1.55 mm w.r.t. bregma, LM 3.00 mm w.r.t. bregma, -DV 0.75 mm w.r.t. brain surface) of Ai14 mice; in the same surgery, 300 nl of AAV2-retro-synP-FLPo (1.16E13 gc/ml) was injected into dorsolateral striatum (AP 0.74 mm w.r.t. bregma, LM 2.25 mm w.r.t. bregma, DV -2.30 mm w.r.t the brain surface). 7 days after AAV injection, 300nl of RVΔG-NPEST-4Cre(EnvA) (1.86E9 iu/ml) or RVΔG-N*PEST-4Cre(EnvA) (diluted 12.96-fold in DPBS from 2.41E10 iu/ml to 1.86E9 iu/ml) was injected in barrel cortex at the same site as the helper AAV mixtures.
Monosynaptic tracing experiments: perfusions and histology
12 days or 3 weeks (depending on experiment; see main text) after injection of rabies virus, mice were transcardially perfused with 4% paraformaldehyde in phosphate-buffered saline. Brains were postfixed overnight in 4% paraformaldehyde in PBS on a shaker at 4°C and cut into 50 µm coronal sections on a vibrating microtome (Leica, VT-1000S). Sections were collected anterior to posteriorly into 6 tubes containing cryoprotectant. Collection goes on for 15 rounds so that each tube contains a sixth of the collected tissue (15 sections in each tube). Sections were immunostained as described(21) with a chicken anti-GFP primary antibody (Aves Labs GFP-1020) 1:500 and donkey anti-chicken Alexa Fluor 488 secondary antibody (Jackson Immuno 703-545-155) 1:200. Sections were mounted with Prolong Diamond Antifade mounting medium (Thermo Fisher P36970).
Monosynaptic tracing experiments: cell counts and microscopy
Coronal sections between 1.2mm and -3.3mm relative to bregma were examined under an epifluorescence microscope (Zeiss, Imager.Z2). When necessary due to high density of labeled cells, images were taken with the same microscope for cell counting. tdTomato-labeled neurons in contralateral cortex and thalamus were counted manually with the Cell Counter plugin in ImageJ. For cells at the injection site, when tdTomato expressing cells were few and sparse (usually less than 100 per section), cells coexpressing mTagBFP2 or H2b-emiRFP670 alongside were counted manually adding separate labels to each and then looking for overlapping cells. When tdTomato expressing cells were dense, tdTomato labeled cells were first counted using the Analyze Particle function in ImageJ (size in micron^2: 20-400; circularity: 0.20-1.00). The outline of these cells was then merged on top of images of mTagBFP2 and H2b-emiRFP670 labeled cells for the counting of the overlapping cells. Only one of the six series of sections (i.e., every sixth section: see above) was counted for each mouse.
Images for figures were taken on a confocal microscope (Zeiss, LSM 900). So that the confocal images of brain tissue included in the figures in this paper are representative of each group, the images were taken after the counts were conducted (see above), in each case using the mouse with the middle number of labeled neurons in that group (i.e., neither the highest nor the lowest in the group of 3 mice used for each condition).
Supplementary File S1: Sanger sequencing data of all clones shown in Figure 1.
51 clones derived from SiR-CRE, 50 from SiR-FLPo, and 51 from RVΔG-4mCherry are identified by their unique indices. All of the indices as well as the sequences corresponding to the 3’ end of the nucleoprotein gene are shown.
Supplementary File S2: Summary tables of SMRT sequencing data. These tables show all mutations occurring at positions mutated at greater than 2% frequency in the three virus samples analyzed. Position numbers in these tables refer to the sequences in the three Genbank files below (Supplementary Files S3-S5).
SINGLE-MOLECULE, REAL-TIME SEQUENCING RESULTS
Frameshifts and insertions
Position numbers in this file refer to the reference sequences included as Supplementary Files S4-S5. A “frameshift” is included in Tables 1a-2b if the number of deleted bases in positions 1439-1492 (the vicinity of the junction of the end of the N gene and the intended 3’ addition) is not an integer multiple of 3, with insertions ignored. “Any error” includes either the apparent frameshifts, or the new TAA/TAG/TGA stop codons, or both, with insertions ignored. The number of “frameshifts” increases considerably if insertion mutations are included in the calculation, indicating that there is a much higher insertion rate as compared to that of deletion; however, previous studies have found that spurious insertions are high with SMRT (see main text), so we ignore insertions in this paper apart from summarizing the data below.
All mutations in the SiR-CRE sample at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8), as well as frameshift mutations found in the C-terminal region of N.
Frameshift mutations in the C-terminal region of N in the SiR-CRE sample at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8).
All mutations in the SiR-FLPo sample at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8), as well as frameshift mutations found in the C-terminal region of N.
Frameshift mutations in the C-terminal region of N in the SiR-FLPo sample at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8).
All mutations in the SiR-FLPo sample at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8).
Tables of all mutations above 2% frequency threshold
Table 4a to 4c list all single-nucleotide substitutions and deletions at positions mutated at >2% threshold frequency. The percentage of mutations is calculated based on the total number of single nucleotide and deletion mutations divided by the total number of reads aligned, when insertion mutations are ignored. Deletion mutations dominate in the medium-frequency range between 2% and 5%.
SiR-CRE: substitutions and deletions at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8).
SiR-FLPo: substitutions and deletions at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8).
RVΔG-4Cre: substitutions and deletions at positions mutated at >2% frequency at all stringencies (CCS3, CCS5, CCS8).
Supplementary File S3: SiR-CRE amplicon reference sequence. This Genbank-format file contains the expected (i.e., based on the published sequence: Addgene #99608) sequence of amplicons obtained from SiR-CRE for SMRT sequencing.
Supplementary File S4: SiR-FLPo amplicon reference sequence. This Genbank-format file contains the expected (i.e., based on the published sequence: Addgene # 99609) sequence of amplicons obtained from SiR-FLPo for SMRT sequencing.
Supplementary File S5: RVΔG-4Cre amplicon reference sequence. This Genbank-format file contains the expected (i.e., based on the published sequence: see Addgene #98034) sequence of amplicons obtained from RVΔG-4Cre for SMRT sequencing.
(A-D) Reporter cells infected with first-generation, ΔG viruses show characteristic bright, clumpy anti-nucleoprotein staining (green), indicating high nucleoprotein expression and active viral replication. Red is mCherry expression, reporting expression of Cre or FLPo; blue is mTagBFP2, constitutively expressed by these reporter cell lines.
(E-H) Reporter cells infected with second-generation, ΔGL viruses show only punctate staining for nucleoprotein, indicating isolated individual viral particles or ribonucleoprotein complexes; these viruses do not replicate intracellularly (21). Reporter cassette activation takes longer from the lower recombinase expression levels of these viruses, so mCherry expression is dimmer than in cells infected with ΔG viruses at the same time point.
(I-J) Reporter cells infected with SiR viruses show clumps of nucleoprotein and rapid reporter expression indicating high expression of recombinases, similarly to cells infected with ΔG viruses. Scale bar: 100 µm, applies to all panels.
Supplementary File S6: >90% of SiR-CRE viral particles with the mCherry gene intact also have the Cre gene intact, suggesting that most of the SiR-CRE-infected cells that disappear over time in tdTomato reporter mice are dying rather than simply stopping expression of mCherry.
We sequenced the transgene inserts for 21 individual SiR-CRE clones (see Methods). 19 out of 21 had no mutations in the Cre gene, and two had one point mutation each (Ala88Val and Arg189Ile). All 21 had an intact mCherry gene. The lack of a large proportion of Cre-knockout mutants is one indication that the majority of red fluorescent neurons in SiR-CRE-injected Ai14 (tdTomato reporter) mice are not labeled only with mCherry, providing evidence that their disappearance is equivalent to their death.
A) Maximum intensity projections of the two-photon FOV shown in Supplementary Video S1 of visual cortical neurons labeled with SiR-CRE in an Ai35 mouse, 11-28 days postinjection. Images are from the same FOV at four different time points. All cells clearly visible on day 11 are circled. In this example, 18 out of 19 cells (red circles) disappeared by a subsequent imaging session. Only one cell (white circle) is still visible on day 28. Numbers below four of the circles mark the cells for which intensity profiles are shown in panel B. Scale bar: 50 µm, applies to all images.
B) Green fluorescence intensity versus depth for the four representative neurons numbered in panel A at the four different time points, showing disappearance of three of them over time.
C) Fraction of cells visibly EGFP-labeled at day 11 still visible at later time points, from four different FOVs in two Ai35 mice. Connected sets of markers indicate cells from the same FOV. 86% of SiR-CRE-labeled neurons had disappeared by 4 weeks postinjection.
Supplementary Video S1 (separate file): Video of 95% of SiR-CRE-labeled neurons in an Arch-EGFP-ER2 reporter mouse disappearing between 11 days and 28 days postinjection. Two-photon image stacks of a single FOV of visual cortical neurons in an Ai35 mouse imaged at four different time points; time in the video represents depth of focus. Large blobs are glia. 18 out of the 19 neurons visibly labeled with Arch-EGFP-ER2 at 11 days following injection of SiR-Cre are no longer visible 17 days later. White circles indicate cells present at both 11 days and all subsequent imaging sessions; red circles indicate cells present at 11 days but gone by 28 days.
A) Representative images of red fluorescence in SiR-CRE-labeled cells in Ai14 (Cre-dependent expression of tdTomato, top row) and Ai35 (Cre-dependent expression of Arch-EGFP-ER2, bottom row). The three images for each mouse line are from 3 different mice of each line, imaged 7 days following SiR-CRE injection (see Methods), all with the same laser intensity and wavelength (1020 nm). Red fluorescence due only to mCherry (i.e., in Ai35 mice) is obviously much dimmer than that due to tdTomato (i.e., in Ai14 mice). Scale bar: 50 µm, applies to all images.
B) Intensity of red fluorescence of SiR-CRE-labeled cells in Ai14 (left) and Ai35 (right) mice. Data point indicate intensity of individual cells in arbitrary units at the same laser and microscope settings (see Methods). Box plots indicate median, 25th–75th percentiles (boxes), and full range (whiskers) of intensities for each mouse. The average of the mean red fluorescent intensity in each mouse was 48.97 in Ai14 and 15.69 in Ai35 (p=0.00283 < 0.01, one-way ANOVA); the midpoint of these means, 32.33, was used as the cutoff for the reanalysis of the data in Ai14 mice to exclude neurons that could have been labeled with mCherry alone.
A) Same representative fields of view as in Figure 3 but with circles now marking only cells with intensity at 7 days of greater than 32.33 a.u. (see text and Supplementary Figure S3) that are no longer visible at a subsequent time point. Scale bar: 50 µm, applies to all images.
B-D) Numbers of cells above threshold fluorescence intensity at week 1 that were still present in subsequent weeks. The conclusions from Figure 3 are unchanged: few cells labeled with RVΔGL-4Cre were lost, RVΔG-4Cre killed a significant minority of cells, and SiR-CRE killed the majority of labeled neurons within two weeks following injection.
E) Percentages of cells above threshold at week 1 that were still present in subsequent imaging sessions. By 28 days postinjection, an average of only 19.2% of suprathreshold SiR-CRE-labeled cells remained.
A-B) Closeup images of virus injection sites in barrel cortex for NPEST (A)and N*PEST (B) viruses, without TEVP. Images are from the same sections as Figure 4C. The circled cells are “starter cells”, defined as expressing both tdTomato (reporting Cre expressed by the rabies virus) and mTagBFP2 (coexpressed with G). Scale bar: 50 μm, applies to all images in each panel.
C-D) Images of regions presynaptic to barrel cortex for NPEST (A) and N*PEST (B) viruses, without TEVP. Red indicates tdTomato fluorescence reporting activity of the Cre-expressing rabies viruses. Scale bar: 200 µm, applies to all images in each panel.
Supplementary File S7 (separate file): Sequencing results for RVΔG-NPEST-4Cre(EnvA), showing that the intended 3’ addition to the nucleoprotein was intact in the final stocks used for monosynaptic tracing experiments.
We sequenced the complete region of the 3’ addition to the nucleoprotein region from 32 viral particles for each of the two high-titer, EnvA-enveloped preparations of RVΔG-NPEST-4Cre(EnvA) and RVΔG-N*PEST-4Cre(EnvA) that we made for our in vivo experiments. For RVΔG-NPEST-4Cre(EnvA), none of the 32 clones had any mutations in the 3’ addition; one clone had a point mutation in the intergenic region between the N and P genes (from TGTATA to TTTATA), and one other clone had a synonymous mutation in the N gene (Ser437, from TCA to TCG). For RVΔG-N*PEST-4Cre(EnvA), 30 of the 32 clones had no mutations in the sequenced region; one clone had a synonymous mutation in the N gene (Asn436, from AAC to AAT), and one other clone had a synonymous mutation in the PEST region (Pro5, from CCG to CCA) after the stop codon.
Supplementary File S8 (separate file): Counts and statistical analyses of labeled neurons in transsynaptic tracing experiments.
Sheet 1: Detailed cell counts of every (6th) brain section for all mice, including tdTomato+ cell numbers from ipsilateral cortex, contralateral cortex, and thalamus, as well as counts of starter cells for each animal;
Sheet 2: Total counts of labeled cells for each mouse in each condition, as well as the ratios of tdTomato+ cells to starter cells and the average numbers for each condition;
Sheet 3: P-values of comparisons using single factor ANOVAs.
A-B) Closeup images of virus injection sites in barrel cortex using NPEST (A) and N*PEST (B) viruses, with TEVP supplied in starter cells. Images are from the same sections as Figure 5C. The circled cells are “starter cells”, defined as expressing both tdTomato (reporting Cre expressed by the rabies virus) and mTagBFP2 (coexpressed with G); many of these cells also express emiRFP670, coexpressed with TEVP from the third helper virus included in these cases. Scale bar: 50 μm, applies to all images in each panel.
C-D) Images of regions presynaptic to barrel cortex for NPEST (A) and N*PEST (B) viruses, with TEVP supplied in starter cells. The NPEST virus has spread to the same input regions as has N*PEST, although to a more limited degree. Scale bar: 200 µm, applies to all images in each panel.
A-B) Images of injection sites for control experiments in which the helper virus expressing G was omitted (but in which the helper virus expressing TEVP was included), using NPEST (A) and N*PEST viruses. Neither virus spreads to any detectable degree without G. Scale bar: 200 µm, applies to all images in each panel.
C) Counts of tdTomato-labeled neurons in ipsilateral cortex for the controls without G.
D) Counts of “starter cells” (defined as cells coexpressing tdTomato and mTagBFP2 when TEVP was omitted, and cells coexpressing tdTomato, mTagBFP2, and emiRFP670 when TEVP was supplied) for all conditions.
Although we did not rigorously quantify the effect, our FLPo-encoding RVΔG-4Flpo appears to kill neurons more quickly than does RVΔG-4Cre (cf. Figure 3 and Chatterjee et al. (21)). In this example field of view, most neurons clearly visible at earlier time points have disappeared by 28 days postinjection, leaving degenerating cellular debris. See Discussion for possible reasons why a preparation of a first-generation vector encoding a recombinase may or may not preserve a large percentage of infected neurons. Scale bar: 50 µm, applies to all panels.
ACKNOWLEDGEMENTS
We thank Ernesto Ciabatti and Marco Tripodi for sharing samples of EnvA/SiR-CRE and EnvA/SiR-FLPo and for comments on the manuscript. We thank Ed Callaway, Sean Whelan, Ayano Matsushima, and Kim Ritola for helpful discussion and Jun Zhuang, Soumya Chatterjee, and Ali Cetin for helpful discussion and sharing their own results with SiR viruses. We thank Stuart Levine, Noelani Kamelamela, and Huiming Ding of the MIT BioMicro Center for assistance with SMRT sequencing and bioinformatic data analysis and Sara Beach for helpful feedback on the manuscript. Research reported in this publication was supported by the following BRAIN Initiative awards from the National Institute of Mental Health: U01MH106018 (Wickersham) RF1MH120017 (Wickersham), U01MH114829 (Dong), and U19MH114830 (Zeng).
Footnotes
This version includes the results of a new set of experiments, in which we made a rabies virus with an intact PEST domain fused to its nucleoprotein and tested the ability of the new virus, along with a "revertant" control virus, to spread transsynaptically with and without TEVP.