Generation of viral vectors specific to neuronal subtypes of targeted brain regions by Enhancer-Driven Gene Expression (EDGE)

Understanding brain function requires understanding neural circuits at the level of specificity at which they operate. While recent years have seen the development of a variety of remarkable molecular tools for the study of neural circuits, their utility is currently limited by the inability to deploy them in specific elements of native neural circuits, i.e. particular neuronal subtypes. One can obtain a degree of specificity with neuron-specific promoters, but native promoters are almost never sufficiently specific restricting this approach to transgenic animals. We recently showed that one can obtain transgenic mice with augmented anatomical specificity in targeted brain regions by identifying cis-regulatory elements (i.e. enhancers) uniquely active in those brain regions and combining them with a heterologous promoter, an approach we call EDGE (Enhancer-Driven Gene Expression). Here we extend this strategy to the generation of viral (rAAV) vectors, showing that when combined with the right minimal promoter they largely recapitulate the specificity seen in the corresponding transgenic lines in wildtype animals, even of another species. Because active enhancers can be identified in any tissue sample, this approach promises to enable the kind of circuit-specific manipulations in any species. This should not only greatly enhance our understanding of brain function, but may one day even provide novel therapeutic avenues to correct the imbalances in neural circuits underlying many disorders of the brain.


Introduction
The mammalian brain is the most complex biological structure known, with innumerable distinct cell types differing in cytoarchitecture, electrophysiological properties, gene expression and connectivity 1,2 . Recent years have seen the development of truly revolutionary molecular tools that allow neuroscientists to elucidate precise neural connectivity 3 and monitor 4 and manipulate [5][6][7] neural activity. However, optimal use of these tools to examine the functional circuitry of the brain requires the ability to deliver them specifically to particular elements of neural circuits (i.e. neuronal cell types), rather than as a nonspecific bolus affecting all of the neurons in a brain area. The use of molecular genetics is the only method by which one can perform truly cell-type specific manipulations, as evidenced by a variety of studies using transgenic animals expressing transgenes from neuronal promoters (genomic regions just upstream of the transcriptional start site) 8, 9 . However, such approaches are limited by the fact that because genes are expressed in a variety of cell types in the brain, promoters are not specific to a single cell type. While estimates vary 10 , there are many more cis-regulatory elements (i.e. enhancers and repressors, distal genomic regions which help regulate where and when promoters transcribe RNA) than promoters, suggesting that enhancers may be more specific. This led us to take an approach to the generation of molecular genetic tools with augmented specificity that we call Enhancer-Driven Gene Expression (EDGE). It is based upon identifying the cis-regulatory elements uniquely active in particular brain regions and combining them with a heterologous minimal promoter. When we used this strategy to make transgenic mice, they were indeed significantly more specific than the presumed parent gene, often driving expression in particular sets of neurons in the brain region they were derived from 11 . However, while transgenic animals are powerful tools for the analysis of neural circuits, they do have some serious drawbacks. They are costly in both time and resources, can be subject to insertional effects 12,13 , and are most practical in a limited number of species. Moreover, while they are often excellent models of disease, transgenic technologies are far from therapeutic applications. Recombinant adeno-associated viral vectors (rAAVs) can overcome many of the above issues. They can be made relatively quickly, generally do not insert into the genome, and can be used in a variety of species 14 including humans and therefore have clinical potential as well [15][16][17][18] . However, efforts to generate cell-specific viral vectors have been largely unsuccessful to date [19][20][21] , with a few notable exceptions 22,23 . This is due in large part to the fact that one can add relatively little genetic material to the relatively small genomes of rAAVs, putting most native promoters out of reach. However, most enhancers are much smaller than promoters, raising the intriguing possibility of targeting specific neuronal cell types in any species by adapting the EDGE strategy to viral vectors. Towards this end, we present results showing the generation of EDGE viral vectors specifically expressing in particular neurons of the entorhinal cortex (EC) in two different species of wildtype animals.

Optimization of rAAV design for Enhancer-Driven Gene Expression
Because one can obtain some degree of apparent specificity with rAAVs by means other than transcriptional regulation (e.g. serotype, precise injections of minute quantities), we took steps to ensure that any observed specificity comes from the enhancer element used. First and foremost, we used an enhancer specific to a known subset of neurons in the entorhinal cortex 11 in transgenic animals and analyzed whether the rAAV construct could match this specificity. Figure 1A shows the expression pattern obtained from crossing one of the MEC13-53 tTA driver lines to a payload line expressing the helper transgenes for the ΔG-rabies monosynaptic tracing system 11 . Expression in this cross was limited to reelin-positive (RE+), calbindin-negative (CB-) excitatory projection neurons in LII of the medial and lateral EC (i.e. stellate and fan cells, respectively [24][25][26]. Second, because the distinct tropisms of the various serotypes of AAV 14,27 is a potential confound, we used a single serotype with a wide tropism for neurons (AAV2/1, which has a mosaic capsid of serotypes 1 and 2) 28 for all viral constructs.
To ensure that the expression pattern was not due to the specifics of the injection, we injected multiple animals with the same large volume for each virus, and always used GFP-expressing rAAVs of similar titer (see Table S1) and the same coordinates in every case (see Methods for details). Figure 1B shows the result of injecting 400 nl of a virus expressing GFP from a ubiquitous CMV promoter (CMV-rAAV) into the MEC, note the widespread strong expression throughout the various layers of the entorhinal cortex, as well as subiculum and parasubiculum.
In order to obtain viruses capable of driving expression as specific as the EDGE transgenic animals in wildtype brains one must first find a minimal viral promoter which is capable of 6 robust expression only when paired with a heterologous enhancer. This is complicated by the fact that the viral ITRs themselves have transcriptional activity [29][30][31] , as can be seen by the weak nonspecific expression obtained from a GFP construct with neither a promoter nor an enhancer ( Figure 1C). Note that the expression levels in 1C are far below those seen with other viruses: each panel in Figure 1 has been differentially processed to aid visualization, see Figure S1 for details. To minimize this issue, we reversed the orientation of the expression cassette relative to the ITRs such that the sense strand was under the influence of the 3' ITR, which we attenuated by putting WPRE 32 between the 3'ITR and the enhancer (see schematics in 1C, D). This revised design substantially reduced the background expression in other layers, enabling us to recapitulate MEC LII-specific expression in a wildtype mouse ( Fig   1D) with a mutated minimal CMV promoter (CMV*) 33 . Roughly similar results were obtained with three different minimal promoters ( Figure S2) but we selected CMV* for all subsequent experiments (and hereafter simply refer to the enhancer) as it was the smallest minimal promoter that yielded layer-specific EDGE with low background expression. The specificity of the expression of this virus as compared to a nonspecific CMV-rAAV virus is quantified in Figure 1E. While still clearly much more specific than the CMV-rAAV, the quantification of MEC13-53 rAAV does not look as specific as the figure panels because our conservative manual quantification for this initial characterization (see Methods) did not distinguish between weak "background" label (such as that seen without a promoter) and strong specific labelling.

MEC13-53 EDGE rAAVs express specifically in layer II stellate cells in wildtype mice
The precise anatomical boundaries of the various layers of EC can be easily visualized by using the neuron-specific stain NeuN 34 , confirming the robust layer-specific expression of the MEC13-53 rAAV (Fig 2A). All the GFP+ cells were also NeuN positive, confirming the specificity of the virus to neurons (data not shown). Much less intense background GFP expression was observed in other layers as well in both the MEC13-53 rAAV (Figure 2A, inset) and in the rAAV backbone with the same design and minimal promoter but lacking the enhancer (Fig 2B, inset), which did not strongly label any cells. As for cell specificity, within LII of MEC there are two major classes of excitatory principal neurons, RE+ stellate cells and CB+ pyramidal cells 26,35 , with RE label providing a sharp boundary between MEC and parasubiculum 25,26 (see arrows in Figure 2C, inset). We therefore performed immunohistochemical analysis comparing these markers to viral GFP and found that for the MEC13-53 rAAV, 100% (594/594) of GFP+ neurons in layer II were RE+ ( Figure 2C, E), while less than 1% (5/655) of them were CB+ ( Figure 2D, E).
In contrast, with the ubiquitous CMV-rAAV, only 34% (319/929) of GFP+ LII neurons were RE+ while 10.5% (142/1353) were CB+. Thus, the MEC13-53 rAAV drives transgene expression specifically in a particular subset of excitatory neurons in EC of wildtype mice, i.e. RE+ EC LII neurons (stellate cells in MEC), avoiding the adjacent CB+ pyramidal cells, just as in the transgenic lines based upon the same enhancer.

EDGE rAAVs drive neuronal subtype-specific expression across species
While recapitulation of the expression pattern of the corresponding transgenic mouse line nicely illustrates the specificity of EDGE rAAVs, perhaps the greatest utility of viral vectors is that they can conceivably be used in wildtype animals of any species. Because evolutionary conservation is one of the hallmarks of enhancers 36 , we posited that the murine MEC13-53 enhancer might be similarly cell-specific in the rat MEC. As seen in Figure 3A, stereotactic injections of 1000 nl of the MEC13-53 rAAV into rat MEC leads to cell-type specificity as specific as that seen in the mouse (if anything, it is more specific). Figure 3A shows the MEC13-53 rAAV counterstained with NeuN, demonstrating that GFP expression is almost exclusive to MEC LII (as quantified in Figure 3E), while the few labelled neurons seen in the virus without the enhancer have no layer II specificity ( Figure 3B), much as was the case in mouse ( Figure   2B). Similarly, 100% (1803/1803) of GFP+ neurons in rats injected with MEC13-53 rAAVs were RE+ ( Figure 3C, F), while only 1.8% (29/1589) were CB+ ( Figure 3D, F), demonstrating that in both species this virus drives expression exclusively in one (RE+) subtype of EC LII excitatory neurons but not the other (CB+), even though the two subtypes are intermingled 26 . This further demonstrates that the observed specificity results from the enhancer rather than the injection site, as does the fact that this specific expression pattern is seen throughout the dorso-ventral and medio-lateral axes of the MEC (Figure S3 C). It is interesting to note that while these two markers are largely mutually exclusive, there are reports of a very small subpopulation of RE+ neurons that are also CB+ 25,37 .

EDGE rAAVs recapitulate the expression pattern of their respective transgenic lines
To examine whether all EDGE rAAVs can specify gene expression in particular subsets of cells, we created EDGE rAAVs with several other enhancers. While not all enhancers that worked as transgenic lines worked in rAAVs, roughly half ( Figure 4, left column) did indeed appear to recapitulate the specificity of the corresponding EDGE lines (Figure 4, right column).
Remarkably, the MEC13-104 rAAV ( Fig 4A) recapitulates even the sparse labeling of a small subset of LIII neurons (arrows) seen in MEC13-104, a mainly LII-specific enhancer line (Fig 4B), and the converse is true for LEC13-8, a mainly EC LIII-specific line (compare 4C to 4D). This suggests that the sparse label in the minor layer seen in the EDGE transgenic lines is specifically driven by the enhancer, rather than by random integration site-dependent mosaicism. Thus, the relative densities of the layer-specific label appear to be enhancer-specific, suggesting that the minority of cells which express outside of their primary layer may not be "noise". Ongoing experiments seek to verify whether there are any functional distinctions between the cells labeled by the various enhancers, which often express in different subsets of what has been thought of as a single neuronal cell type, e.g. stellate cells.

Discussion
Our prior work showed that identification of cis-regulatory elements uniquely active in finely dissected cortical subregions allows one to generate genetic tools specific to cells in that subregion, an approach we call EDGE (Enhancer-Driven Gene Expression) 11 . Here we show that one can use the same approach to make rAAVs with similar specificity in both mouse and rat, provided the vector and minimal promoter's innate transcriptional activity is minimized.
This clearly cross-validates the initial identification of enhancers in our prior work 11 : while transgenic lines might show highly specific expression patterns purely due to insertional effects (though not the same patterns multiple times, as we saw), rAAVs typically do not insert into the genome 38 , so they cannot show such effects. In other words, while the precise functional significance of the enhancers presented here remains unknown, they clearly are "true" enhancers, reflecting some genetic subgroup of excitatory neurons in the entorhinal cortex of wildtype mice and rats. Taken together, these data lead to two very interesting conclusions: 1) the number of genetically-defined subgroups of neurons may be far greater than generally assumed (though this assumption has recently been challenged 39,40 ), and 2) this approach conceivably provides a path towards neuronal subtype-specific transgene expression in any species.
That enhancers drive expression similarly in both transgenic lines and viruses is not a particularly surprising result. It has been known for decades that enhancers drive cell-specific expression [41][42][43] . For instance, enhancers related to the 6 homeobox genes related to the fly distal-less gene 44 (Dll in fly, Dlx in vertebrates) have been shown to play a crucial role in morphogenesis in a variety of species 45 . Due to their central role in development such homeobox genes have been under intense scrutiny by geneticists working in a variety of systems [46][47][48][49] for decades, leading to a highly detailed understanding of these loci and the cisregulatory elements controlling their transcription. One such enhancer in the Dlx 5/6 gene cluster has been shown to be critical to the development of interneurons in particular 50 , though. A recent paper 22 used this enhancer element in a viral vector to obtain interneuronspecific expression in a variety of species, nicely showing that enhancers can be used to drive expression in viral vectors. However, as is true for most genetically-defined enhancers active early in development, Dlx5/6 drives expression across broad classes of neurons (e.g. interneurons in general) throughout the brain, rather than to particular interneuronal subclasses and/or subregions. In addition, there are several (mostly unpublished) reports of using cis-regulatory elements identified by various means to drive expression in neuronal subtypes 23,43,51,52 .
Thus, the most important aspect of these data is not that enhancers can work in viral vectors, it is the illustration of the promise of applying modern genomic techniques to the study of the precise neural circuitry of the vertebrate brain. The striking diversity of enhancers found in these tiny subregions of cortex (the numbers of unique enhancers in each was comparable to those found for entire organs) may indicate a similar diversity of neuronal cell types in the brain. However, the relationship between enhancers and cell types remains unclear. Indeed, the expression patterns we obtain are arguably more specific than our current understanding of neuronal cell type 1, 2 . For instance, stellate cells are a generally-accepted excitatory neuronal cell type of the medial entorhinal cortex 53 . They are characterized mainly by their stellate morphology, the fact they project to the hippocampus, and their expression of RE but not CB 25,26 . However, we show that 2 distinct enhancers reproducibly drive expression in certain percentages of stellate cells in LII of EC, even in rAAV. The question becomes whether these enhancer-driven expression patterns reflect functionally distinct subtypes of stellate cells, or random subsets of the same indivisible cell type? In the specific case of stellate cells, a recent paper used optogenetic tagging to show that stellate cells of the MEC exhibit a variety of quite distinct receptive field properties (i.e. they can be grid cells or spatial cells or border cells, etc), suggesting that there are many functional subtypes of stellate cells 35 . More generally, the relationship between differential enhancer usage and neuronal cell types is a highly non-trivial question, not least because there is not even complete agreement even as to how to define neuronal cell types (though there are notable exceptions) 39,40,54 , let alone how many there are. There are several other interesting explanations for differential enhancer usage beyond cell type, for instance it could dictate distinct states of a single cell type. In support of this, neural activity drastically changes the chromatin landscape of the brain, including which enhancers are active 55,56 . It will likely take years of anatomical, molecular, and physiological characterization of these tools to disentangle such questions, so for our current purposes the most important consideration is that these enhancer-based molecular genetic tools remain true to type, as appears to be the case. For instance, both the It should be noted, however, that specificity is never absolute, especially with viral vectors.
While we obtain neuronal subtype-specific results with large injections into the entorhinal cortex ( Figure S3), it is likely that any cell type in other brain regions which express the transcription factor(s) appropriate for a particular enhancer would be labeled as well.
Moreover, presumably many more cells are infected than show strong GFP label, and there is a baseline level of transcription from other elements in the viral construct (i.e. the minimal promoter and the ITRs). This implies that multiple infections of an EDGE rAAV in any cell could lead to discernible nonspecific transgene expression without involvement of the enhancer at all ( Figure 1C, 2B). Viral expression is thus not all-or-nothing, but the difference between background and enhancer-driven expression levels can be quite marked ( Figure S1). This background expression can be quite problematic with enzymes such as recombinases, or when complementing replication-competent viruses (e.g. ∆G-rabies 57 ), but is likely not an issue with transgenes whose effects vary roughly linearly with their expression levels, such as chemogenetic 6 and/or optogenetic tools 7 .
Thus, identification of the active enhancers of a mere four cortical subregions of the mouse brain has led to a variety of viral tools for circuit analysis that appear to work across species, at least in rodents. Since one can do this on any species with a reasonably well-annotated genome, one could conceivably develop tools for anatomically specific "circuit-breaking" experiments in any species. Thus, not only will circuit-specific tools greatly facilitate our understanding of normal and pathological brain function, they could in time possibly provide novel circuit-specific therapeutic avenues. For example, it has been known for decades that preclinical stages of Alzheimer's disease (AD) are characterized by neuronal loss and accumulation of neurofibrillary tangles in the superficial layers of trans-entorhinal cortex 58 , a region roughly equivalent to rodent MEC layer II. In addition, intracellular amyloid-β is found specifically in MEC layer II reelin-positive neurons in human AD pathology and rodent disease models 59 . Given the emerging consensus that AD may progress trans-synaptically 60, 61 , it is conceivable that one could use something like a MEC13-53 rAAV to deliver therapeutic agents directly to the presumed pre-α cells. More generally, it is possible that the reason that many neurological and neuropsychiatric disorders are resistant to drug therapy is that they are imbalances in particular neural circuits, not diseases of the entire brain. A drug having tropism for multiple circuits (as most do) would then by definition produce unwanted side effects: it may do the right thing in the right circuit, but it does the wrong thing to normal circuits.
Results like those presented here allow us to hope that we may one day be able to design interventions with the required specificity to match the complexity of diseases of the brain.

Legends Schematic summary of region-specific transgene expression by EDGE-rAAV viruses.
(A) Biological replicates of regions of interest are microdissected from C57BL/6J mouse brain.        in multiple horizontal sections in dorso-ventral axis from a mouse brain injected with CMV-rAAV. Label is throughout the layers of EC and also in subiculum. Sterotaxic coordinates were identified based on anatomical features using Paxinos G & Franklin K (for mouse brain) and Paxinos G & Watson C (for rat brain). Scale bar =100 µm. EDGE rAAVs were packaged in AAV serotype 2/1 (having a mosaic of capsid 1 and 2) 28 using Heparin column affinity purification 63 . Specifically, a pAAV construct generated as described above with AAV helper plasmids encoding the structural elements, were transfected into the AAV-293 cell line (Agilent, USA). The day before transfection, 7 x 10 6 AAV-293 cells were seeded into 150 mm cell culture plates in DMEM containing 10 % fetal bovine serum (ThermoFisher, USA) and penicillin/streptomycin. Co-transfection of plasmids such as pAAVcontaining the transgene, pHelper, pRC (Agilent, USA) and pXR1 (NGVB, IU, USA) was carried out next day. After 7 hours, the medium was replaced with fresh 10 % FBS-containing DMEM.

Methods
The AAV-293 cells were cultured for two days following transfection to allow rAAV synthesis Finally, the wound was rinsed with saline and the skin was sutured. The animals were left to recover in a heating chamber, before being returned to their home cage. Next day Metacam was administered orally and their health was checked daily.  Figure S1). For the post-acquisition processing, confocal czi. images were opened in Zen 2012, and visualized with intensity range indicator. The intensity histogram for the green channel was altered until the optimum intensity was visualized (as displayed by the range indicator). Identical changes in the intensity levels were applied to the other confocal images captured using the same settings for comparison ( Figure S1).