Summary
Neurons transmit information to distant brain regions via long-range axonal projections. In the mouse, area-to-area connections have only been systematically mapped using bulk labeling techniques, which obscure the diverse projections of intermingled single neurons. Here we describe MAPseq (Multiplexed Analysis of Projections by Sequencing), a technique that can map the projections of thousands or even millions of single neurons by labeling large sets of neurons with random RNA sequences ("barcodes"). Axons are filled with barcode mRNA, each putative projection area isdissected, and the barcode mRNA is extracted and sequenced. Applying MAPseq to the locus coeruleus (LC), we find that individual LC neurons have preferred cortical targets. By recasting neuroanatomy, which is traditionallyviewed as a problem of microscopy, as a problem of sequencing, MAPseq harnesses advances in sequencing technology to permit high-throughput interrogation of brain circuits.
Introduction
Neurons transmit information to distant brain regions via long-range axonal projections. In some cases, functionally distinct populations of neurons are intermingled within a small region. For example, nearby hypothalamic nuclei regulate basic drives including appetite, aggression, and sexualattraction (Kennedy et al., 2014; Sternson, 2013), and neurons from these nuclei project to distinct targets. In visual cortical area V1, responses to visual stimuli are matched to the properties of the higher visual areasto which the neurons project (Glickfeld et al., 2013; Movshon and Newsome, 1996). Findings such as these suggest that the information transmitted byindividual neurons may be tailored to their targets. Such selective routing of information requires an anatomical substrate, but there is currently no high-throughput method for determining the diverse projection patterns of individual neurons.
At present, there is a steep tradeoff between throughput and resolutionin anatomical approaches to mapping long-range connections. In conventional anterograde brain mapping studies, a fluorescent or enzymatic label is used to enable visualization of cell bodies and distal projections by lightmicroscopy. Bulk techniques query the projections of many neurons labeled at a single injection site and thus sample the aggregate architecture of an entire neuronal population. There have been several large-scale efforts,including the Allen Brain Projection Atlas (Oh et al., 2014) and the iConnectome (Zingg et al., 2014), to systematically map mesoscopic connectivity. Although fast, such bulk methods obscure the diversity of the many projection neurons labeled in any one experiment. Consider, for example, a single source area that projects to three downstream areas (Fig 1a). This projection pattern enables neurons in the source area to send information to the three downstream areas. However, identical bulk projection patterns could arise in multiple ways: (i) from a one-to-one architecture, in which each neuron targets only a single downstream area (left); (ii) from an all-to-all architecture, in which each neuron targets every downstream area (middle); or (iii) from a host of more complicated architectures (right). With conventional bulk labeling these three projection patterns, which have different functional implications, are indistinguishable without further experimentation.
Several alternative methods have been developed to complement conventional anterograde bulk-labeling approaches. For example, genetically definedsubpopulations of neurons within an area can be targeted by expressing a marker such as Cre recombinase (Gong et al., 2007; Harris et al., 2014; Huang, 2014). Similarly, subpopulations can be defined using retrograde (Lima et al., 2009; Oyibo et al., 2014; Wickersham et al., 2007a) or trans-synaptic viruses (DeFalco et al., 2001; Wickersham et al., 2007b). However, because such approaches rely on positing defined subpopulations, they cannoteasily be used to screen for the myriad possible complex projection patterns neurons might exhibit.
The most general and unbiased approach to distinguishing among the architectures in Fig. 1a relies on single neuron anterograde tracing. Current methods for achieving single neuronresolution require individually labeling one or, at most, a few cells per brain (Economo et al., 2016), a labor-intensive approach that affords highresolution at the cost of low-throughput. Although single-neuron tracing can be multiplexed by labeling individual neurons with different coloredfluorophores (Ghosh et al., 2011; Livet et al., 2007), in practice the extentof multiplexing is limited by the number of colors — at most 5-10 — that can be resolved by microscopy.
Here we describe MAPseq, a novel approach in which the speed and parallelization of high-throughput sequencing is exploited for brain mapping. MAPseq achieves multiplexing by using short, random RNA barcodes to uniquelylabel individual neurons (Mayer et al., 2015; Walsh and Cepko, 1992; Zador et al., 2012) (Fig 1b). The key advantage of using barcodes is that their diversity grows exponentially withthe length of the sequence, overcoming the limited diversity of the resolvable color space. The pool of unique barcode identifiers is effectively infinite; even a 30 nucleotide (nt)-sequence has a potential diversity of 43≈1018 barcodes, far surpassing the ~108 neurons in the mouse brain (Herculano-Houzel et al., 2006). Because high-throughput sequencing can quickly and inexpensively distinguish these barcodes, with MAPseq we can potentially read out the projections of thousands or even millions of individual neurons in parallel within a single brain (Fig. 1c).
In MAPseq, we uniquely label neurons in a source region by injecting a viral library encoding a diverse collection of barcode sequences. The barcode mRNA is expressed at high levels and transported into the axon terminals at distal target projection regions (Fig. 1d). To read out single neuron projection patterns, we then extract and sequence barcode mRNA from the injection site, as well as from each target region of interest. Spatial resolution of MAPseq is limited mainly by the precision of target dissection. Although MAPseq, like GFP tracing, does not distinguish fibers of passage, we minimize their contribution by avoiding large fiber bundles during the dissection of target areas. Using this procedure, the brain-wide map of projections from a given area can be determined in less than a week. By reformulating projection mapping as a problem of sequencing, MAPseq harnesses advances in high-throughput sequencing to permit efficient single neuron circuit tracing.
Results
As a proof-of-principle, we applied MAPseq to LC, a small nucleus in the brainstem that is the sole source of noradrenaline to the neocortex (Foote and Morrison, 1987). Early bulk tracing experiments revealed that the LC projects broadly throughout the ipsilateral hemisphere, leading to the view that the LC broadcasts a generalized signal that modulates overall behavioral state (Foote and Morrison, 1987; Foote et al., 1983; Loughlin et al., 1982; Waterhouse et al., 1983). This view has recently been supported by more sophisticated retrograde bulk tracing experiments, which reinforcethe idea that LC neurons project largely indiscriminately throughout the entire ipsilateral hemisphere (Schwarz et al., 2015). However, other reports have challenged this model. Using double retrograde labeling methods, these experiments uncovered separate populations of LC neurons projecting to different areas of cortex (Chandler et al., 2014; Chandler and Waterhouse, 2012), raising the possibility that the LC exerts differential control over different cortical areas. To address this controversy, we applied MAPseq to LC to obtain a large number of projection patterns at single neuron resolution.
In what follows we first show that long-range projections of neurons can be efficiently and reliably determined using barcode mRNAs, the abundance of which we interpret, like GFP intensity, as a quantitative measure of projection strength. Next, we establish the theoretical and practical foundations of randomly labeling large numbers of neurons with a viral barcodelibrary, critical for ensuring single cell resolution for MAPseq. We then apply MAPseq to the LC, and find that individual neurons have a variety ofidiosyncratic projection patterns. Some neurons project almost exclusivelyto a single preferred target in the cortex or olfactory bulb, whereas others project more broadly. Our findings are consistent with, and reconcile, previous seemingly contradictory results about LC projections. Finally we show that MAPseq can be multiplexed to two and potentially many injectionsin the same animal, which will allow the projection patterns from many brain areas to be determined efficiently and in the same brain without the need for registration across animals.
Using RNA to trace neurons
Conventional neuroanatomical tracing methods rely on filling neurons with dyes or proteins, so that neural processes can be resolved by microscopy. An implicit assumption of these techniques—albeit one that has rarely been rigorously tested—is that the tracer fills the neuron abundantly and uniformly, so that the strength of the signal corresponds to the quantity of labeled neural process, independent of distance from the soma. For barcode mRNAs to act as a comparable label in MAPseq, we sought to maximize the abundance and uniformity of barcode mRNA in distal processes. We used two strategies to achieve this goal.
First, we expressed a modified presynaptic protein that was designed tospecifically bind to and transport barcode mRNA into axon terminals. We engineered this protein, which we denote MAPP-nλ, as part of a larger project aiming to read out synaptic connectivity using mRNA. To generate MAPP-nλ, we began with pre-mGRASP, a protein engineered to localize at the presynaptic terminal due to fusion with trafficking signals from the endogenous presynaptic protein NRXN1β (Kim et al., 2011). We then inserted four copies of the nλ RNA binding domain (Daigle and Ellenberg, 2007) into the cytoplasmic domain of the protein. Derived from the λ phage λN protein, the nλ domain is a 22 amino acid peptide which strongly and specifically binds to a 15-nt RNA hairpin, termed boxB. We added four copies of the boxB hairpin to the barcode mRNA, ensuring coupling of MAPP-nλ to the barcode mRNA and thus transport of barcode mRNA into axon terminals (Daigle and Ellenberg, 2007). Second, we delivered the barcode sequence using recombinant Sindbis virus, a virus which can rapidly achieve very high expression levels (Ehrengruber, 2002). We used a novel Sindbis packaging system which, unlike previous systems, is both neurotropic and propagation incompetent (Kebschull et al., 2015) (Supplementary Fig 1a-g; Supplementary Note 1). All components necessary for MAPseq are expressed from a dual promoter virus that generates two subgenomic RNAs (Fig. 2a). The first encodes the MAPP-nλ protein. The second RNA encodes a random 30-nt barcode, as well as the boxB sequence, downstream of a GFP marker (Supplementary Fig. 1h,i). We reasoned that combining these two strategies would maximize our ability to reliably detect barcode mRNA in distal processes.
We injected barcoded virus into right LC (Supplementary Fig. 1j,k,l) and examined barcode localization by in situ RNA hybridization 44 hours after injection. We observed robust barcode mRNA localization in the soma and neuronal processes, in a pattern similar to that of coexpressed GFP (Fig. 2b). This suggested that barcode mRNA could effectively fill local neuronal processes.
To determine whether the barcode mRNA fills distal neuronal processes uniformly, we exploited the particular anatomy of LC projection neurons. LCneurons that project to cortex send their processes all the way to the rostral end of the brain, before changing direction and moving caudally to innervate cortical areas (Fig. 2c). Axons that project to visual cortex are therefore approximately twice as long as those that project to frontal cortices. From bulk tracing studies it is known that LC innervation is homogeneous along the rostro-caudal axis (Schwarz et al., 2015; Waterhouse et al., 1983). Thus if barcode mRNA were not efficiently transported to distal processes, we would expect to find more barcode mRNA in rostral regions of cortex. To assess this, we injected barcoded virus into LC, cut 300μm coronal slices of the entire cortex (Fig. 2d), and quantified the amount of barcode mRNA from each ipsilateral and contralateral slice. Consistent with previous results using GFP and other tracing methods (Schwarz et al., 2015; Waterhouse et al., 1983) we found approximately uniform projections throughout the ipsilateral cortex (p=0.972 F-statistic vs. constant model; Fig. 2e); in particular, we found no evidence that distal processes were more weakly labeled than proximal processes. Also consistent with previous results, weobserved much weaker (30.6-fold less; p=4×10-31 paired student’s t-test) projections to the contralateral cortex (Fig. 2f). These results suggest that barcode mRNA fills distal and proximal processes with about equal efficacy, so that the barcode mRNA can be interpreted in the same way as the fluorophores and dyes used in conventional tracing studies.
Unique labeling of neurons with barcodes
In conventional single-neuron tracing, the main challenge to multiplexing is the low diversity of labels (fluorophores or enzymes) available to disambiguate individual neurons. To overcome this challenge, MAPseq labels neurons with short, random RNA barcodes delivered by infection with a diverse viral library. Ideally, each labeled neuron should have exactly one unique barcode. Here we consider the factors that could lead to deviations from this ideal scenario: (i) more than one barcode per neuron (multiple labeling); and (ii) more than one neuron per barcode (nonunique or degenerate labeling). As discussed below, deviations resulting from multiple labeling are much less of a concern than those resulting from degenerate labeling.
A neuron may express more than one barcode if it is infected by more than one viral particle. Such multiple labeling will lead to an overestimateof the number of neurons identified, but will not distort the projection patterns recorded for individual neurons (Fig. 3a). Furthermore, even estimates of the relative abundances of different neuronal classes will, on average, be accurate. Assume for example that two neurons A and B are each labeled with 10 different barcodes. In this scenario, MAPseq will discover 10 instances of neuron A and 10 of B; but even though the absolute number of neurons is incorrect, the facts that neurons A and B have distinct projection patterns, and that these patterns occur in a 1:1 ratio, are accurately inferred. Thus multiple labelingwill not, on average, lead to mischaracterization of neuronal classes or of their relative frequency in the population.
Nevertheless, to simplify the interpretation of MAPseq results we sought to minimize the multiplicity of infection (MOI) by titrating the concentration and volume of virus injected. To estimate the MOI, we isolated a total of 45 individual neurons from 3 animals injected with MAPseq virus into the right LC and sequenced the barcodes within each neuron. On average, infected LC neurons contained 1.2+/-0.1 barcodes each, implying MOIof 0.43 (Fig. 3b). Only 21+/-11% of neurons contained more than one barcode, with most of these neurons carrying two barcode sequences and only 1.7+/-2.9% of neurons containing three barcode sequences.
The second deviation from the ideal scenario is non-unique labeling. Iftwo neurons share the same barcode, then MAPseq misinterprets this as a single neuron whose projection pattern is given by the union of the projection patterns of the two infected neurons (Fig. 3c). The probability that two neurons are infected by the same barcode depends on the number of infected cells relative to the number of available barcodes. Trivially, if the number of infected cells is larger than the number of available barcodes, unique labeling of all neurons cannot be achieved. Conversely, if the number of available barcodes is much higher than the number of infected cells, every neuron will be labeled with a different barcode purely by chance.
To determine whether our barcode diversity was sufficient to ensure unique labeling, we first quantified the number of neurons. We counted 1985+/-132 (N=6 animals) neurons expressing tyrosine hydroxylase, a noradrenergic marker. The size of this neuronal population is approximately 15 orders of magnitude smaller than the theoretical diversity (43≈1018) of a library of 30-nucleotide barcodes, so in theory unique labeling would be virtually certain. In practice, however, the actual diversity of a virus library is limited by bottlenecks in plasmid and virus generation (Fig. 3d), so further analysis was required to determine whether the viral library was sufficiently diverse.
We therefore computed how the fraction uniquely labeled neurons scaled with the diversity of the viral library. This problem is formally equivalent to a generalization of the classic “birthday problem”, which concerns the probability that, in a group of k people (neurons), some pair will have the same birthday (barcode). Assuming that all barcodes are equally abundant in the library, we can express the expected fraction F of uniquely labeled neurons as: where k is the number of infected neurons (assuming one barcode per cell), and N is the barcode diversity (see Supplementary Note 2). Thus if k=1000 LC neurons were infected with a library of diversity of N=106, on average 99.9% of all neurons would be labeled uniquely. This expression holds only for a library of equally abundant barcodes; if some barcodes are overrepresented the fraction F of uniquely labeled neurons decreases (in the same way that if birthdays tend to fall on a particular day the probability of finding a shared birthday in a group increases (Munford, 1977)). However, analysis of the actual viral barcode library determined by sequencing (Fig. 3e) revealed that in practice these deviations from uniformity had only a minor effect (Supplementary Fig. 2a). Thus, under our conditions, the vast majority (>99%) of neurons will be uniquely labeled, even taking into account the uneven barcode distribution in the viral library (Fig. 3f; Supplementary Note 2).
We also used a second, more empirical approach, to estimate the extent of degenerate labeling. Since we used the same viral library to infect neurons in different animals, barcode sequences found in more than one animalrepresent degeneracy. We therefore looked for overlap in the recovered barcodes from four independent injections of the same virus library. Out of the 992 unique barcodes that we recovered from traced neurons, only three barcodes were present in more than one animal, and no barcode was present in more than two animals. This empirically measured rate of degenerate labeling is in close agreement with our expectations based on the theoretical considerations above. Moreover, two of the three repeated barcodes were among the most abundant barcodes in the virus library (Supplementary Fig. 2b), and would thus be expected a priori to have the highest probability of double labeling.This analysis provides an independent confirmation that the error rate due to non-unique labeling by the barcode library is very low in our experiments.
In addition to non-unique barcode labeling, MAPseq is subject to other errors that differ from those associated with conventional tracing approaches. We used several approaches to quantify these errors, and find that the overall MAPseq error rate was low both for false positives (1.4+/ 0. 8% (mean+/-std error)) and false negatives (8.6+/-6%) (see Supplementary Note 3). MAPseq thus provides a reliable measure of axon projections.
Sequencing of barcode mRNAs reveals diverse single neuron projection patterns
The goal of MAPseq is to quantify the projection patterns of large populations of neurons in parallel. We therefore developed a method to determine the amount of each barcode in each dissected target (Fig. 4a). Forty-four hours after injection of barcoded virus into right LC, we performed reverse transcription on barcode mRNA extracted from dissected target regions. To overcome distortions introduced during amplification (Kebschull and Zador, 2015), and to allow a precisecount of barcode cDNA molecules, we designed reverse transcription primersto tag each individual barcode mRNA molecule with a random 12-nt unique molecular identifier (UMI). We also added a 6-nt slice-specific identifier (SSI), to allow multiplexing of samples within a single high-throughput sequencing flow cell. We then amplified, pooled and sequenced these SSI-UMI-barcode cDNAs (Supplementary Fig. 3). We developed a conservative computational pipeline to minimize noise due to RNA contamination and to correct for sequencing and other errors (Supplementary Note 4). Finally, we converted barcode abundance in the target areas to a matrix of single neuron projection patterns.
We used MAPseq to determine the projection patterns of a total of 995 barcodes labeled in four animals (249+/-103 barcodes per animal), roughly corresponding to an equal number of LC neurons. For each animal, we analyzed the barcode mRNAs extracted and amplified from the olfactory bulband from 22 coronal slices (300μm) taken from the cortex ipsilateral to the LC injection (Fig. 4b,c).Although like conventional GFP tracing, MAPseq does not distinguish betweensynaptic connections and fibers of passage, we minimized the contribution of large fiber tracts in the white matter during dissection, so most of the mRNA barcode signal was likely generated from axons terminating in theregions of interest. Dissection of coronal slices allowed us to probe the organization of projections along the rostral-caudal axis, but we were insensitive to any additional structure along the medio-lateral axis. Becauseindividual barcode mRNA molecules are tagged with a UMI before amplification, we obtained a precise quantification (subject to Poisson counting statistics; see Supplementary Note 5) of the amount of each barcode sequence in each dissected target. In this way we could infer the projection strength—the density of axon per tissue volume—of each neuron to each coronal target area. For example, we recovered 223 copies of BC28 in slice 5, but none in slice 20, indicating that the projection strength to slice 5 is at least a factor of 200 higher than our detection floor (Fig. 4d).
Inspection of the projection patterns revealed that, in contrast to thesimplest prediction from conventional bulk tracing, single neurons did notproject uniformly throughout the ipsilateral hemisphere. Instead, neurons projected in diverse and idiosyncratic ways to specific targets, innervating some areas hundreds of times more strongly than others (Fig. 4d). Some neurons (e.g. BC28) showed specificprojections to only a small part of cortex, whereas others (e.g. BC79) projected more broadly, or projected to multiple areas (e.g. BC51 or BC235). Projections to the olfactory bulb appeared independent of projections to cortex, with some neurons projecting exclusively to the olfactory bulb (e.g. BC302), some projecting exclusively to cortex (e.g. BC79) and others projecting to both (e.g. BC108).
The small fraction of multiply infected neurons revealed by single cellsequencing (21+/-11%; Fig. 3b) provided a convenient internal measure of the reliability of MAPseq. Within each animal, we found pairs of very similar projection patterns, as would be expected if they arose from double labeling of the same neuron (Fig. 5a; Supplementary Fig. 4a). By comparing the similarity of projection patterns within and across animals, we estimated the total number of barcode pairs arising from doubly labeled neurons (18.6+/-2%; Fig. 5b; Supplementary Fig. 4b). The fact that this estimate is in such close agreement with the independent estimate of the number of doubly labeled neurons from single cell sequencing supports the view that MAPseq provides a robust measure of single neuron projection profiles.
To assess the structure of the LC projection to cortex and olfactory bulb, we sorted all traced neurons by their maximum projection (Fig. 6a). The maximum projections of individual LC neurons tile the entire cortex. To compare across the population we normalized the total barcode content of each neuron to one, although interestingly there was no correlation between expression level in the LC and themaximal projection strength to cortex, as would have been expected if differences across neurons were dominated by expression level (R=-0.06; p=0.09; Supplementary Fig. 5a). Only in the aggregate do these projections re-create the apparently homogeneous LC innervation of cortex previously described by bulk methods (Fig. 2e). Consistent with previous results (Shipley et al., 1985), a considerable fraction (23+/-4.7%) of all mapped neurons projected to the olfactory bulb.
We then asked if we could find structure, or even stereotypic projection cell types, in the single cell data set. We investigated the cortex-wideprojection patterns of LC by reducing the dimensionality of the projectiondataset using Euclidean distance based t-SNE (t-distributed stochastic neighbor embedding) (Van der Maaten and Hinton, 2008). Neurons with maximum projections close to each other in physical space along the rostro-caudal axis also fall closely together in t-SNE space (Supplementary Fig. 5b), indicating that the location of the maximum projection target at least partially describes the individual neuron projection patterns. However, hierarchical clustering of the projection profiles of neurons that project to cortex did not uncover distinct cell classes (Supplementary Fig. 5c). Although we cannot rule out the possibility that there is further structure in the projection patterns that would be revealed by higher-resolution dissection, our data suggest the intriguing possibility that LC projections are equipotential for projecting to all targets, and the choice is arbitrary for each neuron. How the circuit might exploit such random connectivity raises interesting computational challenges.
Although many LC axons projected very strongly to a narrow target, these axons often sent minor collaterals to a much broader area, like a plant with a single main stalk and many minor growths. The average number of projection peaks per LC neuron was 1.6+/-0.8 (Supplementary Fig. 5d), and the fall-off to half the maximum projection strength of individual neurons occurred on average in <300μm (Fig. 6b). Nonetheless, every cortically-projecting neuron innervated on average 65+/-23% of cortex at a detectable level (see cumulative distribution of projection widths in Fig. 6c). Importantly, this broad, weak innervation of cortex cannot be explained simply by contamination of our dataset by fibers of passage. LC axons innervate cortex starting in the rostral end and moving caudally (Fig. 3c). We would therefore expect signals from accidentally dissected fibers of passage to only precede strong projection targets of individual neurons along the rostro-caudal axis. However, we find that strong projection targets are both preceded and followed by low level projections (see e.g. Fig 4d BC79 and BC51). We therefore conclude that the observed weaker signals are not the result of fibers of passage but represent real, but weak, projections. The fact that many neurons had a strong preferred target in cortex or olfactory bulb, but also projected weakly to a much broader area, provides a way to reconcile apparently conflicting results about the specificity of LC projections. Recent experiments, in which retrograde viral labeling was combined with anterograde tracing of axons, concluded that as a population LC neurons project largely indiscriminately throughout cortex and the rest of the brain examined (Schwarz et al., 2015). However, using this approach, a neuron labeled retrogradely from a weak projection is indistinguishable from one labeled from a strong projection, so at the level of the population (i.e. after summing the projection patterns of strongly and weakly projecting neurons) it may appear that projections are nonspecific. Thus although the results of this study may appear to contradict those obtained by MAPseq at single neuron resolution, simulations of retrograde labeling in combination with anterograde bulk tracing based on our MAPseq data set demonstrate that there is no contradiction (Supplementary Fig. 5e,f) at the bulk level.
MAPseq scales to several injection sites
MAPseq can readily be extended to determine the projections of two or more regions in a single animal. As a proof-of-principle, we injected the same library of MAPseq virus bilaterally into left and right LC, and dissected coronal slices of left and right cortex and the olfactory bulbs, as well as both injection sites (Fig. 7a). Each barcode was expressed predominantly in either the left or right LC (Fig. 7b); barcode expression at the site contralateral to the injection, due to contralaterally projecting fibers and/or contamination, is much lower. Thus each barcode can be reliably assigned to the appropriate injection site. As expected, parallel injections recapitulated the projection pattern observed with single injections (Fig. 7c). Multiplexing MAPseq todozens of injections per animal may befeasible, reducing the labor and cost of brain-wide projection mapping efforts, and eliminating the need to map data from multiple animals to an average reference brain (Oh et al., 2014; Zingg et al., 2014).
Discussion
We have described MAPseq, the first application of sequencing to neuroanatomical tracing. MAPseq is a simple, rapid, and inexpensive approach to determining the projection patterns of myriad single neurons at one or more injection sites in a single animal. As a proof-of-principle, we applied MAPseq to the LC. In contrast to previous bulk labeling studies that reported diffuse and non-specific projections from the LC, this single neuron resolution analysis reveals that individual LC neurons have idiosyncratic projection patterns with preferred cortical targets, and reconciles a controversy about the specificity of LC projection patterns. MAPseq, which complements rather than replaces conventional approaches, can readily be applied to other brain regions and organisms, and with further development my be combined with information about gene expression and neural activity.
High-throughput sequencing and neuroanatomy
The cost of sequencing the human genome was several billion dollars in 2003, but today it is less than one thousand dollars—a decrease of over six orders of magnitude in just over a dozen years (Hayden, 2014; Sheridan, 2014). This precipitous drop in sequencing costs continues unabated, at a rate faster even than Moore’s law (the rate at which computers improve). At the same time, DNA sequencing has evolved from a specialized tool for determining the sequences of genomes into a versatile technology for determining gene expression levels, discovering new species, tracking cell fates, and understanding cancer growth, among many other applications (Reuter et al., 2015). With advances of technology and novel assays, DNA sequencing has revolutionized disparate areas of biology. By harnessingsequencing for neuroanatomical tracing, MAPseq may accelerate our understanding of neural circuits.
Information about single neuron projection patterns obtained by MAPseq may be combined with information about other dimensions of neuronal function. For example, single neuron projection patterns obtained by MAPseq could be associated with information about gene expression. One approach exploits transgenic mice that mark defined neuron classes with Cre recombinase.Although expression of barcodes delivered using the RNA virus Sindbis cannot readily be controlled with Cre, barcodes could be delivered using a DNAvirus like AAV or a retrovirus like lentivirus. More general approaches, such as single cell isolation (Supplementary Fig. 6), might associate several genes or even a whole transcriptome with projection patterns. MAPseq data could also be combined with recordings of neural activity obtained bycalcium imaging. Taken together, thecombination of connectivity, gene expression, and activity could provide aricher picture of neuronal function than any of these alone (Marblestone et al., 2014).
MAPseq resolution
The factors limiting MAPseq can be understood by analogy with those limiting optical microscopy. Just as spatial resolution — effective pixel size — in microscopy is limited by the optics and detectors, sothe spatial resolution of MAPseq is limited by the spatial precision with which brain regions are defined. In the present work we used gross dissection, which affords only relatively crude spatial resolution, perhaps to the level of cortical areas (we have successfully dissected individual cortical areas, including the primary auditory cortex; Supplementary Fig. 7a,b). The spatial resolution can be increased by other methods including laser-capture microdissection (Espina et al., 2006), transcriptome in vivo analysis (TIVA) tagging (Lovatt et al., 2014), or fluorescent in situ sequencing (Lee et al., 2014) to determine the location of barcode mRNAs in tissue.
The effective spatial resolution of MAPseq may also be limited by the amount of barcode mRNA in each spatially defined region, analogous to the low-light photon-counting limit in microscopy. The precise spatial scale atwhich the barcode mRNA “shot noise” limit is reached in MAPseq is determined by the interplay of several factors, including the diameter of the axonal projections, expression level of the barcode, and the efficiency with which barcodes can be recovered and amplified. Because mRNA shot noise did not appear limiting in the present experiments, we did not invest significant effort in optimizing these parameters. However, the factthat barcodes can be detected even in fine LC axons suggests that MAPseq may achieve relatively high spatial resolution.
Uniformity vs. specificity of LC projections
The LC sends projections to most ipsilateral brain areas, with the notable exception of the striatum. However, how broadly individual neurons innervate those target areas is subject to debate. Classical retrograde tracing studies suggested a topographic organization of neocortical (Waterhouse et al., 1983) and brain-wide (Loughlin et al., 1986) projection neurons in LC. Consistent with this, double retrograde labeling studies reported that the LC projections to frontal and motor cortices (Chandler et al., 2014; Chandler and Waterhouse, 2012) overlap minimally. In contrast, other double retrograde studies found overlap between neurons projecting to separate structures along the same processing stream (Simpson et al., 1997), or structures as different as forebrain and cerebellum (Steindler, 1981). Morerecent work using retrograde viral labeling combined with anterograde tracing concluded that LC neurons project largely indiscriminately throughout both cortex and the rest of the brain (Schwarz et al., 2015).
Our single cell resolution data reconcile these conflicting datasets. We find that individual LC neurons have very specific projection targets incortex and olfactory bulb, but are not limited to a single target. We further observe that many LC neurons that project to cortex innervate a large fraction of the cortex at least weakly, in addition to having preferred projection targets. To the extent that retrograde viral tracers may not distinguish between strong and weak projections, it may be this feature of single neuron projections in combination with bulk tracing that lead Schwarz and colleagues (Schwarz et al., 2015) to conclude that LC neurons largely indiscriminately project throughout cortex and the bulb, and indeed we canreplicate their results by simulating retrograde tracing on our single cell dataset.
The LC is the sole source of noradrenaline to the cortex. Noradrenalineexerts a powerful influence on an animal’s behavioral state. Noradrenaline levels control the overall level of vigilance; they are lowest during sleep, and are increased in response to stimuli such as pain. Noradrenaline gates attention, enhances formation of long-term memory and is thought to regulate the exploration-exploitation balance (Aston-Jones and Cohen, 2005; Sara, 2009). Traditionally, it has been assumed that the levels of neuromodulators such as noradrenaline represent a global signal, broadcast indiscriminately throughout the cortex. However, the specificity of the single neuron projections patterns uncovered by MAPseq suggests that different brain regions could be subject to differential control. Whether this potential for differential control is actually realized, and what functional role it plays, remain to be determined.
Conclusion
Applying MAPseq to LC, we discovered unexpected structure that could not have been resolved by previous methods lacking single neuron resolution. MAPseq also lays the foundation for using sequencing to decipher local neuron-to-neuron connectivity (Zador et al., 2012). Using DNA sequencing technology, experimenters have gained unprecedented insight into the heterogeneity of cell populations at the single cell level (Navin et al., 2012). By leveraging this sequencing technology, MAPseq empowers neuroscience researchers to efficiently do the same for populations of projection neurons examined at the single neuron level.
Experimental procedures
MAPseq
Forty-four hours after injection of MAPseq virus into LC, we flash froze the brain and cut it into 300μm coronal sections using a cryostat. We dissected the cortical regions on dry ice and extracted total RNA from each sample. We then preformed gene specific reverse transcription for the barcode mRNA, produced double-stranded cDNA and PCR amplified it to produce an Illumina sequencing library which we sequenced at paired end 36 onan Illumina NextSeq sequencing machine.
Data analysis
We processed the sequencing data to determine the exact copy number of each barcode sequence in each target area and in the injection site, an produced a barcode matrix, where each row corresponds to one specific barcode sequence, each column corresponds to a target area or the injection siteand each entry is the copy number of that barcode mRNA in the respective area. All data are expressed as mean +/- standard deviation unless otherwise stated.
For full details on the experimental procedures please refer to Extended Experimental Procedure section in the supplement.
Author contributions
I. D.P and A.M.Z conceived the study. J.M.K. and P.G.S. performed the experiments. J.M.K and A.P.R. performed the single cell isolation. J.M.K. and A.M.Z. analyzed the data. J.M.K. and A.M.Z. wrote the paper. A.M.Z. andD.F.A. supervised the project.
Extended Experimental Procedures
MAPP-nX
MAPP-nλ is a modified version of pre-mGRASP (Kim et al., 2011). We stripped the pre-mGRASP protein of the 2A-cerulean fusion and added four repeats of the nλ RNA binding domain (Daigle and Ellenberg, 2007)in thecytoplasmic tail after amino acid 287 of the original pre-mGRASP sequence. We also added a Myc epitope tag followed by the CLIP-tag domain (Gautier et al., 2008) after amino acid 59 of the original pre-mGRASP protein.
Sindbis virus barcode library
The virus used in this study is based on a dual promoter pSinEGdsp construct (Kawamura et al., 2003). We inserted MAPP-nλ after the first subgenomic promoter. Downstream of the second subgenomic promoter, we inserted the GFP coding region followed by closely spaced NotI and MluI restriction sites and four repeats of the boxB motif (Daigle and Ellenberg, 2007). Using this construct, we produced a high diversity plasmid library by inserting a diverse pool of double stranded ultramers (Integrated DNA Technologies) with sequence 5’-AAG TAA ACG CGT AAT GAT ACG GCG ACC ACC GAG ATC TAC ACT CTT TCC CTA CAC GAC GCT CTT CCG ATC TNN NNN NNN NNN NNN NNNNNN NNN NNN NNN NYY GTA CTG CGG CCG CTA CCT A-3’ between the NotI and MluI sites. We then produced Sindbis virus as previously described (Kebschull et al., 2015) using either the conventional DH(26S)5’SIN helper (Bredenbeek et al., 1993) or the new DH-BB(5’SIN;TE12) (Kebschull et al., 2015) helper. We determined the titer of the resulting virus by qPCR as previously described (Kebschull et al., 2015) and determined the viral library diversity by Illumina sequencing of the RNaseI protected genomic virus RNA.
Injections
Animal procedures were approved by the Cold Spring Harbor Laboratory Animal Care and Use Committee and carried out in accordance with National Institutes of Health standards.
We pressure injected 180nl of 2×1010 GC/ml barcoded Sindbis virus uni-or bilaterally into LC of 8-10 week old C57BL/6 males (Jackson Labs) as described (Cetin et al., 2007). We leveled the animal skulls on two axes using lambda and bregma for the AP axis and 2mm laterally from the midpoint between lambda and bregma for the lateral axis. We used coordinates AP=-5.4mm, ML=0.8mm, DV=2.9mm and 3.1mm forLC and measured depth from the surface of the brain. We injected each DV coordinate with 90nl of virus, waiting ten minutes in between each depth. We sacrificed animals 44 hours post injection. For immunofluorescence, RNA in situ and histology, we transcardially perfused animalswith ice cold saline (9g/l) followed by 4% paraformaldehyde (Electron Microscopy Sciences) in 0.1M Phosphate buffer. For RNA work we extracted the fresh brain and flash froze it on dry ice.
For measurements of MAPseq efficiency, we injected red retrobeads (Lumafluor) into the right olfactory bulb of 8-12 week old C57BL/6 males (Jackson Labs). Briefly, we roughly determined the center of the right olfactorybulb, and measured +/-1mm from the center in the AP axis and performed two craniotomies 2mm apart. We sonicated the beads for 20 minutes prior to injection to homogenize the solution and injected 210nl of stock concentration of beads at three different depths (0.3mm, 0.6mm and 0.9mm DV from the surface of the olfactory bulb) as described (Cetin et al., 2007). Twenty-four hours later, we injected barcoded Sindbis virus into right LC as above and sacrificed the animals 44-48 hours after Sindbis injection. Immunofluorescence and ISH
We performed anti-GFP staining and RNA in situ hybridization on 6μm thick paraffin sections. For immunofluorescence, we used a rabbit anti-GFP antibody ab290 (Abcam) after heat induced antigen retrieval. We performed RNA in situ hybridization using thePanomics ViewRNA ISH Tissue kit (Affymetrix) using anti-GFP probe VF1-10141 according to the manufacturer’s protocol (10 minutes boiling and 10 minutes protease treatment). We performed anti-TH staining on floating 70μm vibratome sections using rabbit anti-TH antibody SAB4300675 (Sigma-Aldrich).
Spike-in RNA
To produce spike-in RNA, we double stranded an ultramer (Integrated DNATechnologies) with sequence 5’-GTC ATG ATC ATA ATA CGA CTC ACT ATA GGG GAC GAG CTG TAC AAG TAA ACG CGT AAT GAT ACG GCG ACC ACC GAG ATC TAC ACT CTT TCC CTA CAC GAC GCT CTT CCG ATC TNN NNN NNN NNN NNN NNN NNN NNN NAT CAG TCA TCG GAG CGG CCG CTA CCT AAT TGC CGT CGT GAG GTA CGA CCA CCG CTA GCT GTA CA-3’ and then in vitro transcribed the resulting dsDNA using the mMessage mMachine T7 in vitro transcription kit (Thermo Fisher) according to the manufacturer’s instructions.
qPCR
We reverse transcribed total RNA using oligodT primers and Superscript III reverse transcriptase (Thermo Fisher) according to the manufacturer's instructions. We then quantified the amount of barcode and β-actin cDNA by qPCR in SYBR green power master mix (Thermo Fisher) according to the manufacturer's instructions using primers 5’-GAC GAC GGC AAC TAC AAG AC-3’ and 5’-TAG TTG TAC TCC AGC TTG TGC-3’ for barcode cDNA and 5’-CGG TTC CGA TGC CCT GAG GCT CTT-3’ and 5’-CGT CAC ACT TCA TGA TGG AAT TGA-3’ for p-actin cDNA.
MAPseq
We cut 300μm thick coronal sections of fresh frozen brains using a Leica CM 3050S cryostat at −12°C chamber temperature and −10°C object temperature. To avoid cross-contamination between samples, wetook care to cut each section with a fresh, unused part of the blade. We melted each section onto a clean microscope slide and rapidly froze the section again on dry ice before dissecting out the cortex on dry ice using acold scalpel blade. During dissection, we aimed to avoid known fiber tractsto minimize the contamination of our dataset with fibers of passage. Aftersample collection, we processed all samples out of order to avoid potential sample crosscontamination from impacting interpretation of MAPseq results.
We extracted total RNA from tissue samples using Trizol reagent (ThermoFisher) according to the manufacturer's instructions. We mixed the total RNA from the tissue samples with spike-in RNA. We then produced ds cDNA as previously described (Morris et al., 2011) using a gene specific primer of from 5’-CTT GGC ACC CGA GAA TTC CAN NNN NNN NNN NNX XXX XXTGTA CAG CTA GCG GTG GTC G-3’, where XXXXXX is one of 65 trueseq like SSI and N12 is the UMI. We then cleaned the reaction using the Qiagen MinElute PCR purification kit according to the manufacturer’s instructions and treated the eluted ds cDNA with ExonucleaseI (New England Biolabs) to remove remaining primers. We amplified the barcode amplicons by nested PCR using primers 5’-CTC GGC ATG GAC GAG CTG TA-3’ and 5’-CAA GCA GAA GAC GGC ATA CGA GAT CGT GAT GTG ACT GGA GTT CCT TGG CAC CC GAG AAT TCC A-3’ for the first PCR and primers 5’-AAT GAT ACG GCG ACC ACC GA-3’ and 5’-CAA GCA GAA GAC GGC ATA CGA-3’ for the second PCR in Accuprime Pfx Supermix (Thermo Fisher). We then gel extracted the amplicons using the Qiagen MinElute Gel extraction kit according to the manufacturer's instructions and pooled the individual sequencing libraries based on qPCR quantification using primers 5’-AAT GAT ACG GCG ACC ACC GA-3’ and 5’-CAA GCA GAA GAC GGC ATA CGA-3’. We then sequenced the pooled libraries on an Illumina NextSeq500 high output run at paired end 36 using the SBS3T sequencing primer for paired end 1 and the Illumina small RNA sequencing primer 2 for paired end 2.
Efficiency measurements and single cell isolation
After transcardial perfusion with ice-cold artificial cerebrospinal fluid (127mM NaCl, 25mM NaHCO3, 1.25mM NaPO4, 2.5mM KCl, 2mM CaCl2, 1mM MgCl2, and 25mM D-glucose), we extracted the unfixed brain and flash froze the bead-injected olfactory bulb on dry ice before processing it for sequencing as described above. We cut 400μm thick acute sagittal slices of the remaining right hemisphere in dissection solution (110mM choline chloride, 11.6mM ascorbic acid, 3.1mM Na pyruvic acid, 25mM NaHCO3, 1.25mM NaPO4, 2.5mM KCl, 0.5mM CaCl2, 7mM MgCl2, and 25mM D-glucose) using a Microm HM650V vibratome. We incubated sections containing LC in artificial cerebrospinal fluid (126mM NaCl, 20mM NaHCO3, 3mM KCl, 1.25mM NaH2PO4, 2mM CaCl2, 2mM MgSO4, and 20mM D-glucose) containing synaptic blockers (0.05mM APV, 0.02mM DNQX and 0.1 μM TTX) for 20 minutes at room temperature. We then digested the slices in artificial cerebrospinal fluid with streptomyces griseus protease (Sigma P5147) at 1mg/ml at room temperature for 30 min. After washing in artificial cerebrospinal fluid with synaptic blockers, we dissected LC from the digested section and triturated the tissue to produce a single cell suspension. Using an inverted fluorescent microscope (Zeiss Observer), we picked individual cells by hand, deposited the cells directly into lysis buffer (2.4μl 0.2% triton, 1μl 10mM dNTPs, 1μl 10mM RT primer per cell) and proceeded to preparing sequencing libraries from the cells as described above for tissue samples.
Acknowledgements
The authors would like to acknowledge Vasily Vagin, Fred Marbach, Alex Koulakov, and Brittany Cazakoff for useful discussions and assistance, and Barry Burbach and Diana Gizatulliana for technical support. We would further like to thank Anirban Paul for help with single cell isolations. This work was supported by the following funding sources: National Institutes of Health [5RO1NS073129 to A.M.Z., 5RO1DA036913 to A.M.Z.]; Brain Research Foundation [BRF-SIA-2014-03 to A.M.Z.]; Simons Foundation [382793/SIMONS to A.M.Z.]; PhD fellowship from the Boehringer Ingelheim Fonds to J.M.K.; PhD fellowship from the Genentech Foundation to J.M.K.; PhD fellowship fromthe Fundagao para a Ciencia e Tecnologia, Portugal to P.G.S.; Pew Scholarship and CSHL startup funds to D.F.A. This work was performed with assistance from CSHL Shared Resources, which are funded, in part, by the Cancer Center Support Grant 5P30CA045508.
Data deposition
All high throughput sequencing datasets are publicly available under SRA accession codes SRS1204613 (library ZL067; virus library), SRS1204589 (libraries ZL068, ZL070, ZL071, ZL072; unilateral MAPseq datasets), SRS1204614 (libraries ZL073, ZL074; bilateral MAPseq datasets), and SRS1204626 (libraries ZL075 and ZL078; single cell dataset).
Conflict of interests
The authors declare no conflict of interests.