Abstract
Targeted DamID (TaDa) allows highly efficient cell-type-specific profiling of protein-DNA interactions. Cell-type-specificity, however, is governed by the GAL4/UAS system, which can exhibit differences in expression patterns depending upon the genomic insertion site and the UAS promoter strength. The TaDa system uses a bicistronic transcript to reduce the translation rates of Dam-fusion proteins, presenting the possibility of using the primary ORF within in the transcript to label expression domains and precisely identified the profiled cell populations in experimental samples. Here, we describe two TaDa vectors, pTaDaG and pTaDaG2, that use myristoylated GFP as the primary ORF. Differing lengths of the myristoylation sequence between the plasmids allows additional translational control. Fly lines created with this system allow easy visualisation of expression domains under both fluorescent dissecting and confocal microscopes without the use of antibody staining, whilst faithfully profiling protein-DNA interactions via Targeted DamID.
Introduction
Targeted DamID (TaDa) is a recently developed technique that generates in vivo cell-type-specific binding profiles of DNA-binding, chromatin-modifying, or DNA-associated proteins in Drosophila melanogaster [1, 2]. The technique is highly reproducible and extremely sensitive, generating binding profiles from as few as 10,000 cells in living organisms [2]. The technique is a variant of DamID, in which a protein of interest is fused to DNA Adenine Methylase (Dam) from Escherichia coli, leading to the enriched methylation of GATC sites in close proximity to where the protein of interest binds.
In TaDa, cell-type-specificity is accomplished via the GAL4/UAS system [3]. High levels of cellular Dam, however, are toxic; and a signature feature of TaDa is a significant reduction in the translation levels of Dam-fusion proteins from highly-expressed GAL4-driven transcripts [1]. In TaDa, this lowering of translation levels is accomplished via a bicistronic transcript, where an upstream ORF is separated from the Dam-fusion ORF by two stop codons and a frameshift. The Dam-fusion ORF is thus only translated via spontaneous ribosome re-initiation, in which the rates of translation of a secondary ORF are inversely proportional to the length of the primary ORF [4]. The original TaDa system used full-length mCherry as the primary ORF. In theory, this presents the possibility of visualising the profiled cell population via microscopy of dissected experimental tissue; however, the low brightness of mCherry and the lack of localisation to a particular cellular compartment makes visualisation of expression domains in TaDa experimental samples challenging in practice.
Although broadly defined expression patterns for GAL4 drivers are known, the precise expression pattern and the amount of driver background from other tissues depends upon the targeted insertion site and UAS promoter strength [5]. Given the sensitivity of the TaDa technique and that only a small subset of the cells within an isolated tissue are typically profiled within a TaDa experiment, knowing the exact cells profiled with a GAL4 driver in the TaDa system is critically important to data interpretation.
Here, we describe two new TaDa vectors that use membrane-targeted myristoylated-GFP as a primary ORF. The TaDaG vector uses an 85 amino acid (aa) myristoylation sequence; TaDaG2 uses a 14 aa minimal myristoylation sequence. The differences in the primary ORF allow differing secondary ORF translation levels, but otherwise behave identically. Importantly, these primary ORFs allow easy fluorescent identification of profiled cell populations within the experimental sample. The vectors also incorporate a StuI restriction site upstream of Dam to easily facilitate the creation of C-terminal Dam-fusion proteins.
Methods
Expression constructs
The pTaDaG vector was constructed by cutting pUASTattB [6] with EcoRI and XbaI, and inserting a 1969bp custom gBlock (IDT) containing EcoRI-myrGFP-StuI-Dam-MCS-XbaI via NEB HiFi assembly (NEB). The myrGFP sequence represents the first 85aa of D. melanogaster Scr64B fused to a D. melanogaster-codon-optimised GFPF64L,S65T,H231L [5]. The pTaDaG2 vector was created by cutting the pTaDaG with EcoRI and NdeI, and inserting a 350bp gBlock (IDT) containing the minimal 14aa myristoylation sequence MGSSKSKPKDPSQR from p60src [7] and the 5’ portion of GFP, again via NEB HiFi assembly. pTaDaG-Pc was generated by cutting pTaDaG with BglII/XhoI and inserting a 1233bp gBlock (IDT) containing the Pc-RA ORF. All plasmids were sequence-verified via Sanger sequencing (ABI). Plasmid maps were generated using SnapGene software (Insightful Science).
Fly lines
GAL4 driver lines used were worniu-GAL4 [8] for neural stem cells and R13F02-GAL4 [9] for Mushroom body neurons. Lines were crossed to a tub-GAL80ts stock to generate a worniu-GAL4;tub-GAL80ts line.
TaDaG-Dam, TaDaG2-Dam and TaDaG-Pc fly lines were generated by BestGene, Inc (CA), through phiC31-integrase-mediated insertion of the appropriate expression vectors into attP2 on chromosome 3L.
Confocal microscopy
Larval brains (3rd instar, 96hrs ALH) were dissected in PBS and fixed in PBS + 0.3% TritonX-100 (PBST) with 4% (v/v) paraformaldehyde (ProSciTech) for 20 mins, 4°C, before three 10 min washes in PBST. Brains were mounted in Vectorshield + DAPI, and imaged under an Olympus FV3000 confocal microscope at 20x magnification.
Targeted DamID
TaDaG-Dam or TaDaG-Pc males were crossed to worniu-GAL4;tub-GAL80ts virgin females in cages. Embryos were collected on apple juice agar plates with yeast over a 4-hour collection window at 25°C and grown at 18°C for two days. Newly hatched larvae were transferred to food plates for a further five days at 18°C, before shifting to 29°C for 24 hours. Larval brains were dissected in PBS, and processed for DamID-seq as previously described [2, 10] with the following modifications. Briefly, DNA was extracted using a Quick-DNA Miniprep plus kit (Zymo), digested with DpnI (NEB) overnight and cleaned-up with a PCR purification kit (Machery-Nagel), DamID adaptors were ligated, digested with DpnII (NEB) for 2 hours, and amplified via PCR using MyTaq DNA polymerase (Bioline). Following amplification, 2µg DNA was sonicated in a Bioruptor Plus (Diagenode), DamID adaptors removed by AlwI digestion, and 500ng of the resulting fragments end-repaired with a mix of enzymes (T4 DNA ligase (NEB) + Klenow Fragment (NEB) + T4 polynucleotide kinase (NEB)), A-tailed with Klenow 3’ to 5’ exo- (NEB), ligated to Illumina Truseq LT adaptors using Quick Ligase enzyme (NEB) and amplified via PCR with NEBNext Hi-fidelity enzyme (NEB).
The resulting next-generation sequencing libraries were sequenced on a HiSeq2500 and reads were processed with damidseq_pipeline [11].
Results and Discussion
Design of the TaDaG and TaDaG2 vectors
The pTaDaG and pTaDaG2 plasmids were generated from pUASTattB using synthetic DNA. In designing the myrGFP insert, we combined Drosophila codon-optimised GFPF64L,S65T,H231L [5] with either the 85aa myristoylation sequence from Scr64B [5] (pTaDaG) or a 14 aa minimal p60src myristoylation sequence [7] (pTaDaG2) (Fig. 1A). Following the primary ORF, we incorporated a double stop codon / single base frameshift linker, as per the original TaDa vector (pUAST-attB-mCherry-NDam) [1], together with a StuI restriction enzyme site upstream and in-frame with Dam to allow easy generation of C-terminal Dam fusion proteins (Fig. 1B). Spacing between ORFs is reported to have little effect on translation rates of the secondary ORF [4], allowing the incorporation of the StuI site with no translational penalty. The MCS region is separated from Dam using the same Myc-tag+linker sequence as the original vector, allowing cloning compatibility between vectors (Fig. 1C).
TaDaG and TaDaG2 label GAL4-driven cell populations in vivo
In order to test the labelling capacity of the myrGFP primary ORFs in the pTaDaG and pTaDaG2 constructs, we crossed TaDaG-Dam and TaDaG2-Dam flies to the mushroom body neuron driver R13F02-GAL4 (Fig. 2). Both constructs exhibited clear and specific membrane-bound GFP labelling of mushroom body neurons that was detectable through native fluorescence (without antibody labelling) and was also visible under a fluorescent dissecting microscope (not shown).
The two myrGFP-labelled variants of the TaDa system allow simple verification of the expression pattern of Dam-fusion proteins under experimental conditions. The system also allows experimental crosses to be checked for correct GFP labelling during tissue collection.
The TaDaG system faithfully profiles Polycomb binding domains in neural stem cells
To determine if the TaDaG system could generate cell-type-specific Targeted DamID profiles, we profiled Polycomb binding in neural stem cells (NSCs) using the NSC-specific driver worniu-GAL4, inducing expression for 24hours in 30 brains (∼9000 total profiled neural stem cells) in early (96hrs ALH) 3rd instar larvae (Fig. 3A). Two independent biological replicates had a very high correlation (Pearson’s correlation between TaDaG replicates: 0.92), indicating excellent reproducibility even from very small sample sizes, and no loss of sensitivity when compared to the original TaDa system.
We compared the binding profiles to our previously published Polycomb binding data in NSCs obtained through the original TaDa system [2] (generated using a 16 hour induction timeframe rather than 24 hours in the current study). We observed a high correlation (minimum Pearson’s correlation: 0.77) between the TaDa and TaDaG-generated profiles (Fig. 3A,B) and clear concordant binding over canonical Polycomb foci (Fig. 3C), indicating that the TaDaG system functions indistinguishably from the original TaDa constructs.
Given that GAL4-driver expression patterns are dependent upon both the insertion site and the UAS promoter sequence [5], the ability to verify driver expression in situ is vital for determining the profiled cell population in Targeted DamID and interpreting subsequent binding data. We anticipate that these vectors will prove highly useful to the community.
Acknowledgments
We thank G. Jefferies for technical assistance. This work was supported by an NHMRC grant APP1128784 to OJM and an Ian Potter Foundation equipment grant (20190091) to OJM.