Abstract
Despite growing knowledge of the functions of individual human transcriptional effector domains, much less is understood about how multiple effector domains within the same protein combine to regulate gene expression. Here, we measure transcriptional activity for 8,400 effector domain combinations by recruiting them to reporter genes in human cells. In our assay, weak and moderate activation domains synergize to drive strong gene expression, while combining strong activators often results in weaker activation. In contrast, repressors combine linearly and produce full gene silencing, and repressor domains often overpower activation domains. We use this information to build a synthetic transcription factor whose function can be tuned between repression and activation independent of recruitment to target genes by using a small molecule drug. Altogether, we outline the basic principles of how effector domains combine to regulate gene expression and demonstrate their value in building precise and flexible synthetic biology tools.
Introduction
Transcription factors (TFs) and chromatin regulators (CRs) contain short effector domains that can act as repressors or activators when recruited at a target gene1,23–51,2. Site-specific recruitment assays of effector domains and full length TFs at reporter genes have long been used to understand their effects on gene expression and develop better tools for gene regulation 6–15.
Although recruitment assays have historically focused on recruiting only one transcriptional effector per cell, combinatorial function is a key property of both chromatin-mediated gene regulation16 and transcription factor-mediated gene regulation17–19. Transcription factors frequently contain multiple effector domains with potentially opposing functions, with studies reporting up to 40% of transcription factors having at least 2 distinct effector domains2,20. For example, the chromatin regulator MGA was recently shown to feature two repressive domains with different rates of silencing and amounts of memory21, while an earlier study showed that the transcription factor NIZP1 features an activating KRAB domain that is dominated by a repressive C2HR domain22. Combinations of effector domains have a long history in the context of synthetic biology, where the well-known transcriptional activators VP6423 and VPR24 were built via combining multiple known activation domains. Similarly, repressor combinations featuring the KRAB domain of ZNF10 and DNA methyltransferases have been shown to produce a robust combination of rapid gene silencing and long-term epigenetic memory9,25.
A systematic understanding of how combinations of transcriptional effector domains function in human cells would expand the range of compact tools available for epigenetic perturbations and therapy26,27. Additionally, composing effector domains can serve as a useful strategy to design synthetic TFs capable of implementing gene regulatory functions not achievable by fusing individual effectors to DNA-binding domains. These TFs could be used for many applications, including more efficient reprogramming of cell lineage specification or high-throughput screening of the noncoding genome28,29.
In order to systematically test combinations of effector domains, we need high-throughput methods: even testing all possible combinations resulting from pairing 100 effector domains requires 10,000 measurements. Recently, pooled screens have been developed for high-throughput characterization of individual transcription factors and effector domains in yeast30,31, drosophila32, and human cells20,21,33,34. Arrayed high-throughput measurements of combinations of chromatin regulators in concert with VP16 have been performed in yeast to test 223 combinations7,14 and low-throughput measurements have been performed in human cells35. However, we are not aware of any systematic high-throughput studies to date in mammalian cells that measure how effector domains act in combination to regulate gene expression. As such, it remains unclear whether pairs of activator domains synergize when combined, whether pairs of repressor domains do the same, and how activator and repressor domains affect each other when recruited simultaneously. Here, we modify a recently developed pooled high-throughput method21 to test thousands of combinations of previously characterized protein domains that can activate and repress gene expression in human cells in order to start addressing these questions.
Developing a workflow for combinatorial screening of transcriptional effector domains
We began by selecting a panel of effector domains and controls from our previous screen of single domains21: 44 repressors, 30 activators, and 20 control domains (Fig. 1A, Materials & Methods, Supplementary Table 1). These effectors were chosen to span a wide range of individual activation and repression strengths when recruited at the same reporter gene, (Fig. S1A). The control domains were chosen to be either random sequences or fragments of the DMD protein that were shown to have no effect on gene expression when recruited to a reporter individually21. To avoid problems with protein stability, we chose only well-expressed, stable domains as measured by FLAG-staining21. We generated combinations of these individual domains using a 2-step cloning strategy to build a pool of domain-linker-domain concatenations that were then cloned into a lentiviral backbone vector as fusions to the reverse TetR (rTetR) inducible DNA binding domain (Fig. 1A, Fig. S1B, Materials & Methods). We delivered this pool into K562 reporter cells using lentivirus with a low multiplicity of infection (MOI=0.24), such that most cells expressed a single library element. We measured the ability of each pair in this library to activate or repress a reporter gene using a high-throughput pooled method we recently developed: HT-recruit21. Briefly, by adding doxycycline (dox) we recruited the rTetR-concatenated fusions to either a minimal promoter to measure activation, or to a highly expressed constitutive promoter to measure repression (Fig. 1B). We separated cells into populations with high (ON) and low (OFF) reporter gene expression using magnetic separation (Materials & Methods, Fig. 1C, Fig. S1C), and computed the relative enrichment of individual effector combinations in each population (Fig. 1D). We recovered the large majority of concatenations that were present in our combinatorial library with sufficient sequencing depth (Fig. S1D), and found good correlation across replicates for both our activation (Fig. 1E, Pearson R=0.803, p<2.2×10-16) and repression (Fig. 1F, Pearson ρ=0.795, p<2.2×10-16) screens. To identify effector combinations for which we could reliably measure activity, we used the distribution of negative control-only combinations: we considered a pair to be activating and/or repressing if it had a score at least 2 standard deviations away from the mean of the negative control combinations.
To validate these high-throughput measurements and our detection threshold, we measured activation for 61 domain concatenations and repression for 38 domain concatenations by recruiting them individually to our synthetic reporters and measuring gene expression via flow cytometry. We measured the fraction of cells activated at the weak minCMV promoter after 2 days of recruitment and found strong correlations between our screen measurements and the fraction of cells activated (Fig. 1G, R2=0.88, p=7.4×10-50). We found a similarly strong correlation between repressive screen scores and the fraction of cells silenced after 5 days of recruitment to the strong pEF promoter (Fig. 1H, R2=0.84, p=4.6×10-13). We concluded that our screen accurately measured these concatenations’ ability to either activate or repress gene expression.
We began our analysis of the screen data by examining the behavior of effector domains in our library in combination with the negative control domains. We identified 4 out of 20 negative control domains that significantly affected either activation or repression scores across all effector domains and filtered them from downstream analyses (Fig. S1E-F). We then sought to determine if the order of the domains within concatenations altered their effect on gene expression. While prior literature has demonstrated that the ordering of effector domains in synthetic TFs can significantly affect the function of fusions24,25, we found that few of the domains in our library changed effect size significantly when switched from the first to the second position across all of their fusions with negative controls (Fig. S1G). We excluded concatenations featuring any domain in the orientation that ablated its function from downstream analysis (Fig. S1H).
To check to what extent effectors maintained their function when fused to a control sequence, we examined the distribution of scores for each effector domain when paired with all negative controls (Fig. 1I). While for some domains all scores clustered together in one quadrant (e.g., the strong KRAB repressor domain from ZNF10, Fig. 1I, blue), we found that other domains dropped under the detection threshold when paired with certain negative controls (e.g., weak CRTC2 activation domain, Fig. 1I, yellow). Some domains, such as the FOXO3 activation domain, have been previously reported to act as dual effectors that both activate a minimal reporter and repress a constitutive one when recruited individually20. We indeed found that most FOXO3-control fusions acted as dual activator-repressors, though a minority of fusions only acted as activators (Fig. 1I, green). In order to classify each domain, we calculated the number of effector-control pairs that met our hit threshold for activation and for repression (Fig. 1J). We used these data, along with the magnitude of activation and repression scores for each effector when paired with negative controls, to label effectors as activators, repressors, dual-functional effectors (both activators and repressors), or non-hits (Fig. 1K, Materials & Methods). We found that the median activation or repression score of each effector domain when paired with negative control domains correlated well with prior screen measurements of activation (Fig. 1L, Pearson R=0.81, p<2.2×10-16) and repression (Fig. 1M, Pearson R=0.85, p<2.2×10-16). Convinced that we could appropriately characterize how individual effectors behave when paired with negative controls, we proceeded to analyze the behavior of effector domain combinations with each other.
Weak activator domain pairs synergistically drive gene expression from weak promoters
The high-throughput measurements for transcriptional activation identified a large number of activator-activator pairs whose activation scores exceeded the sum of each individual activator’s scores when paired with negative controls (Fig. 2A, left). We proceeded to validate individual examples of activator pairs to verify that the scores in the screen correlate well with individual flow cytometry measurements (Fig. S2A-B, Fig. 1G), and that synergy could be replicated in low-throughput (Fig. 2B top). Notable examples of synergy included the pair of ANM2’s SH3 domain and KIBRA’s WW-1 domain, as well as ANM2’s SH3 domain and NOTC2’s LNR-2 domain. While the full role of the ANM2 SH3 domain is not yet clear36, it is known to regulate PRMT1 activity in a methylation-dependent fashion37, is required for the function of the actin nucleator CobI38, and modulates alternative splicing of BCL-X39. KIBRA’s WW-1 domain binds PPxY motifs in other proteins40, and is essential for KIBRA-mediated regulation of Hippo signaling via interactions with LATS1/241, while NOTC2’s second Lin-12/Notch repeat (LNR) domain sits within its negative regulatory region42 and is cleaved off during ligand binding43.
Surprisingly, we found that the strongest activators at minCMV, which were generally dual-functional effectors that also repressed gene expression at pEF, resulted in lower activation scores at minCMV when paired together than when either activator was paired with negative controls (Fig. 2A, upper right quadrant). To verify that the antagonism of dual-dual effector pairs was real and not simply a result of promoter saturation, we individually tested the MYBA-ZN473 pair and found that, indeed, the combination did not activate as many cells as either domain when paired with a negative control (Fig. 2B bottom). Using the previously computed functions connecting screen scores with individual flow cytometry measurements of gene activation (Fig. 1G), we compared the estimated fraction of cells activated for each combination of activators with the sum of each activator’s control-paired estimated fraction activated. We found that activator-activator pairs tended to synergize, activator-dual pairs were additive, and dual-dual pairs acted antagonistically with low activation scores (Fig. 2A, Fig. 2C top). We computed an estimated quantity of synergy for each combination by taking the difference between that combination’s estimated fraction on and the sum of its individual domains’ fractions on. Doing so, we found that activator-activator pairs tended to feature more synergy than activator-dual pairs, which in turn tended to feature more synergy than dual-dual pairs (Fig. 2C bottom).
For each activator or dual-functional domain we then tracked the activity of all combinations containing that domain (including combinations with repressors and controls) and examined how the strength of the combination varied with the strength of the partner’s activation score. We fit these data using a sigmoidal Hill function (Fig. 2D, Fig. S2C). We found that the strength of the partner domain at the half-maximal point of the Hill function decreased as the activation domain’s strength increased, meaning that strong activators were able to reach half-maximal activation with less help from their partner domain than weak activators (Fig. 2E). We also found that the Hill coefficient of these functions went up as the activation domain’s strength increased, indicating that the increase from minimal to substantial gene activation happened more rapidly in stronger activation domains as the strength of their partners increased (Fig. 2F).
Across all individual validations, we found a strong correlation between the fraction of cells activated and the mean fluorescence intensity (MFI) of activated cells for our effector domain pairs (Fig. 2G). This correlation and the fluorescent distributions of reporter expression (Fig. 2C, Fig. S2A) are consistent with activation domains modulating transcriptional bursting kinetics at the minCMV promoter: either increasing burst frequency or burst magnitude44–48. Since the degradation rates of our reporter mRNA and protein are slow, taking multiple days to dilute out10,21, reporter molecules in our cells persist long after the cessation of a transcriptional burst. As such, it would be difficult to determine whether reporter protein molecules were made all at once in one big burst or bit by bit (in many small bursts) over a longer period of time. Thus, in this regime of slow degradation rates, increasing either the duration of a transcriptional burst (burst size) or the frequency of such bursts (burst frequency) would lead to both a higher fraction of cells measured to be on and a greater MFI of those on cells49. Our findings are consistent with recent work showing that changes in burst kinetics can combine to produce transcriptional synergy50.
Repressor domain combinations generate robust promoter silencing
Our measurements of transcriptional repression suggested a linear relationship between the repressive strength of each domain and the repressive strength of the pair (Fig. 3A, points on the diagonal in the left bottom quadrant). In this case, the points that are above the diagonal come from saturation of repression at our reporter; effector pairs with scores ≤-3 are predicted to silence all cells according to our validation curve (Fig. 1H, Fig. S3A). Indeed, when we tested a number of domain pairs in low throughput, we found that recruitment of pairs of strong repressors that each fully repressed the reporter when paired with controls (e.g. ZNF10 and CBX1) can also repress 100% of the cells when the pair is recruited. (Fig. 3B).
Using the best-fit curve for low-throughput validations mapping the repression log2(ON:OFF) scores to the fraction of cells silenced at day 5 (Fig. 1H), we estimated the fraction of cells silenced at day 5 for each repressor or dual effector in our screen based on its median control-paired repression log2(ON:OFF) score. We found that while the strength of effectors that could repress gene expression spanned a wide range, the majority of such domains were estimated to silence ≥75% of cells after 5 days of recruitment (Fig. 3C). As a result, when comparing the estimated fraction of cells silenced at day 5 for a combination versus the sum of the individual domains’ respective fractions, we found that most concatenations were expected to and did silence virtually all cells after 5 days, with stronger repressors more consistently producing more repressive combinations than weaker repressors (Fig. 3D, Fig. S3B-D).
Our high-throughput measurements indicated that the repression log2(ON:OFF) score of a repressor domain pair was a linear function of the two individual domains, as opposed to the nonlinear and sigmoidal behavior of activation domain pairs. To determine if this feature held true for individual repressor domains, we plotted the strength of every combination featuring each domain versus the strength of the domain’s partner in that combination. We found that the trendline for repressors was linear, with lower slope for stronger repressors (Fig. 3E, Fig. S3E). For each repressor domain, we determined the slope and y-intercept of its corresponding trendline, and found that stronger repressors featured both a flatter slope closer to 0 (Fig. S3F, Pearson R=0.87, p<2.2×10-16) and a lower y-intercept (Fig. S3G, Pearson R=0.92, p<2.2×10-16). These results suggest that weak repressors are tunable in that the strength of a pair featuring a weak repressor can vary over a wide range depending on the strength of the partner; however, pairs featuring a strong repressor will generally only act as strong repressors.
When examining the fluorescence distributions of our reporter throughout silencing, we found that most domain combinations silenced either all or none of the cells by day 5 (Fig. 3F, top 2 rows), as expected from previous experiments on a small number of chromatin regulators10. However, a minority of combinations either silenced only a fraction of cells or reduced gene expression without fully silencing cells (Fig. 3F, bottom 2 rows). We attempted to see if these results were consistent with prior models of stochastic gene silencing by extending a mathematical model of gene expression we have used before to understand chromatin regulation49. We did so by incorporating parameters for background silenced cells at the beginning of the timecourse, basal gene expression from the pEF promoter, lag time prior to silencing, the rate of protein decay upon silencer recruitment, and the rates of gene silencing and reactivation (Fig. S4A, Materials & Methods). After fitting this model to our experimental data, we found that it was able to accurately represent the dynamics of gene silencing for both strong silencers such as ZNF10-CBX1 (Fig. S4B) and weaker silencers such as SMCA2-U2AF4 that only silenced the reporter in a fraction of cells (Fig. S4C). For these types of domains, the rates of silencing predicted by this extended telegraph model for the individual validations where the model was able to accurately fit the data, and we found a good correlation with both the screen data (Fig. S4D, R2=0.73, p=1.20×10-9) and the fraction of cells silenced at day 5 (Fig. S4E, R2=0.79, p=2.94×10-11). Silencing rates predicted by the model suggested that the rate of silencing of a combination may be linear in the sum of the silencing rates of the individual domains (Fig. S4F), although the correlation between screen data and silencing rates was not strong enough for a more definitive conclusion.
However, we found that our model of all-or-none gene silencing could not capture the dynamics of silencers that reduced but did not fully ablate gene expression by day 5, such as DPY30-HXA13 (Fig. S4G), which featured a domain from DPY30 that worked as an activator on its own in prior screens and a repressor homeodomain from HXA1321. For these silencers, their effect on gene expression is better explained by a decrease in the production rate in the active state, rather than transitioning to a fully silent state. Altogether, these results suggest that while strong silencers in combinations will rapidly and fully silence gene expression in most cases, weaker repressors can produce more complex dynamic patterns of gene expression.
High-throughput profiling of activator-repressor interactions
To identify principles underlying activator-repressor interactions, we looked at all pairs featuring an activator domain in combination with a repressor domain and determined if that pair functioned as an activator only pair, repressor only pair, dual-functional pair, or non-hit pair (Fig. 4A, Fig. S5A). We found that almost all pairs featuring moderate to strong repressors tended to behave exclusively as repressors, reducing gene expression at the strong pEF promoter while failing to drive transcription at the weak minCMV promoter (blue right region, Fig. 4A). Domains that were dual-functional when paired with negative controls (e.g., MYBA, FOXO3, and SERTAD2) tended to be dual-functional in combinations with either weak repressors or other dual effectors (green region, Fig. 4A), although they were dominated by strong repressors (e.g. ZNF10).
For most domains their activator-repressor and repressor-activator orientations had similar strengths for both activation (Fig. S5B, Pearson R=0.82, p<2.2×10-16) and repression (Fig. S5C Pearson R=0.79, p<2.2×10-16), indicating that these results were not the result of position-specific effects of the protein domains. Activator-repressor combinations with stronger activators were significantly more likely to be able to activate the minimal promoter than combinations with weaker activators (Fig. 4B, Pearson R=0.79, p=1.6×10-4). For repressors, this relationship was significantly less pronounced: weak repressors were only slightly less effective than strong repressors when paired with activator domains (Fig. 4C, Pearson R=-0.52, p=1.6×10-4). Testing pairs in low throughput validated that the combination of the weak repressive SH3 domain from BIN1 and FOXO3’s TAD could both activate gene expression at the weak minCMV promoter (Fig. 4D) and repress gene expression when recruited to the strong pEF promoter (Fig. 4E). As expected, adding the weak repressor domain from BIN1 decreased activation and increased repression compared to FOXO3 alone (Fig. S2A, S3A).
While in general we found that strong repressors overpowered virtually any non-repressor effector domain they were paired with, we did find a few examples of activators that could prevent dual-functional domains from repressing strong promoters (Fig. 4A). These activators include the activator KRAB from ZN597, a variant KRAB domain that functions as an activator; the SH3 domain from ANM2 previously described in this work; and the activating SH3 domain from BTK. While BTK’s SH3 domain is known to regulate BTK kinase activity upon autophosphorylation51, and while BTK can translocate to the nucleus52, we are unaware of a previously described role for the SH3 domain in regulating gene expression of BTK targets.
We were excited to find a number of non-hit domains that could not alter gene expression when paired with negative controls but did affect the function of other effector domains (gray labels, Fig. 4A, Fig. S5A). For example, we found that the aforementioned domain from DPY30, a core subunit of the SET1/MLL methyltransferase complex that interacts with ASH2L to establish H3K4me353, was able to ablate repressor function when paired with not only dual-functional domains but even weak to moderate strength repressors. Additionally, the N-terminal domain of DPF1, which links the NF-κB RelA/p52 heterodimer with SWI/SNF complex subunits to drive transcription54, was able to prevent repressor function from strong repressors, such as HERC2’s Cyt-b5 domain. We also found non-hit domains that prevented activation when paired with other effectors, including the a tile from the N-terminal disordered region of DNMT3B thought to be part of a broader region mediating interactions with the methyltransferases DNMT1 and DNMT3A55. The C2HR domain from the zinc finger ZNF496, which has been shown to overpower the variant activator KRAB present on the same transcription factor22, was similarly able to prevent all activators and all but the strongest dual-functional domains from driving gene expression. Altogether, these effectors did not themselves activate or repress transcription when paired with negative controls, but did modulate the activity of partner effectors on the same molecule in a manner consistent with the function of their native proteins.
We wondered how the distribution of effector combinations along both repression and activation log2(ON:OFF) scores varied between repressor-dual, activator-dual, and repressor-activator combinations. We found that while most repressor-dual combinations functioned as pure repressors, some dual-functional domains were able to combine with repressors to produce overall dual-functional combinations (Fig. 4F). In contrast, combining activators with repressors produced a larger number of effector pairs that neither activated the weak promoter nor repressed the strong promoter, with relatively fewer dual-functional combinations (Fig. 4G). Pairing pure activators with dual-functional domains produced combinations that mostly acted as dual activator-repressors, with a smaller fraction of combinations acting as pure activators without maintaining the repressor effect of the dual-functional domain (Fig. 4H).
Systematic characterization of domains that influence KRAB-mediated repression
Our screen data so far indicated that virtually any concatenation including the ZNF10 KRAB domain functioned as a strong repressor (~180 KRAB-containing pairs). We were interested in investigating this KRAB domain in more detail, as it is widely used in the well-known CRISPRi system56, has been harnessed in conjunction with DNA methyltransferases to produce more durable epigenetic silencing9,25, and has been deeply characterized21. We decided to test whether this pattern of KRAB dominance held when using a larger panel of partner domains, and generated a lentiviral library encoding a set of concatenations fusing ZNF10 KRAB with a library comprised of ~5000 80-amino acid proteins sequences consisting of Pfam annotated domains from nuclearly localized proteins and a set of negative controls21. We transduced this library into K562 cells expressing a reporter gene driven by the strong pEF promoter (Fig. 5A) and recruited the concatenations to the reporter gene for 5 days by adding doxycycline. We used a lower doxycycline concentration than in our previous screens (100 ng/ml vs 1000 ng/ml); to slow down KRAB silencing (Fig. S6A) and thus allow for a wider dynamic range for measuring both decrease and increase of function. We measured the enrichment of domain pairs in the ON versus OFF populations (Fig. S6B) at day 2 of recruitment to determine the speed of silencing; we opted not to take a measurement on day 5 at the end of recruitment, as the population was virtually entirely silent (Fig. 5B).
On day 2, the majority of pairs featuring well-expressed domains (as measured by FLAG staining before21) scored similarly to concatenations of KRAB with a negative control domain (Fig. 5C-D), consistent with the flow cytometry results showing that the majority of the cells were silenced (Fig. 5B). This matched our expectations from the prior screens, where concatenations featuring KRAB silenced similarly to KRAB on its own or with negative control domains. In this case, the negative controls consisted of a large set of random sequences and DMD fragments 21, and contained both well-expressed and poorly expressed proteins. We found that a number of domains that were lowly expressed on their own ablated KRAB function when paired with it (Fig. 5C-D). For example, we verified that the poorly expressed DHX16 OB_NTP and BAZ1A DDT domains inhibited KRAB function at 100 ng/mL dox (Fig. 5E, top). Interestingly, increasing the dox concentration to 1000 ng/mL dox permitted some silencing for KRAB-DHX16 and full silencing for KRAB-BAZ1A (Fig. 5E, bottom), consistent with the loss of function of these KRAB fusions coming from decreased protein abundance.
We also observed loss of KRAB function when fused to certain well-expressed domains from proteins that are part of the basic transcriptional machinery, namely the 2nd WD40 domain from TAF5L and the fork domain of RPB2. Consistent with the high-throughput measurements, individual validations at 100 ng/ml dox show a complete loss of KRAB silencing for the fusions with these domains (Fig. 5E, top row). The annotated RBP2 domain is smaller than the 80 aa sequence we used in our screen, and the 64 aa trimmed version had a lower capability of opposing KRAB, allowing for more silencing (Fig. S6E). Some amount of silencing was also restored at saturating dox concentrations for both KRAB-TAF5L and KRAB-RBP2 (Fig. 5E, bottom row), showing that KRAB can still dominate if enough of it is recruited at the locus. These results were corroborated by an overall increase in the rate of gene silencing when KRAB concatenations were recruited at 1000 ng/mL dox as compared to recruitment at 100 ng/mL dox (Fig. S6F, p=0.0239, paired t-test).
In the high-throughput measurements we also identified a small number of domains that increased KRAB silencing (Fig. 5C, bottom left). We validated that the library tile containing the homedomain from GSX2 increases KRAB silencing at day 2 when recruited at 100 and 1000 ng/ml doxycycline (Fig. 5E, left), consistent with its behavior as a repressor when recruited on its own in our previous Pfam screen21. Interestingly, the trimmed version that contains only the annotated homeodomain does not enhance KRAB silencing (Fig. S6C). In contrast, the NHR2 domain from MTG8R, on its own a slightly weaker repressor than GSX2’s homeodomain in our prior screens, was unable to modify KRAB-mediated silencing at both 100 ng/mL and 1000 ng/mL dox (Fig. 5E).
Composing effector domains to generate multifunctional synthetic transcription factors
We wanted to take advantage of the fact that the KRAB repressor is dominant over activators to build a more versatile transcription factor that can switch between repressor and activator based on addition or removal of a second drug. We engineered a version of our rTetR-FOXO3-ZNF10 KRAB concatenation plasmid where the two effector domains were separated by a StaPL domain, which cleaves itself in the absence of asunaprevir (ASV)57 (Fig. 6A). Thus, in the absence of ASV, the KRAB domain would be cleaved, leaving behind the dual-functional rTetR-FOXO3, while upon addition of the ASV inhibitor, the KRAB domain would act as a dominant repressor (Fig. 6B).
We first verified that the construct worked as expected when driving gene activation at the minCMV promoter (Fig. 6C) and gene repression at the pEF promoter (Fig. 6D, Fig. S7A) at maximum recruitment with a saturating dose of dox (1000 ng/mL). At minCMV, the rTetR-FOXO3-StaPL-KRAB drove strong gene activation in the absence of ASV (Fig. 6C, bottom right green) to a similar level as rTetR-FOXO3 only (Fig. 6C, top right). At the same promoter, addition of ASV reduced activation by the rTetR-FOXO3-StaPL-KRAB to a minimum (Fig. 6C, bottom right blue), comparable to rTetR-KRAB recruitment, and consistent with KRAB dominating over FOXO3. At pEF, the rTetR-FOXO3-StaPL-KRAB repressed virtually all cells upon ASV addition (Fig. 6D, bottom right blue), the same as rTetR-KRAB alone. Without ASV, this fusion still repressed 50% of the cells (Fig. 6D, bottom right green), consistent with rTetR-FOXO3 alone being able to silence pEF in about ~54-57% percent of cells (Fig. 6D, top right). As expected, the negative controls rTetR alone and rTetR-StaPL did not change gene expression at either promoter (Fig. 6C-D). Altogether, the construct behaved as expected: adding ASV changed its behavior from FOXO3-like to KRAB-like.
The ability to toggle the FOXO3-StaPL-KRAB TF from activator to repressor and then back to activator again allowed us to test whether the activator behaves differently at a promoter before and after KRAB-induced repression. In order to do this, we added dox to cells containing this fusion for a period of 20 days to induce recruitment, and within this interval we varied ASV to toggle between FOXO3 (-ASV) and KRAB dominant (+ASV) (Fig. 6E-F). At the minCMV reporter, we saw rapid gene activation upon addition of doxycycline that was lost when ASV was added and KRAB recruited to the promoter (Fig. 6E). Removal of ASV, leading to recruitment of FOXO3 alone at this KRAB-silenced minCMV promoter produced much slower gene activation than at the unsilenced minCMV promoter (Fig. 6E, days 0-2 vs. 7-20). The slow reactivation upon ASV washout suggests that while FOXO3 may be able to drive gene expression from a minimal promoter at a permissive locus, it is less efficient at reactivating that promoter after it is silenced with potentially methylated and/or compacted chromatin. At the pEF promoter (Fig. 6F), we observed a slow reduction in gene expression without ASV during days 0-2, consistent with the dual FOXO3 acting as a weak repressor at pEF. The percentage of cells silenced rapidly increased upon the addition of ASV and recruitment of KRAB culminating in complete silencing by day 7. After removal of ASV, we saw slow but measurable reactivation that increased dramatically upon doxycycline removal (Fig. 6F, days 7-20 vs. 20-27), suggesting that while FOXO3 can behave as an activator and drive gene expression at minCMV, its repressive capacity may inhibit proper reactivation of the pEF promoter.
In order to characterize the range of behaviors achievable with this inducible synthetic TF, we varied the dosing of both ASV and doxycycline while recruiting the rTetR-FOXO3-StaPL-KRAB construct at both promoters. At 100 ng/mL dox increasing doses of ASV led to decreased gene activation at the minCMV promoter (Fig. 6G left, S7B), consistent with the expectation that increasing doses of ASV led to an increase in the species containing KRAB relative to FOXO3 only. We also found that increasing the dose of dox at 0 ASV led to increasing gene activation at the minCMV promoter, consistent with a model in which the absence of ASV leads to virtually minimal KRAB recruitment (Fig. 6G right, Fig. S7B). Low to moderate doses of ASV, 0.01uM and 0.1uM, produced intermediate profiles between FOXO3 and KRAB (Fig. S7B-C). Altogether, these results suggest a model where increasing the dose of ASV titrates between FOXO3-like and KRAB-like behavior, while increasing the dose of dox increases the degree of gene activation and/or repression.
We wished to understand what new profiles of gene expression the combined FOXO3-StaPL-KRAB transcription factor could generate that neither FOXO3 nor KRAB could produce on their own. For each experiment performed with the transcription factor, we plotted for each timepoint the fraction of cells actively expressing the pEF reporter versus the fraction expressing the minCMV promoter (Fig. 6I). We found a large number of states that the synthetic inducible TF was able to access at intermediate ASV doses (Fig. 6I, purple) that could not be achieved at 0uM ASV (FOXO3-like behavior) or 1uM ASV (KRAB-like behavior). These states were principally composed of situations where both minCMV and pEF were expressed in only a fraction of cells; FOXO3-like behavior mainly produced high levels of cells with minCMV on (Fig. 6I, green), while KRAB-like behavior did not permit minCMV expression at all (Fig. 6I, blue). In conclusion, we found that composing FOXO3 and KRAB in this manner and varying the dose of ASV produced distinct profiles of gene expression not achievable with either effector domain individually.
Discussion
Despite considerable efforts to parse the combinatorial logic of TF binding and gene regulation19,58–60, our understanding of how multiple distinct functional transcriptional effector domains work together within a single TF remains limited. Improved characterization of the combined function of effector domains is important for understanding the function of natural TFs featuring multiple effector domains2, for building complex synthetic biology tools to manipulate gene expression61, and for building cell therapies to detect and treat diseases26,62–65. Here, we present the results of screening thousands of effector domain combinations, another step towards uncovering the basic principles of combinatorial gene regulation.
We found that weak activators can synergize to drive robust gene expression even when the individual activators being paired were not particularly strong. This is in agreement with previous results showing that multiple activation domains can act synergistically in yeast when recruited at a synthetic reporter7,14, and in human cells when recruited at reporters or endogenous genes using dCas924,66. At the other extreme, we found that some of the strongest activators, which in our system acted as dual-functional domains that could also repress a constitutive promoter, were often antagonistic: when paired with each other, they produced less gene activation from a minimal promoter than when recruited on their own. Antagonism between transcriptional activators has been reported in other contexts: TFs have been shown to interfere with each other’s ability to activate genes via direct interactions with each other67 or via squelching mechanisms involving the sequestration of coactivator proteins68. However, our findings suggest a more general negative feedback mechanism triggered by high levels of activators at a promoter; more thorough confirmation of this phenomenon and subsequent investigations into its molecular underpinnings are still needed.
Our activation data showed a tight coupling between the fraction of actively transcribing cells and the degree of transcription in those cells, consistent with manipulation of transcriptional bursting dynamics49,69). Recently, certain TFs were shown to increase only burst size or only burst frequency, depending on their molecular mechanism of action70. While our long-lived protein reporter does not allow us to differentiate between changes in burst size or frequency for our combinations, it would be interesting to repeat these high-throughput measurements coupled with RNA FISH or a destabilized reporter that allows extraction of burst parameters.
We found that repressors generally overpowered activators, although select activator domains could weaken the repressor function of dual-functional or weakly repressive domains. We confirmed this result more thoroughly via an in-depth screen measuring KRAB function when paired with a panel of 5,000 domains from nuclearly localized proteins, and found that KRAB maintained its repressive function when paired with most domains except ones that had decreased expression and a few domains from general TAFs and RNA polymerase. We used these insights to build a synthetic transcription factor whose function could be switched from a dual-functional FOXO3-like profile (that activates a minimal promoter and weakly represses a constitutive one) to a repressive KRAB-like profile via the addition of the small molecule drug ASV. We used this switchable TF to show that both the activation and repression functions of the FOXO3 effector domain changed when the target promoter was first silenced by KRAB. While activation from a minimal promoter was impaired by prior silencing as expected, recruitment of FOXO3 increased KRAB-mediated epigenetic memory and reduced reporter gene reactivation at the pEF promoter. Importantly, since ASV tuned the behavior of this synthetic TF independent of dox-mediated recruitment, we could generate gene expression profiles that were not accessible with either individual effector domain alone. Looking forward, this synthetic switchable TF can be used to split a cell population into well-defined percentages that express the desired combination of two target genes, e.g., 50% of the cells expressing gene 1 and 50% expressing gene 2. Such flexibility could allow for more sophisticated and complex gene circuits and the engineering of “higher-order” cell behaviors and programs71.
When we started this study, we had a limited number of validated activation and repressive domains to test in combination. Recently, however, we have identified a much larger set of effector domains from human TFs20. Prior efforts have characterized individual examples of multiple effector domains being combined within a larger TF22,72; it would be instructive to design a library that systematically tests combinations of domains that come from the same TF and compare these results with recent ORFeome screens measuring activation and repression of full-length TFs33. Moreover, emerging work has begun to connect TF and effector function with measurements of their affinities for transcriptional coactivators and corepressors13,33,73. In addition, while our work characterized effector combinations at two promoters in one human cell line, we are excited for future efforts to broaden these settings to include other DNA-binding domains beyond rTetR, other promoters, and other cell types. Such efforts will undoubtedly aid projects aiming to build robust tools for cell engineering that can function across multiple contexts74–76.
Supplementary Tables
Supplementary Table 1 – Base Oligo Library
Sequences for domains used in the combinatorial recruitment screen related to Figs. 1-4 are attached in a CSV file.
Supplementary Table 2 – Corecruit screen CSV
Domain combinations and their corresponding measurements in the combinatorial recruitment screens for activation are attached in a CSV file.
Supplementary Table 3 – KRAB+Pfam screen CSV
Domains fused to KRAB and corresponding measurements in the repression screen related to Fig. 5 are attached in a CSV file.
Materials & Methods
Library design
Library members were chosen from the Nuclear Pfam library described in21. A total of 20 negative control domains, 10 randomers and 10 tiles of the DMD protein; 30 activator domains; and 50 repressor domains were chosen for cloning. Of the 50 repressors, 6 were discovered to have errors introduced during design, and were discarded from downstream analysis. All domains, including those eventually discarded, are listed in Supplementary Table 1. After library assembly, DNAChisel77 was used to optimize coding sequences by removing duplicates, 7xC homopolymers, BsmBI and BbsI restriction sites, and rare codons. Codon usage was matched to human codon prevalence and GC content was restricted to be between 20% and 75% in any 50-nucleotide window and between 25% and 65% globally.
Cell culture
Cell culture was performed as described in20. Briefly, all experiments presented here were carried out in K562 cells (ATCC, CCL-243, female). Cells were cultured in a controlled humidified incubator at 37C and 5% CO2, in RPMI 1640 (Gibco, 11-875-119) media supplemented with 10% FBS (Omega Scientific, 20014T), and 1% Penicillin-Streptomycin-Glutamine (Gibco, 10378016). HEK293T-LentiX (Takara Bio, 632180, female) cells, used to produce lentivirus, as described below, were grown in DMEM (Gibco, 10569069) media supplemented with 10% FBS (Omega Scientific, 20014T) and 1% Penicillin Streptomycin Glutamine (Gibco, 10378016). minCMV and pEF reporter cell line generation is described in21. pEF and minCMV promoter reporter cell lines were generated by TALEN-mediated homology-directed repair to integrate donor constructs (pEF promoter: Addgene #161927, minCMV promoter: Addgene #161928) into the AAVS1 locus by electroporation of K562 cells with 1000 ng of reporter donor plasmid and 500 ng of each TALEN-L (Addgene #35431) and TALEN-R (Addgene #35432) plasmid (targeting upstream and downstream the intended DNA cleavage site, respectively). After 7 days, the cells were treated with 1000 ng/mL puromycin antibiotic for 5 days to select for a population where the donor was stably integrated in the intended locus. Fluorescent reporter expression was measured by flow cytometry.
Co-recruit cloning
Cloning for the combinatorial screen proceeded in two stages; in the first, domain-linker-domain concatenations were assembled, and in the second, concatenations were placed into a lentiviral backbone. Domains in the N-terminal and C-terminal position were synthesized as 2 separate oligonucleotide pools (Twist Biosciences). Each pool was PCR amplified in a clean PCR hood to avoid DNA contamination. Each pool was split into 6x 50 uL reactions that were PCR amplified for 21 cycles with 5 ng template, 1 uL of each 10mM primer, 1uL of Herculase II polymerase (Agilent), 1 uL of DMSO, 1 uL of 10mM dNTPs, and 10 uL of 5x Herculase buffer (Agilent). Reaction mixes were thermocycled at 98C for 3m; then 21x cycles of 98C for 20s, 61C for 20s, and 72C for 30s; and, finally, 72C for 3m. Reaction products were pooled and gel extracted by loading a 1% TAE gel, excising the 300 bp band, and purifying using a Zymo Research gel extraction kit. Each of the 2 extraction products (N-terminal and C-terminal) were then amplified for 23x cycles each using the same protocol in order to generate sufficient DNA for downstream reactions. To generate concatenations, 5 uL of 15 ng/uL product from each product, N-terminal and C-terminal, were mixed with 4 uL T4 buffer (NEB B0202S), 4 uL BbsI-HF (NEB R3539L, 1 uL T4 ligase (NEB M0202M), and 1uL of an XTEN linker-encoding DNA amplicon. This linker was encoded with variable nucleotides in codon wobble positions to permit variable DNA sequence in between domains and prevent lentiviral recombination without varying the amino acid sequence of the linker. To complete the GoldenGate reaction, the mix was thermocycled for 65x cycles at 37C for 5m, then 16C for 5m, before a series of incubations at 37C for 1h, 65C for 20m, and 16C for 1h. The reaction product was run on a 1% TAE gel and the 600bp band of domain pair concatenations was extracted with a Zymo gel extraction kit. The concatenation product was then PCR amplified as before, using the outer primers, for 15x cycles. Amplification products were then split into 12x GoldenGate reactions, each consisting of 75ng (1.33 uL) Bsmbi-v2 digested pJT039 lentiviral backbone, 10 ng (0.64 uL) concatenation amplicon, 2 uL T4 buffer, 1 uL BsmbI-v2 Golden Gate Assembly Kit (NEB E1602L), and 15.03 uL nuclease free H2O. Reactions were thermocycled with 65x cycles at 42C for 5m and 16C for 5m, before a final incubation at 42C for 5m and 70C for 20m. The reactions were then pooled and purified with a QIAgen MinElute column, eluting in 6 uL ddH2O. 2 uL of eluent was then electroporated into each of 2 tubes of 50 uL Endura electrocompetent cells (Lucigen, Cat#60242-2) as per manufacturer’s instructions. Cells were then plated onto 4x large 10” x 10” LB plates with carbenicillin. After overnight growth colonies were scraped into a collection bottle and plasmid pools were extracted using a plasmid maxiprep kit (Qiagen #12662). 2 smaller plates of LB with carbenicillin were also prepared with 1:20 diluted cells to count colonies and confirm transformation efficiency. Approximately 50 colonies were Sanger sequenced (Quintara) to estimate cloning efficiencies and the proportion of empty backbone plasmids in the pool.
Co-recruit screen
We plated 15 × 106 HEK293T cells on each of 10x 15-cm tissue culture plates in 30 mL DMEM, grew them overnight, and then transfected each with 8 ug of an equimolar mixture of three third-generation packaging plasmids (pMD2.G, psPAX2, pMDLg/pRRE) and 8 ug of rTetR-concatenation library vectors using 50 mL of polyethylenimine (PEI, Polysciences #23966). Packaging plasmids were gifts from Didier Trono. Lentivirus was harvested after 48 and 72 hours of incubation and filtered through a 0.45 mm PVDF filter (Millipore) to remove debris. K562 cells were infected with lentivirus via spinfection for 2 hours, with 2 replicates each spinfected separately. After 48 hours of growth, infected cells were selected with 10 mg/mL blasticidin (Gibco). Infection and selection efficiency were monitored each day with flow cytometry using a Biorad ZE5 flow cytometer. Infection coverage was approximately 600x for each replicate, while maintenance coverage was maintained between 10,000 and 20,000x cells per library element. On day 8 post-infection, cells were treated with 1000 ng/mL doxycycline (Fisher Scientific) for 2 days for activation or 5 days for repression.
Co-recruit library prep and sequencing
Genomic DNA was extracted with the QIAgen Blood Maxi Kit following the manufacturer’s instructions with up to 1 × 108 cells per column. Domain sequences were amplified by PCR with primers containing Illumina adapters as extensions. 27-55x 100uL PCRs were set up on ice, with the genomic DNA available for each experiment dictating the number of reactions. 10 ug of genomic DNA, 0.5 uL of each 100M primer, and 50uL of NEBnext Ultra 2x Master Mix (NEB) was used in each reaction. Samples were thermocycled at 98C for 3m; 33 cycles of 98C for 10s, 63C for 1m, 72C for 30s; and 72C for 2m. PCR reactions were pooled and run on a 1% TAE gel, the 600bp band was excised, and DNA was purified using QIAquick gel extraction kit (Qiagen) and eluted into nonstick tubes (Ambion). Samples were sequenced on an Illumina HiSeq (2×150bp, Admera Health).
Computing co-recruit enrichment scores
Demultiplexed sequencing reads were provided by Admera. Domain 1 and domain 2 reads were aligned separately using bowtie2, trimming 30 base pairs from both 5’ and 3’ ends, and were then each paired read was identified as a particular library member using a python script iterating over each read pair. The total number of reads for each library member was then summed up using another python script. Library members were required to have at least 5 ON reads or 5 OFF reads as well as 50 total reads across both ON and OFF subpopulations to be considered for downstream analysis. For every qualifying library member, the ON or OFF read count was then set to 5 if it was less than 5. Then, ON read counts were normalized by dividing each library member’s read count by the total number of ON reads, and the same was done for OFF counts. Overall enrichment scores were then computed by taking the log2 of the ratio of normalized ON read counts to normalized OFF read counts.
Labeling domains as activators, repressors, and duals
We labeled a domain as an activator if the following 2 conditions held for all concatenations including that domain and a negative control: 1) at least 19% functioned as activators (i.e. for each, activation log2(ON:OFF) > −0.0236); 2) the average activation log2(ON:OFF) for all pairs was greater than or equal to −0.5, or the activation log2(ON:OFF) for all pairs that functioned as activators was greater than or equal to 0.95. We labeled a domain as a repressor if the following 2 conditions held for all concatenations including that domain and a negative control: 1) at least 19% functioned as repressors (i.e. repression log2(ON:OFF) < 0.771); 2) the average repression log2(ON:OFF) for all pairs was less than or equal to −0.3, or the repression log2(ON:OFF) for all pairs that functioned as repressors was less than or equal to 0.20. Domains that functioned as both activators and repressors were labeled as duals. Domains that functioned as neither were labeled as non-hits.
KRAB-Nuclear Pfam screen cloning and cell culture
An initial backbone was generated by digesting pJT126 (AddGene #161926) with BsmbI, then ligating in a sequence corresponding to the ZNF10 KRAB domain. Then, the nuclear Pfam library was cloned into the downstream GoldenGate cloning site, and used to produce lentivirus identically as in21. K562 cells were then spinfected in 2 separate replicates with this KRAB+nuclear Pfam library lentivirus at an average MOI of 0.015, corresponding to an infection coverage of 85x per replicate. 72 hours post-infection 10 mg/mL blasticidin (Gibco) was added. 48 hours after blasticidin addition, 100 ng/mL doxycycline was added and cells were maintained in both blast + dox for 5 more days. Cells were sampled at day 2 of doxycycline recruitment for magnetic separation and downstream sequencing. After 5 days of recruitment, cells were spun down, washed with PBS, and resuspended in dox-free/blast-free RPMI. 5 days after doxycycline release, cells were again sampled for magnetic separation and downstream sequencing. Cells were maintained at a maintenance coverage of 3,000-30,000 per library element on average. Library preparation and sequencing was performed identically as in to sequence the variable effector domain 21
Magnetic Separation
Cells to be separated were spun down at 300 x g for 5m and media was aspirated. Cells were then resuspended in PBS (Gibco), spun down again, and PBS was aspirated in order to wash the cells. Dynabeads Protein G were resuspended by vortexing for 30 seconds. 50mL of blocking buffer was prepared per 2 × 108 cells by adding 1 g biotin-free BSA (Sigma Aldrich) and 200 mL of 0.5 M pH 8.0 EDTA into DPBS (GIBCO), vacuum filtering with a 0.22-mm filter (Millipore), and keeping on ice. For all screens 60 uL of beads were used per 10 million cells. Magnetic separation was otherwise performed as in21.
Low-throughput validations
Plasmids for each validation were produced using GoldenGate cloning as described in21. Then, 750ng plasmid and 750ng pack mix were mixed in 200uL serum-free OptiMEM (Gibco) along with 2-8uL PEI. Meanwhile, HEK293T cells were plated in 5mL DMEM supplemented with 10% FBS and 1% PSG in one well of a 6-well plate (Fisher Scientific) for each validation. The plasmid-pack mix-optiMEM mixture was added to the HEK293T cells to initiate lentivirus production, and virus was harvested 48 and 72 hours post incubation and filtered through a 0.45mm PDMS filter. For each validation, 2.50 × 105 K562 cells were spinfected with 1mL virus for 2 hours, and 10 mg/mL blasticidin was added after 48 hours. After 7 days of blasticidin selection, cells were spun down and resuspended in RPMI with doxycycline for recruitment.
Model of transcriptional repression
In our model, λ defines the amount of mRNA in Active (ON) cells. We used the experimental data to estimate these parameters for our synthetic reporter. We directly fit the mean and standard deviation of fluorescence of Silent (OFF) cells from experimental data. We assumed that a population of cells each containing m mRNA molecules produced a distribution of log10 mCitrine-A fluorescence values centered at (m/600) + 6.5, and directly fit the standard deviation of this population from experimental data. We constrained mRNA levels to range between 0 and 1500 molecules per cell. Thus, the probability of seeing a cell produce a log10 mCitrine-A fluorescence level of c given parameters σON, μOFF, γOFF, and λ is given by the equations: where PActive and PSilent are the probability that a given cell is active or silent, respectively. Parameters were fit directly to data using scipy.optimize.curve_fit. Note that PActive + PSilent = 1 and that when t ≤ tlag, PSilent = 0.
To measure silencing, for every day of recruitment, 2000 cells were sampled to ensure equal sampling of log10 mCitrine-A fluorescence intensity across recruitment timepoints. The fraction of background silenced cells was directly computed using the rTetR-only recruitment data, with cells being labeled as Silent if their mCitrine-A fluorescence value was less than 107. Then, parameters μOFF and σOFF were directly estimated by examining the distribution of cells with log fluorescence values less than 7. λ was determined by estimating the mean amount of mRNA produced in ON cells at day 0, using the above mapping between mRNA counts and fluorescence values. ks and tlag were fit by examining the fraction of cells silenced over time; we assumed that the fraction of Active cells decreased with rate ks after tlag time had elapsed10,49. Finally, γ was determined by fitting a 2-state Gaussian Mixture model to fluorescence data for each day and fitting a line to the locations of the higher-fluorescence peak; we assumed λ decays linearly after silencing with slope γ.
Flow cytometry analysis
Flow cytometry analysis was performed in Python using Cytoflow78. For all flow experiments, live cells were gated using Cytoflow’s DensityGateOp, keeping 90% of cells, and then thresholded for mCherry+ cells using the ThresholdGateOp. To determine the fraction of cells that were on, we computed the fraction of cells with log10 mCitrine-A fluorescence greater than 7 as measured on a BioRad ZE5 flow cytometer. Normalization of mCitrine levels to the no-dox condition was performed as follows: foff,norm = (foff,dox – foff,no dox) ÷ (1 – foff.no dox) where foff denotes the fraction of cells off for any given condition. Cytometry analysis was otherwise performed identically as in20,21.
Data and code availability
All raw NGS data and associated processed data generated in this study will be deposited in the NCBI GEO database upon publication. All data besides raw NGS data used for this manuscript, and all code, are available at Zenodo at DOI 10.5281/zenodo.7453682.
Author Contributions
AXM and LB designed the study with significant intellectual contributions from JT. AXM and JT designed the combinatorial recruitment library and performed the KRAB-Pfam screen. KS cloned the KRAB-Pfam library. AXM performed the combinatorial recruitment screen with help from CL. AXM, SA, SAR, and CA all performed individual recruitment assay experiments. AXM performed data analysis with input from LB, JT, and MCB. AXM and LB wrote the manuscript, with contributions and feedback from all authors. LB and MCB supervised the project.
CRediT Statement
AXM: conceptualization, methodology, software, formal analysis, investigation, data curation, writing - original draft, writing - review & editing, visualization. JT: conceptualization, methodology, investigation, writing - review & editing. SA: investigation, writing - review & editing. SR: investigation, writing - review & editing. CA: investigation, writing - review & editing. CL: investigation, writing - review & editing. MCB: conceptualization, resources, writing - review & editing, supervision, project administration. LB: conceptualization, methodology, resources, writing - review & editing, supervision, project administration, funding acquisition.
Competing Interests
LB, MCB, and JT acknowledge outside interest in Stylus Medicine. All other authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional Information
Supplementary information is available for this paper. Correspondence and requests for materials should be addressed to lbintu{at}stanford.edu.
Acknowledgements
We would like to thank Michaela Hinks, Jennifer Gucwa, Nicole DelRosso, Abby Thurm, David Yao, Priyanka Shrestha, Zeppelin Cat, Ophelia Hinks, Shivam Verma, Ben Doughty, Mary Frances Gallagher, Kaushik Ragunathan, Zara Weinberg, and all members of the Bintu lab for helpful conversations and assistance. We would like to thank Anshul Kundaje and Will Greenleaf for valuable feedback throughout the project. AXM was supported by Stanford University Medical Scientist Training Program grants T32-GM007365 and T32-GM145402, and a Stanford Bio-X SIGF. JT is supported by the F99/K00 fellowship of the National Institutes of Health (NIH-1F99DK126120-01; NIH-4K00DK126120-03). SAR is supported by an NSF GRFP (DGE-1656518). MCB is supported by a grant from Stanford ChEM-H and an NIH Director’s New Innovator Award (1DP2HD08406901). This work was supported by BWF CASI (LB), NIH-NIGMS R35M128947 (LB), and NIH-NIGRI R01HG011866 (LB and MCB).