An essential mycolate remodeling program for mycobacterial adaptation in host cells

The success of Mycobacterium tuberculosis (MTB) stems from its ability to remain hidden from the immune system within macrophages. Here, we report a new technology (Path-seq) to sequence miniscule amounts of MTB transcripts within up to million-fold excess host RNA. Using Path-seq we have discovered a novel transcriptional program for in vivo mycobacterial cell wall remodeling when the pathogen infects alveolar macrophages in mice. We have discovered that MadR transcriptionally modulates two mycolic acid desaturases desA1/A2 to initially promote cell wall remodeling upon in vitro macrophage infection and, subsequently, reduces mycolate biosynthesis upon entering dormancy. We demonstrate that disrupting MadR program is lethal to diverse mycobacteria making this evolutionarily conserved regulator a prime antitubercular target for both early and late stages of infection. One Sentence Summary Novel technology (Path-seq) discovers cell wall remodeling program during Mycobacterium tuberculosis infection of macrophages

Moreover, we recently demonstrated the essentiality of the desA1 homolog (MSMEG_5773) in Mycobacterium smegmatis (MSM) (17). We showed that depletion of DesA1 in a conditional mutant 130 caused loss of mycolic acid biosynthesis, reduced cell viability, and accumulation of mono-unsaturated mycolic-acid species consistent with DesA1's role in the desaturation of mycolic acids (17). 132 Crystallography studies were unable to obtain soluble DesA1 but revealed that DesA2 is structurally related to plant fatty acid desaturases and that unordered properties of the protein indicate a specialized 134 role for DesA2 (18). More biochemical characterization of the desaturases is needed, but the significant up-regulation of desA1/A2 following in vivo infection was interesting and deserved further investigation 136 of their transcriptional control. 138 Genome-wide expression analysis during in vitro macrophage infection using Path-seq Several genome-wide expression studies of MTB challenged with various stresses, such as nutrient 140 starvation (19), hypoxia (20), and during in vitro infection (21), have shown that genes involved in mycolic acid biosynthesis are generally downregulated. Granted, none of these studies specifically addressed the 142 regulation of the mycolic acid desaturases, but this opposes what we observed in the in vivo infection data. Therefore, we used the Path-seq method to study the dynamics of desA1/A2 expression at higher 144 resolution following MTB infection of bone marrow derived macrophages (BMDMs). We isolated BMDMs and infected them with MTB at a MOI of 10. Infected cells were collected at 2, 8 and 24 h after 146 infection along with extracellular MTB grown in standard media as control. Total RNA was extracted, depleted of rRNA and handled as described above (Fig. 1A). All extracellular MTB samples were 148 processed by Path-seq as well. For the infection samples, we again split each sample into RNA-seq and Path-seq fractions to evaluate the enrichment efficiency and to simultaneously obtain both host and 150 pathogen transcriptomes (Fig. S2). We evaluated the percentage of reads that aligned to MTB, and found a consistent 100-fold increase in the enriched vs nonenriched samples across replicates and time points 152 ( Table 1) to generate a tSNE network. This resulted in a network map with extracellular samples clustered closely together, distinct from the intracellular samples and according to their time post infection (Fig. S3A). 158 Biological replicates fell into related groups and demonstrated strong correlation in pairwise comparison of RPKMs (Fig. S3B). Differential expression of intracellular MTB was calculated relative to 160 extracellular, at each time point using DESeq2. Overall, there were 746, 945, and 412 significant differentially expressed (log2 fold change < -1.0 or > 1.0 and multiple hypothesis adjusted P-value < 162 0.01) transcripts at the 2, 8, and 24 h post infection time points (Data S2). The most upregulated genes at all time points included genes such as icl1, Rv1129c, prpD, prpC, and fadD19. The induced expression of 164 these genes is consistent with known alteration in lipid metabolism during infection, enhanced activity of the methylcitrate cycle(22), and genetic evidence that MTB utilizes cholesterol from the host during 166 infection (23). These carbon metabolizing genes were also found to be up-regulated in microarray analysis of MTB infected BMDMs(21), along with a significant overlap of other differentially expressed genes 168 between the datasets (multiple hypothesis corrected P-value = 6.6x10 -24 at 2 h and adjusted P-value = 1.5x10 -12 at 24 h). These data demonstrate that the Path-seq method yielded data consistent with published 170 transcriptional studies of in vitro infected host cells. Importantly, the Path-seq method allows for simultaneous expression analysis of host transcripts and additional pathogen features that are not possible 172 in microarray studies. 174 desA1 and desA2 are induced early during in vitro macrophage infection Interestingly, desA1 and desA2 were transiently up-regulated at 2 h following MTB infection of BMDMs, 176 followed by return to levels similar to extracellular MTB at 8 h and 24 h ( Fig. 2A). This is earlier than the Path-seq data from AMs, which showed induced expression of the desaturases at 24h after in vivo 178 infection. In fact, we compared all significantly differentially expressed genes between the two infection models at 24 h and found only a small subset of common genes (59 genes), most of which were up-180 regulated in both models (Fig. S4) P-value from differential expression analysis between intracellular and extracellular conditions. We calculated a relative score for each TF in conditions simulating deletion or overexpression of the TF. 202 These simulations prioritized TF activities (low or high) yielding a transcriptome most similar to the infected state, compared to the control (see Methods and summary schematic in Fig. 3A). We performed 204 this analysis for each time point and infection model to identify highly ranked TFs (Fig. 3B).
From the in vitro macrophage infection, many of the TFs had distinct temporal activity, while 206 others were highly ranked across all time points (Fig. 3B). These sustained regulons include DosR, which is known to contain a set of ~50 genes that are induced in response to multiple signals including hypoxia, 208 nitrosative stress, and carbon monoxide (28)(29)(30)(31). While DosR regulon induction is typically associated with hypoxic conditions and reactive nitrogen intermediates (RNIs), we observed activation as early as 2 210 h post infection. Encouragingly, this 2 h induction was also found in the microarray study of MTB infected macrophages, where high DosR regulon expression was sustained until a striking down-212 regulation at Day 8(32). This indicates that the known cues of this regulatory network are present almost immediately during in vitro infection. In addition to DosR, two other TFs had high activity across all time 214 points, Rv0681 (Fig. 3C) and Rv0691c. Interestingly, both are TetR family transcriptional regulators and conserved across all mycobacterial genomes (33), including the drastically reduced Mycobacterium 216 leprae. The function of these transcriptional networks is unknown, but suggests their activity is important for survival in both environmental and intracellular niches. 218 Among the TFs with low predicted activity, KstR and KstR2 were found across all time points of the in vitro infection and are known to repress genes required for cholesterol utilization(34). Our analysis 220 indicates that deletion of their repressive activity, and increased expression of their target genes, is important for driving the in vitro intracellular transcriptional state. This is consistent with the highly 222 expressed cholesterol utilization and methyl citrate cycle genes that we and others have observed (21,35,36). Moreover, this emphasizes the importance of altered carbon metabolism and utilization of host-224 derived nutrients as key to MTB in vitro intracellular adaptation. Another repressor, Zur (previously FurB), had low predicted activity across all time points (Fig. 3D). Zur downregulates genes involved in 226 zinc transport(37). During MTB infection, macrophages overload the phagosome with copper and zinc as a strategy to poison the pathogen(38). However, through multi-faceted resistance mechanisms we do not 228 fully appreciate, MTB is able to protect itself against metal toxicity. Our analysis proposes that reduced Zur activity, results in increased expression of zinc transport genes which could help with regulating zinc 230 levels in MTB during in vitro macrophage infection. Interestingly, other regulators of metal content (TFs, uptake and export) were recently found to be required for in vitro intracellular growth by high-content 232 imaging of an MTB transposon mutant library(39). Leveraging our Path-seq data, we developed a systems-level approach that recapitulates known in vitro intracellular regulatory networks and prioritizes 234 others for further experimental testing. We also applied the same analysis to the in vivo expression data (using differentially expressed 236 genes in Data S1) to identify transcriptional networks involved in MTBs response to animal infection. Interestingly, we observed very few networks that were active in both infection models. Only Rv0691c 238 was highly ranked at 24 h from both AMs and BMDMs (Fig. 3E). In our regulatory network, Rv0691c has ~50 target genes, a subset of which are up-and down-regulated during in vitro and in vivo infection.

240
The genes in the regulon do not categorize into a certain pathway, but our unbiased analysis suggests the Rv0691c regulon deserves further study for its role in establishing MTB infection both in vitro and in 242 vivo. 244 Identification of desA1 and desA2 transcriptional regulator, Rv0472c, and conserved regulation in M. smegmatis 246 Our systems analysis revealed novel and infection-specific regulatory networks. However, none of the identified regulons included desA1 or desA2. We believe this to be a result of regulon size threshold that 248 we implemented to reduce false positives (TFs with at least five targets were considered in this analysis). Therefore, we used the environment and gene regulatory influence network ( predicted regulation by Rv0472c (Fig. 4A). Module 276 also contains other genes associated with PDIM biosynthesis and transport. Furthermore, the expression of module genes were found to be significantly 264 correlated with Rv0472c expression under conditions related to oxidative stress and re-aeration(40). Rv0472c is a TetR-type TF with homology across all mycobacteria, including M. leprae (33). When 266 overexpressed in MTB, the TF led to significant repression of 15 genes, but only desA1 and desA2 had significant binding of Rv0472c in their promoter region from ChIP-seq analysis(26, 27). 268 Given the conservation of Rv0472c across mycobacteria, we hypothesized that overexpression of the MSM homolog, should also repress the desaturases in MSM (Fig. 4A). We cloned MSMEG_0916 into 270 an anhydrotetracycline (ATc)-inducible Gateway shuttle vector as previously described for MTB (20,26), and transformed into MSM. We induced expression of MSMEG_0916 for 4h and harvested chromatin 272 samples for ChIP-seq as well as RNA for transcriptional profiling by RNA-seq. Overexpression of MSMEG_0916 resulted in 9 significant ChIP peaks (P-value < 0.01) with a peak score higher than 0.7, 274 as analyzed by DuffyNGS ChIP peak calling method (see Methods). Among these, were peaks located in the promoter of the MSM desA1 and desA2 (Fig. 4B). Additionally, MSMEG_0916 overexpression 276 resulted in significant repression of desA1 and desA2, with a log2 fold change of -1.32 and -1.72, respectively, compared to uninduced (Fig. 4C). The DNA consensus motifs, generated using MEME and 278 DNA-binding data from ChIP-seq, also had significant alignment between Rv0472c and MSMEG_0916 (Fig. S5). The conserved consensus motif is particularly interesting, given that desA1 and desA2 are the 280 only regulatory targets shared by the TFs. 282 Inducible overexpression of MSMEG_0916 or Rv0472c causes loss of mycobacterial viability and reduction in mycolate biosynthesis 284 Motivated by our identification of Rv0472c and MSMEG_0916 as controlling the expression of desA1 and desA2, we hypothesized that overexpression-mediated repression of the desaturases should also have 286 phenotypes similar to the desA1 knockout that we previously characterized (reduced viability following loss of mycolate biosynthesis) (17). We tested the viability of the TF overexpression strains by spotting 288 serial ten-fold dilutions of cultures on agar plates with or without ATc. Plates with MSMEG_0916 were incubated for 1 week and growth patterns indicated that the presence of ATc resulted in a 2-log fold 290 reduction in CFU counts (Fig. 5A). In comparison, plates containing the parental MSM strain showed no change in CFUs with the presence or absence of ATc (Fig. 5A). We also observed very limited growth in 292 broth culture when MSMEG_0916 was induced with ATc ( Fig. S6). Similar experiments done in Mycobacterium bovis BCG (BCG) and MTB with Rv0472c overexpression, also resulted in 3-log (Fig.  294 5B) and 4-log ( Fig. S7) reduction in viability, respectively. Overexpression of the conserved TF resulted in a loss of mycobacterial viability, due to 296 repression of desA1 and desA2 and ensuing decrease in mycolic acid biosynthesis. Conditional depletion of DesA1 in MSM leads to an intermediate decrease in desaturation prior to complete loss of mycolic 298 acids (17). To test for a decrease in mycolic acid biosynthesis, we labeled cultures of MSMEG_0916 overexpression strain with 14 C acetic acid following growth in the presence or absence of ATc. Thin layer 300 chromatography (TLC) analysis of apolar lipids demonstrated that overexpression of MSMEG_0916 reduced the levels of trehalose dimycolates (TDMs) (Fig. S8). We also analyzed methyl esters of mycolic 302 acids (MAMEs) obtained from apolar lipids using 2D-argentation TLC analysis, designed to separate each subclass of mycolic acid based on saturation levels. MAMEs analysis revealed an accumulation of 304 products that migrate identically to our previous observation with DesA1 depletion(17) (Fig. 5C) and most likely correspond to mono-unsaturated mycolates. Similarly, 1D TLC separation of fatty acid methyl 306 esters (FAMEs) and MAMEs from apolar lipids confirmed the general decrease of MAMEs and an accumulation of FAMEs when Rv0472c is overexpressed in BCG ( Fig. 5D and densitometric analysis in 308 Fig. S9). This characteristic profile of total MAMEs inhibition and FAMEs accumulation mirrors what is seen with fatty acid synthase (FAS)-II inhibitors, such as isoniazid(43) and thiolactomycin(44), and 310 confirms the involvement of DesA1 and DesA2 in the biosynthesis of mycolic acids and more specifically with the FAS-II system. Interestingly, in BCG there was no accumulation of mono-unsaturated mycolates 312 as those found in MSM upon detailed analysis of MAMEs by 2D-argentation TLC ( Fig. S10A and S10B). This could be due to key differences in mycolate subclasses between MSM and MTB, particularly 314 cyclopropane ring formation, which is abundant in MTB but not MSM and requires a precursory desaturation event. 316

CONCLUSIONS 318
During MTB infection, the bacterium utilizes various mechanisms to ensure its own survival and 320 persistence in the host. Mycolates, essential for mycobacterial cell wall rigidity and stability, have been prime candidates for such virulence factors(45). Mycolic acids not only make up a lipid rich barrier in the 322 mycobacterial cell envelope, they also act as potent immunomodulators, driving the pathogenesis of MTB, primarily as part of the cord factor (TDM)(46, 47). Here, we present evidence that mycolate 324 biosynthesis is tightly regulated in response to the intracellular environment. Using our novel Path-seq method, we observed significantly induced expression of the fatty acid desaturases, desA1 and desA2, 24 326 h after MTB infection of mice. We further demonstrated that desA1 and desA2 are regulated by Rv0472c (MSMEG_0916) and that Rv0472c-mediated repression leads to reduced mycolate biosynthesis and loss 328 of mycobacterial viability. As DesA1 and DesA2 have been shown to be involved in mycolic acid biosynthesis via desaturation of the merochain, we have therefore named their transcriptional regulatory 330 protein MadR (for Mycolic acid desaturase Regulator).
Not much is known about the regulation of mycolic acid biosynthesis apart from two transcription 332 factors shown to regulate distinct operons, both containing genes encoding core FAS-II proteins (19,48 synthesized, but also altered cyclopropane ring formation by varying desaturation levels, thus affecting virulence and persistence. 344 Surprisingly, we observed early (2 h post infection) induced expression of desA1 and desA2 during MTB infection of BMDMs, followed by return to basal levels by 8 h. This is consistent with the 346 reported increased production of TDM within the first 30 min after in vitro phagocytosis(53) and suggests that the desaturases play a role in cell wall modifications that occur in response to intracellular cues. 348 However, the presence of these cues appears to be different or delayed in AMs from infected mice. The overall disparity in the transcriptional profile of MTB from BMDMs and AMs is both intriguing and 350 disturbing. The active MTB networks we identified from BMDMs imply the presence of early and sustained bacterial stress. However, the induction of these stress-related genes is absent in the 352 transcriptomes of MTB from AMs, suggesting the bacteria are not experiencing the same type or amount of stimuli in AMs. These data support recent observations using fluorescent MTB reporter strains, 354 demonstrating that bacilli in AMs exhibit lower stress and higher bacterial replication than those in interstitial macrophages (24). Similarly, we hypothesize that MTB responds divergently to macrophages 356 of different lineages and that AMs present fewer stresses and possibly a more permissive environment compared to BMDMs. It is also worth mentioning that the data raises some concerns with respect to the 358 use of BMDMs as an appropriate infection model. The expression dynamics of desA1 and desA2 during MTB infection of BMDMs is also mirrored 360 in RNA-seq data of MTB entering and exiting hypoxia over a 5 day time course (Fig. 6A). In this experiment, we used mass flow controllers to regulate the amount of air and nitrogen (N 2 ) gas streaming 362 into cultures of MTB and achieve a gradual depletion of oxygen over 2 days (Fig. 6A). The cultures were maintained in hypoxia for 2 days by streaming only N 2 then reaerated over 1 day by a controlled increase 364 in air flow. During the 2 day oxygen depletion, the expression levels of desA1 and desA2 did not change significantly. However, as soon as the cultures reached hypoxia, the expression of the desaturases 366 increased for ~5h, followed by a dramatic repression after ~30h of being in hypoxia. Subsequently, reaeration of the culture returned desA1 and desA2 to basal expression levels (Fig. 6B). This data along 368 with the results described above leads us to propose a model for MadR regulation of desA1 and desA2 transcription as summarized in Fig. 6C. Under normal growth conditions, MadR exists in equilibrium 370 between the free and DNA-bound forms, thus, maintaining basal levels of desA1 and desA2 transcripts. Upon macrophage infection and early hypoxia, equilibrium favors unbound MadR which de-represses 372 desA1 and desA2 transcription and increases mRNA levels. As infection progresses and reaches later stages of hypoxia, MadR has increased binding affinity in the promoters of desA1 and desA2 and 374 represses their transcription to below basal levels. Ultimately, the MadR regulatory system enables mycobacteria to efficiently alter mycolate biosynthesis and composition in response to environmental 376 signals. We suspect the early response to infection (desA1 and desA2 up-regulation) increases desaturation events and allows MTB to fine-tune cyclopropanation and other merochain modifications 378 that contribute to the establishment of infection. However, mycolate biosynthesis is energetically expensive and MadR-mediated repression occurs in later stages of infection. The reduction in mycolate 380 biosynthesis allows MTB to enter dormancy and facilitates long term persistence. The question remains how MadR is able to differentially bind to DNA in response to 382 environmental changes. In mycobacteria, the other TFs regulating mycolate biosynthesis are modulated by long chain acyl-CoAs(54-56), proposing a role for these molecules in the modulation of MadR as well.

384
Similarly, a MadR homolog in Pseudomonas aeruginosa, DesT, was shown to have enhanced DNA binding in the presence of unsaturated acyl-CoAs(57). These studies support the notion that a select acyl-386 CoA ligand may control MadR DNA binding affinity (as shown in Fig. 6C), and thus the expression of desA1 and desA2. 388 The characterization of the MadR regulon provides valuable insight for understanding the evolution of MTB. While we have shown the regulation by MadR is conserved from MSM to MTB, our 390 results also suggest the fatty acid desaturation events and resulting mycolate subclasses have evolved, specializing for bacterial survival in the host environment. These findings propose mycobacterial 392 evolution from saprophyte to pathogen has occurred through the adaptation of ancestral genes and regulatory networks to function in the host environment. Ultimately, this study demonstrates the in vivo 394 significance of the desaturases and their regulation by MadR. We believe the Path-seq method, described and employed here, offers a sensitive and tractable approach to elucidate the molecular mechanisms used 396 by MTB during host infection. Our detailed characterization of one such mechanism has revealed that modulation of MadR activity can affect mycolate composition as well as mycobacterial viability. 398 Accordingly, we have established Path-seq as a powerful tool for uncovering the minimally studied in vivo biology of this pathogen and revealed the essentiality of MadR encoded program for cell wall 400 remodeling and biosynthesis.  should be addressed to NSB (nitin.baliga@systemsbiology.org). 586

588
Figures S1-S10 Tables S1 590 Data S1-S2 References 58-69 are only cited in the Supplementary Materials.   with a yet unknown co-factor that leads to repression of the desaturases.

710
The approaches used in this study include both computational and biological methods. Algorithms 712 developed for the EGRIN model and the NetSurgeon approach were implemented in the R programming language and Python, respectively. Plots were generated using R and images prepared using Adobe 714 Illustrator CS5. 716 Culturing conditions Mycobacteria strains were cultured in Middlebrook 7H9 with the ADC supplement (Difco), 0.05% 718 Tween80 at 37°C under aerobic conditions with constant agitation. Strains containing the anhydrotetracycline (ATc)-inducible expression vector were grown with the addition of 50 μg/mL 720 hygromycin B to maintain the plasmid. Growth was monitored by OD600 and colony forming units (CFUs using 485 nm and 520 nm as excitation and emission wavelengths, respectively. When growing BCG or MSM for lipid extraction, cultures were cultured up to OD600 0.5. Then, they were induced with 100 734 ng/ml ATc (Sigma-Aldrich) final concentration, and labelled with acetic acid [1-14C] 1 mCi/ml (PerkinElmer), if hot lipid analysis was going to be performed. MSM samples were collected the 736 following day, while BCG cultures at 4 and 8 hours post-induction. 738 Tissue Culture BMDMs were cultured in RPMI (RPMI containing 10% (v/v) FBS, 2mM L-glutamine) with recombinant 740 human CSF-1 (50 ng/mL) for 6 days and then replated. BMDMs were infected on day 7 with MTB H37Rv strain (MOI 10), followed by washing 3x with RPMI at 2 h post infection. MTB infected BMDMs 742 were lysed with TRIzol (Invitrogen) and total RNA was isolated from mixed host-pathogen sample. Strains 744 To investigate the growth properties of MadR overexpression, we used strains containing an ATcinducible expression vector of the gene, as described previously (20,26,27,58 Mice C57BL/6 mice were purchased from the Jackson Laboratory. All mice were housed and bred under 752 specific pathogen-free conditions at the Center for Infectious Disease Research (CID Research). All