Single molecule imaging and modelling of mRNA decay dynamics in the Drosophila embryo

Regulation of mRNA degradation is critical for a diverse array of cellular processes and developmental cell fate decisions. Many methods for determining mRNA half-lives rely on transcriptional inhibition or metabolic labelling. Here we use a non-invasive method for estimating half-lives for hundreds of mRNAs in the early Drosophila embryo. This approach uses the intronic and exonic reads from a total RNA-seq time series and Gaussian process regression to model the dynamics of premature and mature mRNAs. We show how regulation of mRNA stability is used to establish a range of mature mRNA dynamics during embryogenesis, despite shared transcription profiles. Using single molecule imaging we provide evidence that, for the mRNAs tested, there is a correlation between short half-life and mRNA association with P-bodies. Moreover, we detect an enrichment of mRNA 3’ ends in P-bodies in the early embryo, consistent with 5’ to 3’ degradation occurring in P-bodies for at least a subset of mRNAs. We discuss our findings in relation to recently published data suggesting that the primary function of P-bodies in other biological contexts is mRNA storage.

including binding sites for RNA binding proteins or miRNAs [5]. While the two major decay pathways 23 are responsible for general turnover of cytoplasmic mRNAs, there are also mRNA surveillance pathways 24 that degrade aberrant mRNAs. These include mRNAs carrying a premature stop codon, lacking a stop 25 codon, or mRNAs with paused ribosomes [6]. 26 Regulation of mRNA degradation is essential for diverse cellular processes including 27 proliferation, differentiation, apoptosis and immune responses [1,[7][8][9][10]. Control of mRNA stability is also 28 important for cellular decisions and behaviour during development. For example, regulation of myc 29 mRNA stability fine-tunes the proliferation rate of neuroblasts in the Drosophila larval brain [11], an fgf8 30 mRNA gradient generated by mRNA decay couples differentiation to posterior elongation of the 31 vertebrate embryonic axis [12] and Hes1 mRNA instability is integral to the Hes1 protein ultradian 32 oscillations that may act as a timer for vertebrate neuronal differentiation [13]. In addition, a conserved 33 feature of early embryogenesis is that there is bulk degradation of maternal mRNAs around the time of 34 zygotic genome activation [14,15]. Consistent with the key roles of mRNA stability in cell biology, 35 mutations in many components of the degradation pathways are associated with human diseases [9,16]. 36 While the half-lives of strictly maternal mRNAs during embryogenesis can be readily measured 37 genome-wide [17], measuring the decay of zygotic mRNAs is more difficult due to ongoing transcription. 38 One approach is to inhibit transcription and then follow the decline in mRNA levels over time [18][19][20]. 39 Other methods involve metabolic labelling of the RNA, for example in pulse-chase or approach-to-40 equilibrium experiments [19][20][21]. Related approaches use computational models to estimate 41 transcription and degradation rates by sequencing both the total and labelled RNA following the pulse 42 [19]. Single molecule fluorescent in situ hybridisation (smFISH) imaging based methods for estimating 43 mRNA half-lives have also been described. However, these methods are not high throughput and 44 require either steady-state transcript levels [22] or a natural shut off of transcription [23]. 45 In this study we generate a high-resolution total RNA-seq time series across early 46 embryogenesis that we use to estimate half-lives and assign mRNAs into different stability classes. Our 47 data suggest that some mRNAs can be degraded in P-bodies, as the unstable mRNAs we have 48 investigated are more highly colocalised with P-bodies and we can detect 3' mRNA fragments in P-49 bodies. Overall, our data reveal the contribution of mRNA stability to shaping mRNA levels during early 50 embryogenesis and provide insight into how mRNA stability is regulated. 51 52 53 Results 54

Isolation of mRNA from early embryos captures high resolution transcriptional dynamics 55
To investigate mRNA accumulation dynamics during early Drosophila embryogenesis, we first 56 generated a total RNA-seq time series. The early Drosophila embryo undergoes a series of 14 nuclear 57 cycles within a common cytoplasm (nc1-14). RNA was isolated from single early Drosophila embryos at 58 10 time points, starting at nc11, ~ 90 minutes after egg lay (AEL) and prior to the onset of bulk zygotic 59 transcription through to the beginning of gastrulation ( Figure 1A). Embryos were collected from a 60 His2avRFP transgenic line and precisely staged at nc11, nc12, nc13 and nc14 by calculating an 61 internuclear distance ratio ( Figure S1A). Single embryos were collected in triplicate 5 min after the nc11 62 and nc12 divisions, both 5 and 15 min after the nc13 division, then at 15 min intervals during the long 63 nc14 interval, with the final time point corresponding to the appearance of the cephalic furrow (Figure 64 1A, Table 1). As male and female embryos have differences in X chromosome transcription due to 65 dosage compensation [24], we used PCR to determine the sex of each embryo and select female 66 embryos for analysis. We sequenced total RNA following rRNA depletion, rather than selecting for 67 polyadenylated RNA, allowing us to capture intronic reads and other non-coding RNA species. The 68 intronic reads allow quantification of nascent, unspliced transcripts and also detection of early zygotic 69 expression by distinguishing zygotic transcripts containing introns from maternally loaded spliced 70 mRNAs. 71 We detected a total of 18159 transcripts during early embryogenesis representing 9026 unique 72 genes. Using principal component analysis (PCA) we observed that the first two principal components 73 represented 44% and 18% of the variation respectively and the replicates at each time point clustered 74 together ( Figure 1B). This suggests the biological age of the embryos explains the majority of variation 75 within the data rather than differences between replicates, indicating the high quality of the libraries. 76 Transcript levels across embryogenesis were visualised as a heatmap, with the transcripts ordered 77 based on the time point of peak expression ( Figure 1C). We classified 4897 early peaking transcripts 78 (at the 95 or 105 min time points) as maternal, and 13262 transcripts peaking after 105 min as zygotic. 79 Of the zygotic transcripts, 23% show peak expression early in nc13 or the start of nc14 (between 115 80 and 160 min inclusive) and the remainder show late peak expression after 160 min. Analysis of different 81 dynamically expressed genes showed that our dataset included well characterised maternal (nos and 82 bcd), maternal and zygotic (Neu3 and da), early zygotic (upd1 and dpp) and late zygotic mRNAs (wg 83 and hnt) ( Figure 1D). 84 As we sequenced total RNA, we determined the number of reads that mapped to introns as well 85 as exons and transcripts. Analysis of the distribution of intronic reads shows an even read coverage 86 across introns over all time points ( Figure S2A). Only a very small proportion of transcripts at time points 87 105-125 min have intronic reads ( Figure 1E), suggesting there is only minor zygotic transcription of 88 intron-containing genes during these early stages. Previous studies have shown that the earliest zygotic 89 activation of the Drosophila genome is biased towards expression of short intronless genes [25,26], 90 which we cannot distinguish from maternally deposited transcripts at the early time points in our data. 91 In addition, the early nuclear cycles are short, limiting the time period of active transcription. 92 Nonetheless, eight genes have detectable levels of intron signal at nc12 and nc13A, suggesting early 93 zygotic transcription ( Figure S2C-D We next addressed how post-transcriptional regulation contributes to the range of mature mRNA 199 dynamics seen in our data, by combining clustering analysis with our modelling of transcript half-lives. 200 The pre-mRNA data were clustered using GPclust (Figure S5 As the clustering data indicated that half-life contributes to the shape of the mature mRNA profile, 228 we further investigated the relationship between the relative timing of the peak of the pre-mRNA and 229 mature mRNA. Visualisation of the gene-level pre-mRNA and mature mRNA data from the zygotic 230 subset as heatmaps, reveals that for a given pre-mRNA peak time, there are a range of mature mRNA 231 peak times with different delays ( Figure 3B). Delay is defined as the time difference at which the peak 232 is observed for the pre-and mature mRNA. The pre-mRNA and mature mRNA data were modelled 233 using a Gaussian process which was then sampled with n = 100, so that the time delay between the 234 peaks could be determined and the uncertainty in the estimate quantified ( Figure 3C). The relationship 235 between delay and half-life, for each transcript that has been modelled, is shown in Figure 3D. There is 236 a moderately positive yet significant correlation between the two variables. Figure 3E shows the data 237 as a confusion matrix in order to assess whether delay is predictive of half-life. Enrichment along the 238 diagonal supports this; 63% of short delay genes have short half-lives; 67% of medium delay genes 239 have medium half-lives and 72% of long delay genes have long half-lives. Together, these results reveal 240 how post-transcriptional regulation is able to shape mature mRNA dynamics through regulation of 241 mRNA half-lives and that the time delay between maximum expression of the pre-mRNA and mature 242 mRNA can be used as an indicator of mRNA stability. 243

244
The short half-life mRNAs tested are more compact 245 The degradation of an mRNA in the cytoplasm can be closely linked to its translation [1]. We therefore 246 investigated how mRNA half-lives are shaped by both structural and sequence features known to 247 influence translation. Regulatory sequences controlling mRNA degradation, translation and localisation 248 are frequently located in the 3'UTR [40]. We found that 3'UTR length does not have any significant 249 correlation with our inferred half-lives (Figure S7A Figure 4A). To extend this analysis, we also determined the codon stabilisation coefficient (CSC) for 258 each codon which is a measure of the correlation between codon usage and stability of mRNAs. We 259 plotted the CSC of each codon ordered by this value from highest to lowest (Figure 4B negative CSC groups (33% vs. 39%, p = 0.79, Figure 4B). There is also no significant difference in the 263 proportion of optimal codons for transcripts within each of the different categories of half-life ( Figure 4C) 264 and clustering mRNAs based on codon usage showed that different clusters had similar half-lives 265 ( Figure S7C). 266 We next used imaging to analyse mRNA compaction in the context of stability. A more open 267 conformation has been detected for specific mRNAs when they are being translated [44][45][46], raising the 268 possibility that a particular conformation may also influence mRNA stability. We therefore selected a set 269 of 11 zygotic mRNAs, 4 each from the medium and long half-life categories and 3 from the short half-270 life category ( Figure S8A). A 4 th short half-life mRNA, Neu2, is too short to separate the probe sets for 271 compaction analysis but is included in further analysis (see later). We used dual-colour smFISH probes 272 to visualise their 5' and 3' ends, and quantitate the distance between them, in fixed embryos (Figure 273 4D). A representative smFISH image for one of the mRNAs, lov, is shown in Figure 4E, images for the 274 other mRNAs tested are shown in Figure S8B. 5' ends, 3' ends and colocalised ends labelled by magenta, yellow and white arrowheads, respectively. 287 Scale bars: 5 µm. In the uncropped image from this embryo there are 5668, 1645 and 3620 intact, lone 288 5' and lone 3' signals, respectively. For absolute numbers of intact mRNAs and lone ends for all mRNAs, 289 see Figure S9A. (F) Graph shows the end-to-end distances of mRNAs with different stabilities, n = 3 290 embryos per mRNA. Data are mean distances across all colocalised mRNAs in each embryo (n > 220 291 whole RNAs for all images). Genes are grouped by their half-life category and the hue in each category 292 corresponds to the order of the half-lives (lighter colour refers to shorter half-life). Short half-life mRNAs 293 are more compact than both medium (p = 3.1 x 10 -3 ) and long (p = 1.6 x 10 -3 ) half-life mRNAs. No 294 significant difference in end-to-end distance was seen between medium and long half-life transcripts (p 295 = 8.9 x 10 -1 ). 296 For each image, the number and position of the 5' and 3' signals were collected and pairs were 297 identified by solving a paired assignment problem (Figure 4Dii). For each pair, the distance between the 298 5' and 3' signals was then measured; only ends with a distance less than 300 nm were assigned as 299 same mRNA [46]. First, we estimated our smFISH detection efficiencies using alternating fluorophores 300 for the otd and lov mRNAs ( Figure S9Bi). These data reveal mean detection efficiencies of ~70% (Figure 301 S9Bii), which is in the 70-90% range reported from other smFISH studies [47-51]. However, we note 302 that our detection of the 670 labelled probe sets is generally slightly poorer than that of the 570 probes 303 due to a lower signal to noise, consistent with findings from a previous study that used 670 labelled 304 The distributions of end-to-end distances for each of the mRNAs tested reveal that short half-life 306 mRNAs are significantly more compact, based on a smaller end-to-end distance, than mRNAs in the 307 medium and long half-life categories ( Figure 4F). Considering the lower detection limit of the imaging 308 setup we used is ~120nm (see Methods), we found that for our alternating probe sets and the 5' and 3' 309 compaction data, the otd and lov short half-life mRNAs had an end to end distance that is very compact 310 and close to this limit ( Figure 4F). No significant difference is observed in the end-to-end distance for 311 mRNAs in the medium and long half-life categories however we did find some mRNAs in the long 312 category were in a more open conformation than those in the medium category ( Figure 4F). We also 313 identified unpaired mRNA ends (see later), which were further apart than the 300 nm distance threshold 314 used. Finally, quantitation of additional control smFISH experiments for some of the test mRNAs, in 315 which the fluorophore dyes on each set of probes were switched (Figures S9C) in order to control for 316 detection differences between the channels mentioned above, also revealed significantly shorter end-317 to-end distances for short half-life mRNAs ( Figure S9D). 318 We find no correlation between compaction and mRNA length, as brachyenteron (byn) and 319 echinus (ec) are the shortest and longest mRNAs tested, respectively. Taken together these results 320 suggest that within early Drosophila development, the decay of zygotically expressed genes is not 321 strongly correlated with translation efficiency or codon optimality, but unstable mRNAs tend to be slightly 322 more compact than medium and long half-life transcripts. 323 324

Embryonic P-bodies are associated with unstable mRNAs and enriched in 3' decay fragments 325
Cytoplasmic P-bodies have been implicated in mRNA degradation and storage in Drosophila [52]. 326 Therefore, we investigated whether mRNAs with distinct stabilities are differentially localised to P-327 bodies. We visualised P-bodies using Me31B, a marker of P-bodies, including in Drosophila For each mRNA tested, a proportion of the individual mRNA signals colocalise with P-bodies 351 ( Figure 5A, Figure S10A). As seen in Figure 5A, orthodenticle (otd) (also called ocelliless) mRNAs 352 appear more highly colocalised with P-bodies than ltl mRNAs. As otd has a much shorter half-life than 353 ltl (3 mins and 249 mins respectively), we examined whether this was a trend across the set of test 354 mRNAs. To quantitate colocalisation, we used a colocalisation index that controls for variation in mRNA 355 and P-body numbers between embryos ( Figure 5B). This analysis reveals that the both the short and 356 medium half-life mRNAs tested are significantly more colocalised with P-bodies than the long half-life 357 mRNAs tested ( Figure 5C). While the mean colocalisation index value for short half-life mRNAs is higher 358 than that of the medium half-life mRNAs tested, this difference is not significant, due to higher variance 359 in the colocalisation index of short half-life mRNAs as lov, a short half-life mRNA, has a particularly low 360 colocalisation index. 361 Given the difference in P-body colocalisation observed for some of the test mRNAs, we extended 362 this analysis by using published Me31B RIP-seq data from the early Drosophila embryo [52]. This 363 analysis reveals a relatively weak but significant negative correlation between Me31B interaction and 364 mRNA half-life in 1-2 hr embryos ( Figure 5D) and 2-3 hr embryos ( Figure S10Ci). This negative 365 correlation between the Me31B RIP-seq data and our model half-lives is no longer significant when the 366 RIP-seq data from 3-4 hr embryos are used ( Figure S10Cii), a later stage than we have imaged. These 367 data are consistent with a previously reported negative correlation between Me31B binding and mRNA 368 stability in the Drosophila embryo, when fold change in mRNA abundance was used as a proxy for 369 mRNA stability [52]. Together, our imaging data and the negative correlation between RIP-seq 370 interaction and mRNA half-life suggest that in the Drosophila embryo P-bodies may be sites of mRNA 371 degradation for at least a subset of mRNAs. 372 In our dual-colour smFISH images we observed a proportion of unpaired 5' and 3' mRNA ends 373 suggestive of degradation intermediates ( Figure 4E). We detect more lone ends when we use 5' and 3' 374 otd compaction probes, compared to alternating probes ( Figure S9E), providing further support that 375 some of the lone signals are due to mRNA degradation, as detection with alternating probes is more 376 resistant to loss of mRNA 5' and 3' sequences. In addition, due to the short length of Drosophila mRNAs, 377 we are using 24-30 probes in each detection set. Therefore, it is likely that loss of binding of only a small 378 number of probes from the 5' or 3' set is enough to take the signal below the detection threshold, 379 facilitating our detection of partly degraded mRNAs. 380 In order to determine if these 5' and 3' fragments co-localised with P-bodies we assessed 381 whether the 5' and 3' probe sets colocalised with the GFP-Me31B P-body marker. An image of an early 382 nc14 embryo is shown for the Dfd mRNA in Figure 6A, revealing that some complete mRNAs (orange 383 arrowhead) and lone 3' ends (yellow arrowhead) are colocalised with the P-body marker Me31B. 384 However, colocalisation of lone 5' ends with Me31B is less evident. For clarity, an equivalent region of 385 an early nc14 embryo is shown as 3 colour images with only either the 5' or 3' end of Dfd mRNAs, 386 Me31B and DAPI ( Figure 6B, Figure S11). For the analysis, we identified unpaired 5' and 3' ends as 387 described above and assessed if there is an enrichment of either end in P-bodies ( Figure S12A). In 388 general, we do not see an excess of lone 3' ends compared to 5' ends across the mRNAs we tested 389 ( Figure S9). However, quantitation of the proportion of single 5' and 3' signals that localise to P-bodies 390 reveals a general trend of more unpaired 3' ends in P-bodies, which is significant for over half the 391 mRNAs investigated ( Figure 6C). Similar results are obtained when the fluorophores on the otd, Dfd 392 and cv-2 5' and 3' probes are reversed ( Figure S12Bi). Furthermore, this trend is lost when we use 393 alternating probes for otd (Figure S12Bii). Taken together these results suggest that the lone 3' signals 394 detected in P-bodies are consistent with them being 5' to 3' mRNA decay intermediates. Additionally, 395 comparison of the proportion of lone ends versus intact mRNAs in P-bodies (relative to the total number 396 of each) reveals that in general the proportion of intact mRNAs and lone 3' ends in P-bodies is similar 397 but there are more intact mRNAs that lone 5' ends ( Figure S12C). This detection of intact mRNAs in P-398 bodies may support a storage role in addition to 5' to 3' decay (see Discussion). Together, these data 399 suggest mRNA degradation can occur within P-bodies for at least some mRNAs in the early Drosophila Quantification of the percentage of unpaired mRNA 5' and 3' ends with P-bodies relative to the total 417 number of lone 5' or 3' ends. n = 3 embryos, paired t-test used to determine significance with  = 0.05. 418 For absolute numbers of intact mRNAs and lone ends, see Figure S9A and S9B. 419 420

Discussion 421
Here, using total RNA-seq time series data and Gaussian process regression, half-lives of Disadvantages of our approach are firstly that it is not global as ~20% of Drosophila genes 441 expressed in our dataset do not contain introns. Secondly, even for mRNAs with introns, we only derived 442 a proportion of transcript half-lives from the dataset due to strict filtering to ensure that there is signal in 443 both the intron and transcript expression time-series meaning genes with small introns and therefore 444 poor signal would also be excluded. Thirdly, for genes with high degradation rates, there may be high 445 uncertainty in the inferred degradation rate since the splicing rate and degradation rate estimates 446 become difficult to disentangle (for simulations demonstrating the reach of the model see 447 Supplementary Methods). Fourthly, the modelling requires some computational expertise in this area to 448 implement on a new dataset. Potential solutions to overcoming these issues would be to generate a Pol 449 II ChIP-seq time series for the transcription profiles that would allow intronless genes to be studied. 450 Looser filtering could be applied to provide half-lives for more mRNAs, although this would potentially 451 lower confidence in the estimates. Finally, the delay between the peak of the pre-mRNA and mature 452 mRNA could be measured as a simpler approach for categorising stability, as we have shown that the 453 stability of a transcript can be classified using this delay. Our imaging data on the test set of mRNAs show that those with short half-lives tend to be more 496 colocalised with the P-body marker Me31B than more stable mRNAs in the early embryo. Consistent 497 with this, using published Me31B RIP-seq data from the early Drosophila embryo, we find a significant 498 correlation between Me31B interaction and mRNA half-life across the set of ~260 mRNAs for which we 499 estimated half-lives. The stronger association of short half-life mRNAs with P-bodies and our ability to 500 detect mRNAs lacking their 5' end in P-bodies suggests that 5' to 3' mRNA decay can occur in P-bodies 501 in the early Drosophila embryo. However, the majority of the lone 3' ends we detect are in the cytosol, 502 suggesting that mRNAs can also undergo 5' to 3' decay outside of P-bodies. We also note that the 503 localisation of mRNAs with P-bodies is variable as, within the short and medium half-life categories, the 504 lov and byn mRNAs are less colocalised with P-bodies. This suggests that for these mRNAs in particular, 505 degradation in P-bodies may only have a minor contribution to their turnover. 506 Although we generally detect similar proportions of 5' and 3' end fragments of a particular mRNA 507 in the cytoplasm, there is weaker colocalisation of 5' end fragments with P-bodies. This observation 508 suggests that 3' to 5' mRNA degradation by the exosome does not occur in P-bodies, consistent with 509 components of the exosome being largely absent [4]. We also detect a similar proportion of intact 510 mRNAs in P-bodies (relative to the total number in the cytoplasm), as we find for lone 3' ends. The 511 presence of intact mRNAs in P-bodies may reflect an mRNA storage role. Therefore, we speculate that 512 in the Drosophila embryo mRNAs enter P-bodies where they can undergo either: 1) 5' to 3' degradation 513 (hence the lone 3' ends detected), or 2) transient storage before exit back into the cytoplasm for 514 translation. 515 A role for P-bodies in 5' to 3' decay is consistent with early studies in yeast following the discovery 516 of P-bodies [70] and with later work in Drosophila suggesting that Me31B is involved in mRNA 517 degradation in the embryo following zygotic genome activation [52] and P-bodies are sites of mRNA 518 degradation in intestinal stem cells [71]. In addition, the Xrn1 exonuclease localises to P-bodies in yeast, 519 Drosophila and mammalian cells [72]. However, P-bodies have been implicated in mRNA storage and 520 translational repression in mature Drosophila oocytes [55] and Me31B represses translation of maternal 521 mRNAs in Drosophila embryos prior to zygotic genome activation [52]. Moreover, many lines of evidence 522 from other systems argue against a role for P-bodies in mRNA degradation. These include an absence 523 of detectable mRNA decay intermediates either following purification of P-bodies [73] or based on a live 524 imaging approach [47], mRNA degradation when P-body formation is disrupted [73] and the ability of P-525 body mRNAs to re-enter translation [74,75]. Although the sequencing data following P-body purification 526 from human tissue culture cells provided evidence for mRNA storage, and do not support a role for P-527 bodies in bulk mRNA degradation [73], we note that two pieces of data are potentially consistent with 528 some degradation occurring in P-bodies. Firstly, there is a weak correlation between mRNA P-body 529 enrichment and half-life, and secondly a 3-fold difference in the median half-lives of the most strongly 530 enriched versus depleted P-body mRNAs was observed [73]. 531 We speculate that P-bodies are involved in both storage and degradation in an mRNA dependent 532 manner, with features of an individual mRNA as well as the proteins present in P-bodies at a particular 533 developmental time influencing which function dominates. In support of this, it is known that there are 534 changes in P-bodies during Drosophila development, for example from being large and viscous in the 535 oocyte to smaller, more dynamic structures in the early embryo [55]. Moreover, at the maternal-to-536 zygotic transition some P-body proteins are degraded, including the Cup translational repressor protein, 537 which may increase the prevalence of mRNA decay in P-bodies [52]. Our data suggest that the

Fly stocks 550
All stocks were grown at 25°C and maintained at 20°C for experiments on standard fly food media (yeast 551 50g/L, glucose 78g/L, maize 72g/L, agar 8g/L, 10% nipagen in EtOH 27mL/L and proprionic acid 3mL/L). 552 The following fly lines were used in this study, y 1 w* (BDSC Stock #6599), y 1 w*; P{His2Av-mRFP1}II. were visualised and imaged with a Leica optigrid microscope at 20X magnification using a Texas red 562 filter. Embryos were timed following the observation of a nuclear division, an image was taken and the 563 embryo was immediately picked out of the oil droplet with a pipette tip and transferred to Eppendorf 564 tubes containing 50µL TRIzol Reagent (Invitrogen). Single embryos were crushed and homogenised 565 using a pipette tip and an additional 450uL Trizol added. Samples were immediately snap frozen in 566 liquid nitrogen and stored at -80°C until processing for nucleic acid extraction. 567 Ten timepoints were collected spanning early Drosophila embryonic development from nc11 568 through to cephalic furrow formation (Table 1). Embryos were collected 5 minutes after nuclear division 569 for nc11 and nc12, 5 and 15 minutes following the nc13 nuclear division and every 15 minutes following 570 the nc14 nuclear division as well as embryos that showed clear cephalic furrow formation. This yielded 571 samples covering every 10-15 minutes through development from nc11 to cephalic furrow formation. 572 The internuclear distance of 15-20 nuclei pairs per embryo was measured in Fiji and normalised to the 573 whole embryo length to obtain an average internuclear distance per embryo ( Figure S1A and B). This 574 was compared to the internuclear distance of embryos of known stages to accurately confirm the nuclear 575 cleavage stage and age of embryos. All embryos were collected at 20°C with approximate time after 576 egg lay in minutes shown in Table 1. 577 578  All images were collected sequentially and optical stacks were acquired at system optimised 632 spacing. Imaging of the membrane using brightfield or anti-Spectrin antibody at the mid-sagittal plane 633 of the embryo with 40x objective at 0.75X zoom and 1024 X 1024 format was used to measure the 634 average length of membrane invagination from at least 5 cells. These measurements were used to 635 select embryos of a similar age in early nuclear cycle 14 (10 m membrane invagination). For all 636 analysis, 3 separate embryos were imaged and quantified as independent replicates. smFISH images 637 were deconvolved using Huygens professional deconvolution software by SVI (Scientific Volume 638 Imaging). By deconvolving images taken on a Leica SP8 confocal we estimate that our lower detection 639 limit is ~120nm. 640 641

Image analysis 642
The spot detection algorithm Airlocalize [79] was used to detect and quantify TSs, single mRNAs and 643 P-bodies within confocal microscopy images. This software fits a 3D gaussian mask and gives the 644 coordinates in X, Y and Z of each spot and its intensity. Z stack images were first subsetted to detect 645 TSs within the range of Z slices around the nuclei. Images were then processed again to detect single 646 mRNAs in the full image. The TS data was then used to remove these high intensity spots from the 647 single mRNA data. Detection of 5' and 3' single mRNA ends and P-bodies was performed separately 648 on each corresponding channel image as appropriate. 649 The whole-embryo total RNA-seq dataset was also processed at the gene level in order to 671 quantify the intronic reads, by aligning data to BDGP6 (dm6) using STAR with default parameters. 672 FeatureCounts was used to get the counts data for exons and introns, respectively. Modified RPKM 673 (reads per kilobase of transcript per million reads mapped) normalisation was applied to exon and intron 674 counts data, where the total mapped reads for each library were used to address the sequencing depth 675 for exon and intron counts from the same sample yielding 11,587 genes with a detectable level of 676 expression (RPKM > 0). 677 To model the pre-mRNA dynamics, any genes without introns, or with zero intronic reads across 678 all timepoints were removed to give a set of 5035 genes and the intron sizes were then used to obtain where ( ), is assumed to be a Gaussian process with RBF kernel [31,32]. This differential equation 703 can be solved in closed form and it can be shown that ( ) is also a Gaussian process with a certain 704 kernel. For more details and specification of this kernel we refer to Supplementary Methods. As the 705 results, ( ) and ( ) can be modelled jointly as a Gaussian process regression with a block kernel 706 which depends on biologically interpretable parameters such as S and D. It is assumed that we observe 707 ( ) and ( ) at discrete times with measurement noise terms which have variances 2 and 2 for 708 mRNA and pre-mRNA respectively. Thus, we have six parameters which we estimate: two parameters 709 of RBF kernel (lengthscale, 2variance, which correspondingly define smoothness and amplitude 710 of possible functions underlying pre-mRNA dynamics), two parameters and , which describe the 711 relationship between mRNA and pre-mRNA, and two measurement noise variances 2 and 2 . 712 We assign priors to these six parameters and use the L-BFGS-B algorithm to find maximum a sites (see Image analysis), the spot position data was analysed with a custom Python script to find 775 optimal spot pairs by solving a paired assignment problem. The distance between n 5′ spots and m 3′ 776 spots are computed and stored as a distance matrix. The optimal assignment of 5' and 3' pairs is then 777 found by minimising this distance matrix to give a set of paired spots with a minimum total distance 778 between all pairs. Spot pairs are then filtered for distances less than 300 nm where the ends are 779 considered to be colocalised and belonging to the same RNA. This 300 nm upper threshold was selected 780 as described in a previous study [46]. For all colocalised 5' and 3' spots the distribution of distances was 781 then analysed and summary statistics extracted. 782 783 Analysis of mRNA colocalisation with Processing bodies 784 mRNA localisation within Processing bodies (P-bodies) was determined from confocal images using a 785 custom script in Python. This script uses the position data for the mRNAs and P-bodies output from 786 Airlocalize and calculates the distance between a given mRNA and every P-body. The minimum 787 distance is then selected so that an mRNA is assigned to its closest P-body. If this distance is less than 788 200 nm (a typical radius of a P-body) then the RNA is considered to be colocalised with the P-body. The 789 proportion of mRNAs located within and outside of P-bodies is then analysed to determine whether a 790 given gene is enriched within P-bodies in the cytoplasm. In order to do this, we derived the P-body 791 colocalisation index, a measure of the degree of colocalisation with P-bodies of an mRNA of interest: 792 Where CP is the P-body colocalisation index, mcoloc is the number of mRNAs colocalised with P-bodies, 797 mtotal is the total number of mRNAs and NP is the number of P-bodies. 798 For analysis of unpaired ends, any 5' or 3' spot which was unpaired from the optimal assignment, 799 or was more than 300 nm away from its assigned pair, was considered. The colocalisation of these with 800 P-bodies was then analysed using a more conservative threshold of 150 nm, to ensure a sufficient 801 proportion of the mRNA was located inside the P-body. The enrichment of unpaired ends in P-bodies 802 was then derived by dividing the number of unpaired ends in P-bodies by the total number of unpaired 803 ends for each channel. 804 805 Acknowledgements 806 We thank Jing Yang for processing the RNA-seq data, Lijing Lin for help with the intronic read coverage, 807 Nuha BinTayyash for filtering the data for dynamics, Mark Ashe for helpful discussions, the University 808