Abstract
The endoplasmic reticulum (ER) is a structurally complex, membrane-enclosed compartment that stretches from the nuclear envelope to the extreme periphery of eukaryotic cells. The organelle is crucial for numerous distinct cellular processes, but how these processes are spatially regulated within the structure is unclear. Traditional imaging-based approaches to understanding protein dynamics within the organelle are limited by the convoluted structure and rapid movement of molecular components. Here, we introduce a combinatorial imaging and machine learning-assisted image analysis approach to track the motion of photoactivated proteins within the ER of live cells. We find that simultaneous knowledge of the underlying ER structure is required to accurately analyze fluorescently-tagged protein redistribution, and after appropriate structural calibration we see all proteins assayed show signatures of Brownian diffusion-dominated motion over micron spatial scales. Remarkably, we find that in some cells the ER structure can be explored in a highly asymmetric manner, likely as a result of uneven connectivity within the organelle. This remains true independently of the size or folding state of the fluorescently-tagged molecules, suggesting a potential role for ER connectivity in driving spatially regulated biology in eukaryotes.
I. INTRODUCTION
One of the most notable aspects of eukaryotic life is the development of membrane-enclosed organelles [6]. These compartments provide two important advantages to cells. First, they segregate unique chemical environments that allow incompatible biochemical reactions to occur simultaneously, and second, they provide extensive two-dimensional surfaces (membranes) to facilitate the efficiency of biological reactions [2]. The largest organelle in eukaryotes is the endo-plasmic reticulum (ER). Topologically, the ER is a single, immense compartment that divides the nucleoplasm from the cytoplasm and stretches to the furthest reaches of the cell as a structurally complex membrane-enclosed network [25]. It is a major site of cellular translation, protein folding, and quality control [60, 62], the major cellular calcium store and sink in most cells [8], and the primary site of lipid synthesis and regulation [77]. Thus, understanding how molecules are spatially distributed over time within the ER network is important for understanding how their functions are achieved.
The development of technologies for fluorescent tagging of molecules has opened the possibility to probe the dynamic nature of proteins within the ER [17, 75]. In particular, fluorescence recovery after photobleaching (FRAP) and fluorescence lifetime in photobleaching (FLIP) [39] have provided significant insight into the way many components of the compartment behave [41]. These studies have suggested that the composition of the organelle is strikingly dynamic, equilibrating on a timescale of seconds [11, 50]. However, these studies are limited to single structurally heterogeneous spatial regions (FRAP) or time scales of tens of seconds (FLIP), and the ability to simultaneously observe high-speed ensemble dynamics across the entire cell has proven elusive. The development of photoactivatable fluorophores [57] has enabled this in other systems [30, 52], but most photoactivatable fluorophores do not fold correctly or provide low SNR in the high redox environment of the ER lumen [12, 33].
An additional limitation to understanding protein redistribution within the ER is provided by the convoluted structure of the organelle itself [73]. The small size of ER substructures necessitates that molecules are not free to move uniformly in any dimension. Previous work has attempted to account for this in FRAP data with a number of clever modeling-based approaches, but the inability to directly observe the structure in question during fluorescent recovery has led to inconsistent results from varying model assumptions [55, 66, 67, 71].
In this article, we develop a combined experimental and analysis pipeline to surmount these two issues. We have used photoactivation of protein-linked organic dyes that work well within the ER lumen to track the dynamic distribution of ER-localized molecules. During photoactivation, we simultaneously image the underlying structure of the organelle with an independent camera and orthogonal fluorescent label. This is approach supports high speed imaging while also collecting entire cells, and allows us to place the photoactivated material back in the structure it is localized to in real time. We then coupled this to a novel, machine learning-assisted analysis pipeline to provide a first-order approximation of the complex organelle structure, allowing estimation of the potential paths taken by the fluorescent components. Collectively, this pipeline allows the extraction of quantitative information about the molecular behaviour within the ER at the whole-cell scale.
II. RESULTS
In order to track ER protein dynamics across entire cells, we utilized photoactivatable organic dyes that can be directly coupled to a protein of interest by genetic fusion to a HaloTag [18, 26, 42]. These dyes are functional in the ER lumen and are not thought to induce the misfolding and ER stress-related pathways common with fluorescent proteins not optimized for the high redox environment within this compartment [13]. Once labelled with the dye, proteins of interest can be fluorescently marked at specific locations within the ER structure by stimulation with 405nm light, and their distribution over time can be monitored by high speed microscopy. We performed experiments in cells also expressing a non-perturbative GFPderivative (moxGFP) targeted to the ER lumen [12], to provide a simultaneously collected map of the underlying structure of the ER for downstream analysis purposes.
Briefly, COS-7 cells were cotransfected with moxGFP and a HaloTag fused to the protein construct of interest. We then used high speed spinning disk microscopy (at 10 Hz acquisition rate) using dual cameras to simultaneously collect the distribution of the photoactivated species of the dye and the fluorescence of the GFP label. This resulted in two stacks of paired 2D images collected at continuous 100 ms intervals. Photoactivation was carried out continuously at single, diffraction limited spot, and the diffusion of molecules away from that spot was visible as the distribution of the converted dye molecule (fig 1).
A The ER structure of a COS-7 cell. The red dot indicates the location of photoactivation. Green regions correspond to ER domains that are equidistant (d = 15 μm) from the photoactivation region (distance calculated on the underlying ER morphology). The blue dashed line denotes the Euclidean distance of 15μm from the photoactivation region when the ER morphology is neglected. B1 Data and analytical fit of the fluorescence curve inside the photoactivation region. B2 Average fluorescence of the target regions at distance 15 m from the photoactivation source and t of effective diffusion model (D = 11 m2 s−1). C Effective diffusion coefficient as a function of the distance d from the photoactivation region, obtained from separate fits at each distance, for the morphology-aware model (orange) and the standard model neglecting ER morphology (light blue).The error bars represent the approximated confidence interval estimated by observed Fisher information (± 5 standard deviations). D Effective mean-squared displacement (MSD) obtained by ensemble estimate. E Distribution of the effective MSD exponent α of each cell for the morphology-aware and the standard model (N = 33 and N = 37 respectively), obtained by ensemble estimate. F Distribution of the median cellular diffusion coefficients obtained by ensemble estimate for HaloTag-KDEL in the ER lumen, comparing the morphology-aware model (green) and the standard model neglecting morphology (light blue). G Example of ER in a COS-7 cell. The photoactivation region is denoted by the red dot. H Structure of the ER network of the cell in G with colour scale denoting distances from the photoactivation source. I Kymograph of the average fluorescence in time of the cell in G, binned over the distance from the photoactivation source through the structure. The white dashed line denotes the squared root of effective MSD curve of the diffusion model that was fit .
A. Ensemble diffusion governs dynamics of ER luminal content
We first set out to evaluate the dynamics of lumen-resident proteins that are freely floating (i.e. without membrane anchors or known specific binding partners). We assumed a globally isotropic but not necessarily diffusive motion, since local directional flows may be present [31]. To accomplish this, we used a HaloTag targeted to the ER lumen with a signal sequence and a c-terminal KDEL retention motif as a model probe for free-oating luminal content (HaloTag-KDEL).
Proteins within the ER lumen are constrained to compart-ment by the bounding membrane, so they are not free to move in any arbitrary direction. Thus, analysis of the distribution patterns must account for the underlying organelle structure, or model fitting will be prone to artefacts (e.g., [73]). To address this, the ER was first segmented by analysing the signal from the moxGFP to extract the network structure (see Methods, section IV A and supplementary figure S1). Regions of the ER were characterized by their distance from the PA source as measured through the underlying structure of ER (fig. 1A, inset), as opposed to Euclidean distance (fig. 1A, dashed line).
Using this approach, the estimated arrival time of fluorescent molecules can be coupled with the distance from the PA source and the amount of photoactivation to produce an effective diffusion model. In principle, this model depends on the rate of photoconversion by the photoactivation laser, which cannot be easily determined. To circumvent this issue, we take a hybrid data-driven modelling approach. We first quantified the photoactivated fluorescence over time at the PA source and fitted it with an arbitrary parametric function ϕ (t). In our case, we found that a function ϕ (t) in the form of a sum of saturating exponentials could well describe the data (fig 1B1, and see Methods). We then fixed the effective diffusion model conditioning it on ϕ (t) at the source region, effectively constraining the model based on the measured intensity of the PA source. Once the source parameters were determined, we performed a maximum-likelihood the of the fluorescence intensity model to extract the effective diusion coefficients corresponding to the average fluorescence measured in the equidistant points (fig 1B2). We constrained the model parameters based on the density of the ER structure, which we estimated directly from the moxGFP channel (Methods, section IV C). e fluorescence intensity model also considers the possible background photoactivation (fig 1B2, dashed line) due to inefficient photoactivation of the dye by the laser used to image the GFP. Note that this approach can be modified to deal with sources of anisotropy if the model t is not good, as would be predicted by synchronous directed flows or other micron scale sources of anisotropy (see section E below). We then repeated this procedure for every distance value in the structure, obtaining distance-dependent estimations of the effective diffusion coefficient. In fig 1C we compare, on an example cell, the results of this morphologyaware model (orange) to a standard maximum-likelihood fit that only considers Euclidean distances and that does not impose constraints based on ER density (light blue). We note that the approach that does not take ER morphology into account produces estimates of the diffusion coefficient that increase with the distance from the PA source (an indication of super-diffusive motion), while such effect disappears when considering a morphology-aware model.
In the immediate proximity of the PA source both approaches are highly sensitive to variations in fluorescence and can badly estimate the effective diffusion coefficient. Similarly, the quality of the t degrades in regions that are too far away from the PA source, since only a relatively small fraction of molecules can reach those regions over the considered timescale, thus decreasing the signal-to-noise ratio of the measured fluorescence. To mitigate these issues, we es-timate the uncertainty of the diffusion coefficient fit based on observed Fisher information (see Methods, section IV C), dis-carding data points with standard deviation above a threshold of 0.1 μ m2 s−1.
To evaluate the validity of a diffusion model more generally, we represented the fit results for each cell in the form of effective diffusion timescales, creating an effective MSD curve describing the time evolution of the average distance travelled by a protein released from the PA source (fig. 1D). The exponent α of the MSD curve allowed us to evaluate possible deviations from standard diffusion (α = 1), such as sub diffusion (α < 1) or super diffusion (α > 1). For each cell, we determined the MSD exponent α by fitting the effective MSD data with a function of the form (c tα) (fig. 1D), comparing the proposed morphology-aware model (black dashed line) with the MSD obtained from a standard model that neglects ER morphology (blue dotted line). We found that population statistics of cells expressing HaloTag-KDEL had an average MSD exponent α ≈ 1, thus suggesting that luminal dynamics can be well described by standard diffusion over this spatiotemporal scale (fig. 1E, green). We also note how neglecting ER morphology resulted in apparent super-diffusive behaviour with α > 1 (fig. 1E, light blue). This is a surprising contrast to modeling performed in the peripheral ER, where erroneous model fits as a result of structural confinement generally manifest as subdiffusive scaling of the MSD exponent (likely the result of lower Renyi dimensionality and competing molecular populations, see Discussion and Supplementary Text and Discussion, Sections 1 and 5) [55, 66, 67, 73]. However, in agreement with these studies, model fits that account for the structure of the ER show significantly faster diffusion rates than those that neglect them (fig. 1F).
For organelles with less complex morphology, changes in the distribution of fluorescent molecules over time are often effectively represented with a kymograph, i.e. a time-space plot of the fluorescence observed along one dimension over time [32, 45, 85]. In the ER, this is challenging due to the circuitous and dynamic nature of paths through the structure and the relatively low number of fluorescent molecules present in any isolated structural component. As a visualization tool, we used the distances calculated between the PA source and every point of the segmented ER structure (fig. 1G–H), andaveraged the fluorescence values at each distance to create an equivalent kymograph describing the time-evolution of the fluorescence as a function of the distance from the PA source (fig. 1I). This representation shows concentrated photoactivated signal at the boom (close to the PA source) and a gradual spreading through the structure moving upwards (towards distant regions) as the time of photoactivation increases. As a qualitative evaluation of the diffusion model fit to this cell, the time evolution of the fluorescence is continuous and well contained within the bounds predicted by pure diffusion at the effective diffusion coefficient observed, both notable differences from what we have theoretically predicted for active flows on visible spatial and temporal scales [16].
Specically, one prediction of our previous work [16, 31] is that active flows in luminal content are predicted to form fluorescent “packets” when they occur on observable scales. Although we did not observe such packets in the spatially averaged kymographs (fig. 1I), we wondered whether the large degree of complexity in the ER may be masking such phenomena based on heterogeneity of packet distances from the PA source. To address this, we chose an example cell that showed significant local heterogeneity in the photoactivated channel (fig. 2A) and simulated purely diffusive motion directly on the experimentally-obtained ER morphology. Briefly, the proposed effective diffusion model was implemented with constant diffusion coefficient defined as the median of the maximum-likelihood fit for the cell (see Methods, section IV C).
A Experimental snapshots at different time intervals from the beginning of photoactivation. B Simulated snapshots built by evaluating the effective diffusion model on the ER shape obtained by experimental data, using the median diffusion coefficient obtained by the morphology-aware maximum-likelihood fit. C Comparison of experimental and simulated fluorescence curves at the photoactivation source (C1) and at two ROIs (C2 and C3) located at different distances from the PA source.
The data was then graphically applied to the underlying ER structure, weighted by the underlying moxGFP fluorescence (fig. 2B, see Supplementary Text and Discussion, Section 2). Remarkably, the simulated distribution was nearly indistinguishable from the experimentally-observed photoactivation channel, correctly predicting fluorescence arrival times in both bright and dim regions of the structure (fig. 2C). Notably, even the fluctuations in fluorescence signal were largely predicted by the model, suggesting that the majority of the heterogeneity observed in the photoactivated channel is due to the varying density and motion of the underlying ER structure and not to uneven mixing of the molecules.
B. Effects of membrane anchoring on ER protein diffusion
Classic work in the field using numerous methods has established that viscous drag on transmembrane domains is a dominating factor on the diffusive properties of membraneassociated proteins (reviewed in [14]). However, the degree to which this drag slows motion compared to freely diffusing molecules has been inconsistently reported, likely as a result of varying model assumptions and tools [22, 58, 66, 71]. Since our approach uses data to directly estimate the parameters that are taken as boundary conditions in these previous studies, we reexamined the phenomenon with knowledge of the underlying structure and dynamics of the ER.
To achieve this, the freely floating HaloTag in the ER lumen (HaloTag-KDEL, introduced in fig. 1) was compared to the same HaloTag anchored to the ER membrane. The membrane anchoring was achieved by fusing HaloTag genetically to a targeting domain from Sec61β that is known not to interact with the translocon in living cells [53]. Photoactivation experiments were then carried out exactly as described above, monitoring the arrival times of photoactivated HaloTag-Sec61 β throughout the structure of the ER.
In agreement with the literature, membrane-anchored HaloTag showed significantly reduced distances traveled over the same time window, compared to HaloTag free in the ER lumen (fig. 3A-B). This resulted in a smaller fraction of the cell falling within the quality threshold for analysis, but the relatively larger fraction of molecules within that space caused significantly more uniform and stable signal in each cell (fig. 3C). Estimation of the isotropic diffusion characteristics was consistent with published literature [35, 73] and the previous results with HaloTag-KDEL (fig. 3C), showing that membrane dynamics were also well described by standard Brownian diffusion (MSD exponent α = 1.02, fig 3D). Population level statistics for the effective diffusion coefficient (fig. 3E) confirmed that membrane proteins are diffusing significantly slower than luminal proteins in our system (membrane DSec61β = 5.39 μm2 s−1, compared to luminal DKDEL = 16.84 μm2 s−1, p-value < 1 × 10−3). Note that as with HaloTag diffusion within the lumen, accounting for the micron-scale ER structure results in estimates of ER membrane protein diffusion that are significantly elevated compared to literat-ure values. This is consistent with our published work using single molecule tracking to look at HaloTag-Sec61β in the ER periphery, which also suggested morphology-unaware approaches dramatically underestimate the true diffusion of membrane proteins in convoluted structures like the ER [73] (see Discussion and Supplementary Text and Discussion, Section 3).
A-B Examples showing the spatial distribution of photoactivated HaloTag at various time points post-activation when anchored to the membrane (Sec61,α)(A) or freely diffusing in the lumen (KDEL)(B). C Two representative examples of the isotropic diffusion t in a cell (averaged over all targets in the cell at a fixed distance) for HaloTag-KDEL and HaloTag-Sec61, β, plotted over the median value for all cells observed. Note the increased heterogeneity of effective diffusion coefficients for luminal content. D Cell population statistics of the effective MSD exponent for HaloTag when linked to Sec61, β or free in the ER lumen (KDEL). E Population statistics of the effective diffusion coefficient (Deff) of HaloTag-KDEL and HaloTag-Sec61, β conditions.
C. Effect of protein size on mixing dynamics
Transmembrane proteins show significantly reduced diffusion as a result of viscous drag in the membrane, but classical FRAP and fluorescence correlation spectroscopy (FCS) experiments suggest that even the ER lumen is itself significantly more viscous than the cytoplasm (reviewed in [40]).The Stokes– Einstein formula predicts that the mean diffusion coefficient of freely diffusing molecules should be inversely proportional to the radius of the molecule. We tested our ability to resolve this relationship by generating a larger version of a free-floating molecule in the ER lumen. Briefly, we genetically fused a signal sequence to three codon optimized concatemers of the HaloTag linked by flexible linkers and C-terminally fused to a KDEL retention sequence (3×HaloTag-KDEL).
As predicted for a diffusion-dominated system, the photoactivated 3×HaloTag-KDEL explored a slightly smaller proportion of the cell in a defined time window than the single HaloTag-KDEL construct (fig 4A-B). Accordingly, it also showed much more stable mean arrival times over distance, similar to what was seen when HaloTag was slowed by a membrane anchor (fig 4C), suggesting this effect is a result of slower motion leading to more dense sampling in defined time windows and not inherent differences in the properties of the lumen as opposed to the membrane. Again, we found that the dispersion of the 3×HaloTag was most effectively fit with a model for simple Brownian motion (effective MSD exponent α3×=1.01, fig 4D). In agreement with physical characteristics of diffusion, we observed significantly decreased mixing speed for the 3×HaloTag-KDEL compared to single HaloTag-KDEL (D3× = 8.53 m2 s−1, D1× = 16.84 m2 s−1, p-value < 1 × 10−3, see fig 4E). Notably, as a control, the same 3×HaloTag fused to the membrane targeting domain of Sec61α had no statistically significant effect on its diffusion characteristics, since the viscous drag of the membrane on the membrane anchor was dominating and the 3x tag faced the cytoplasm (fig 4E). In agreement with this, we note that the mean effective diffusion coefficient of 3xHaloTag-KDEL is still nearly double that of the single HaloTag when it is anchored in the membrane (HaloTagSec61α, fig 4E).Thus, in agreement with other approaches in the literature, viscosity in the ER lumen is significant and can be detected with this method, but is still qualitatively dominated by the drag of membrane anchors.
A-B Examples of the spatial distribution of photoactivated HaloTagKDEL(A) or 3×HaloTag-KDEL(B) at defined time points after the start of photoactivation. C Results for the isotropic diffusion fit (averaged over all targets at fixed distance) for two example cells expressing 3×HaloTag-KDEL, compared to an example of HaloTag-KDEL. Note the reduced variance in effective diffusion coefficients with the larger luminal protein. All results are plotted over lines indicating the median value of all cells in the dataset. D Population statistics of the effective MSD exponent of for HaloTag-Sec61, α, 3×HaloTag-Sec61, α (control), HaloTag-KDEL, and 3×HaloTag-KDEL. E Population statistics of the effective diffusion coefficient Deff for the same conditions.
D. Dynamics of misfolded proteins in the ER lumen
One of the major functions of the ER is to serve as the major site of translation, folding, and quality control for membrane and secreted proteins [60, 62]. Consequently, much of the ER lumen is dominated by protein-folding machinery, whose interactions with ER-localized cargo are likely to have signicant effects on the diffusive properties of the protein species within. Studies using FRAP and FLIP have established dramatic changes in the diffusive properties of ER-resident protein folding machinery under conditions of disregulated folding [36, 37, 69, 72], but they have generally been less effective in resolving changes in the diffusive properties of the misfolded proteins themselves [50], despite evidence from FCS that such changes should be present in the same systems [44] (see Supplementary Text and Discussion, Section 4). Additionally, literature examining the diffusive properties of number of unrelated targets has yielded a confusing combination of gains and losses of motility in the unfolded state [49, 59, 61, 64], suggesting a need for a standardized, full-cell approach to characterizing the redistribution of misfolded proteins over time.
To test if the increased resolving power of our approach could address this in principle, we took advantage of the fact that the protein CD3δ is obligately misfolded and degraded in normal tissue culture cells when the other components of the CD3 complex are not present [7, 9]. Truncation experiments have shown that the luminal domain alone is sufficient for misfolding and degradation when expressed as a soluble construct [7, 46], so we tested its effect on the motility of freely diffusing HaloTag in the ER lumen. Briey, we fused the signal sequence and soluble domain of CD3δ to HaloTag to generate a photoactivatable version of a model misfolded protein (CD3δΔ–HaloTag).The resulting protein is a good client for ER-associated degradation through the Hrd1 pathway, indicating it is correctly recognized as misfolded by the ER quality control system and eventually removed from the ER. As such, we reasoned that it should at the very least switch between diffusive motion in the lumen and motion associated with Hrd1, which is a membrane-embedded protein. We then performed our full analysis pipeline, and analysed the dispersion of CD3δΔ–HaloTag from the PA source over time.
Examination of cells expressing the misfolded construct showed that a significant fraction of the protein was still dynamic within the system, allowing us to perform the full analysis pipeline (fig 5A-B). Notably, CD3δΔ–HaloTag shows significant delayed time of arrival for photoactivated protein compared to a HaloTag-KDEL control (fig 5C), suggesting this approach can resolve slowed diffusive properties of proteins as a result of interactions with luminal protein folding machinery. However, like the other constructs, the majority of the data was well described by a simple diffusion model once the ER structure was accounted for (fig 5D), suggesting the sources of anomality observed in FCS are either at spatiotemporal scales beneath the resolution of this technique or only exist within subpopulations of molecules that do not diffuse across the micron scales visible in these experiments. However, the mean effective diffusion coefficient is significantly reduced for the misfolded protein compared to the HaloTagKDEL control (DCD3 δ Δ = 6.19 μm2 s−1, DKDEL = 16.84 μm2 s−1, p-value < 1 × 10−3, see fig 5E), suggesting interactions with ER quality control machinery are frequent enough to create a reduced effective diffusion coefficient across the population (see Supplementary Text and Discussion, Section 5). We note that this reduced effective diffusion coefficient is closer to that for membrane-anchored proteins than it is to other luminal content, even though CD3δΔ-HaloTag is signifcantly smaller than the 3xHaloTag-KDEL construct introduced previously (see fig 4), suggesting our approach can identify the known significant interactions with misfolding machinery, even if they occur over spatiotemporal scales too small to resolve them directly.
A-B Representative examples of the spatial distribution of photoactivated, misfolded protein (CD3-δ Δ-HaloTag)(A) or freely diffusing folded protein (HaloTag-KDEL) (B). C Isotropic diffusion fit (averaged over all points at a fixed distance) for two example cells expressing CD3-δ Δ-HaloTag overlaid against the median of the population. The median value for cells expressing HaloTag-KDEL is shown for comparison. D Distribution of the effective MSD exponent for the cells expressing CD3-δ Δ-HaloTag. E Distribution of effective diffusion coefficient Deff for CD3-δ Δ-HaloTag (lavender).The distribution of mean Deff for HaloTag-KDEL is reproduced from Figure 1 for comparison.
E. Spatial heterogeneity in molecular mixing in the ER
One limitation of existing technologies for understanding the dynamics of ER content is their difficulty in identifying spatially varying mixing frequencies or diffusion speeds.The data presented thus far is analysed assuming a reasonable degree of isotropy in the system, since locations at similar distances from the PA source are averaged together. While this approach generally fits the data well, we wondered if the high resolution afforded by the approach might allow us to resolve heterogeneity across the structure or locally varying mixing frequencies.
To address this, we divided each cell into spatially uniform regions (in our case, a square grid) and local regions of the ER within each grid square were binned together and collectively fit for a diffusion model with the PA source signal imposed as a boundary condition (fig 6A). In an unmodified form, this local approach suers from low signal to noise due to background and moving ER structure [71], but our high-speed, two-channel approach and machine learning-assisted structural estimation provided a way to surmount this problem. Within each square on the grid, only pixels within the segmented structure from the moxGFP signal were analyzed (fig 6B), and the pixels used for analysis were updated in real time as the structure moved. Bins that did not contain ER structure for at least half of the frames in the timelapse were discarded. The resulting time evolution curve for each non-empty grid bin was fit as described for averaged distances (e.g., fig 6C), and by fixing the PA source parameters as described above, we estimated an effective diffusion coefficient for each grid bin (see Methods, section IV C).The resulting spatially-defined effective diffusion coefficients provided a map of regions in the cellular landscape where diffusion is faster or slower than predicted by the median diffusion coefficient in the cell (fig 6D). Regions characterized by higher effective diffusion coffiefficients are associated with faster arrival of luminal proteins from the PA source.
A Subdivision of the ER using a grid of square bins. (Note: a larger bin size is used for easier visualization in the figure than was used for analysis.) B An example of ER-associated, segmented pixels belonging to a given bin as they are tracked through time. C Average fluorescence over time for segmented pixels belonging to the bin shown in B. Dotted line shows the model fit. D Example of a spatial map describing the observed mean arrival time of HaloTag-KDEL in a representative cell, visualized by a spatially-dependent effective diffusion coefficient (higher is faster).The red dot indicates the PA source. The grid bin size is 400 nm (3×3 camera pixels). Note the relatively uniform diffusion throughout the structure. E-H Examples of diffusion maps showing asymmetric arrival times in subregions of the ER network. Diffusion maps are derived from cells expressing HaloTag-KDEL (E), 3xHaloTag-KDEL (F), and CD3δΔ-HaloTag (G).
In contrast to the majority of cells which were largely uniform (e.g., fig 6D), in a few of the cells in each condition we observed cases of strong local heterogeneity in the dispersal of fluorescent proteins, represented by regions exhibiting faster arrival time with respect to the cell median mixing time (e.g., fig 6E-G, yellow regions). Such regions seem to most frequently be localized in denser perinuclear regions of the ER, close to the nuclear envelope. To ascertain whether this heterogeneity could be explained by an increased reshaping of the ER morphology in the affected regions, we quantified the morphology variation and tubule motion (see Methods, section IV A and supplementary figure S2). ere was no association between higher morphological reshaping and regions with increased mixing dynamics, suggesting that motion of the ER structure is unlikely to be the source of the observed anisotropy in the redistribution timescales.
Although relatively rare, we found examples of such anisotropic photoactivated protein arrival in all of luminal constructs (fig 6E-G), but not the membrane-anchored constructs. The location and size of these regions of faster and slower protein arrival are not consistent with the predicted spatial scales for active transport [16, 31], which are thought to occur over much smaller scales. However, we note that the regions of faster arrival time are in regions of the cell where we have previously shown an elevated level of ER connectivity [84], so they may represent bias in local mixing as a result of the connectivity of the ER network, as has been demonstrated with correlated FRAP and single molecule tracking in systems with more predictable cross sectional geometries [4]. Such a system could potentially give rise the frequently observed but poorly understood phenomena of locally regulated ER functions (see Supplementary Text and Discussion, Section 6).
III. DISCUSSION
In this article, we have combined high speed imaging with point-based photoactivation to examine the dynamic properties of proteins within the ER. We have implemented this approach with a machine learning-based segmentation pipeline and a spatially-defined modeling procedure that can extract quantitative information about molecular motion within the network. This approach supports evaluation of the primary mode of molecular motion at high speed across entire cells, and using this tool we have made a variety of observations about the physical factors that govern molecular behaviour within the ER. Most notably, we find that at the micron-scale, transport phenomena within the ER lumen and membrane are well described by a simple diffusion model when a first degree approximation of the ER structure is accounted for, without needing more complex active ow or anomalous models for diffusion. This model dictates that diffusion within both the membrane and the lumen is likely signicantly faster than previous models that do not account for the ER structure, but it does produce effective diffusion coefficients consistent with other structure-based analyses [67, 73] or experiments performed over spatial scales where ER structural contributions are thought to be negligible [49, 78] (see Supplementary Text and Discussion, Section 1).
While this result is apparently at odds with models of locally directed flows [16, 31] or anomalous diffusion as a result of local binding [44], we caution that this approach only observes processes visible at the spatial and temporal scales of diffraction-limited imaging over hundreds of milliseconds. Local, nonlinear transport phenomena at the nanoscale can still show linear MSD scaling over larger spatial and temporal scales if the phenomena are locally isotropic [16, 44], a phenomenon we observe directly in the predominantly diffusive motion of misfolded proteins that are known to have local binding interactions with folding machinery (see Supplementary Text and Discussion, Section 5). us, we conclude that if local active flows are present, they must exist over spatial and temporal scales that average out to net diffusive behavior by the micron scale, as has been shown for sources of anomality in ER protein diffusion with other models [44, 49].
We also note from this a necessary caution about models in complex structures like the ER. It has been well established by theory [66, 68] and experiment [15, 67, 73] that confinement of diffusing particles to networks can result in the appearance of subdiffusive motion. In this work, which uses full cells that are significantly less uniform in their connectivity than the examples in the literature, we see examples where connement to the structure has more dramatic effects at short distances and time scales than at larger ones, resulting in the appearance of superdiffusive motion (see Supplementary Text and Discussion, Sections 1 and 6 for full discussion).Thus, depending on complexity and connectivity of the underlying structure, MSD scaling can be misleading in either direction about the true nature of the motion within.
The combinatorial approach introduced here promises to be broadly implementable for dealing with complex structured scaffolds on which transport phenomena occur even beyond the ER, and will undoubtedly be improved by integration with developing technologies that can further resolve the complex structures that exist in cells [29, 54, 79]. In particular, one area interest for further technology development is the integration of this approach with tools that have complimentary spatial and temporal scales. For example, traditionally the regionspecific nature of FRAP has made it difficult to multiplex in the same cells or regions as FCS or single molecule tracking, which can in principle be performed simultaneously with our approach. This type of integration across scales at the population level has been effective even applied at the cell population level [4, 44, 49], and promises to elucidate many of the open questions about folding and interaction dynamics within organelles when calibrated inside individual cells.
IV. METHODS
Fluorescence microscopy images are acquired in low-light conditions, since the power of the excitation laser and exposure interval are constrained by the fluorescence saturation rate, the risk of photodamaging the sample, and the acquisition rate for dynamical imaging. Fluorescent images are thus characterized by high level of noise. In particular, the main source of disturbance is represented by Poisson noise (or shot noise) originating from the discrete number of photons hitting the sensor [21]. This is combined with other signalindependent noise sources, due to thermal vibration (e.g. dark current) or caused by the electronics (i.e. noise due to the amplification, conversion, and transmission process, which is usually collectively indicated as readout noise).These last noise sources do not depend on the photon flux, and can be modelled effectively as a single aggregated white noise source. Thus, denoising of fluorescence images requires handling Poisson-Gaussian noise. Various techniques have been developed to solve this problem, following mainly three approaches: variance stabilizing transformation followed by additive white noise removal [21], algorithms specifically designed for Poisson-Gaussian mixtures such as PURE-LET [43], and deep learning models based on CNN [38, 80]. A recent work, benchmarking the most popular methods on a dataset of fluorescence images, highlighted the superior performances of CNN-based models [82].The main drawback of CNN is that they usually require training on a set comprising both noisy and clean images. However, in the last few years various deep learning models have been proposed which allow training without clean data [5, 38, 76]. Another drawback of CNNs is the phenomenon of pareidolia, i.e. the tendency to impose a meaningful interpretation to random or ambiguous patterns. CNN are, by their nature, biased towards certain structures which can be intrinsic or derived from the training data [47, 76]. While this is the very reason of their effectiveness, it can also be the cause of unwanted visual artefacts: a CNN model trained on ER images will develop a tendency to see ER even where there is none. For these reasons, in our analysis of photoactivation data, we have taken a mixed approach: we relied on a deep-learning model to facilitate the segmentation of the ER morphology, but followed a more conservative approach when extracting quantitative measures from the photoactivation curves, considering a statistical model of the raw data.
A. Pre-processing and segmentation of ER morphology
To ensure reliable segmentation of the ER structure, we preprocess the data of the morphology channel (GFP channel) in several steps (see supplementary fig S1).Thse preprocessing steps are needed because the ER structure and density can vary significantly. Regions of the ER close to the nuclear envelope are characterized by denser structures (membrane sheets or dense matrices of tubules [53, 74]) which are not resolvable with confocal microscopy and thus appear as continuous membrane arrangements. At the cell periphery, the ER is formed by a network of sparse tubules with diameter 60–100 nm [53, 70].These structural differences determine the quantity of fluorescent proteins and worsen the image contrast, making it challenging to segment simultaneously the entire ER structure [56].
1. Frame denoising
We first denoised the morphology data to reduce the level of noise and improve the detection of finer structures. To this aim, we adapted the Noise2Noise technique [38] to train a U-Net convolutional neural network [63] for noise removal of single frames (see supplementary fig. S1A). We trained U-Net model using pairs of noisy realizations generated by extracting 128×128 px tiles from time-adjacent frames. One noisy realization is kept as a reference while the other is processed by the U-Net model. The loss is then calculated as the MSE between the reference noisy frame and the model output. This approach relies on the assumption that the acquisition rate is fast enough to make changes between consecutive frames negligible.The assumption is reasonable as most of the ER structure is steady for the acquisition interval used (100 ms). To minimize the bias created by the time evolution of the ER network, both the order of the training samples and the order of the two images forming a couple were randomized in the repeated training epochs. We kept 10% of the samples for validation, separating them from training data.The training was performed on batches of 4 images, using the Adam optimization algorithm [34] with parameters β1 = 0.9, β2 = 0.99, no weight decay and initial learning rate 1 = 2.75 10-4. The learning rate was then gradually reduced with a factor 0.75 when the MSE loss calculated on the validation set did not improve over 5 epochs. To prevent overing, training was stopped when the loss over the validation set did not improve for 15 consecutive epochs, selecting the model with minimal loss.
2. Sharpening and structure enhancement
To enhance the resolution of the finer ER structure, we deconvolved the morphology frames to create sharper images. Image deconvolution is an inverse problem of the type y = Ax, consisting in recovering the sharp image x from a blurred version y, with A representing the PSF operator associated to the optical system. Since the problem is ill-posed, the solution is found by adding regularization constraints on x such as total variation [23]. Recently, a sparsity constraint in the deconvolution procedure has been shown to successfully increase the spatio-temporal resolution in fluorescence microscopy [83]. Based on these approaches, we defined a spatio-temporal deconvolution problem with total variation and sparsity regularization terms, to provide good reconstruction of edges, and a smoothness constraint in time dimension to account the gradual structural movements. We define the optimization problem as
where y represents a sequence of frames, A is the PSF operator, ‖· ‖ and
represent respectively the ℓ1 and ℓ2 norms,
is the directional derivative along vn, and
is the second derivative in the time dimension. Multiple spatial directions vn = (sin(2πn/N), cos(2πn/N), 0) in space (x, y, t) to reduce staircase artefacts. To ensure correct enforcement of the sparsity constraint, each frame in y was preprocessed by taking a 2D discrete wavelet transform and dampening the approximation coefficients by a factor 0.5. The operator A was assumed to be a Gaussian PSF with appropriate standard deviation (1.27 px).The solution x was computed with the split Bregman method [24]. Parameter values used in the optimization are reported in supplementary table S1.
3. Segmentation of the network structure
The ER network was segmented from the sharpened stack as follows. First, the frames were clipped at zero and rescaled between 0 and 1 and local contrast was enhanced by contrast limited adaptive histogram equalization (CLAHE) [1] for five iterations. To detect tubular structures, the frames were first transformed with a Sato tubeness filter [65] with σ = 100 nm and then binarized by locally adaptive Niblack thresholding [51] with window size 127 px (≈ 17 m) and weight 0.1. Empty background regions, which could create bad detections with purely local thresholding, were excluded by setting a minimum global threshold. Continuous surfaces corresponding to dense matrices and cisternae were obtained by a similar thresholding procedure, followed by a morphological opening operation with a disk of radius 250 nm. The thresholded tubules and sheets were then merged in a single frame sequence. To preserve details in very close tubular structure, local minima were identified with a h-minimum transform (with threshold 1 × 10−3) before and set to zero [56]. To fix occasional flickering of isolated pixels, a morpholo-gical closing operation was applied along the time axis of the thresholded stack. Finally, the network structure was extracted by thinning procedure [81]. To preserve dense matrices and cisternæ, these regions were again extracted by morphological opening and added back to the skeletonized network structure.
B. Estimation of morphology reshaping
The amount of local morphological reshaping was quantified in two ways. First, by estimating the relative variance of the denoised morphology frames along the time stack, highlighting regions of the network undergoing more variation in time. Predictably, most of the variation was localized on the tubule edges, due to small oscillations (fig S2B, blue shade). Note that in our method, such small oscillations do not affect the estimate of the dynamics, as only the pixels clearly belonging to the ER structure are considered. Second, we estimated motion of tubules and other structures by optical ow using Farnebäck’s algorithm [19], then calculated the mean norm of the velocity vectors over the photoactivation time to reveal regions characterized by more active reshaping of the network structure (fig S2B, red shade).
C. Statistical model of the photoactivatable channel
1. Estimation of the noise model parameters
We considered a Poisson-Gaussian noise model with clipping as described in [20, 21].The original unclipped noise model which describes the observed pixel value z (x) at position x is defined by
where y (x) is the true pixel value, ξ (x) is zero-mean and unit variance random noise, and σ (y) is the y-dependent standard deviation function characterizing the Poisson-Gaussian noise. Note that this model assumes that the photon count y is large enough, such that the Poisson term can be approximated with a normally distributed noise with signal-dependent variance. Imaging sensors are also characterized by a bias level µ, i.e. an artificial offset added to the collected charge to reduce clipping. This offset value must be subtracted from the readout to correctly model the signal-dependent part of the noise, e.g. by considering the variance function σ (y − µ) or by redefining y←y−µ. If the sensor is not able to capture the full dynamic range of the image (particularly in the case of fluorescence imaging, where frames are intrinsically underexposed), one can define a clipped observation model as:
The goal is to reconstruct the model parameters a, b, µ from the video data. We applied the technique described in [21] on all frames, segmenting the full video data in level sets, excluding the tubules boundaries obtained from the segmentation of the morphology channel, to estimate
pairs. We then performed a maximum likelihood t of the model parameters using the L-BFGS algorithm [48, 86].
2. Effective diffusion model
To extract quantitative features describing the protein dynamics, we consider an effective 2D diffusion model. The approximation with a 2D space is appropriate given that the ER structure in COS-7 cells can, for most of its surface, described by a planar graph [53]. While it is possible to derive a reaction-diffusion system which modelling a constant-rate photo conversion in the source ROI, this approach requires knowledge (or t) of the photo conversion rate parameter. Moreover, due to the reshaping of the ER, the ROI exposed to the photo activation laser may change constantly, altering the effectiveness of photo conversion. To avoid these problems, we used a hybrid approach which allows to preemptively tune the diffusion model based on the data. To this aim, we considered a diffusion model in which we impose a time-dependent Dirichlet boundary condition corresponding to the fluorescence measured close to the photo activation ROI. We denote by ϕ(t) the protein concentration measured at distance R from the centre of the photo activation ROI. Given an analytical expression for ϕ(t), we can calculate the solution of the radial diffusion equation u(r, t) (see supplementary IV A, eq. S8):
where v(r, t) is the solution of the auxiliary diffusion problem with a constant boundary condition (ϕ(t)= 1), which can be explicitly expressed in Laplace domain as
where K0 is the modified Bessel function of order zero. It is convenient to express the convolution in eq. (4) as a product in the Laplace domain:
calculating the u(r, t) by inverting the Laplace transformation.
In principle, we can choose any analytical form of ϕ(t) that well describes the fluorescence curve at distance R from the centre of the photoactivation ROI observed in the experimental data. We considered here the particular case where ϕ (t) can be expressed as a sum of exponentially saturating functions:
Using this definition for ϕ combined with eq. 6 we can write
from which u(r, t) can be obtained by inverting the Laplace transform numerically.
3. Fluorescence intensity model
To properly describe the fluorescence data, we need to take into account variations which not related to the diffusion process. These are mainly constituted by the background intensity and the random photoconversion of proteins outside the photoactivation region due to leaks from the other channel and acquisition laser. We describe these by an offset and a linear trend which can be observed before the photoactivation laser is activated.Thus, we describe the fluorescence intensity f(x, t) at position x as the sum of a linear polynomial c1 t+c0 and the diffusion process u(r, t):
where x0 is the location of the PA ROI, c0(x) describes the background luminosity, c1(x) is the slope of the linear trend of random photoactivation, and c2(x) is a proportionality constant representing the density of photoactivated proteins. Note that, in general, the coefficients c0, c1, and c2 are spacedependent.
4. Maximum-likelihood t of the diffusion model
We first performed image registration on the photoactivation channel to match the position of the GFP channel by cross-correlation [3, 27]. We then considered the intensity data obtained from the raw images in the form of points xi = (ri, zi, ti), where ri represents the distance from the photoactivation source and zi its measured fluorescence at time ti from the beginning of photoactivation. We used the noise model defined in eq. (2), with parameters determined according to the procedure described above, to estimate the likelihood of the fluorescence model of eq. (10) depending on the xi samples. e noise variance for a point xi = (ri, zi, ti) is given by
where f(ri, ti) is the true fluorescence value and µ is the imaging sensor bias level. We consider
The log-likelihood is then
with θ is the set of parameters c0, c1, c2, D (see eq. (10)).The maximum likelihood parameter is obtained as
which can be solved by numerical optimization. This maximum likelihood estimation can be extended to ensemble points obtained by averaging multiple pixel values in the photoactivation stack. The noise variance of such aggregated points is simply
where N is the number of averaged pixels and
is average fluorescence value. By maximum-likelihood estimation it is then possible to recover the diffusion coefficient D and the coefficients
, and
corresponding respectively to the average background intensity, average slope of the linear trend, and average density over the ensemble. In practice, we t the model parameters using L-BFGS-B [48, 86] in two separate steps. First we performed the maximum likelihood fit on the interval preceding photoactivation, recovering the trend para-meters c1 and c0.Then, fixing c1 and c0, we performed the fit on the full time interval to recover D and c2. We initialized the parameter c2 based on the GFP-labeled morphology channel as follows. We considered the frames of the morphology channel preceding photoactivation and calculate the average intensities ρs, ρt and variances
over the pixels corresponding to the PA source and the target regions respectively. We then defined the initial value for coefficient c2 as:
where µGFP is the sensor bias level for the GFP channel. In the morphology-aware approach, we constrained the coefcient c2 in the interval
, where v2 is the approximated variance of the ratio distribution obtained as
.
In the case of isotropic diffusion estimation, these ensemble points were built by binning pixels in each frame based on their distance from the PA source (with bin size of 250 nm). Only pixels belonging to the segmented ER structure were considered. Ensemble points characterized by the same distance r were grouped to build a time series . We note that, due to changes in the morphology of the ER, the pixels belonging to a given distance bin are not invariant across frames. This frame by frame adjustment allows cancelling at least part of the effect of the morphology reshaping.
For the estimation of space-dependent effective diffusion, ensemble points were built by dividing the field of view in equal bins using a squared grid (fig 6A). For each bin, average fluorescence intensity and median distance from the source were evaluated in each frame (considering only the pixels belonging to the segmented ER structure, fig 6B). The ensemble points obtained by this procedure were then used to estimate the effective diffusion by maximum likelihood (fig 6C), obtaining a diffusion coefficient for each bin of the grid (fig 6D).
We estimated the variance of the fitted parameter D using the observed Fisher information. In the limit of large number of samples, the distribution of the maximum likelihood estimate can be approximated as [28]:
where
is the observed Fisher information. In our case, the observed Fisher information can be evaluated as the Hessian of the negative log-likelihood defined in eq. (12), i.e.
. We thus estimated the variance of the parameter D by numerically evaluating the inverse Hessian of log at its minimum [10], and taking the appropriate diagonal element. We note that such estimate of the error is statistically meaningful only limited to the context of the optimization problem defined in eq. (13), as it does not take into account errors resulting from inappropriateness of the effective diffusion model or bad ER segmentation. Nevertheless, this approach provides a method to systematically rate the quality of the ts and exclude the unreliable datapoints in the estimation of the effective diffusion coefficient (fig 1C).
Footnotes
Revised Figure 2-6 and SI
References
- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵
- [68].↵
- [69].↵
- [70].↵
- [71].↵
- [72].↵
- [73].↵
- [74].↵
- [75].↵
- [76].↵
- [77].↵
- [78].↵
- [79].↵
- [80].↵
- [81].↵
- [82].↵
- [83].↵
- [84].↵
- [85].↵
- [86].↵