## ABSTRACT

Principles of genome folding and their relationship to function depend on understanding conformational changes of the chromatin fiber over time. However, analysis of bulk chromatin motion at high resolution has not yet been reported. We developed Hi-D, a method to quantitatively map DNA dynamics for every pixel simultaneously over the entire nucleus from real-time fluorescence images. Hi-D combines reconstruction of chromatin motion using computer vision and classification of local diffusion processes by Bayesian inference. We discovered that DNA dynamics in the nuclear interior are spatially partitioned into 0.3 - 3 μm domains in a mosaic-like manner which is uncoupled from chromatin compaction. This pattern was remodeled in response to transcriptional activity accompanied by a global reduction of DNA dynamics. Hi-D opens new perspectives towards understanding motion of nuclear molecules in the context of nuclear architecture.

## INTRODUCTION

Spatial organization and dynamics of chromatin correlate with cell function and fate [1]. In mammalian cells, chromosomes occupy territories whose localization is stable over the cell cycle although their positioning relative to each other and to other nuclear bodies, and degree of intermingling varies between cells [2]. Each chromosome contains dense heterochromatin and open euchromatin areas whose extent is indicative of cellular activity [3]. Transitions within and between eu- and heterochromatin involve multiple levels of chromatin organization from nucleosome density, long-range looping and domain folding that adapt to and enable DNA processing [4]. Structural models derived from contact and crosslinking frequencies [5–7] are consistent with the view that the genome is partitioned into compartments and sub-compartments [8]. It is now becoming increasingly clear that such nuclear compartments are also dynamic entities whose formation need to be understood. Tracking of labelled single DNA loci [9–13] or chromatin domains [14,15] demonstrated that chromatin motion is highly heterogeneous at short time intervals. However, sparse loci are difficult to place in the context of global chromatin organization which is subject to highly complex dynamics [16]. Locally restricted genomic processes can also not be inferred from quantities averaged over the entire nucleus [17,18]. A first study analyzing bulk chromatin motion at nanoscale resolution revealed dynamic partitioning of chromatin into a number of nuclear sub-regions which show correlated motion of chromatin in the micrometer range [19]. However, to further understand mechanisms driving chromatin dynamics and giving rise to spatial coherence [16], an integrated and comprehensive characterization of diffusion processes acting on the chromatin fiber at the local and global scale is needed.

To decipher the relationship between genome dynamics, organization and activity, we developed HiD, a method to overcome the limitations of sparse and ensemble approaches for imaging dense structures such as chromatin and nuclear proteins. Hi-D, a combination of dense optical flow reconstruction to quantify the local motion of chromatin at sub-pixel accuracy from live cell imaging [19], and a Bayesian inference approach to precisely classify local types of diffusion. Biophysical properties such as diffusion coefficients and anomalous exponents were determined for each pixel representing two dimensional maps of chromatin dynamics at sub pixel resolution down to 65 nm over the entire nucleus in living single cells. We examine the capacity to study configurational changes of labelled chromatin in quiescent and stimulated cells. We find that contrary to common belief, DNA compaction and dynamics do not necessarily correlate and strengthen the hypothesis that, instead, the chromatin environment modulated by activity-induced macromolecular crowding or association with the nuclear lamina dictate chromatin dynamics at local scales.

## RESULTS

### Hi-D determines and classifies DNA dynamics at nanoscale

Motion observed in a series of conventional confocal fluorescence microscopy images was quantitatively reconstructed by a dense Optical Flow method [19] and trajectories were calculated by integration of the resulting flow fields (Figure 1a, Supplementary Note 1). In order to examine the accuracy of calculated trajectories and associated diffusion constants, we compared Hi-D results to those of a Single Particle Tracking (SPT) method which is commonly used for dense molecule tracking, dynamic multiple-target tracing (MTT) [20] (Supplementary Note 2). While the SPT method outperforms the Hi-D approach in scenarios of sparsely labelled molecules, Hi-D calculations result in considerably more accurate estimates for local diffusion coefficients than SPT estimates in scenarios of densely labelled molecules or structures of heterogeneous labelling density such as chromatin. HiD therefore constitutes an approach to extract dynamic information of biomolecules with dense labeling where SPT cannot be applied. To ensure that the calculated dynamics are not a consequence of imaging noise, we experimentally validated the sensitivity of the approach by calculating the Mean Squared Displacement (MSD) for formaldehyde fixed and living U2OS cells labelled by using DNA-SiR kit with and without supplying serum during growth. Diffusion coefficients derived from the MSD curves by Bayesian inference were about two orders of magnitude greater in living cells than in fixed cells (Supplementary Figure 5) confirming that Hi-D enables quantifying DNA dynamics well above the noise background. The type of diffusion characterizing each pixel’s chromatin motion was determined in an unbiased manner using Bayesian inference to simultaneously test a set of common models to fit each MSD curve (Monnier et al. 2012) (Figure 1b, left panel). The five principal models are shown as a coloured map directly on the nucleus (Figure 1b, right panel) (Methods section). This classification of local diffusion processes revealed that diffusion of DNA is highly heterogeneous throughout the nucleus

We found that only a small fraction of trajectories displayed directed diffusion (Figure 1b), while the bulk of chromatin exhibited sub-diffusive behaviour. By examining a wide range of parameters governing these types of diffusion, our results suggest that chromatin diffusion in human U2OS cells can be adequately described as anomalous (Supplementary Note 3). Biophysical parameters calculated for each pixel (diffusion constant *D*, anomalous exponent *α* and drift velocity *V*) are presented in color-coded 2D heat-maps (Figure 1c) (Methods section) and form islands of irregular shape and dimensions (Figure 1c). These parameter maps clearly demonstrate that chromatin exhibits spatially partitioned dynamics forming a mosaic of domains with similar values characterizing chromatin motion. This view is in line with current insights that chromatin dynamics are spatially correlated in the micrometer-range [17,19].

### Hi-D mapping resolves heterogeneous motion of chromatin

To spatially resolve and classify chromatin motion, we applied Hi-D to an image series of live U2OS nuclei labelled by SiR-DNA (Figure 2a). The selected model for each pixel is represented by colour as indicated by the legend (Figure 2b). Diffusion coefficients (D) were calculated for each pixel based on the model selected by Bayesian inference. The range of D values (0 < D < 8.2 ×10^{−3} μm^{2}/s; blue to yellow) and *α* ranged (0.24 < *α* < 1.5) were plotted in a 2D heat map of the nucleus (Figure 2c, d). Small values of D were prominently located at the nuclear envelope (dark blue) (Figure 2c). Plotting the average diffusion constants versus the distance from the nuclear periphery showed that the mobility of the first 1 μm from the periphery increases linearly before assuming a nearly constant value (D = 3.9 ×10^{−3} ± 0.3 ×10^{−3} μm^{2}/s) in the inner volume of the nucleus (Figure 2e, upper panel). A similar but more gradual trend was observed for chromatin mobility next to nucleolar boundaries (Figure 2e, lower panel).

In numerous sites across the remaining nuclear volume fast diffusive areas of irregular dimensions spanning 0.3 – 3*μm* in diameter (yellow areas in Figure 2c) are imbedded in the bulk of moderately dynamic chromatin. Areas of different parameter values seamlessly transition into one another without clearly defined boundaries, evident of spatially correlated chromatin dynamics. To further quantify this heterogeneous distribution, D values were plotted in a histogram and contributing subpopulations were deconvolved using a General Mixture model (GMM) (Figure 2f, Methods, and Supplementary Figure 6). We determined that DNA mobility tended to be composed of three populations described by Gaussians which we refer to as slow, intermediate and fast (Methods, exemplary in Figure 2b), for both serum-starved and -stimulated cells irrespective of the parameter under consideration (diffusion coefficient or anomalous exponent) (Figure 2f, g). We found that directed motion involving a drift velocity (V) was present to a much lower extent than free and anomalous diffusion resulting in much less data for V than for the other two parameters (Figure 1c, Figure 2j). Hence the drift velocity was not retained for further analysis.

The slowly diffusing population was present near the nuclear periphery in a 100 – 200 nm thick rim (Figure 2c,h). Further, for each population found by the GMM (Figure 2h, i), the relative fraction of pixels which have chosen a specific model are shown (Figure 2j). Noticeably, the slow population fraction was dominated by free diffusion, while the two major populations were represented by a mixture of motion types (Figure 2j). In comparison to the two major populations, the slow chromatin population was distinct not only in due to its localized spatial distribution in the whole nuclear periphery but also by its homogeneity in the type of motion. The intermediate and fast diffusing populations were distributed in large irregular areas throughout the nuclear volume with selection of four motion models (Figure 2j).

### Transcription status modulates chromatin diffusion processes

We compared the effect of transcriptional activity on chromatin dynamics in serum-starved and - stimulated cells. Compared to serum-starved U2OS nuclei, the average diffusion constant decreased by nearly one order of magnitude for all three populations upon addition of serum (Figure 3a). In serum-stimulated cells, sub-diffusive and super-diffusive regimes were strengthened (Figure 3b). The presence of a population with *α* close to 0.5 in both conditions is indicative of a Rouse-like chromatin fiber architecture [23,24]. We hence hypothesize that processes involved in cell maintenance considerably reduce the diffusion of chromatin, while in a quiescent state only a fraction of nuclear processes was active allowing chromatin to be more mobile.

In serum starved cells, large variations of D is suggestive of a range of diffusion modes in the absence of DNA related activities and increased *α* (≈0.75) in a large fraction of DNA further corroborates particle-like behavior of the fiber in a crowded solution[25]. In contrast, upon serum stimulation anomalous diffusion was predominant and its value (*α* ≈ 0.33) is indicative of entangled polymers [26], where the entanglement could effectively be established by protein complexes binding to DNA thereby restricting large displacements of the polymer.

### Single-cell biophysical property maps of genome conformation and behaviour

To concomitantly monitor position and distribution of the three mobility populations in quiescent and activated cells, we determined Hi-D maps for a single cell (Figure 4a, b). The low D population occupying ≈6% of the nuclear area was invariant to transcriptional changes (Figure 4c). In contrast, upon serum stimulation, a large fraction of the fast moving population in quiescent cells was re-classified as intermediate mobility population with moderate changes in the mode of diffusion (Figure 4d).

Anomalous diffusion dominated across the entire nucleus (0.3 ≤ *α* ≤ 0.73) forming a mosaic-like pattern which was reinforced in transcriptionally active nuclei (Figure 4e, f). Within this pattern, patches of super-diffusive (red: *α* > 1) motion segregated into distinct islands which became more fragmented upon serum stimulation. These observations suggest that the decreased mobility results from a change in the crowding of microenvironment. Hi-D thus reveals high-resolution spatial changes in anomaly of chromatin diffusion in single cells. Further investigation may tell us if all or a subset of these physical domains correspond to the ones determined using chromosome conformation capture (Hi-C).

### Chromatin dynamics is uncoupled from compaction

We next asked if chromatin dynamics is influenced by the compaction of chromatin since heterochromatin is widely believed to be less dynamic than euchromatin [15]. Eu- and heterochromatin domains were determined in serum-starved and -stimulated cells by the fluorescence intensity as explained in [27] (Figure 5a). Heterochromatin at the nuclear periphery overlapped with the peripheral slow motion domain (Figure 5b) consistent with previous findings [15]. Inside the nucleus however, we did not observe any tendency of heterochromatin being associated with a specific mobility population. Instead, we observe that mobility populations were distributed equally among euchromatin and heterochromatin regions (Figure 5c) with the exception of heterochromatin being significantly more attributed to the slow diffusion population in serum-starved cells at the nuclear periphery. We speculate that this might be a confinement effect exerted by the nuclear envelope on the peripheral chromatin in the absence of DNA-related processes. Furthermore, we find that populations of the anomalous exponent do not show a tendency to overlap with either eu- or heterochromatin (Figure 5d, e). These results also hold for MCF7 cells (Figure 5f) and suggest chromatin undergoes diffusion processes which are, in general, unrelated to the compaction level of chromatin. However, compact chromatin is characterized by an increased contact frequency of the chromatin fiber with itself which seems plausible to enhance the extent of coherent chromatin motion. To test this hypothesis, we calculate Moran’s Index of Spatial Autocorrelation [28] for the flow magnitude assessed at different time lags in either eu- or heterochromatin (Figure 5f). We find that heterochromatin exhibits enhanced spatial autocorrelation compared to euchromatin across all accessible time lags in line with our hypothesis and that spatial autocorrelation decreases with increasing time lag in serum-starved cells, while in serum-stimulated cells, autocorrelation is enhanced in the long-time limit (over 30 seconds). This points to active processes establishing spatial coherence in the long-term [17,19] while random processes such as thermal fluctuations decrease autocorrelation in the range of 10 seconds and over the whole time span in serum-starved cells.

## DISCUSSION

Hi-D is a single cell approach that enables tracking of dense structures such as chromatin directly without losing active fluorophore density and with no need for prior experience in sophisticated labelling preparations or advanced microscopy [29]. The information gain through image analysis afforded by Hi-D alleviates the incompatibility of conventional microscopy for nanoscale mapping of chromatin dynamic properties in living cells.

Hi-D analysis revealed three populations of DNA chromatin fibers diffusing in a comparable manner. The first, a slow mobility fraction prominently located at the nuclear rim, is reminiscent of lamina associated domains (LADs) [30] and the dynamic response to transcriptional stimuli supports the hypothesis that LADs play an important role in attenuating transcription activity and of gene expression patterns [31]. Although chromatin that is less mobile at the nuclear periphery largely overlapped with long known perinuclear heterochromatin [32,33], Hi-D analysis remarkably points to an overall absence of correlation between chromatin compaction and mobility.

Indeed, sub-regions displaying intermediate and highly diffusive regimes were distributed in a mosaic-like pattern throughout the nucleus, some of them also at the nuclear periphery. Heterochromatin therefore does not exhibit low mobility in nuclear space in general, but may be divided into a more viscous component and a more rigid elastic LAD chromatin with reduced mobility due to anchoring to a nuclear substructure [34]. The extent of the third, highly mobile fraction which dominated chromatin behaviour in the quiescent state, decreased dramatically when cells were serum stimulated. This global switch in chromatin mobility is suggestive of dramatically altered molecular crowding, as a result accrued concentration of players of all types in a highly transcribing nucleus, in particular ribosome biogenesis [19,35–37].

Heterogeneous chromatin motion arises due to irregular protein binding along the chromosomes and can lead to thermodynamic or electrostatic self-organisation of nuclear compartments [38,39]. Local patches of large anomalous exponents indicate super-diffusive behaviour of chromatin which may result, among others, from loop extrusion by structural maintenance of chromosome (SMC) complexes [40,41]. Strikingly, quiescent cells show an increase of chromatin condensation by accumulation of the SMC complex condensin during quiescence entry [42], which is indicative of dynamically active chromatin in quiescence. Furthermore, chromatin patches with *α* < 0.3 and *α* > 1 respectively correspond in size to one or a few TADs [43]. These two types of patches are present as islands within the general chromatin fraction governed by an anomalous exponent 0.3 < *α* < 1. This organization is in good agreement with the chromosome territory - interchromatin compartment model [2]. In conclusion, the combination of diffusion constant and anomalous exponent maps was complementary and provide an integrated view on chromatin organization and dynamics during genomic processes.

We focus here on DNA, but Hi-D can be applied to real-time imaging of any fluorescent molecule to obtain comprehensive maps of their dynamic behavior in response to stimuli, inhibitors or disruptors of nuclear functions and integrity. Hi-D could be extended to 3D image analysis, furthermore, combined with tracking specific loci to zoom in on their environment. It will also be exciting to probe phase separating condensates to gain a better understanding of their physical nature [44].

## METHODS

### Cell Culture

A Human U2OS osterosarcoma and MCF-7 cells (ATCC) were maintained in Dulbecco’s modified Eagle’s medium (DMEM) containing phenol red-free and DMED-12 (Sigma-Aldrich), respectively. Medium was supplemented with Glutamax containing 50 μg/ml gentamicin (Sigma-Aldrich), 10% Fetal bovine serum (FBS), 1 mM sodium pyruvate (Sigma-Aldrich) and G418 0.5 mg/ml (Sigma-Aldrich) at 37°C with 5% CO2. Cells were plated for 24 h on 35 mm petri dishes with a #1.5 coverslip like bottom (μ-Dish, Ibidi, Biovalley) with a density of about 10^{5} cells/dish.

### Cell starvation and stimulation

For starvation mode, cells were incubated for 24 h at 37°C before imaging with serum-free medium (DMEM, Glutamax containing 50 μg/ml gentamicin, 1 mM sodium pyruvate, and G418 0.5 mg/ml). Just before imaging, cells were mounted in L-15 medium. For stimulation, 10% FBS was added to the L-15 medium for 10 minutes.

### Chromatin staining

U2OS and MCF-7 cell lines were labelled by using SiR-DNA (SiR-Hoechst) kit (Spirochrome AG). For DNA was labelled as described in [45]. Briefly, we diluted 1 mM stock solution in cell culture medium to concentration of 2 μM and vortexed briefly. On the day of the imaging, the culture medium was changed to medium containing SiR-fluorophores and incubated at 37°C for 30-60 minutes. Before imaging, the medium was changed to L-15 medium (Liebovitz’s, Gibco) for live imaging.

### Cell fixation

U2OS cells were washed with a pre-warmed (37 °C) phosphate buffered saline (PBS) and followed by fixation with 4% (vol/vol) Paraformaldehyde in PBS for 10-20 min at room temperature. Imaging movies were recorded at room temperature in PBS, after washing the cells with PBS (three times, 5 min per each).

### Imaging

Cells were placed in a 37 °C humid incubator by controlling the temperature and CO2 flow using H201-couple with temperature and CO2 units. Live chromatin imaging was acquired using a DMI8 inverted automated microscope (Leica Microsystems) featuring a confocal spinning disk unit (CSU-X1-M1N, Yokogawa). Integrated laser engine (ILE 400, Andor) was used for excitation with a selected wavelength of 647 nm and 140mW as excitation power. A 100× oil immersion objective (Leica HCX-PL-APO) with a 1.4 NA was chosen for a high resolution imaging. Fluorescence emission of the SiR-Hoechst was filtered by a single-band bandpass filter (FF01-650/13-25, Semrock, Inc.). Image series of 150 frames (5 fps) were acquired using Metamorph software (Molecular Devices), and detected using sCMOS cameras (ORCA-Flash4.0 V2) and (1×1 binning), with sample pixel size of 65 nm. All series were recorded at 37°C.

### Image processing

#### Denoising

Raw images were denoised using non-iterative bilateral filtering [46]. While Gaussian blurring only accounts for the spatial distance of a pixel and its neighbourhood, bilateral filtering additionally takes the difference in intensity values into account and is therefore an edge-preserving method. Abrupt transitions from high- to low-intensity regions (e.g. heterochromatin to euchromatin) are not over-smoothed.

### MSD analysis and model selection by using Bayesian inference

In order to carry out a MSD analysis locally, the spatial dependency of the Mean Squared Displacement (MSD) can be written explicitly:
where is the position at time *t* of a virtual particle with initial position , τ = {Δ*t*, 2Δ*t*,…, (*N* – 1)Δ*t*] are time lags where Δ*t* is the time difference between subsequent images and the average <·>_{t} is taken over time. The resulting MSD is a function of the initial position and the time lag τ.

#### MSD models

The MSD can be expressed analytically for anomalous diffusion (DA), confined diffusion (DR) and directed motion (V) in two dimensions as
where *D*_{α} is the diffusion coefficient in units of *μm*^{2}/*s*^{α}, *α* is its anomalous exponent, *v* [*μm/s*] its velocity and *R*_{c} [*μm*] is the radius of a sphere within the particle is confined [21]. The case *α* = 1 is known as free diffusion, 0 < *α* < 1 corresponds to anomalous diffusion and 1 < *α* ≤ 2 corresponds to superdiffusion. Strictly speaking, each generalized diffusion coefficient *D*_{α} has different units, corresponding to the specific value of *α*. However, we refer to it as the diffusion coefficient *D* throughout the text for simplicity. Additionally to eq. (1)-(3), different types of motion can appear overlaying, resulting in a linear combination of the equations above. For example, anomalous motion can be superimposed on an underlying drift and the resulting *MSD* reads *MSD*_{DAV}(τ) = *MSD*_{DA}(*t*) + *MSD*_{V}(τ). We found that anomalous and confined diffusion appears very similar in experimental data and therefore decided in favor for anomalous diffusion to describe our data (Supplementary Note 3). The abbreviations used in this study are summarized in Table 1. As experimental data is usually subject to noise, a constant offset *o* is added to every model.

#### MSD model selection

The MSD is calculated for every pixel independently, resulting in a space- and time lag-dependent MSD. It is known that living cells can behave largely heterogeneous[3,47]. Ad-hoc, it is not known which particle undergoes which kind of diffusion. Fitting a MSD curve with a wrong model might result in poor fits and highly inaccurate determination of the mentioned parameters. For this reason, we use a Bayesian inference approach to test different models for any given MSD curve as proposed by Monnier *et al.* [22]. Given the data *Y* = (*Y*_{1}, …,*Y*_{n}} and *K* model candidates *M* = (*M*_{1}, …,*M*_{k}}, each with its own (multidimensional) parameter set *θ* = (*θ*_{1}, …,*θ*_{k}}, we want to find the model *M*_{k}(*Y*, *θ*_{k}) such that the probability that *M*_{k}(*Y*, *θ*_{k}) describes the data, given the set of models to test, is maximal. By Bayes’ theorem, the probability for each model is given by

If there is no reason to prefer one model over the other, the prior probability of each model *P*(*M*_{k}) is equal. The parameter set which is used to describe the data, given a fixed model, strongly influences the probability. Therefore, it is crucial to estimate the optimal parameters for every model in order to calculate the model probabilities. The probability that the data *Y* is observed, given the model *M*_{k} described by the model function *M*_{k}(*x*; *θ*_{k}) and any parameter set *θ*_{k} is approximated by a general multivariate Gaussian function [48]
where *C* is the empirical covariance matrix of the data and the prefactor is a normalizing factor. This equation has an intuitive meaning. Assume we test a model *M*_{k} parametrized by *θ*_{k} to find out if it describes the data *y*. The exponential function consists of the term [*Y* – *M*_{k}(*x*; *θ*_{k})], i.e. the residuals of the data and the describing model. If the residuals are small, i.e. the model describes the data well, the exponent is small and the probability *P*(*Y* |*θ*_{k}, *M*_{k}) seeks 1. On the other hand, the worse the fit, the greater the resulting residuals and the probability seeks asymptotically to 0. The factor *C*^{−1} accounts for the covariance in the data. The covariance matrix for a set of MSD curves normally shows large values for large time lags as the uncertainty increases and MSD curves diverge. The covariance matrix implicitly introduces a weight to the data, which is small for large variances and large where the data spreads little. This fact avoids cutting of the MSD curve after a specific number of time lags, but instead includes all available time lags weighted by the covariance matrix. The approach is illustrated in (Supplementary Figure 6b) with the covariance matrix exemplary shown in the inset. In case of uncorrelated errors, non-diagonal elements are zero, but the approach keeps its validity [49] and follows an ordinary least-squares regression.

Given the best estimate of the parameter set for a model, the model and its corresponding parameters are chosen so that their probability to describe the data is maximal: .

It has to be stressed that values of the anomalous exponent scatter around 1, but do not assume the value 1 (e.g. Figure 1c, middle panel). This is due to the model selection procedure, selecting the simplest model consisting with the data. In case that the underlying motion is well described by free diffusion, *α* is inherently set to 1 and classified as free diffusion rather than anomalous diffusion. The descriptions of free diffusion or anomalous diffusion with *α* = 1 are equivalent, but the free diffusion model contains one parameter less and is therefore preferred leading to abundance of *α* values close to 1 in the parameter maps and histograms. To carry out the MSD analysis locally, we choose to take the 3×3 neighborhood of a pixel, detect possible outliers therein by the interquartile range criterion [50] and calculate the error covariance matrix of the data within the pixel’s neighborhood. The restriction to a single pixel and its neighborhood allows us to carry out the MSD analysis of trajectories locally, in contrast to an ensemble MSD in previous studies [17], revealing only average information over many trajectories. The choice of a 3×3 window is reasonable with regard to the equivalently chosen filter size in the Optical Flow estimation. The flow field in this region is therefore assumed to be sufficiently smooth. All calculations, except for the General Mixture Model analysis, were carried out using MATLAB (MATLAB Release 2017a, The MathWorks, Inc., Natick, Massachusetts, United States) on a 64-bit Intel Xeon CPUE5-2609 1.90 GHz workstation with 64 GB RAM and running Microsoft Windows 10 Professional.

### Deconvolution of sub-populations

Regarding especially the distribution of the diffusion coefficient, an analytical expression can be found assuming that the diffusion coefficient was calculated from a freely diffusing particle (*α* = 1)[51]. However, we find anomalous diffusion to a large extent in our data (e.g. Figure 2b, d and Figure 4d) and, to our knowledge, an analytical expression cannot be found for distributions of anomalous exponent, radius of confinement and drift velocity. We therefore aim to deconvolve the parameter sets in a rather general manner, for which we use a General Mixture model (GMM), a probabilistic model composed of multiple distributions and corresponding weights. We describe each data point as a portion of a normal or log-normal distribution described by
respectively. The logarithmic mean *m* and standard deviation s are related to the mean and standard deviation of the normal distribution via [52]

We consider up to three subpopulations to be found in our data and model the total density estimate as a superposition of one, two or three subpopulations, i.e. the Mixture Model reads
for both normal and log-normal distributions, where to sum goes to 1, 2 or 3. The variable w_{k} describes the weights for each population (or component), which satisfy 0 ≤ *w*_{k} ≤ 1 and sum up to unity. The weights of each component are directly proportional to the area of the histogram covered by this component and therefore its existence in the data set.

### General Mixture Model analysis

Let *Y* = [*Y*_{1},…, *Y*_{n}] denote *n* data points. For the scope of this description, assume *Y* to be a onedimensional variable. Further assume that the data cannot be described by a single distribution, but by a mixture of distributions. A deconvolution of the data into sub-populations faces the following problem: Given a label for each data point, denoting the affiliation to a population, one could group corresponding data points and find the parameters of each population separately using a maximum likelihood estimation or other methods. On the other hand, if we had given the model parameters for each population, labels could in principle be inferred from the likelihood of a data point being described by a population or another. The problem can be formulated by Bayes’ rule (*M* indicates model, *D* indicates data)

Here, *P*(*M*|*D*) is the posterior probability of the model given the data, which is the aim to calculate. We assign a data point to the component, which maximizes *P*(*M*|*D*). The probability to observe the data given a model is described by *P*(*D*|*M*), i.e. the likelihood function. *P*(*M*) is the prior for the models to be chosen from. In our case, we have no prior beliefs on the models (all models are equally likely) such that *P*(*M*) is uniform. Lastly, the probability *P*(*D*) does not depend on the models and can therefore be dropped.

Unfortunately, neither labels, that is *P*(*M*|*D*), nor model parameters and weights are known a priori. The problem can be approached by an Expectation-Maximization (EM) scheme: Without any prior beliefs about the data distribution, one starts with a simple estimate of model parameters, e.g. a k-means clustering estimate and iterates subsequently between the two following steps until convergence:

#### Expectation step

Calculation of the probability that the component with the current parameter estimate generated the sample, i.e. *P*(*D*|*M*).

#### Maximization step

Update the current parameter estimate for each component by means of a weighted maximum likelihood estimate, where the weight is the probability that the component generated the sample.

We illustrate the results of the EM algorithm exemplary in Supplementary Figure 8. From the input data (Supplementary Figure 8a), represented as histogram, both the likelihood *P*(*D*|*M*) (Supplementary Figure 8b) and the posterior (Supplementary Figure 8c) is obtained. The sum of subpopulations corresponds to the overall probability distribution (shown in black) with different model parameters and weights found by maximizing the likelihood function. The posterior describes the probability of data points to fall under each population, i.e. ∑_{k}*P*(*M*_{K}|*D*) = 1. The data points are assigned to those population, for which *P*(*M*_{k}|*D*) is maximum, resulting in labeled data. The labels are subsequently mapped in two dimensions, visualizing spatial correspondence of slow, intermediate and fast sub-populations (Supplementary Figure 8d). The GMM analysis is carried out using the pomegranate machine learning package for probabilistic modeling in Python [53].

### Selection of subpopulations by the Bayesian Information Criterion (BIC)

A priori, it is not unambiguously clear from how many populations the data is sampled and which form the subpopulations take. We therefore assess the suitability of each model by means of the Bayesian Information Criterion (BIC), which is calculated by[54]
where is the maximum likelihood of the maximum likelihood estimation (MLE) estimate, *p* denotes the number of parameters in the model and *n* is the number of data points used for the fit. Among afamily of models, the one with the lowest BIC is considered to describe the data best, taking into account competing complexity of models. A large likelihood of a model favors it to describe the data well, while on the other hand the model is penalized if many parameters are involved in the model by the second term in (3). Therefore, the BIC prevents overfitting. In order to judge which model is appropriate for our data, we tested all considered models for each histogram and assessed the optimal model by means of the BIC. The fraction of all histograms which described best by one of the six models considered is given in Table 2. Based on the objective judgement of the fit using the BIC, we chose for each parameter the model which best describes the largest fraction of histograms (Table 2, bold cells).

## DATA AVAILABILITY

The code of the HiD method and raw data are available on request.

## AUTHOR CONTRIBUTIONS

H. A.S. conceived the method, performed the experimental work, developed the data processing algorithm and analyzed the data; R.B. wrote the code, developed the data processing algorithm and analyzed the data; H.A.S. and K. B. interpreted the results; H.A.S., R. B. and K. B. wrote the manuscript.

## CONFLICT OF INTEREST

The authors declare no competing financial interests.

## CONTENTS

### SUPPLEMENTARY NOTE 1

This note provides detailed information about the conversion of flow fields into trajectory.

### SUPPLEMENTARY NOTE 2

This note provides a detailed information about simulations carried out to compare Optical Flow and Single Particle Tracking reconstruction of trajectories in different imaging conditions and illustrate the response of Optical Flow to experimental issues such as diffusion out of focus and heterogeneous labelling density in the field of view.

### SUPPLEMENTARY NOTE 3

This note provides a detailed view on the comparison of confined and anomalous diffusion, which is supported by exemplary simulations of the two types of diffusion.

## SUPPLEMENTARY NOTE 1

### Conversion of flow fields into trajectories

In high-density scenarios where Single Particle Tracking methods reach their limit, dense Optical Flow methods present a powerful tool to investigate local bulk motion of biological macromolecules. Here we determine flow fields of fluorescently labelled DNA and reconstruct virtual trajectories to extract motion at sub-pixel resolution and long-time intervals at the level of the whole nucleus [1]. Optical Flow algorithms estimate motion between frames as a field description (Eulerian description) of the underlying continuum motion, evaluated at fixed ‘stations’, i.e. the pixel positions in the Cartesian coordinate system, which is a powerful approach when the coordinates of single particles cannot be defined. In contrast, actual tracking of particles’ coordinates over time is referred to as Lagrangian description. A continuum motion consisting of a finite number of particles can be described in both ways, according to continuum mechanics[2–4]. Because it is impossible to identify individual emitters (or particles) in densely labelled images, we start out with a limited number of *virtual* particles, which are assumed to be seeded on a regular grid which is defined by the pixels. Let denote these (fixed) pixel positions. Let the coordinates of the particle in Cartesian space be at any time point *t*. Consider further that the Eulerian flow field is known only at the positions , but can be evaluated at any position by interpolation of the coordinates of interest. Then, the particle’s (Lagrangian) velocity at position and time *t* is the same as the Eulerian velocity at

Therefore, the trajectory, consisting of the consecutive positions of the particle can be obtained by integration of Equation (1) using the fact that Eulerian and Lagrangian velocities are equal when evaluated at the same position. In situations where particle detection is impossible (e.g. due to high density of emitters), the Eulerian description of continuum motion can be translated to a Lagrangian description by considering *virtual* particles and using the flow field description to extract their hypothetical trajectories. We consider virtual particles with initial positions at the center of each image pixel, for which flow was estimated by Optical Flow (Supplementary Figure 1a, first flow field highlighted). Note that the flow fields describe the motion *between* frames, whereas particle coordinates are described at the imaging time of each frame. The time evolution, i.e. the trajectory of each virtual particle is reconstructed as follows: From the particle’s initial position , the flow field dictates the displacement from frame 1 to frame 2 (dark blue trajectory segment in Supplementary Figure 1d, e), i.e. , where Δ*t* = *t*_{i+1} – *t*_{i} denotes the time between consecutive frames. The current particle coordinates at *t*_{1} do not necessarily coincidence with the regular grid on which the flow field is evaluated. We therefore interpolate the flow field at time *t*_{1} to the particle coordinates (Supplementary Figure 1b), light blue flow field) and can evaluate the particle coordinates at where denotes the interpolated flow field (i.e. displacement vectors) at time *t*_{1} at the particle coordinates. This procedure is repeated until all flow fields are processed (Supplementary Figure 1c) and the resulting visual particle coordinates are connected to form trajectories (Supplementary Figure 1d, e). Note that extrapolation outside the nucleus and in nucleoli where no signal intensity and therefore no flow field is given, is not considered.

Importantly, the concept of virtual particles does not correspond to particles associated with specific foci along the genome. These are impossible to detect with the given labelling density. Instead, virtual particles move along a trajectory which reflects the local bulk motion of many emitters as computed by Optical Flow. Indeed, it is highly likely that a single pixel may contain fluorescence signal from more than one chromatin fibre and thus calculated trajectories should not be interpreted in the sense of single particle tracking trajectories. Instead, trajectories across multiple pixels in a local neighbourhood have to be taken into account in order to interpret quantities derived from these trajectories.

## SUPPLEMENTARY NOTE 2

### Performance comparison of Single Particle Tracking and Optical Flow by simulation

In order to quantitatively evaluate the performance of Optical Flow (OF) and Single Particle Tracking (SPT) methods with respect to trajectory reconstruction, ground truth (GT) data is necessary, which is in general not available in experimental data. For this purpose, we simulated typical fluorescence microscopy images of particles undergoing realistic motion in three dimensions (Supplementary Figure 2). We consider three scenarios: low and high density of emitters as well as a high density scenario with local patches of super-high density (Supplementary Figure 2, Supplementary Figure 4). After simulation of an image series, the complete series is input to the two trajectory reconstruction algorithms. In order to compare the dynamic properties of all considered trajectories, the MSD is calculated and the following three quantities are computed: Cosine similarity, relative Euclidian distance and the relative error of diffusion constant, which was derived from a regression of the MSD. These measures allow for a comparison of the MSD shape, a characteristic of the underlying type of motion revealed by the MSD, as well as systematic over- and underestimation of local dynamics. In the following, we describe the simulation procedure, evaluation measures which were used to assess the accuracy of both methods as well as the results.

### Simulation of microscopy images

#### Emitter density

The emitter density has a limiting influence on the performance of SPT methods, as shown in an extensive comparison study of SPT methods[5]. We consider two density levels: 0.001/*px*^{3} (low density) and 0.02/*px*^{3} (high density). In order to further mimic high heterogeneity of chromatin density, we simulate local regions of even higher density (0.035/*px*^{3}), corresponding to eu- and heterochromatin domains. The density of heterochromatin was chosen such as to match a nucleosome density ratio of 0.58 determined previously[6]. The proportion of heterochromatin domains was empirically determined using a volume proportion of eu- and heterochromatin of about 12.5 % [6]. This enables determining regions of heterochromatin within a fluorescence microscopy image as described previously by Wachsmuth *et al.*[6]. In brief, the image is blurred by a Gaussian Filter and the 12.5% of the highest intensity value is extracted indicating high chromatin density within these pixels. Supplementary Figure 3 shows an example nucleus with regions of heterochromatin as determined by Wachsmuth *et al.;* the heterochromatin regions are marked in white and individual areas are found (Supplementary Figure 3b, c), from which the area of each domain is calculated for all nuclei used in this study. The area distribution was found to be well described by an exponential distribution with mean 1.9 ± 3.5 *μm*^{2} (Supplementary Figure 3d). This value was a guideline for the simulation of artificial heterochromatin domains, which were assumed to be circular for simplicity. The number of areas was adjusted such that the total area of heterochromatin domains corresponds to about 12.5% of the total area of the simulated volume (as projected to two dimensions).

A volume of 128 × 128 × 15 pixels is simulated corresponding to approximately 8 × 8 × 1 *μm* (pixel size ~65 *nm*). Typically, the depth of the focal plane in a confocal microscopy image is about ~500 *nm;* we therefore simulate twice this thickness in order to allow emitters to move in and out of the focal plane. The mentioned parameters result in an average number of particles of about 240, 4770 and 5500 particles respectively for the three scenarios, which are seeded randomly in the simulated volume. Particles are not seeded on the image boundary to avoid boundary effects. An initial example configuration for the scenario of super-high density patches in shown in Supplementary Figure 2a.

#### Particle dynamics

We simulate Brownian motion in three dimensions such that particles are allowed to randomly move in and out of the imaged volume. A diffusion coefficient of *D* = 5 · 10^{−3} *μm*^{2}/*s* was used, matching previously determined values for the diffusion constant for chromatin[7,8]. A total of 20 frames were simulated governing 4 s of experimental imaging (acquisition time τ = 200 *ms*). Therefore, particle displacements between subsequent images were drawn from a normal distribution with mean zero and variance . It was previously shown[1] that chromatin motion shows a high degree of correlated motion, which allows to empirically impose motion correlation to the purely random Brownian dynamics. We therefore displace particles within a domain of correlated motion collectively, i.e. several emitters along the same displacement vector for each frame independently as reported previously[1]. Furthermore, particle appearance and disappearance are regulated by random processes[5] due to several factors such as emitter transitions into a dark triplet state. Seeding N particles uniformly over the simulation volume, particle disappearance was modeled by a Bernoulli process with probability *α* = 0.05 for every particle to disappear at each time step and the possibility of reappearance excluded. On the other hand, particles may appear at random locations at every time step and the number of appearing particles is drawn from a Poisson distribution with mean *Nα*, such that the number of appearing and disappearing particles balance each other. From all generated trajectories, those with a length of 4 or less frames were discarded.

#### Imaging process

Due to a diffraction limited optical microscopy setup, the imaging of fluorescent photons is modelled as the convolution of the light emission field and the point spread function (PSF) of a typical confocal microscope. The PSF of an optical system is the image of a point source and the pupil function is defined as the Fourier transform thereof. Therefore, the complex-valued amplitude point spread function *PSF*_{A} and the pupil function *P* form a Fourier transform pair:

In other words, the PSF can easily be computed by the use of discrete Fourier transforms if the pupil function of the optical system is known. The complex pupil function in our case is simply described by a disk with a radius, which is defined by the ratio of numerical aperture NA and the wavelength of the light *λ*:

The treatment above is carried out in scalar diffraction theory and is idealized assuming equal excitation and emission wavelength, no aberration and a constant value inside the pupil function disk. Furthermore, the point source is assumed to be in focus. Emission outside the focal plane is included by adding a defocus to the PSF. This is done by separately expressing *k*_{z}(*k*_{x}, *k*_{y}) = , where *n* is the refractive index of the immersion medium and multiplying the integrand in eq. (2) by a ‘defocus phase function’ *exp*(*2πik*_{z}(*k*_{x}, *k*_{y})*z*) such that the expression for the amplitude point spread function in three dimensions reads[9]

The observed PSF is the absolute square of the computed amplitude PSF, i.e. *PSF* = |*PSF*_{A}|^{2}. The parameters defining the shape of the PSF are set as follows: emission wavelength *λ* = 647 *nm*, numerical aperture *NA* = 1.4, refractive index of immersion medium *n* = 1.3 and pixel size 65 *nm*. The simulated PSF is shown in (Supplementary Figure 2b) for exemplary z-slices, and a convolved simulation volume is shown in (Supplementary Figure 2c).

Next to blurring, the imaging process is subject to unavoidable noise. In practice, two predominant sources of noise exist, namely signal-dependent Poisson noise and setup-dependent Gaussian white noise due to several factors such as camera gain and thermal noise. The signal-to-noise ratio (SNR) determines the presence of noise photons in contrast to signal photons and is defined as the ratio of squared signal intensity *I*^{2} and noise variance *σ*^{2} on a decimal logarithmic scale: *SNR* = 10log_{10}(*I*^{2}/*σ*^{2}). While Poisson noise is dependent on the number of observed photons, the Gaussian noise contribution can be varied to match the SNR of about 21 *dB*, which we typically observe in our data. The final image is the projection of the whole convolved simulation volume in two dimensions with applied Poisson and Gaussian noise (Supplementary Figure 2d).

### Performance measures

#### Cosine Similarity

The cosine similarity addresses the orientation similarity between two multidimensional vectors, or in our case between two MSD curves. Defining the two MSD curves to compare as arrays and , the cosine similarity is defined as

The cosine similarity returns 1 if the two curves have the same shape, regardless of their magnitude and values < 1 otherwise. The shape of the MSD curve has an important meaning as it is characteristic for the underlying type of motion. For instance, motion is considered Brownian if the MSD follows a linear relationship over time lag, whereas a deviation denotes constraint (sub-linear relationship) or active transport (super-linear). In case that multiple estimated trajectories are associated to a single GT trajectory, the mean MSD curve is compared to GT.

#### Relative Euclidean distance

The Euclidean distance between two multidimensional vectors is defined as the ratio of the Euclidean distance between the vectors and the norm of the GT to compare to. Denoting the GT by and the estimated MSD by :

The Euclidean distance is measure of magnitude yielding 0 if the GT and estimated MSD coincide perfectly. Otherwise, the relative Euclidean distance returns the norm of the deviation between the two MSD curves with respect to the GT. In case that multiple estimated trajectories are associated to a single GT trajectory, the mean MSD curve is compared to GT.

#### Estimated relative diffusion coefficient

Another obvious measure to assess the accuracy of the reconstruction of particle motion is the diffusion coefficient of GT and estimated trajectories. Diffusion is a stochastic process, such that in general a single trajectory cannot display the average simulated behavior of many particles. Both SPT and OF determine the local motion of particles rather than an ensemble average. For this reason, it is appropriate to compare local quantities rather than comparing to an average value. We therefore face the problem of fitting a single GT MSD and possibly several estimated MSD curves ultimately comparing the diffusion constant derived from the regressions. In case that only 1 MSD curve is found to be fit, which is the case for the GT curve and most of the estimated SPT trajectories, a weighted nonlinear regression is used, weights being the standard deviation of the MSD curves. For more than 1 but less than 5 trajectories, the mean MSD is used for fitting. In case that more than 5 trajectories have been found, an appropriate way of fitting is described in (see main text, Methods). We consider only the free diffusion model in order to be able to compare the diffusion constant to the weighted non-linear regression. Similar to the relative Euclidean distance, we compute the relative error between estimated and simulated MSD curves after regression as
where *D** is the GT and *D* is the reconstructed diffusion coefficient. By the regression, further inaccuracy is introduced, but a comparison is nevertheless possible and an overall trend is apparent.

To see how these performance measures, reflect the behaviour of individual trajectories and corresponding MSD curves, three example trajectories reflecting different cases of the motion reconstruction by OF and SPT compared to the GT trajectory are shown in Supplementary Figure 2e. The performance measures defined above are given in Supplementary Supplementary Table for the three example trajectories. The direction of OF and SPT are similar for approximately the first half of the trajectory (panel i), whereas the GT trajectory propagates in the opposite direction. For longer time points the OF trajectory resembles the one of the GT, whereas the SPT trajectory deviates largely, possibly due to a reconnection error. The MSD curve reflects the similarity of GT and OF estimation despite of the deviation in direction of the trajectories’ propagation leading to relatively accurate performance measures. The reconnection error in SPT leads to a tremendous increase in MSD values especially for long time lags yielding a considerably lower cosine similarity and more than 400% error in Euclidean distance and estimation of diffusion constant. Disappearance of a GT emitter after 7 frames in shown in panel (ii). Both OF and SPT detect trajectories throughout the complete frame series using intensity information from surrounding emitters. Motion is accurately estimated by OF in the first 5 time lags and largely deviates from the GT MSD until its disappearance. In this case, the shape of the MSD is well reflected by the SPT trajectory leading to a considerably higher value in cosine similarity for SPT than for OF. However, SPT overestimated the motion leading to large errors in the relative Euclidean distance and diffusion coefficient, whereas the resembling of GT and OF in the first few time lags leads to a relatively accurate estimate for the diffusion coefficient due to a small standard deviation and therefore large weight of the OF MSD curves for small time lags. An example of a false negative of SPT and therefore no dynamic information for the SPT data set is shown in panel (iii).

### Performance evaluation

The simulated image series were analyzed using OF and SPT. The results from 10 independent simulations are summarized in Supplementary Figure 4 for the three scenarios. For a low density of emitters (Supplementary Figure 4a), MTT is able to detect and reconnect the majority of GT emitters (79 ±4 % true positives) and yields good performance in all three measures considering value and shape of the estimated MSD curves. In particular, estimated diffusion coefficients show a very low error (< 10%). The OF estimation for relatively sparse signals is difficult leading to comparably lower performance than SPT (relative error in diffusion constant is about 25%). For a high density of emitters (Supplementary Figure 4b), SPT performance in terms of the relative Euclidean distance drops, whereas OF performs better than in the low density scenario. However, the relative error in diffusion constant increases for both methods (60% and 40% for SPT and OF respectively). Introducing a density heterogeneity, the SPT performance stays unaffected, but the relative error in diffusion constant for the OF reconstruction is reduced to about 25%. OF estimation uses the bulk motion of many particles such that the vast majority of pixels carries information and the flow field does not have to be approximated by the smoothness assumption underlying OF as in a low density scenario. However, an additional structure in the image such as a density heterogeneity leads to considerably more accurate results than with a uniform density due to additional structure in the image.

These results confirm that the SPT method under consideration is especially accurate when comparably sparse signals are present. SPT is naturally limited to the detection of single emitters or aggregates of emitters [10,11] and therefore lacks accurate information when detection is impossible, e.g. due to high overlap of independent emitter signals. The Optical Flow algorithm in this study is used in high-density scenarios and additional structures in the image such as a heterogeneous density of chromatin enhance the performance considerably. Hi-D therefore constitutes a complementary approach to extract dynamic information of biomolecules with dense labeling where SPT cannot be applied.

It has to be stressed that the simulations carried out incorporate two important aspects of the approach. Diffusion of emitters in and out of focus as well as appearance and disappearance of emitters as well as heterogeneous chromatin structures (i.e. varying chromatin compaction levels) are considered in all simulations. Thus, the values provided are likely to reflect errors associated with experimental data.

## SUPPLEMENTARY NOTE 3

### Comparison of confined and anomalous diffusion

In order to test the ability of the Bayesian classification routine used in this study to resolve confined and anomalous diffusion (DR and DA respectively), we simulate particles undergoing either one of the two types of diffusion. Confined diffusion is characterized by a particle diffusing freely within a sphere of radius *R*_{c} with diffusion constant *D*. The space outside the sphere has the form of an infinite potential impossible to overcome and hence resulting in confinement to the volume of the sphere. Exemplary simulated trajectories are shown in two dimensions in Supplementary Figure 7a-d (left column) for different values of diffusion constant and radius of confinement. Anomalous diffusion is characterized by an effective potential exerting a driving force towards the particle’s origin, whose source may be to surround obstacles hindering free diffusion. We model the driving potential as a harmonic potential with the characteristic dimension *L*_{trap}. The particle feels a springlike driving force with spring constant . Exemplary trajectories for anomalous diffusion are shown in Supplementary Figure 7a-d (middle column) for different values of *L*_{trap}. The potential strength is indicated by colour code. The theoretical MSD is given as in equations 1 and 3 of the Methods section for anomalous and confined diffusion respectively.

For confined diffusion, one can approximate the expression for short and long time scales:

:The exponential can be expanded in a Taylor series yielding i.e. free diffusion in first order. Effectively, the particle does not feel any confinement for short time lags as the explored volume is much smaller than the confinement volume.

The exponential argument is small and therefore . The confinement is effectively a hard wall potential, impossible to overcome for the particle. For long time lags, the particle therefore explored the whole available volume, but cannot reach any further, resulting in a constant MSD (see Supplementary Figure 6c).

For anomalous diffusion, the particles move freely for short times, but adapt sub-diffusive behaviour for longer times as the effect of the external potential becomes dominant. However, a particle undergoing anomalous diffusion is not confined in the sense of a hard wall potential and can in principle diffuse in all space [12]. This leads to a continuously rising MSD for *τ* → ∞. Despite the analytical form of the MSD for confined and anomalous diffusion (exponential versus algebraic), the behaviour for long time lags is a main characteristic for the distinction of the two types of diffusion.

In order to illustrate the theoretical curve shape, exemplary scenarios are shown in Supplementary Figure 7a-d, where different levels of mobility and confinement / anomaly and the ability to resolve the correct type of motion by means of the MSD (right column) are explored. For the scenarios a-c) the limit is not reached within a trajectory length of 150 steps and the shapes of mean MSD for confined (green) and anomalous diffusion (red) are similar. Consequently, the exact type of diffusion could not be resolved. However, for strong confinement (Supplementary Figure 7d), the curves are sufficiently dissimilar and allow extracting the correct MSD model. Experimentally, only a finite trajectory length can be recorded and it is questionable when the trajectory length is sufficient to observe the long time lag limit when no prior information about the particle environment is known. To test whether a reliable distinction can be made for experimentally observed trajectories, an example nucleus is analysed twice. First, the Bayesian classifier is given the choice between free, directed and confined diffusion (and combinations thereof: D, DR, V, DV, DRV). The model selection is shown in Supplementary Figure 7e (left). The majority of trajectories are classified as confined + directed diffusion (DRV), whereas only about 25% of trajectories are classified as purely confined diffusion. Next, the model for confined diffusion is replaced by anomalous diffusion and the analysis is carried out again. First of all, high agreement between the two modes of analysis is seen for trajectories classified as purely Brownian and directed as well as a combination thereof. However, the fraction of anomalous + directed diffusion (DAV) is small compared to purely anomalous diffusion. In particular, about 92% of trajectories classified as DAV were previously classified as DRV. From the remaining trajectories not classified DAV, about 72% are classified as purely anomalous diffusion. These results suggest that only a subset of trajectories classified as combination of confined and directed diffusion is consistent with the combination of anomalous and directed diffusion. The majority of trajectories classified as DRV is preferentially described by purely anomalous diffusion. A reason might be that in the case where only confined diffusion is allowed to describe experimental trajectories, an effective directed transport is needed to account for the continuous rise of the MSD even for large time lags. Finally, it remains unclear if confined or anomalous diffusion is present in experimental trajectories as long as the plateau in the MSD is not reached. Even though strong experimental evidence for confined diffusion of proteins and molecules exists[13] for large biomolecules such as chromatin, a confinement may have several forms such as anisotropic or temporally varying confinement radii with to date unknown sources. The idealized model of a hard wall potential defining the confinement volume may not be appropriate for most biological cases (except for membranes) resulting in a mixed state between confined and anomalous diffusion, which is hard or even impossible to resolve without high spatiotemporal resolution and long-time measurements. The exemplary results and reasoning above suggest that sub-diffusive behaviour observed for chromatin may be best described by anomalous diffusion rather than confined diffusion to prevent misclassification and misinterpretation.

## ACKNOWLEDGMENTS

We thank Raphael Mourad (CBI Toulouse) and Genevieve Fourel (ENS Lyon) for fruitful discussions, and Alain Kamgoué for assistance with computation. We acknowledge support from the LITC imaging platform, CBI Toulouse. This work was supported by grants (to KB) from the ANR, IDEX Toulouse strategic actions, the Foundation ARC and INSERM Cancer and Epigenetics.