Multi-exponential DNA Residence Behaviors of Transcription Factors Under The Discrete Affinity Model

The transcription process is regulated by temporal interactions of transcription factors with DNA. In the last decade, computational and experimental studies revealed the residence times of transcription factors on DNA correlate with transcriptional output. Biochemical studies suggest that transcription factors exhibit bi-exponential dynamics, attributed to the binary affinity model composed of nonspecific and specific protein-DNA bindings. Recently, transcription factor residence times were shown to display a power-law pattern implicating proteinDNA affinity levels are rather continuous. Elucidating the underlying mechanisms of transcription factor residence distributions, beyond protein-DNA interaction strength, is crucial to construct a more complete understanding of transcriptional regulation. Here, by using molecular dynamics simulations of DNA and dimeric proteins, we demonstrate residence time behaviors of generic homodimeric transcription factors follow a multi-exponential pattern even with single and binary affinity levels between DNA and proteins, indicating the existence of emergent behavior. Our simulations reveal that DNA-protein clusters of various sizes contribute to this multi-exponential behavior. These findings add another layer to transcriptional regulation and, consequently, to gene expression by connecting protein concentration, DNA-protein clusters, and DNA residence times of transcription factors.


Introduction
Gene expression dictates many aspects of cellular behavior, including response to extracellular factors and cellular identity.Among the many regulators of gene expression, transcription factors (TFs) play the most prominent role in inhibiting or activating the transcription of their target genes (Browning, 2004;Seshasayee et al., 2011;Bintu et al., 2005).Regulation of a gene by a TF protein starts at the binding site once the TF binds to its specific DNA sequence and forms protein-DNA complexes.Nevertheless, gene expression is a highly dynamic and regulated process.Thus, TF's unbinding from the DNA binding site might not be a fully stochastic process (de Jonge et al., 2022;Haberle and Stark, 2018).In accord, recent studies in TF dynamics revealed that the residence time (RT) of a TF on its DNA binding site (i.e., its unbinding rate) is intertwined with transcriptional output (Clauß et al., 2017;Lickwar et al., 2012;de Jonge et al., 2020).Thus, TF residence times on DNA directly contribute to the regulation of gene expression.Therefore, understanding residence time patterns becomes a key to the accurate prediction of gene expression behavior.
The duration of a TF on its DNA binding site can depend on the TF protein's affinity to the DNA, temperature, and 3D DNA structure (Kim and Shendure, 2019;Inukai et al., 2017).Kinetic studies focusing on unbinding (or dissociation) kinetics of TFs from their singlebinding sites have demonstrated the concentration dependency of TF unbinding rates (Graham et al., 2011;Joshi et al., 2012).In scarce concentrations, a TF tends to stay bound to its target site for more extended periods of time.In contrast, the abundance of unbound TFs in solution leads to a competition for the binding sites on DNA, resulting in much higher unbinding rates (i.e., shorter residence times) via the process referred to as Facilitated Dissociation (FD) (Kamar et al., 2017;Koşar et al., 2022).
The impact of TF concentration is not limited to protein unbinding dynamics via FD.TFs can also contribute to 3D genome organization and form DNA-protein clusters in a concentration-dependent manner (Kim and Shendure, 2019;Noort et al., 2004;Skoko et al., 2006Skoko et al., , 2004;;Remesh et al., 2020;Koşar and Erbaş, 2022;Arold et al., 2010;Dame et al., 2000;Winardhi et al., 2015).Such structural effects are more pronounced with bacterial Nucleoid-Associated Proteins (NAPs), a class of DNA-binding proteins often with dual functionality.NAPs are involved in chromosome organization, and a number of them also function as transcription factors (Dillon and Dorman, 2010;Wang et al., 2011;Dorman, 2014).Due to their common multivalent nature Lee (1992), DNA-binding proteins can drive the bridging of DNA segments and thus the formation of DNA-protein clusters of various shapes and sizes and other forms of chromosome architectural effects (Skoko et al., 2004;Dillon and Dorman, 2010;Wang et al., 2011;Hammel et al., 2016;Dame, 2005;Verma et al., 2019).Molecular Dynamics (MD) simulations of model bacterial systems suggest that these events are highly dependent on protein concentration and U ns , nonspecific interaction potential (i.e., the affinity between the DNA-binding protein and nonspecific DNA).The affinity of a TF to its specific binding site is often stronger compared to its binding affinity towards a nonspecific DNA sequence.However, nonspecific binding can contribute to the global chromosome organization.Contrarily, specific protein-DNA interactions can govern the formation of small local DNA-protein complexes (i.e., clusters) around the specific binding sites (Koşar et al., 2022;Lin et al., 2012;Brackley et al., 2013;Agback et al., 1998).DNA-protein complexes could significantly affect transcriptional regulation.Transcription factories are multiprotein complexes formed by TFs, RNA polymerase, coactivators, etc., within the eukaryotic nuclei.These factories not only carry out transcription but also organize nuclear architecture (Iborra et al., 1996;Cook, 2010;Melnik et al., 2011;Mitchell and Fraser, 2008).Therefore, multiprotein complexes or clusters are crucial regulators of transcriptional output.Moreover, Eukaryotic euchromatin regions, which are not densely packed as heterochromatin regions, are more accessible by transcription factors thus, more transcriptionally active (Amemiya et al., 2022;Elgin and Grewal, 2003;Penagos-Puig and Furlan-Magaril, 2020).This notion emphasizes the significance of local genome architectures in residence times.In other words, the contribution of DNA-binding proteins to gene regulation is not restricted to their functions as TFs since their activity in domain-specific and global genome organization is also crucial.A recent experimental study employing single molecule tracking (SMT) demonstrated residence durations of several TFs and chromatin-associated proteins dynamics follow a power-law pattern (Garcia et al., 2021).These findings support a continuum model for TF dynamics and TF-DNA interaction affinities rather than a bi-exponential model attributed to binary specific and nonspecific affinities.In the bi-exponential model, TFs were considered to have longer residence times on their specific target DNA sequence and shorter residence times on nonspecific DNA, generating a bi-exponential distribution of residence durations (Chen et al., 2014;Ball et al., 2016;Morisaki et al., 2014).Contrary to this suggestion, even with a single affinity (U ns = U sp ) and binary affinity model (U ns < U sp ), we observed an apparent multi-exponential pattern in our model, indicating an emergent behavior.In parallel with our findings, multiexponential was used to interpret the residence behaviors of several SMT studies (Hipp et al., 2019;Reisser et al., 2020;Agarwal et al., 2017).Consequently, we explored the driving factors of multiexponential residence behaviors of TFs.In particular, high nonspecific affinity cases, which enabled the formation of much larger clusters, exhibited distinct RT distributions and required more exponents to match with decay curves.Investigation of TF-DNA cluster forma-tions of different sizes revealed that cluster dissipation mean lifetimes are coupled to their sizes.Therefore, concentration and affinity-dependent cluster formations could be the driving factors of the observed multi-exponential patterns.Finally, we explored the distributed affinity model to check whether the additional complexity would lead to power-law behavior.However, it is clear that limited discrete affinity distribution again forms a multi-exponential decay pattern and does not provide sufficient complexity to generate a power-law behavior.Here, we demonstrate that TF dynamics follow multiexponential patterns without the need for multiple TF-DNA affinity levels.We also show how these behaviors are impacted by cluster formations.Our model predicts the power-law behavior might be plausible even with discrete affinities with some additional complexities.These findings may help establish a more advanced understanding of transcriptional regulation, thus of regulation of gene expression.

Residence time behaviors are multifactorial
We employed multiple cases for nonspecific interaction affinities ranging from a very weak 1kT to a strong 4kT (i.e., U ns = U sp ) per bead, where U sp = 4kT per bead in all cases.For each of these affinity levels, four different physiologically relevant (Verma et al., 2019;Azam et al., 1999;Ball et al., 1992) concentrations ranging from 10 − 60µM of TFs were employed in the simulations.We obtained residence time patterns, as shown in Figure 1 and as described in methods, in the form of occurrence versus duration, where occurrence is the number of times a duration was achieved or observed.We then tried fitting several equations to define TF residence patterns (Figure 1C).Fit equations included exponential decay with up to five exponents as well as a power-law equation (Figure 2).The projection of RT patterns in the log-log scale exhibited an apparent arching pattern, eliminating the possibility of a good power-law fit, which requires a straight pattern in such a scale.The single exponential decay (ED) equation can only describe short-duration (< 10a.u.) distributions regardless of affinity and concentration (Figure S2A, Figure S1), implying the fullspectrum residence time behavior is not dependent on a single parameter.This behavior was indeed expected due to distinct NS and SP levels leading to at least a bi-exponential behavior.For most of the cases, our simulations utilized binary values where, for each simulation, there was only one nonspecific and one specific affinity where U ns < U sp .If there was no emergent factor affecting residence times, a double exponential decay (DED) would be sufficient to interpret the residence time patterns obtained from the simulations.Thus, we used a double exponential decay equation for characterizing RT patterns.DED provided Possible binding states of a TF and calculation of residence times (RT).A TF could be fully or partially bound to DNA, or it could be in an unbound state in which it does not interact with the DNA.Note that red and dark blue regions represent specific binding sites and nonspecific binding sites of the DNA, respectively.Nevertheless, binding to the either site are considered equivalent.(C) Collection, minimization, and analysis of TF residence patterns.First, bound durations of each TF throughout the simulations are collected which is then minimized to their medians.Resulting patterns are analyzed fitting several exponential equations.much better fits compared to the ED equation but notably failed to generalize well for all durations (Figure 2).That is certainly noteworthy, considering simulations utilized a single type of TF and two distinct types of DNA sites.
The employed strategy here was to increase the number of exponents to characterize TF residence behaviors better.We gradually increased the number of exponents for the decay equation to up to five exponents.On top of the visual inspection of the fits, we quantitatively analyzed their accuracy via normalized Residual Sum of Squares (RSS), where the lower RSS indicates a better fit.The increase from single to double as well as from double to triple exponential decay resulted in the order of magnitudes lower RSS (Figure 2B).Additional increments in the number of decay exponents also reduced RSS for the fits, but the changes were not as drastic.These analyses demonstrate that RTs follow a multi-exponential decay pattern, suggesting RT patterns are shaped by additional factors besides binary DNA-protein binding affinities.

Residence time behaviors are dependent on the emergent behavior of TF concentration and binding energies
Residence time behaviors (Figure 2A, Figure S1) suggest a pattern beyond a bi-exponential system that cannot be simply explained nor attributed to the dual model of short-lived TF-DNA interactions on nonspecific sites and longer-lived, more stable interactions on specific binding sites.Therefore, an emergent behavior is required to explain such multi-exponential patterns.Specifically, patterns obtained from elevated concentration and high-affinity kT cases require more exponents, indicating such behaviors emerge as the results of concentration and energy levels.
The relatively high binding energies and concentration of DNA-binding proteins were shown to lead to cluster formations as well as local and global condensations of the chromosome (Koşar et al., 2022;Lin et al., 2012;Brackley et al., 2013;Agback et al., 1998).Unsurprisingly, higher concentrations yield larger clusters (Figure 3C).The remarkable part is that global compaction of the chromosome, and cluster sizes, even conformations are mainly regulated by nonspecific interactions, and specific interactions have little effect.This behavior can be attributed to the abundance of nonspecific DNA sites over specific sites.In this work, we reduced the impact of global chromosome compaction via miniaturized TF binding domains (Figure 1A), minimizing the bridging of multiple DNA segments.Moreover, the use of adequate binding energies eliminated the possibility of high chromosomal compaction even at high concentrations (Figure 3A).This strategy allowed TFs to roam relatively freely within the cellular confinement.
Our model depicts TFs residing for increasingly higher durations on DNA instead of freely roaming around with the increasing NS potential with fixed specific binding energy (Figure 3A-B).Consistent with the previous findings, at a very low NS potential (U ns = 1kT ), bound proteins are sparsely distributed around the DNA polymer, mainly near binding sites, but do not form DNA-polymer complexes (i.e., clusters) (Figure 3).Relatively higher NS potentials (U ns = 2.8kT and U ns = 3kT ) led to primarily small and globular clusters.Whereas stronger NS affinities (U ns = 3.5kT and U ns = 4kT ) enabled the formation of much bigger clusters with filamentous conformation (Figure 3C).Therefore, we hypothesized the cluster sizes and conformations to play significant roles in TF residence patterns.

Cluster formations drive multi-exponential residence time patterns
To unravel the relationship between DNA-protein clusters and multi-exponent patterns, we initially considered tracking and collecting residence times of TFs for each individual cluster, as described in Figure 1.However, clusters are dynamic formations, and it is not rational to follow TFs that were initially part of a cluster because after they dissipate, they are free to bind anywhere on DNA.Therefore, we rather investigated the dissipation mean lifetimes of individual clusters (see methods).As opposed to RT pattern acquisitions, partial unbinding events were also counted.This modification was needed simply because at high nonspecific affinities, the time needed for obtaining cluster decay rates would exceed simulation lifetimes when only full unbinding events were counted.At relatively low nonspecific energies (U ns = 2.8kT and U ns = 3kT ), correlation analysis of cluster size and their dissipation durations did not reveal any relation.trarily, higher nonspecific interaction potentials (U ns = 3.5kT and U ns = 4kT ) led to correlations with coefficient of r ≥ 0.5 (Figure 4B), where larger clusters had longer mean lifetimes.These imply differently sized clusters could dissipate at diverse rates, contributing to multi-exponential behavior that is more apparent for high nonspecific affinity cases (Figure 2A).It should also be noted that at 2.8kT and 3kT nonspecific affinities, cluster sizes ranged from 20 to 45 and 20 to 70 TFs, respectively (Figure 4B).The size ranges were significantly improved with higher nonspecific interactions.The cases of 3.5kT and 4kT nonspecific affinities allowed the formation of clusters of size ranging from 50 to 250 TFs (Figure 4B).This increase, of course, provides advantages for the analysis of the relation between lifetimes and cluster sizes.

TF residence times depend on their position in clusters
Upon establishing the relationship between cluster sizes and their lifetimes, we considered analyzing the residence times of TFs that are located at different regions of the clusters.Cluster-associated TFs can be classified as surface TFs and core TFs.While the core TFs are rather trapped within the cluster, surface TFs are more exposed and could be expected to have shorter residence durations as they are free to unbind.Additionally, TFs that are not part of any cluster are classified as free TFs, which are useful as references against cluster-associated TFs (Figure S2B).
Visual investigation of individual proteins located at different positions with respect to the TF-DNA clusters did not show apparent distinction in their residence times (Figure S2).However, in an analytic (i.e., t-test) and more comprehensive approach utilizing multiple time steps, core TFs exhibited significantly longer residence times compared to surface and free TFs (Figure 5A).However, this behavior was limited to the high nonspecific affinity cases and not observed at lower nonspecific potentials, which may add another layer of complexity to residence behaviors, therefore, could explain the more prominent multi-exponential residence patterns at such energy levels.At U ns = 3.5kT case, surface TFs also had significantly higher residence times in comparison to free TFs (Figure 5A).This disparity indeed reveals that even located at the exposed regions of clusters, TFs could behave differently compared to freely roaming TFs.Overall, the variations among core, surface, and free TFs residence times could shape the overall residence distributions and further contribute to multiexponential DNA-residence behaviors of TFs at single and binary affinity models.for all cases.The threshold was set as minimum 20 TFs for cluster size for decay rate analysis due to statistical reasons.12 timesteps were used for sampling and the fixed specific binding energy was 4kT for all simulations.
Another factor that should be considered is the differing number of TFs of each type for distinct cases.Even though the concentration (60µM ) and specific binding potentials (U sp = 4kT ) are fixed among the cases, the higher nonspecific energies drive clustering rather than scattering (Figure 3B), also leading to the formation of much bigger clusters (Figure 3C, Figure 4B).Thus, more of the TFs are located within the clusters for high nonspecific affinity cases (Figure 5B).Moreover, the larger the clusters, the higher the core TF percentages were (Figure 5B), as expected due to the decrease in the surface-to-volume ratio.

Distributed affinity models do not lead to power-law behavior
Our MD simulations revealed that even single and binary affinity models could lead to multi-exponential residence time patterns.The next step was to mimic the continuum affinity model with a distributed affinity model, where we employed 13 distinct DNA site types and assigned them diverse binding affinities with DNA.
The affinities inclusively ranged between 1 − 4kT and 2 − 5kT with an increment of 0.25kT and the number of DNA sites for the given affinity assigned to provide either a normal (i.e., Gaussian) or uniform distribution.Contrary to our expectations, distributed affinity mod- Distributions of TF residence times for the given TF types.12 timesteps were used for sampling and the fixed specific binding energy of 4kT was used for all simulations.Representative 12% of the data are shown to prevent over-crowding.Data was obtained from 60µM cases.Free TFs do not belong in any clusters.
TFs on the Surface and in the Core of clusters were named accordingly.
els did not produce power-law behaviors .Instead, the triple-exponential fit was the better predictor of the models (Figure 6).Further power-law fit attempts via adding weights for the power-law equation provided better fits partly for higher durations, yet it failed to explain overall residence time patterns (Figure 6).Therefore, we might speculate that discrete distribution models may not be enough for power-law behavior, and such behavior would require much higher complexity than our MD simulations could provide to behave completely as a continuum model.

Discussion
Our findings reveal an apparent multi-exponential behavior for the discrete affinity model of TF-DNA bindings.Single and binary affinities leading to multiexponential distributions were beyond the prior suggestion of bi-exponential behavior due to specific and nonspecific binding sites.Here, we demonstrate how this multi-exponential behavior is shaped by TF-DNA complexes, the dependency of these formations, and the residence times behavior to nonspecific binding affinities.
One of the prominent features of our previous coarsegrained model was the global chromosome organization.Although that system allowed the modeling of the role of nonspecific affinities, selected bead size for TF binding domains led to multiple interactions facilitating the chromosomal collapse.Extreme compaction of the chromosome and resulting residence times of TFs would exceed simulation durations, making it unlikely to obtain RT patterns.By lowering the bead size of TFs, we minimized multiple binding and over-compaction.That also enabled the decoupling of 3D genome organization from RT patterns, increasing the control over other variables.
Similar to any MD study, this work has considerable limitations.Most prominently, the simplification of the cell for coarse-grained MD simulations removes most of the complexity possessed by the actual cellular systems.Additionally, we used a single type of TF for all MD simulations with discrete binding affinities.In combination, reduced complexity may explain the lack of observation of power-law behavior, which was attributed to continuous TF-DNA affinities in experimental studies.Of course, the cellular complexity is persistent in the experimental setup.Therefore, other factors, such as fluctuating TF levels and dynamic chromosomal landscape, should not be neglected.However, our simulations, even with discrete affinities, exhibited multiexponential behaviors, suggesting a power-law pattern is highly possible with continuous affinities, which is beyond the scope of this study.
As their primary roles, TFs can inhibit and activate transcription.The formation of stable protein-DNA complexes extends the duration of a TF's residence on its target DNA site.That, in turn, could significantly enhance their inhibitory or activatory effects.We may also speculate that larger and more compact clusters may decrease the accessibility by RNA polymerase, such as in heterochromatin-like domains (Amemiya et al., 2022), effectively reducing the transcriptional output.Contrarily, similar protein-DNA complexes, but in the form of transcriptional machinery, could initiate or enhance transcription.
Even though we demonstrated cluster size is an important factor for residence times, we also wanted to reveal any possible relationship between cluster shape and dissipation rates.However, investigation of the conformations of the cluster formations on residence times was inconclusive due to cluster sizes being highly dependent on nonspecific affinities.Thus, establishing a clear relation between cluster shape and RT requires eliminating affinity as a variable.In turn, that would require a separate MD model.Furthermore, clusters are not fixed formations; they are highly dynamic.Their sizes, conformations, orientations, and even the DNA segments they interact with change over time in the system.Moreover, clusters can merge and split, making it rather complicated to track individual clusters and make accurate analyses.
Additionally, the utilization of dimeric TF models may have contributed to the multi-exponential behavior in two ways.First, our definition of unbound TF requires both binding domains not to be in touch with any DNA molecule.In other words, partial bindings to DNA (i.e., bindings with only one binding domain) are equivalent to full bindings in terms of being considered bound.The difference in stability of partial binding and full binding may indeed lead to distinct residence behaviors contributing to multi-exponential behavior.Moreover, divalent interactions are significantly more prone to facilitated dissociation (Kamar et al., 2017;Chen et al., 2018).Since FD is pronounced at high concentrations (Koşar et al., 2022), multi-exponential behavior could be partially attributed to FD as it can facilitate the unbinding of the exposed TFs (i.e., free or surface) more than the unexposed TFs (i.e., core).
Noise in gene expression is an important factor driving cellular heterogeneity (Liu et al., 2019).Gene expression noise could lead different cells in a homogeneous population to distinct phenotypes even under the same environmental conditions (Raser and O'Shea, 2005).This resulting cellular heterogeneity may provide evolutionary advantages to cells.The primary driving event of gene expression noise is considered to be TF binding (Parab et al., 2022).Also, infrequent or rare biochemical processes contribute to noise in gene expression (Raser and O'Shea, 2005).Such infrequency or lower occurrence could be seen for the higher duration residences in our systems, which are more apparent at high nonspecific affinity and high protein concentration cases.Therefore, such cases could lead to higher gene expression noise and yield a more heterogeneous population of cells.It should also be noted that longer residence times are more likely to cause transcriptional bursts (Raser and O'Shea, 2005), which is another notion that contributes to noise in gene expression.On the other hand, shorter DNA residence times are suggested to lower gene expression noise (Azpeitia and Wagner, 2020).Our study sheds light on DNA residence distributions and may explain the diverse noise in gene expression responses.
In this study, we demonstrate that TFs follow multiexponential patterns with discrete affinities in our MD model.We discuss how binding affinity and concentration of TFs dictate these behaviors.There are indeed some implications of this behavior on biological systems.Isolated from cellular complexity and continuum model, model homodimeric TFs exhibit multiexponential patterns even with single and binary affinities.This type of behavior might be one of the underlying reasons for gene expression noise and consequent cellular heterogeneity and contribute to cellular differentiation.Moreover, the distributions of TF-DNA residence times may help explain discrete transcriptional bursts.Lastly, DNA-protein clusters in bacterial chromosomes could drive Topologically Associated Domainlike domain formations and further affect the regulation of gene expression.Overall, our findings might contribute to a more comprehensive understanding of gene expression and regulation.

Modelling the system
We used a modified version of our previous coarsegrained model bacterial system mimicking an E. coli bacterium.The model system includes a fairly relaxed chromosome with uniformly distributed binding sites confined within cell wall-like boundaries resembling that of a rod-shaped bacterium.Generic homodimeric transcription factors with various concentrations (10 − 60µM ) were placed randomly in the volume created by confinement.

Modelling the DNA polymer
For coarse-grained modeling, we used 10bp ≈ 1 bead bead "Kremer-Grest" (KG) model for DNA where 1 bead has a diameter of 1σ in Lennard-Jones (LJ) units corresponding to ∼ 3.4nm.We set the persistent length to 15 beads corresponding to 50nm consistent with double helix DNA molecule.We established an N = 12000 KG bead DNA model where we maximized the number of binding sites and left sufficient spacing for DNA segmental flexibility to minimize the impacts by bridging the binding sites.The initial circular DNA structure was compacted using self-attractive forces to reduce its size to fit within the available volume, similar to our previous work (Koşar et al., 2022).The binding sites are composed of three beads (as opposed to two beads in our previous work), increasing the likelihood of maximum interaction with the proteins.150 binding sites are placed uniformly along DNA with 80-bead spacing.has a volume of 2/3 × π × R 3 .The radius (r) was set to provide a 1% DNA volume fraction to match that of E. coli.Therefore, r corresponds to ∼ 30σ in LJ units (∼ 100nm) for a DNA polymer length of N = 12000.The fixed beads of the cell wall are placed dense enough to provide effective boundaries for DNA and proteins.Nevertheless, there were an insignificant number of TF leaks (< 2%) for extended simulations.

Modelling of the transcription factors
We employed a generic model of homodimeric TFs in which a TF has two identical binding domains and a hinge domain with no affinity to DNA.Binding domains were placed around the hinge domains with 90 • angles in a semi-flexible fashion with 12 beads of persistent length.Radiuses were set to 0.5σ and 0.21σ for hinge and binding domains, respectively (Figure 1A).We used relatively small binding domains to minimize multiple interactions by single binding domains and prevent overcondensation of the nucleoid.The exact sizes and angles were used to ensure the well-fittings of dimeric TFs into three-bead binding sites.We employed four different concentration levels (i.e.,10µM , 20µM , 40µM , 60µM ).The corresponding number of TFs for the given concentrations was calculated using the volume provided by the confinement.TFs were distributed into the volume at random coordinates.

Modelling the TF-DNA affinities
We used a fixed specific interaction potential of U sp = 4kT per bead, energy high enough for robust binding also low enough to allow unbinding in the timeframe of our simulations, enabling us to extract residence time patterns.Varying nonspecific interaction potentials in the range of 1−4kT allowed tracking of residence times at distinct local compaction levels and diverging cluster sizes, as well as differing protein distribution over DNA.
For the distributed affinity cases, there was no specific binding potential.Instead, affinities inclusively followed the given ranges of 1 − 4kT and 2 − 5kT with the increment of 0.25kT , resulting in 13 distinct affinity levels throughout DNA to ensure either Gaussian or uniform distributions.

Calculation of residence durations
Transcription factors are considered bound under the condition that at least one of two binding sites is in direct contact with the DNA polymer.For each timestep, each protein is marked either bound or unbound.Then, the duration of each uninterrupted bound state was calculated (Figure 1B).Each time a particular residence duration was encountered, the corresponding occurrence was incremented by 1.We then utilized pooled occurrences, containing the number of occurrences for each possible duration (1 − tmax), for analyzing residence patterns of transcription factors from simulations with distinct parameters.Obtained data is then minimized to ensure one occurrence value has only one correspond-ing duration by taking the median of durations.This step was necessary for equation fits and visualizations.

Fitting equations to the residence time distributions
We used several equations to interpret the behavior of transcription factors.Initially, we used single exponential decay c × exp(−t × k) and power-law c × t −k where c stands for coefficient and k for decay rate or τ −1 .
Due to suboptimal fitting with these equations, we included double, triple, quadruple, and pentuple exponential decays in the form of c 1 × exp(−t × k 1 ) + ... + c n × exp(−t × k n ).Resulting fits were graphed and used for calculating the normalized Residual Sum of Squares (RSS).Normalization was achieved by dividing RSS by the number of durations with non-zero occurrence points to prevent more observations leading to higher total RSS.

Clustering the transcription factors
Transcription factors within the threshold distance of each other are accepted to be part of the same cluster.We set the threshold distance to 2.1σ in LJ units in agreement with visual inspections.For the overview of the system, the minimum number of transcription factors to form a cluster was 12, and for decay analysis, it was 20, ensuring a more accurate estimation of mean lifetimes.

Cluster analysis
Cluster analysis included size, conformation, surface analysis, and decay rates.The size of a cluster is simply the number of transcription factors within that particular cluster.To classify the conformation (or shape) of a cluster, we first determined three possible formations, namely filamentous, globular, and semi-filamentous.We formed Rg tensor for each cluster and evaluated their eigenvalues.To distinguish surface and core cluster proteins, we employed the Convex-Hull algorithm.Decay rates (reverse of mean lifetimes) for clusters were calculated by fitting a single exponential decay equation to N (t)/N 0 .Decay rates require statistically meaningful numbers for reliable analysis of lifetimes, which led us to select 20 as the threshold for minimum cluster size.We then used linear regression for correlation analysis between cluster dissipation mean lifetimes and their sizes.

Figure 1 .
Figure 1.Graphical abstract of the study (A)The coarse-grained model of a generic homodimeric transcription factor.The light blue parts represent the binding domains and dark blue part is the hinge (non-binding) domain.Radiuses of the coarse-grained bead are given in Lennard-Jones (LJ) units and the angle between binding domains are shown in degrees.(B) Possible binding states of a TF and calculation of residence times (RT).A TF could be fully or partially bound to DNA, or it could be in an unbound state in which it does not interact with the DNA.Note that red and dark blue regions represent specific binding sites and nonspecific binding sites of the DNA, respectively.Nevertheless, binding to the either site are considered equivalent.(C) Collection, minimization, and analysis of TF residence patterns.First, bound durations of each TF throughout the simulations are collected which is then minimized to their medians.Resulting patterns are analyzed fitting several exponential equations.

Figure 2 .
Figure 2. Analysis of TF residence time patterns (A) Distribution of TF residence times in arbitrary unit (a.u.) at various concentrations for distinct Uns levels and fitting Exponential Decay (ED), Double Exponential Decay (DED), Triple Exponential Decay (TED), Quadruple Exponential Decay (QED), Pentuple Exponential Decay (PED), and Power-law equations to residence time data.(B) Normalized Residuals Sum of Squares (RSS) of the fits depicting the difference between observation and the fits.For all equations, overall RSS, RSS by concentrations, and by binding affinities are shown, respectively.Note that specific binding energies are 4kT for all the systems.

Figure 3 .
Figure 3. Visualizations of the system, protein distributions, and cluster formations of the coarse-grained bacterium model at various nonspecific binding energies (A) Overview of the system DNA (white-light blue), and TFs.(B) TF distributions within the confinement.(C) Cluster formations of TFs.Here, the coloring only serves to distinguish distinct protein clusters.Snapshots were obtained from systems with TF concentrations of 60µM .Specific binding energies are 4kT for all the systems.

Figure 4 .
Figure 4. Correlation analysis of cluster sizes and dissipation times (A) Snapshots of the TF clusters at given nonspecific energies from random timeteps.Darker colors implicate higher mean lifetime as indicated by the colorbars on the left.(B) Mean lifetimes in arbitrary units versus cluster sizes in number of TF they are composed of.Regression lines are used to determine correlations.Pearson correlation coefficients (r) are given on the upper-left side and number of the clusters in the analysis are given on the bottom-right side.TF concentrations were 60µM

Figure 5 .
Figure 5. Residence time analysis based on the TF localization with respect to clusters (A) Differences among of TF residence times for free, surface, and core TFs at given nonspecific affinities.Welch's t-test was used for significance analysis.(B)

Figure 6 .
Figure 6.TF residence time behaviors at distributed affinities (A) Uniform distribution and (B) Normal or Gaussian distribution affinity models.Pattern analysis via fitting of triple exponential, power-law, and weighted power-law equations from left to right respectively.TF concentrations were 60µM for all cases.Black lines are shown as reference from fixed nonspecific affinity of 3.5kT .Specific binding affinities are 4kT for all the systems.