Live imaging and biophysical modeling support a button-based mechanism of somatic homolog pairing in Drosophila

The spatial configuration of the eukaryotic genome is organized and dynamic, providing the structural basis for regulated gene expression in living cells. In Drosophila melanogaster, 3D genome organization is characterized by the phenomenon of somatic homolog pairing, where homologous chromosomes are intimately paired from end to end. While this organizational principle has been recognized for over 100 years, the process by which homologs identify one another and pair has remained mysterious. Recently, a model was proposed wherein specifically-interacting “buttons” encoded along the lengths of homologous chromosomes drive somatic homolog pairing. Here, we turn this hypothesis into a precise biophysical model to demonstrate that a button-based mechanism can lead to chromosome-wide pairing. We test our model and constrain its free parameters using live-imaging measurements of chromosomal loci tagged with the MS2 and PP7 nascent RNA labeling systems. Our analysis shows strong agreement between model predictions and experiments in the separation dynamics of tagged homologous loci as they transition from unpaired to paired states, and in the percentage of nuclei that become paired as embryonic development proceeds. In sum, our data strongly support a button-based mechanism of somatic homolog pairing in Drosophila and provide a theoretical framework for revealing the molecular identity and regulation of buttons.


Introduction
Eukaryotic genomes are highly organized within the three-dimensional volume of the nucleus. Advances over the past several decades have revealed a hierarchy of organizational principles, from the large scale of chromosome territories to the smaller-scale patterned folding of chromosomal segments called Topologically Associated Domains (TADs) and the association of active and inactive chromatin into separate compartments (Szabo, Bantignies, and Cavalli 2019) . Disruption of these organizational structures can have dramatic consequences for gene expression and genome stability (Lupiáñez et al. 2015;Kragesteen et al. 2018;Despang et al. 2019;Rosin et al. 2019) , emphasizing the importance of fully understanding underlying mechanisms of three-dimensional genome organization.
While many principles of genome organization are common among eukaryotes, differences have been noted among different organisms and cell types. For example, in Drosophila , an additional layer of nuclear organization exists in somatic cells wherein homologous chromosomes are closely juxtaposed from end to end, a phenomenon known as somatic homolog pairing (Joyce, Erceg, and Wu 2016;Stevens 1908) . While similar interchromosomal interactions occur transiently in somatic cells of other species and during early meiotic phases of most sexually-reproducing eukaryotes, the widespread and stable pairing of homologous chromosomes observed in somatic cells of Drosophila appears to be unique to Dipteran flies (King et al. 2019;Joyce, Erceg, and Wu 2016;McKee 2004) . Notably, the close juxtaposition of paired homologs can have a dramatic impact on gene expression through a process known as transvection, where regulatory elements on one chromosome influence chromatin and gene expression on a paired chromosome (Fukaya and Levine 2017;Duncan 2002) . However, although somatic homolog pairing was first described over 100 years ago (Stevens 1908) , the molecular mechanisms by which homologous chromosomes identify one another and pair have yet to be described.  Figure 1A) (Fung et al. 1998;Hiraoka et al. 1993) . This model is further supported by recent studies that converge on a "button" model for pairing, which postulates that pairing is initiated at discrete sites along the length of each chromosome ( Figure 1B) (Viets et al. 2019;Rowley et al. 2019) . However, the molecular nature of these hypothesized buttons is as yet unclear, nor is it clear whether this proposed model could lead to de novo pairing in the absence of some unknown active process that can identify and pair homologous loci. In light of these advances, our understanding of the initiation and maintenance of pairing would be greatly facilitated by the establishment of a biophysical model that defines parameters for the activities of pairing buttons, informed by observations of pairing dynamics in living cells.
In this paper, we turn the "button" mechanism for somatic homolog pairing into a precise biophysical model that can predict the behaviors of chromosomes over time.
Our simulations show that chromosome-wide pairing can be established through random encounters between specifically-interacting buttons that are dispersed across homologous chromosomes at various possible densities using a range of binding energies that are reasonable for protein-protein interactions. Importantly, we find that active processes are not necessary to explain pairing via our model, as all of the interactions necessary for stable pairing are initiated by reversible random encounters that are propagated chromosome-wide. We test our model and constrain its free parameters by assessing its ability to predict pairing dynamics using live-imaging. Our model can successfully predict that, once paired, homologous loci remain together in a highly stable state. Furthermore, the model can also predict the dynamics of pairing through the early development of the embryo as measured by the percentage of nuclei that become paired as development proceeds, and by the dynamic interaction of individual loci as they transition from unpaired to paired states. In sum, our analysis provides the necessary quantitative data to strongly support a button model as the underlying mechanism of somatic homolog pairing and establishes the conceptual infrastructure to uncover the molecular identity, functional underpinnings, and regulation of buttons.

1) FORMALIZING A BUTTON-BASED POLYMER MODEL OF HOMOLOGOUS PAIRING
Several prior studies have suggested that somatic homolog pairing in Drosophila may operate via a button mechanism between homologous loci (Viets et al. 2019;Rowley et al. 2019;Gemkow, Verveer, and Arndt-Jovin 1998;AlHaj Abed et al. 2019;Erceg et al. 2019;Fung et al. 1998) . In this model, discrete regions capable of pairing specifically with their corresponding homologous segments are interspersed throughout the chromosome. To quantitatively assess the feasibility of a button mechanism, we implemented a biophysical model of homologous pairing ( Fig. 2A). Briefly, we modeled homologous chromosome arms as polymers whose dynamics are driven by short-range, attractive, specific interactions between homologous loci, the so-called buttons, to account for pairing. These buttons are present at a density along the chromosome and bind specifically to each other with an energy . As E p described in detail in Materials and Methods, we included in our model short-range, non-specific interactions between (peri)centromeric regions to account for the the large-scale HP1-mediated clustering of centromeres that may also impact the global large-scale organization inside nuclei (Strom et al. 2017;Rosin, Nguyen, and Joyce 2018) and thus may interplay with pairing ( Fig. 2 -Supp. Fig. 2D) . As initial conditions for our simulations, we generated chromosome configurations with all centromeres at one pole of the nucleus (a 'Rabl' configuration) (see Supplementary Movies 1 and 2), typical of early embryonic fly nuclei (Csink and Henikoff 1998) .
Furthermore, to account for the potential steric hindrance of non-homologous chromosomes that could impede pairing, we simulated two pairs of homologous polymers.  Pairing between homologous chromosomes is assumed to be driven by specific, short-range attractive interactions of strength between certain homologous E p regions, named buttons. Each 10kb monomer in the simulation corresponds to one locus. (B) Kymograph of the time-evolution of the distances between homologous regions predicted by the model in one simulated stochastic trajectory for a button density of =65% , an interaction strength of =-1.6k B T, and E p an initial distance . See Figure  We systematically investigated the role of the button density along the genome , of the strength of the pairing interaction , and of the initial distance between homologous E p chromosomes in dictating pairing dynamics (Fig. 2D). For a given density, there is a critical d i value of below which no pairing event can be captured independently of the initial conditions E p Fig. 2A) since pairing imposes a huge entropic cost for the polymers and thus requires a sufficient amount of energy to be stabilized. Beyond this critical point, higher strengths of interactions and higher button densities lead to faster and stronger pairing (Fig. 2D left, center). The initial distance is also a crucial parameter. When homologous d i chromosomes are initially far apart, pairing is dramatically slowed down and impaired (Fig. 2D, right) due to the presence of the other simulated chromosomes between them ( Fig. 2 -Sup. Fig.   2B).
Altogether, these systematic analyses of model parameters support the view that the homologous button model is compatible with pairing. The key necessary mechanism is the specificity of preferential interactions between homologous regions. Indeed, our model suggests that non-specific interactions between buttons ( Fig. 2 -Sup. Fig. 2C,E) are not compatible with chromosome-wide pairing. These results are consistent with previous works where we showed that the weak, non-specific interactions between epigenomic domains that drive TAD and compartment formation in Drosophila (Ghosh and Jost 2018;Jost et al. 2014) cannot establish and maintain stable pairing (Pal et al. 2019) .

2) LIVE IMAGING REVEALS HOMOLOGOUS PAIRING DYNAMICS
The button model presented in Figure 2 makes precise predictions about pairing dynamics at single loci along the chromosome. To inform the parameters of the model and test its predictions, it is necessary to measure pairing dynamics in real time at individual loci of a living embryo. To accomplish this, we employed the MS2/MCP (Bertrand et al. 1998) and PP7/PCP (Chao et al. 2008) systems for labeling nascent transcripts. Here, each locus contains either MS2 or PP7 loops which can be visualized with different colors in living embryos (Fukaya, Lim, and Levine 2016;Lim et al. 2018;Chen et al. 2018;Garcia et al. 2013) . Specifically, we designed transgenes encoding MS2 or PP7 loops under the control of UAS (Brand and Perrimon 1993) , and integrated them at equivalent positions on homologous chromosomes (Fig.   3A). Activation of transcription via a source of GAL4 creates nascent transcripts encoding either MS2 or PP7 stem loops, each of which can be directly visualized by maternally providing  We focused on embryos that had completed the maternal-to-zygotic transition and began to undergo gastrulation at approximately 2.5 to 5 hours after embryo fertilization, a time in development when pairing begins to increase significantly (Fung et al. 1998 For each case, we imaged 30-60 minute time windows from multiple embryos, and used custom MATLAB scripts to determine the relative 3D distances between chromosomal loci over time. In embryos with both PP7 and MS2 transgenes integrated at polytene position 38F (Fig.   3B, top), the majority of nuclei could be qualitatively classified into one of two categories. In "unpaired" nuclei, homologous loci were typically separated by >1 µm with large and rapid changes in inter-homolog distances (e.g. Fig. 3C, blue), with a mean distance of 2.2 µm and standard deviation (SD) of 1.2 µm averaged over 30 separate nuclei. The measured mean distance between homologous loci was comparable within error, though systematically smaller, relative to the mean distance measured between loci in the negative control, where transgenes are integrated at nonhomologous positions ( Fig. 3C, red, mean distance = 4.0 µm, SD = 1.3 µm, n = 21 nuclei). In contrast, in "paired" nuclei, homologous loci remained consistently close to one another over time, with smaller dynamic changes in interhomolog distance (Fig. 3D, blue, mean distance = 0.4 µm, SD = 0.3 µm, n = 25). Interestingly, while the diffraction-limited signals produced from homologous loci can occasionally overlap in paired nuclei, their average separation was systematically larger than that of the embryos carrying interlaced MS2 and PP7 loops that served as a positive control for co-localization ( Fig. 3D, red, mean distance = 0.2 µm, SD = 0.1 µm, n=44). This control measurement also constitutes a baseline for the experimental error of our quantification of interhomolog distances (Chen et al. 2018) . Our measurements thus confirm previous observations of transgene pairing in the early embryo showing that signals from paired loci maintain close association but do not completely coincide with one another over time (Lim et al. 2018) . Notably, of 38 nuclei qualitatively scored as having paired homologs, we never observed a transition back to the unpaired state over a combined imaging time of more than 8 hours. Analysis of embryos with PP7 and MS2 transgenes integrated in homologous chromosomes at polytene position 53F showed comparable dynamics of interhomolog distances for nuclei in unpaired and paired states ( Fig. 3 -Sup. Fig. 1A, B). Thus, somatic homolog pairing represents a highly stable state characterized by small dynamic changes in the distance between homologous loci.
Our assessment thus far has been based on a qualitative definition of pairing. However, our live-imaging data affords us the ability to devise a stringent quantitative definition of homologous pairing. To make this possible, we measured inter-transgene distances for both homologous loci as well as for the unpaired and paired controls through gastrulation. We also included measurements from older embryos (~11-12 hours after embryo fertilization) using the driver R38A04-GAL4 (Jenett et al. 2012) to express the transgenes in epidermal cells, where pairing is expected to be widespread. In Figure 3E, we plotted the resulting mean distance vs.
standard deviation of the inter-transgene distance for each nucleus analyzed over the length of time each nucleus was tracked (~10-50 minutes). From these data, we established a quantitative and dynamic definition of somatic homolog pairing based on a mean distance <1.0 µm and a corresponding standard deviation <0.4 µm (Fig. 3E, shaded region). By this definition, we consider paired 100% of nuclei that we had qualitatively scored as such, but exclude all nuclei scored as unpaired. As expected, this definition also scores 100% (15/15) of the tracked nuclei from older embryos as paired. Further, data for paired nuclei from early vs. late embryos were in close agreement, demonstrating that pairing observed in early embryos is representative of later stages.
We next analyzed the progression of pairing over the first 6 hours of development in single embryos carrying MS2 and PP7 transgenes in homologous chromosomes at positions 38F and 53F. To accomplish this, we collected data for short (~10 minute) intervals every 30 minutes from 2.5 to 6 hours of development, and analyzed interhomolog distances as outlined above. We then plotted the mean distance as a function of standard deviation for each nucleus analyzed at each time point to create a dynamic assessment of somatic homolog pairing over developmental time. As expected, we see an overall decrease in mean interhomolog distance and its standard deviation as development progresses (Fig. 4A, Fig. 3 -Sup. Fig. 1C). To directly compare our analysis to prior studies, we then binned nuclei into paired and unpaired states based on their mean and standard deviation values as defined in Figure 3E, and plotted the percentage of paired nuclei at each developmental time point (Fig. 4B). Consistent with previous literature (Fung et al. 1998) , we observe a steady increase in the proportion of paired nuclei; however, by our dynamical definition of pairing, the percentage of nuclei that are paired is systematically lower at most timepoints when compared to results using DNA-FISH ( Fig. 4 -Sup. Fig. 1). This slight disagreement likely reflects differences between the classic, static definition of pairing based on overlapping DNA-FISH signals in the one frame accessible by fixed-tissue measurements as opposed to our definition that demands loci to be paired over several consecutive frames. In sum, we have demonstrated that our system captures the progression of somatic homolog pairing over developmental time, making it possible to contrast theoretical predictions and experimental measurements.

PAIRING PROBABILITY
Our button model predicts that the fraction of paired loci as a function of time depends on three parameters: the initial separation between homologous chromosomes d i , the density of buttons along the chromosome , and the button-button interaction energy (Fig. 2D) . As an E p initial test of our model, and to constrain the values of its parameters, we sought to compare model predictions to direct measurements of the fraction of paired loci over developmental time.
Due to the still unknown molecular identity of the buttons, it was impossible to perform a direct measurement of the button density and the button-button interaction energy. However, the initial separation between chromosomes d i can be directly estimated using chromosome painting (Ried et al. 1998;Beliveau et al. 2012) . To make this possible, we used Oligopaint probes (Beliveau et al. 2012)  we inferred, for each button density, the strength of interaction that best fits the data (Fig. 4C).
Interestingly, the goodness of fit is mainly independent of the button density ( Fig. 4 -Sup. Fig.   3C): denser buttons require less strength of interaction to reach the same best fit (black line in  Each data point represents a single nucleus over a ten-minute time window, with different colors indicating time after fertilization. Data are separated into three plots for ease of visualization. (B) Nuclei from each timepoint were scored as "paired" if they fell within the shaded box in (A). Data were taken from three different embryos each for transgenes at 38F (red) and 53F (blue). For each button density , we fitted the experimental pairing dynamics (Fig. 4 -Sup. Fig. 2A). Grey shading provides the envelope of the best predictions obtained for each (dark grey) and its standard deviation (light grey). (C) Phase diagram representing, as a function of , the value of (black line) that leads to the best fit between predicted and experimental E p developmental pairing dynamics. The predicted pairing strength is weaker than observed in the parameter space above the line, and stronger than observed below the line.
As illustrated in Figure 4B, we observed that the infe rred developmental dynamics quantitatively recapitulates the experimental observations for both investigated loci for any choice of parameters given by the curve shown in Figure 4C. Note, however, that our simulations do not predict the large increase of pairing observed for 38F between 5.5 and 6 hours (Fig. 4B). This disagreement may be a consequence of the proximity of 38F to the highly paired Histone Locus Body (Hiraoka et al. 1993;Fung et al. 1998) . In sum, the button model can recapitulate the observed average pairing dynamics for a wide range of possible button densities coupled with interaction energies that are consistent with protein-DNA interactions.

4) PARAMETER-FREE PREDICTION OF INDIVIDUAL PAIRING DYNAMICS
The fit of our button model to the fraction of paired loci during development in living embryos (Fig. 4B) revealed a dependency between the interaction strength and the button E p density (Fig. 4C). As a critical test of the model's predictive power, we sought to go beyond averaged pairing dynamics and use the model to compute the pairing dynamics of individual loci. As can be seen qualitatively in the kymographs predicted by the model (Fig. 2B and Fig. 2 -Sup. Fig. 1), pairing spreads rapidly within tens of minutes from the buttons that constitute the initial points of contact along the chromosome. As a result, the button-model predicts that homologous loci will undergo a rapid transition to the paired state as the zipping mechanism of pairing progression moves across the chromosome.  inter-homolog distances decreasing rapidly from 1-2 µm to below 0.65 µm at an accelerating rate over the course of 10-20 minutes. The independence of this first period of the pairing dynamics with respect to and suggests that these dynamics are mainly diffusion-limited.

E p
In contrast to the initial pairing dynamics, varying model parameter values had a clear effect on the distance dynamics that followed the pairing event. Specifically, simulations with a weak (Fig. 5C, red) led to a slow increase in inter-homolog distances as time progressed, E p consistent with unstable pairing events. Conversely, simulations with a strong (Fig. 5D, E p green) were associated with tight pairing of homologous loci following the pairing event, with inter-homolog distances stably maintained around 130 nm, close to the spatial resolution of the model. Notably, the values of and determined to best fit the averaged temporal evolution of E p the fraction of paired loci over development ( Fig. 4 and 5B) all led to similar predictions for the median interhomolog distance dynamics associated with pairing events. As shown in Figures 5E and F, these traces converged to a stable long-term median inter-homolog distance of around 0.5 µm, that is nearly identical to the experimentally-determined distance of around 0.44 µm between homologous loci in stably paired nuclei (compare the colored and black lines in Fig. 5 E,F). Our results thus suggest that the slow dynamics of the pairing probability observed during development (Fig. 4) and the dynamics of inter-homolog distance after a pairing event (Fig. 5) are strongly correlated.
We then compared our simulated traces to experimental observations of pairing events in living nuclei. Among the many movies that we monitored, we captured 14 pairing events matching the criteria of initial large inter-homolog distances that drop below 0.65 µm for at least same approach as with the simulated data described above, and calculated the median dynamics of inter-homolog distances around the pairing event ( Fig. 5C-F, black lines). As shown in the figures, we observed that the pre-pairing dynamics are fully compatible with model predictions, with a rapid decrease in inter-homolog distances over the course of 10-20 minutes.
Furthermore, Figure 5F shows that the experimental post-pairing dynamics in interhomolog distance are closely recapitulated by the predictions made using parameters that best fit the pairing probability over developmental time presented in Figure 5B. Two caveats may be considered in interpreting our analysis. First, our method of tracking homologous loci in living embryos relies on visualizing nascent RNAs generated from transgenes rather than direct observation of DNA or DNA-binding proteins. While nascent RNAs provide a robust and convenient signal for the position of the underlying DNA (Lim et al. 2018;Chen et al. 2018) , the method limits us to examining the behavior of transcriptionally active loci, which could behave differently from silent chromatin. In addition, it is possible that our analysis could overestimate interhomolog distances in paired nuclei if, for example, nascent RNA molecules from separate chromosomes are prevented from intermixing (Fay and Anderson 2018) . Secondly, our simulations do not account for complex behaviors of the genome that take place during development and that may also influence pairing dynamics and stability, including cell cycle progression and mitosis (Foe 1989) , establishment of chromatin types and associated nuclear compartments (Sexton et al. 2012;Yuan and O'Farrell 2016;Hug et al. 2017;Ogiyama et al. 2018) , and additional nuclear organelles such as the Histone Locus Body (White et al. 2011;Liu et al. 2006) . Further testing and refinement of our understanding of somatic homolog pairing will require new approaches to incorporate the potential influences of these genomic behaviors in a developmental context.
A previous analysis of pairing and transvection in living embryos focused on the blastoderm phase, coinciding with the earliest developmental timepoints in our analysis, and found that interhomolog interactions were generally unstable at that time (Lim et al. 2018) . Thus, the embryo appears to transition from an early state that antagonizes stable pairing prior to cellularization to one that supports stable pairing at later time points of development. Prior studies have postulated changes in cell cycle dynamics (Fung et al. 1998;Gemkow, Verveer, and Arndt-Jovin 1998)  In the future, we anticipate that our model will be instrumental in identifying and characterizing candidate button loci, and in determining how these parameters are modulated in the various mutant backgrounds that have been found to affect pairing Joyce et al. 2012;Hartl, Smith, and Bosco 2008;Gemkow, Verveer, and Arndt-Jovin 1998) . Thus, our study significantly advances our understanding of the century-old mystery of somatic homolog pairing and provides a theory-guided path for uncovering its molecular underpinnings.

DNA constructs and fly lines
Flies expressing a nuclear MCP-NLS-mCherry under the control of the nanos promoter were previously described (Bothma et al. 2018) . To create flies expressing PCP-NoNLS-GFP, the plasmid pCASPER4-pNOS-eGFP-PCP-ɑTub3'UTR was constructed by replacing the MCP coding region of pCASPER4-pNOS-NoNLS-eGFP-MCP-ɑTub3'UTR (Garcia et al. 2013) with the coding region of PCP (Larson et al. 2011) . Transgenic lines were established via standard P-element transgenesis (Spradling and Rubin 1982) . To create flies expressing MS2 or PP7 loops under the control of UAS, we started from plasmids piB-hbP2-P2P-lacZ-MS2-24x-αTub3'UTR (Garcia et al. 2013) and piB-hbP2-P2P-lacZ-PP7-24x-αTub3'UTR, the latter of which was created by replacing the MS2 sequence of the former with the PP7 stem loop sequence (Larson et al. 2011) . The hunchback P2P promoter was removed from these plasmids and replaced by 10 copies of the UAS upstream activator sequences and the Drosophila Synthetic Core Promoter (DSCP) (Pfeiffer et al. 2010) . Recombinase Mediated Cassette Exchange (RMCE) (Bateman, Lee, and Wu 2006) was then used to place each construct at two landing sites in polytene positions 38F and 53F (Bateman and Wu 2008a;Bateman, Johnson, and Locke 2012) . Flies carrying the GAL4 driver nullo-GAL4, which drives expression in all somatic cells during the cellular blastoderm stage of cell cycle 14, were a gift from Jason Palladino and Barbara Mellone. Flies carrying the GAL4 driver R38A04-GAL4 , which drives expression in epidermal cells in germband-extended embryos (Jenett et al. 2012) , were acquired from the Bloomington Drosophila Stock Center. Finally, the interlaced MS2 and PP7 loops under the control of the hunchback P2 enhancer and promoter (P2P-MS2/PP7-lacZ) were based on a previously described sequence (Wu, Chen, and Singer 2014) .
The resulting embryos are loaded with MCP-mCherry-NLS and PCP-GFP proteins due to maternal expression via the nanos promoter, and zygotic expression of nullo-GAL4 drives transcription of MS2 and PP7 loops in all somatic cells starting approximately 30 minutes into cell cycle 14 (cellular blastoderm.) For pairing analysis, both MS2 and PP7 transgenes were in the same genomic location, either position 38F or 53F, whereas for a negative control, MS2 loops were located at 38F and PP7 loops were located at 53F. To visualize pairing at later time in development, the mothers indicated above were instead crossed to males of genotype 10XUAS-DSCP-PP7; R38A04-GAL4 , where both MS2 and PP7 loops were located at position 38F. Finally, to visualize MS2 and PP7 loops derived from the same genomic location, mothers of genotype MCP-mCherry-NLS, PCP-GFP were crossed to P2P-MS2/PP7-lacZ located at position 38F.

Embryo preparation and image acquisition
Embryos were collected at 25 ० C on apple juice plates and prepared for imaging as previously described (Garcia et al. 2013) . Mounted embryos were imaged using a Leica SP8 confocal microscope, with fluorescence from mCherry and eGFP collected sequentially to minimize channel crosstalk. For each movie, the imaging window was 54.3 x 54.3 µm at a resolution of 768 x 768 pixels, with slices in each z-series separated by 0.4 µm. Z-stacks were collected through either 10 or 12 µm in the z plane (26 or 31 images per stack), resulting in a time resolution of approximately 27 or 31 seconds per stack using a scanning speed of 800 Hz and a bidirectional scan head with no averaging. For the pairing data described in Figure 3, the imaging window was centered on a dorsal view of the embryonic head region covering mitotic domains 18 and 20 (Foe 1989) , which show minimal movements during gastrulation and germ band extension relative to other regions of the embryo. We compared pairing levels in these cells at 6 hours of development to that of cells in a posterior abdominal segment at the same time point and found them to be near identical (75.0% paired, n=16 for anterior cells vs 73.7%, n=19 in posterior cells according to the definition of pairing established in Figure 3E), supporting that cells from different regions of the embryo are roughly equivalent for pairing dynamics at this stage. For positive control embryos with interlaced PP7 and MS2 loops driven by the hunchback promoter, embryos were imaged during cell cycle 13 and early cell cycle 14, and the imaging window was positioned laterally as previously described (Garcia et al. 2013) . To assess pairing in late stage embryos using the R38A04-GAL4 driver, embryos were aged to approximately 11-12 hours and the imaging window was positioned laterally over an abdominal segment. For the developmental time course movies described in Figure 3, imaging centered on mitotic domains 18 and 20 when these cells were in interphase. During timepoints when these domains were undergoing mitosis, an adjacent mitotic domain in interphase was imaged.

Image analysis
All images were first run through the ImageJ plug-in Trainable Weka Segmentation (Arganda-Carreras et al. 2017) and filtered with custom classifiers to generate two separate channels of 3D segmented images that isolated fluorescent spots. These segmented spots were then fitted to a Gaussian with a nonlinear least squares regression to find the 2D center. Image z-stacks were then searched for any spots tracked for 3 or more contiguous z-slices and the rest were discarded. Additional manual curation was employed to confirm the accuracy of segmented images and to add in any additional spots that were missed. An initial estimate of the center of each spot was set based on the z-slice in which the spot had the greatest maximum intensity within a predefined radius from its 2D center. These initial estimates were then used to seed a 3D Gaussian fit for each spot, the center of which was used for all distance calculation. This granted us not only sub-pixel resolution in x-y but also sub z-slice resolution, allowing for more precision in the z coordinate that would otherwise be limited by the 0.4 µm spacing between consecutive stacked images created by confocal imaging. Raw image z-stacks for each time frame were also maximum projected in the channel containing nuclearly localized MCP-mCherry to create 2D maps of all the nuclei in frame. These nuclear projections were then segmented and tracked in Matlab, followed by manual curation to ensure that each nucleus was consistently followed. One tracked particle lineage from each channel was then assigned a distinct nucleus based on its proximity to that nucleus in the 2D map and the particles in each channel were considered homologous chromosomes of one another. Since absolute coordinates of assigned particles were not possible to obtain due to cellular rotation and motion, all distance calculations were done with the relative coordinates of each locus from its homolog as any cellular rotation or motion was assumed to be conserved between loci in the same cell. For the data presented in Figure 3, we qualitatively scored each nucleus based on the measured distances between red and green signals over the time the signals were observed: "paired" nuclei showed small distances and little variation over time, and "unpaired" nuclei showed larger distances and greater variation over time. Nuclei that showed a transition from large distances and variation at earlier timepoints to smaller distances and variation at later timepoints were scored as "pairing" traces, and were not included in Figure 3 (see Figure 5). In assessing the stability of the paired state, we included both "paired" (n=25) and "pairing" (n=13) nuclei in the total number of nuclei (n=38) assessed. In this analysis, we conservatively only included the observation time of "paired" nuclei (> 8 hours of observation with no transition back to the unpaired state), although "pairing" nuclei also remained in the paired state throughout the remaining observation time once they became paired. To align the traces presented in Figure 5 based on a timepoint when the loci become paired, we manually aligned all traces that had been qualitatively assessed as "pairing" traces according to several different values of threshold distance and consecutive frames below that threshold. We then optimized this exploration for values that provided qualitatively good alignment of traces but that excluded as few traces as possible in order to maximize the data available for analysis. The same criteria were applied to identify and align pairing traces from simulations. All image analysis was done using custom scripts in Matlab 2019b unless otherwise stated.

Chromosome painting
Embryos of genotype w 1118 were aged to 2-3 hours after embryo deposition, fixed, and subjected to DNA Fluorescence In Situ Hybridization (DNA-FISH) using 400 pmol of Oligopaint probes (Beliveau et al. 2012) targeting 2L and 2R (200 pmol of each probe) as previously described (Bateman and Wu 2008b) . Oligopaint probes targeting chromosome arms 2L and 2R are described in (Rosin, Nguyen, and Joyce 2018) . Hybridized embryos were mounted in Vectashield mounting media with DAPI (Vector Laboratories), and three-dimensional images were collected using a Leica SP8 confocal microscope. To establish initial inter-homolog distances, an image from an embryo in early interphase 14 (as judged by nuclear elongation (Fung et al. 1998) ) and with high signal-to-noise was analyzed using the TANGO image analysis plugin for ImageJ (Ollion et al. 2013(Ollion et al. , 2015Belevich et al. 2016) . After segmentation and assignment of each painted territory to a parent nucleus, distances between territories were measured from centroid to centroid in 3D. Since homologous chromosomes are labeled with the same color, when territories produce a continuous region of fluorescence, a distance of zero was assigned. A total of 48 nuclei were analyzed for each of 2L and 2R.

The homologous button model
We modeled two pairs of homologous chromosome arms as semi-flexible self-avoiding polymers. Each chromosome consists of N =3,200 beads, with each bead containing 10 kbp and being of size b nm. The 4 polymers moved on a face-centered-cubic lattice of size L x x L y x L z under periodic boundary conditions to account for confinement by other chromosomes. Previously, we showed that TAD and compartment formation may be quantitatively explained by epigenetic-driven interactions between loci sharing the same local chromatin state (Jost et al. 2014;Ghosh and Jost 2018) . However, such weak interactions cannot lead to global homologous pairing (Pal et al. 2019). Here, to simplify our model, we neglect these types of interactions (whose effects are mainly at the TAD-scale) to focus on the effect of homolog -specific interactions. We do consider, however, HP1-mediated interactions between (peri)centromeric regions that are thought to impact the global large-scale organization inside nuclei (Fig. 2 -Supp. Fig. 2D) (Strom et al. 2017) . Homologous pairing was modeled as contact interactions between some homologous monomers, the so-called buttons ( Fig. 2A). For each pair of homologous chromosomes, positions along the genome were randomly selected as buttons with a probability . Each 10kbp bead i of chromosome chr is therefore characterized by a state p chr,i with p chr,i =1 if it is a button (=0 otherwise) and p chr,i = p chr',i =1 if chr and chr' are homologous. In addition, the first 1,000 monomers of each chromosome were modeled as self-attracting centromeric and pericentromeric regions, the rest as neutral euchromatic regions. The energy of a given configuration was then given by , (S1) H = ∑ The dynamics of the chains followed a simple kinetic Monte-Carlo scheme with local moves using a Metropolis criterion applied to H . The values of ) and L z ( =4 µm) were fixed using the coarse-graining and time-mapping strategies developed in (Ghosh and Jost 2018) for a 10-nm fiber model and a volumic density=0.009 bp/nm 3 typical of Drosophila nuclei. For every set of remaining parameters (the button density and the strength of pairing interaction E p ), 100 independent trajectories were simulated starting from compact, knot-free, Rabl-like initial configurations (Csink and Henikoff 1998)  To constrain model parameters, we compared the measured pairing dynamics (Fig. 4B) to the model prediction. Specifically, for each parameter set, we computed a chi2-score between the predicted dynamics and experimental timepoints ,  ) as a μm ≤ 1 function of the initial distance between homologous chromosomes at 1h, 4h and 10h for =65% and d i E p =-1.6kT in presence ( , full lines) or absence ( , dashed lines) of interactions between − .1kT E c = 0 kT E c = 0 (peri)centromeric regions. Even for large initial distances ( ), we observe a weak but significant .5μm d i ≥ 2 amount of pairing in the euchromatic regions . These predictions are consistent with DNA FISH experiments at different loci suggesting an average higher pairing probability in heterochromatic loci during embryogenesis (see Fig.1 and Sup. Fig 2 in   ) . (E) The non-specific button model. To verify whether a non-specific button model can lead to global pairing, we relaxed the homologous model including specific attraction only between homologous loci and generated models where buttons may interact with any other buttons in the nucleus (left). In this model, the energy of a given configuration was described by We varied from 0.1 to 1 and from -0.025 to -4kT, one realization of which is shown here (right), E p without observing any global pairing of homologous chromosomes. Other parameters were as in the homologous button model (see Materials and Methods of the main text). Contacts preferentially form between buttons belonging to the same chromosome, or more weakly, between buttons of different chromosomes but not necessarily between homologous loci.   Inter-homolog distances were determined by segmenting painted regions in 3D using ImageJ and measuring center-to-center distances (inset; see Materials and Methods for details). Chromosome arm 2R, carrying transgene location 53F, was sampled in the same field of cells using a different fluorescent tag (not shown). (B) Distribution of inter-homologous arm distances measured from 48 nuclei in the image in (A) (red curve). Measurements from chromosome arms 2L and 2R were completely overlapping and therefore were combined. The curve is nicely fitted by a Gaussian distribution (black curve, mean=1.9 , SD=0.9).   defined as in Equation S1 and the number of common binding sites between buttons (chr,i) and m chr,i;chr ,j ′ (chr',j) , and the strength of interaction between binding sites bound to the same architectural E a < 0 proteins. For example, the case n site =1, n archi =1 ( =1, ) corresponds to the non-specific m chr,i;chr ,j ′ E a = E p button model described in Fig. 4 -Sup.Fig. 2E. To simplify and avoid potential issues arising from periodic boundary conditions, we focused on favorable situations for pairing where homologs are initially aligned and close to each other (~640nm between their respective centers of mass) (see inset in A) and where the monomers evolve in a close box (rigid wall conditions). Using this model, we first investigated how specific a button should be in order to lead to pairing. We fixed n site =1 and varied n archi and for a button E a density of 60%. In these cases, n archi represents the number of different button types. In (B), we plotted, for each n archi , the time evolution of the average pairing probability between homologous sites (paired if distance ) for the -value that leads to maximal pairing. As a point of comparison, we also plotted μm ≤ 1 E a the corresponding curves in the absence of buttons (black line) and for the homologous button model investigated in the main text (red line) for the -value (-1.5kT) consistent with experiments at the E p corresponding button density. We observed that as n archi increases, i.e. as the buttons become more specific, the pairing efficiency increases. The full specific model (red) becomes well approximated by our combinatorial button model for n archi >200. This suggests that pairing needs a significant degree of specificity via a large number of button types but each button type may be present in a small amount. However, it is unlikely that there exist enough different architectural proteins to reach such single-site specificity. A possibility to increase specificity from a small number of proteins is to allow more than one binding site per button ( n site >1). The number of different buttons is then ( n site )!/[( n archi -n site )!( n site )!]. In (C), we fixed n archi =50, an upper maximal number of architectural proteins, and varied n site . For each n site , we plotted the pairing probability for the -value that leads to maximal pairing. We observed that there E a exists an optimal number of sites (here~5) where the pairing is close to the one obtained with the homologous button model. This corresponds to a value that leads to a large diversity of buttons while maintaining a low number of spurious interactions between non-homologous buttons which is of the order of n site / n archi~0 .1.

Supplementary Movies
Supplementary Movie 1 . Polymer simulations of homologous pairing. Example of a 4 hours numerical simulation of the homologous button model ( =60% , =-1.6 k B T) with frames taken every 30 seconds. The E p movie focuses on one pair of homologs (red and blue polymers). Orange and cyan parts of these chains represent their (peri)centromeric regions. Surrounding transparent light-grey chains represent the periodic boundary images of the simulated chains. The black bar measures 1 micron. Homologous loci in closed contact (distance <200nm) are colored in green.

Supplementary Movie 2 .
Polymer simulations of homologous pairing. Example of a 4 hours numerical simulation of the homologous button model ( =60% , =-1.6 k B T) with frames taken every 30 seconds (different E p from the Supplementary Movie 1). The two pairs of homologs are highlighted (red/blue for one pair; purple/dark blue). Orange/cyan and pink/light blue parts of these chains represent their (peri)centromeric regions. Surrounding transparent light-grey chains represent the periodic boundary images of the simulated chains. The black bar measures 1 micron. Homologous loci in closed contact (distance <200nm) are colored in green.

Supplementary Movie 3 .
Representative confocal movie of a live Drosophila embryo (cell cycle 14 to gastrulation) in which MS2 and PP7 loops are integrated at equivalent positions on homologous chromosomes. Examples of nuclei are highlighted whose loci display characteristic dynamics, including loci that do not pair ("Unpaired"), loci that are already paired ("Paired"), and loci that are observed transitioning from the unpaired to the paired state ("Pairing"). Image stacks were taken roughly every 30 seconds and max-projected for 2D viewing.

Supplementary Movie 4 .
Representative confocal movie of a live Drosophila embryo (roughly 4.5 hours old) in which MS2 and PP7 loops are integrated at equivalent positions on homologous chromosomes. Image stacks were taken roughly every 30 seconds and max-projected for 2D viewing.

Supplementary Movie 5 .
Representative confocal movie of a live Drosophila embryo (roughly 5.5 hours old) in which MS2 and PP7 loops are integrated at different positions on homologous chromosomes (MS2 at position 38F and PP7 at position 53F) where we expect no pairing between transgenes. Image stacks were taken roughly every 30 seconds and max-projected for 2D viewing.

Supplementary Movie 6 .
Representative confocal movie of a live Drosophila embryo (cell cycle 14) in which MS2 and PP7 loops were interlaced in a single transgene on one chromosome at polytene position 38F to act as a positive control for pairing. Both GFP and mCherry are co-localized to the same locus in all transcriptional loci. Image stacks were taken roughly every 30 seconds and max-projected for 2D viewing.