Abstract
Measurement of in-vivo chromosome conformations (structures) in single cells is a major technological goal of structural biology. If one could identify many genetic loci in a microscope image despite the limited palette of fluorescent colors used to label them, then the conformation could be solved at some resolution by ‘connecting the dots’. Computational tools for making this reconstruction are expected to produce near-perfect reconstructions when the number of fluorescent colors is high enough, irrespective of the number of loci assayed. Here we report the first experimental test of the performance of these reconstruction algorithms and check their ability to reconstruct experimentally-measured conformations. We also demonstrate the experimental metrics needed to assess reconstruction quality. Our results indicate that current sequential FISH experiments may be close to the point where the reconstructions are nearly flawless at some distance scales.
The in-vivo conformation of chromosomes plays an major role in controlling gene expression [1] and is therefore of great importance to structural biology. Unfortunately there is not yet a direct assay of chromosome conformation in single cells. Until recently, most conformational inferences have been based on contact frequencies between pairs of chromosomal loci as measured by various chromosome-capture methods [2–7]. There is some doubt as to how well contact-based inferences work [8, 9], particularly when the contact frequencies have been averaged over many cells owing to high cell-to-cell variability [10, 11] (single-cell chromosome-capture is possible but technically difficult). Alternatively, fluorescence microscopy can directly measure the locations of a set of chromosomal loci, and ‘connecting the dots’ produces a conformation at the resolution of the locus spacing. The microscopy approach is becoming popular due to several advantages: it can directly measure the spatial locations of fluorophore-labeled chromosomal loci; like chromosome-capture it can assay many chromosomal loci in parallel; and unlike chromosome-capture it is inherently a single-cell method. However, its major weakness is scalability in number of loci: the number of genetic loci that can be identified in an image is usually taken to be equal to the number of different fluorophores that can be distinguished by color, typically ∼ 3 – 5. Recent developments in sequential FISH techniques [12] have pushed the number of uniquely identifiable loci beyond the number of colors, to around 50 in a typical experiment. However, a full measurement of chromosome conformation that resolves kilobase-scale structures will require thousands of loci to be discriminated, which is not yet possible using purely experimental tools.
In order to scale further to the thousand-locus scale, one can use computational tools. This final boost involves labeling many times more loci than can be discriminated by color or round of hybridization, and then computationally disambiguating look-alike spots in the image in order to reconstruct the underlying conformation. This disambiguation works by requiring loci that are nearby on the genome to also be close in three-dimensional space. The nearness requirement is obvious on small scales: for example, two loci spaced by 100 bp can only be physically separated 30 nm or less without the DNA breaking. But such a relationship has also been observed over kilobase- to-megabase scales [13–15], implying that computational reconstructions should also work at these larger scales. Simple reconstructions can be done ‘by eye’ [16], but two computational tools have been developed to auto-mate this process and quantify the uncertainty in large reconstructions: align3d [17, 18] and ChromoTrace [19]. These tools have been tested using simulations but not yet using real-world experimental data.
A major advantage of reconstruction algorithms is that once the point is reached where adding more loci does not affect their density in the image, reconstruction performance should plateau. In other words, at this point reconstructions can be scaled up indefinitely, with little performance penalty. In this regime, reconstruction quality seems to be determined mainly by the number of what we term ‘pseudocolors’, which we define as groups of loci that can be distinguished purely experimentally. In a simple imaging experiment, the number of pseudocolors is simply the number of different fluorophores available: an experiment labeling some loci with red fluorophores and others with green fluorophores has 2 pseudocolors, since the red spots in an image can be discriminated from the green spots but not the other red spots, and vice versa. In a sequential FISH experiment the number of pseudocolors is the number of fluorophores times the number of rounds of hybridization: for example if 2 fluorophores are used over 5 rounds of hybridization, then labeled loci can be grouped into 10 categories (pseudocolors) when considering both color and round of hybridization. Simulations indicate that if labeling experiments use enough pseudocolors, reconstruction algorithms produce near-perfect results irrespective of the size of the reconstruction [18, 19].1
Here we describe the first tests of computational reconstruction algorithms using real-world labeling data, including both blind tests of the reconstructions (where the answer is genuinely unknown) and tests using published conformations where we have added ambiguity using a recoloring scheme (so that the reconstructions could be checked). Our immediate goal was to determine the performance on actual, as opposed to simulated or inferred, chromosomal conformations, and to validate as the accuracy of the reconstruction quality metrics that are produced by these analyses. In particular, the recolored data sets let us begin to address the question of whether current experiments have sufficient pseudocolors to solve the reconstruction problem; however recoloring produces fewer pseudocolors than can be attained experimentally, so we only directly test experiments that are less than state-of-the-art. As a secondary result, these recolored data sets allowed us to accurately model the distance function between pairs of loci, which itself gives clues to the biology of chromosome structure.
Results
10-locus experiment
For our first reconstruction experiment, we fluorescently labeled 10 genomic loci over a 4 Mb stretch of chromosome 4 in fixed human cells (GM12878), using one of 3 spectrally distinct fluorophores to hybridize each locus (see Figure 1A). The cells were then imaged to localize the fluorescent labels in 3 dimensions, as illustrated in Figure 1B. There were several identical-looking spots in each color channel of the image, and the only way to identify these spots with specific loci was to compare the distances between them with the likelihoods of finding such distances given the known spacing of genomic labels, a relationship we call the interlocus distance function. In this experiment, we did not measure the distance function directly, but rather fit a Gaussian chain model [20] to the relationship between mean spatial distance between loci and their genomic separation found in Ref. [13], as illustrated in Figure 1C. Using the labeling pattern, the spot locations in each image, and the distance function, the reconstruction algorithm align3d produced likelihoods of assigning each spot in an image to each possible labeled locus of the same color, shown schematically as circles in Figure 2A. These ‘mapping probabilities’ can be used to infer likely conformations (Figure 2B) in which the level of confidence in each spot assignment is known.
The true conformations were unknown, so the reconstruction quality could not be measured directly. However, we were able to test the null hypothesis of no significance by comparing each reconstruction with several ‘control’ reconstructions produced from identical inputs except that the imaged spot colors were randomly scrambled (see Figure 2C). Specifically, we compared two figures of merit between the true reconstruction and the controls: 1) the uncertainty in the spot assignments based on the mapping probabilities that we term ‘entropy’ (denoted by ‘S’), and 2) the total statistical weight over all possible conformations (denoted by Z). The expectation is that the entropy S is lower in the true reconstruction than the controls, and that Z (which roughly measures the ease with which the chromosome can be fit through the spots in the image) should be higher for the true mapping than the control mappings. We also looked at an adjusted Z that subtracts, in an approximate way, an unphysical contribution coming from the calculation (see Methods for details). The top row of Figure 2D shows the results of this test aggregated over all 42 chromosomes we measured. The bottom row shows the same tests applied to the 15 highest-quality chromosomes, selected as having 8-12 spots in the image and being more than 200 nm thick perpendicular to the slide (their width in the imaging plane was typically ∼ 1 µm). In all metrics the true reconstructions outperformed the controls, as expected. Z was a more sensitive measure of significance than S, particularly when adjusted for artifacts of the calculation. Despite the much smaller sample size of high-quality chromosomes, the overall significance of their reconstructions against the null hypothesis was higher. Chromosomal thickness was an important determinant of quality (see Supplementary Figure S1), showing the importance of maintaining cell thickness with non-dehydrating fixation protocols in future experiments.
Analysis of recolored published conformations
Our 10-spot experiments had two main limitations: 1) the distance function was not measured directly within the experiment, and 2) the reconstruction quality could not be checked directly since the true conformation was unknown (as would be the case in a practical use of our method, but ideally not in a method demonstration). To circumvent both these limitations, we reanalyzed two sequential FISH data sets that were published soon after our experiment, in which several tens of loci were uniquely identified and localized using variants of the protocol known as Oligopaints [12]. The first data set, which we refer to simply as the Oligopaints data [12], measured conformation at the megabase scale over whole chromosomes. The second data set, using the Oligopaints variant called ORCA [21], probed conformation at kilobase scales over small regions of chromosomes.
Oligopaints and ORCA are significant for us in three ways. First, these methods greatly increase the number of pseudocolors available to a reconstruction algorithm, potentially allowing the reconstruction of very large-scale conformations using hundreds or thousands of loci. Second, the interlocus distance functions could be measured precisely in Oligopaints data, since the locus positions were measured unambiguously. Third, by ‘recoloring’ the Oligopaints data to introduce ambiguity (see Figure 3), we could construct synthetic images requiring reconstruction, and then check the reconstruction quality using the known true conformations.
Reconstructions at megabase scales (Oligopaints)
The first recolored data set we analyzed was that from the Oligopaints paper [12], which labeled TADs in several chromosomes of human cells. We were most interested in label spacing in the ∼ 0.5 Mb range, because at that scale adjacent loci should be near enough for reconstructions to work, but far enough that one should not need special superresolution techniques even when labeling many loci per pseudocolor. To obtain this rough label spacing, we restricted our analysis to the most densely-labeled parts of the Oligopaints data sets: one region in chromosomes 21 having 28 loci, and another region in chromosome 22 having 9 loci. Although the tiny chromosome 22 recolored conformations are probably not representative of future Oligopaints experiments, we think those reconstructions may be relevant for future live-cell experiments with limited pseudocolors.
Our first step was to use the Oligopaints data set to determine the interlocus distance function for these experiments and locus spacings. Our baseline assumption was that the distribution of locus separations would be described by the diffusive Gaussian chain model. However, we found somewhat more pairs of loci at large separations than predicted by the Gaussian chain model. This excess at large separations was well modeled by an exponential distribution in the interlocus separation R. As shown in Supplementary Figures S2 and S3, our distance function model was a sum of a Gaussian chain whose width scaled with interlocus distance according to an empirically-determined power law, and a exponential function that did not change with interlocus distance.
Next, we randomly recolored the Oligopaints data sets using 3, 5 and (for chromosome 21) 10 pseudocolors, and then passed the recolored images to align3d to obtain mapping probabilities for assigning spots to labeled loci. We also reconstructed 9 color-scrambled control images per ‘true’ recolored image, to compare the rank of the true image relative to the controls using various statistics of the reconstruction process. Note that the align3d calculation can only be done approximately for large calculations, so to simulate that we used a ‘baseline’ approximate calculation for all runs described here (see Methods section for details). Two statistics (entropy S and the adjusted statistical weight Z) proved sensitive in identifying the true reconstructions, particularly when 5 or more pseudocolors were used (see Supplementary Figures S4 and S5).
We measured the quality of the probabilistic align3d output using the metric of ‘unrecovered information’ I, which is the Shannon information required to specify the correct assignments given the locus-to-spot mapping probabilities. In the recolored experiments the true spot assignments were known, so I could be determined from the mapping probabilities, but in a real experiment one would approximate as I ≈ S since entropy S is simply a weighted average of information I over the mapping probabilities. A mapping that confidently assigns spots to their correct loci will have S ≈ I ≈ 0; if the spot assignments are very uncertain then S ≈ I log(N/C) for N loci per C pseudocolors; and if the spot assignment is confident but wrong then S is small but I → ∞. Figure 4 (left-hand panels) plots unrecovered information I versus entropy S over the various recolored Oligopaints reconstructions. These plots show that when the true mapping has significantly lower entropy than the controls, then I ≈ S, which is critical because it means that an experimenter can blindly gauge the quality of a reconstruction. We note that poor mappings send I ≫ S, probably because unrecovered information is very sensitive to errors: a single mistaken assignment of zero probability to a correct locus assignment sends I → ∞ but does not affect S. Next, we used each set of mapping probabilities to guess a likely conformation, using a heuristic formula that selects the most likely mapping probabilities and iteratively resolves conflicts where two loci map to the same spot. The right-hand panels of Figure 4 show the quality of the inferred conformations as measured by the rate of locus assignment errors, and show that this error rate can also be predicted by S although the relationship is weaker.
Reconstructions at kilobase scales (ORCA)
Next, we tested our reconstruction performance using the fine-scale conformations found in the three ORCA data sets [21], whose labeled loci were spaced at 2, 3 and 10 kb respectively. Since the locus spacing was uniform we used the entire conformations for recoloring and reconstructions. First, we examined the distances between nearby loci to determine the interlocus distance function at small scales (see Supplementary Figures S6, S7 and S8). As before, the Gaussian chain model was adequate for the bulk of the distribution, but again a long tail indicated that nearby loci would take long excursions more often than a Gaussian chain would predict. As before, the residuals after subtracting the Gaussian chain distribution followed an exponential distribution. Thus we fit the 3 kb and 10 kb data sets using a Gaussian-plus-exponential distribution. At the smallest separations a second exponential was needed to provide a good fit.
We randomly recolored each of the three ORCA data sets using 3, 5, 10 and 20 pseudocolors, produced locus- to-spot mapping probabilities using align3d and inferred explicit conformations using those mapping probabilities, as before. The quality of the mapping probabilities and the locus assignments of the explicit conformations are shown in Figure 5, and their rankings relative to controls are shown in Supplementary Figures S9, S10 and S11. As before, when the true reconstruction outperforms controls, the entropy S accurately measures both unrecovered information I and locus-assignment errors, which are closely-related error measures (see Supplementary Figure S12). These reconstructions performed no-ticeably worse than the Oligopaints reconstructions, as measured by the greater values of unrecovered information I and error rates in the conformations for the same number of pseudocolors. There are several likely reasons for this. 1) The recolored ORCA data had more labeled loci than the recolored Oligopaints data. 2) The regular spacing of the ORCA labels made reconstruction more difficult. (In the Oligopaints data sets, irregularity in label spacing and color ordering are the two characteristics that help identify genetic loci). 3) The occasional long-distance excursions taken by loci (corresponding to the exponential tail in the distance function) are more significant at small scales. The farther these excursions go relative to a typical displacement between two adjacent loci, the more difficult reconstructions become, since the volume where neighboring loci may be found grows to en-compass many more competing loci. In support of this last explanation, we note that the 3 kb and 10 kb data sets gave better results than the 2 kb data set.
Discussion
Conformation capture experiments such as Hi-C provide much useful information about genome architecture, but they are not an ideal tool for measuring explicit chromosomal conformations, for several reasons. First, most Hi-C data comes from large cell populations whose conformations are likely very dissimilar: a ‘consensus conformation’ deduced from these experiments is mostly meaningless [8], especially given that the structures visible in population-averaged data only exist in a small fraction of cells [10, 11]. Single-cell chromosome-capture experiments are possible but they have lower resolution [10]. Second, these experiments require deep sequencing, which is expensive. Third, even single-cell contact-based structural inference has the overfitting problem where many continuous degrees of freedom are adjusted to fit noisy contact data: the resulting conformations may plausibly fit the data, but the uncertainty in these guesses is completely unknown. Finally, some important information about chromosomal positioning is inevitably absent from contact data, the most obvious examples being the location, orientation and ‘parity’ of the chromosomes.
By contrast, image-based chromosomal reconstructions avoid many of the problems of Hi-C reconstructions, as our results demonstrate. Fluorescence imaging uses cheap reagents, is widely available and is inherently a single-cell assay. The reconstruction method we demonstrate factors in the various types of experimental error. It produces definite conformations where the data is unambiguous, and gives the spectrum of possible conformations with accurate likelihoods where the reconstructions have un-certainty (even when the calculation is only approximate – see Supplementary Figure S14). The inference itself requires finding a finite set of discrete mapping variables, rather than a continuum of real-valued parameters as in the case of Hi-C reconstructions: thus it is far less underdetermined than Hi-C reconstructions, and its output is able to consider all possible solutions and properly weight their likelihoods. We believe our permutation tests are practically immune to known and unknown systematic effects when the label colors are chosen randomly. In turn, when a reconstruction is significant vis-á-vis the permutation tests, the specific error rate in the reconstruction can be accurately determined from the entropy S. For these reasons we believe that conformational reconstructions from microscopy have much better-understood error bounds than those from Hi-C data.
Refs. [18] and [19] argue that once experiments attain sufficient pseudocolors, the conformational inference problem should be essentially solved for reconstructions of any size, and that there is little purpose in improving experimental techniques further. The question is whether we have already reached the required number of pseudocolors with state-of-the-art sequential FISH protocols. We cannot yet give a complete answer from the recolored sequential FISH data, because while those methods are capable of ∼ 50 pseudocolors, the data sets localized only ∼ 50 loci so our recolored data sets necessarily used far fewer than 50 pseudocolors. A full experiment labeling hundreds or thousands of loci using the full 50 pseuodcolor palette could perform worse or better than our 20-pseudocolor results: the density of competing spots will increase (if for no other reason than the fraction of spots on the boundary will drop), but on the other hand the number of pseudocolors will be much higher. However, our 3- and 5-pseudocolor results should directly correspond to the reconstruction performance in live-cell experiments where only 3 − 5 pseudocolors are available. One result we are already confident in even for static experiments is that kilobase-scale reconstructions will require more pseuodcolors than megabase-scale reconstructions having the same number of loci, since the long tail in the distance function increases the search radius relatively much more when looking at small scales.
The interlocus distance functions themselves reflect interesting biology. Our models are consistent with chromosomal DNA taking a simple random walk polymers occasionally punctuated by long-distance excursions, having an exponentially-decaying likelihood with distance that reflects some unknown biology. One might expect that long stretches of DNA should undergo more excursions than short ones, but surprisingly the interlocus distance functions we fit show that the rate of excursions is constant no matter the length of intervening DNA. Our explanation is that the DNA is initially bundled as a random-walk polymer, and subsequent processes in the cell cause certain genomic regions to become stretched away from their initial position without affecting DNA on either side of a given affected region. Thus when measuring the positions of two loci, an excursion is only registered if one of the two loci happens to be in the part of DNA being stretched: interior stretches ‘leave and come back’ between the loci and are not registered. The presence of two exponential functions in the small-scale 2 kb ORCA data set is consistent with the fact that one or both endpoint loci can be undergoing an excursion, and the coarser data sets most likely only register excursions taken by both endpoints.
Several experimental lessons come out of our early results. First, having fully three-dimensional conformations is not only important for preserving genomic integrity, but also essential for obtaining accurate locus-to-spot inferences. In the flattest cells of our original experiment, the collapse of the out-of-plane dimension handicapped our analysis in the same way that simply ignoring the z-coordinate information in our spot data would. Second, it is important for the experimental color channels to be aligned, as this significantly affects the spacing between spots of different colors (and hence pseudocolors). Third, it is important to measure the interlocus distance function within each experiment, ideally by hybridizing set of probes labeling 1 locus per pseudocolor to a parallel preparation of cells. A good example showing why this is important is the difference in distance functions between the ORCA 2 kb and 10 kb data sets versus that of the ORCA 3 kb data set. Fourth, we find slight evidence that certain random labeling patterns are slightly better than others (see Supplementary Figure S13) though the effect is slight, but we did not look into nonrandom labeling patterns.
The computational protocols we developed here will be useful in the reconstruction of much longer and higher-resolution conformations. Computation time for all reconstructions in this paper including controls was only ∼ 3 processor-days, so if experiments can scale to assaying thousands of loci per cell, then the reconstructions easily can too. If the observed trend in quality versus number of pseudocolors extends to the regime of ∼ 50 pseudocolors, then large-scale conformational reconstructions are already possible using experimental techniques which are becoming routine.
Methods
Interphase nuclei were obtained from lymphoblast cell lines (GM12878). FISH experiments were performed using human fosmids containing non-duplicated region directly labeled by nick-translation with Cy3-dUTP (PerkinElmer), Cy5-dUTP (PerkinElmer), and fluorescein-dUTP (Enzo) as described by Ref. [22], with minor modifications. Briefly, 300 ng of labeled probe were used for the FISH experiments; hybridization was performed at 37 ° C in 2x SSC, 50% (v/v) formamide, 10% (w/v) dextran sulphate, and 3 mg sonicated salmon sperm DNA, in a volume of 10 ml. High stringency post-hybridization washing was performed at 60 ° C in 0.1x SSC three times.
Nuclei were simultaneously DAPI-stained. Digital images for 2D-FISH were obtained using a Leica DMRXA2 epifluorescence microscope equipped with a cooled CCD camera (Princeton Instruments). DAPI, Cy3, Cy5, and fluorescein fluorescence signals, detected with specific filters, were recorded separately as grayscale images. Coloring and merging of images were performed using ImageJ. In total, 21 interphase cells were scored.
Spots were located and their x/y centers were recorded manually from the deconvolved image stack. (The positional error introduced by deconvolution was found to be about 30 nm.) The z position of each spot was found to sub-stack-spacing resolution by fitting the intensity profile of each spot’s x/y pixel over the image stack to a Gaussian. Each spot on the color-calibration slide was localized in all color channels, and the difference of each channel from the bead was used to build a linearly-interpolated chromatic error model. This model was used to correct the localizations of the single-color fluorophores on the 3-spot and 10-spot slides. Lastly, spots were grouped into 2 chromosomes on each of the 10-spot slides.
Locus-to-spot mapping probabilities for each chromosome were produced using the align3d tool [17], which is available for download at https://github.com/heltilda/align3d. For the 10-spot experiment we computed the mapping probabilities exactly (to within the accuracy of the distance function we provided), by enumerating each possible conformation consistent with the data and weighing it according to the distance function. For the recolored experiments, we wanted to simulate the performance of a much larger reconstruction where such enumeration is impossible, so we used the lowest-order approximation (denoted Z in Ref. [17] or Z0 in Ref. [18]) that optimized spot penalties to approximate the true mapping probabilities.
In the original align3d paper the total statistical weight Z contains an unphysical contribution from spot penalties and skipped-spot penalties, which are part of the calculation and unrelated to the ease of fitting physical contours through the spots. For purposes of computing rankings of true reconstructions versus controls, we considered an ‘adjusted Z’ which removes those factors in an approximate way by calculating the following quantity:
Acknowledgments
We gratefully acknowledge support and helpful discussions from Andrew Laszlo, Joshua Vaughan, Leila Garcia, Min Fang, Julia Sidorova, Bob Kao, Jenny Mae Samson, Jens Gundlach, and Huy Nguyen. This work was supported by the Boettcher Foundation (J.C.), NIH grant 2T15LM009451 (B.R.), and a Cancer League of Colorado grant (B.R.).
Footnotes
↵* E-mail: brian.ross{at}ucdenver.edu
↵1 Ref. [18] argues that reconstruction quality depends on a competition between the number of pseudocolors and the density (not number) of spots in the image, and that the number of pseudocolors required for near-perfect results is on the order of the number of competing loci that are spatially as close to a given locus Li as either of its neighbors Li−1 or Li+1.