Abstract
Meiotic crossovers facilitate chromosome segregation and create new combinations of alleles in gametes. Crossover frequency varies along chromosomes and crossover interference limits the coincidence of closely spaced crossovers. Crossovers can be measured by observing the inheritance of linked transgenes expressing different colors of fluorescent protein in Arabidopsis pollen tetrads. Here we establish DeepTetrad, a deep learning-based image recognition package for pollen tetrad analysis that enables high-throughput measurements of crossover frequency and interference in individual plants. DeepTetrad will accelerate genetic dissection of mechanisms that control meiotic recombination.
Main
Meiosis consists of two consecutive nuclear divisions and produces four haploid gametes from a single diploid cell in sexually reproducing eukaryotes1. In Arabidopsis male meiosis, ∼200– 250 meiotic DNA double-strand breaks (DSBs) are induced in the genome by a DNA topoisomerase VI-like complex to initiate meiotic recombination2–4. Of these DSBs, only ∼8– 11 are repaired as crossovers (COs) using a homologous chromosome (homolog). Thus, male meiosis in the Arabidopsis genome, which comprises five chromosomes, results in an average of ∼1.8 crossovers between homologs. This low number suggests the existence of mechanisms that limit crossovers, a phenomenon that is observed in most eukaryotes2. Meiotic DSB and CO frequencies are controlled by genetic and epigenetic factors and are non-randomly distributed along chromosomes, with higher levels around gene promoters and terminators and lower levels across the centromeres5–7.
At least two pathways (Type I and Type II), contribute to CO formation2,3. The Type I pathway leads to interfering COs that prevent the coincident occurrence of closely spaced CO on the same pair of chromosomes2,8,9. In plants, interfering COs represent ∼80–85% of total COs and are dependent on the ZMM proteins (ZIP4, MSH4, MSH5, MER3, HEI10, SHOC1, PTD, MLH1, MLH3). The remaining ∼10–15% of COs are non-interfering and occur via the Type II pathway10. Non-interfering COs are resolved by the MUS81 endonuclease and are restricted by anti-recombination factors such as FANCM, RECQ4A, RECQ4B, and FIGL111–15. Disruption of anti-recombination factors can increase the number of Type II COs in plants, which has the potential to create new combinations of desirable alleles that can improve crop varieties13,16. Therefore, high-throughput detection and understanding of CO frequency and interference have important implications for our understanding of the control of meiotic recombination as well as for breeding.
In Arabidopsis, CO frequency and interference can be measured by pollen tetrad analysis using Fluorescent-Tagged Lines (FTLs) in the quartet1 (qrt1) background17,18. Mutation of the QRT1 gene encoding a pectin methylestrase results in the four pollen products of male meiosis remaining attached to one another, allowing classical tetrad analysis. Each FTL has a transgenes that expresses eYFP (Y), dsRed (R) or eCFP (C) fluorescent proteins in mature pollen using the post-meiotic LAT52 promoter. Genetic intervals bounded by transgenes expressing different colors (e.g. I1bc, I1fg, I2ab, I2fg, I3bc, CEN3, I5ab; Supplementary Fig. 1) can be created by crossing FTLs. Scoring the segregation of 2 or 3 linked markers enables CO frequency and interference to be measured. For example, plants that are hemizygous for the three markers (YRC/+++) that define the I1bc interval produce 12 pollen tetrad classes (A–L) depending on the number of COs between YR and RC (Supplementary Fig. 2). The relative segregation of any two markers can be used to place pollen tetrads into one of the three categories used for classic tetrad analysis: parental ditype (PD), tetratype (T) and non-parental ditype (NPD) (Supplementary Fig. 3). Tetrad analysis enables the calculation of map distances between pairs of markers, and measurement of CO interference between adjacent intervals.
T-DNA positions of FTLs originally generated in qrt1 Col-0 plants are displayed on the Arabidopsis genome. Each FTL of homozygous genotype for fluorescence T-DNAs was crossed to qrt1, and pollen tetrads of F1 plants were used to measure CO frequency and interference by DeepTetrad. Chr = chromosome.
Twelve tetrad classes (A-L) are generated from three-color tetrad assays18. According to the position and segregation of three T-DNAs expressing eYFP, dsRed and eCFP, non-crossover(A) and recombination tetrad classes (B-L) are determined in FTLs (YRC/+++). For each tetrad type, merged, red, yellow and cyan images are displayed. Bar graphs indicate the output of DeepTetrad classification. In the bar graphs, X axis labels indicate four pollens (1, 2, 3, 4) per tetrad and Y axis labels show the intensities of three-color fluorescence in the tetrad images.
Tetrad classes of PD (parental ditype), T (tetra type), and NPD (non-parental ditype) are generated from two-color tetrad assays of FTLs (RY/++). In the bar graphs, X axis labels indicate four pollens per tetrad and Y axis labels show the intensities of two-color fluorescence in the tetrad images.
Visual analysis of pollen tetrads is a powerful method for measuring genetic distance and crossover interference in Arabidopsis12,14,15,17,18. For example, manual analysis using fluorescence microscopy has been used to measure interference by comparing the map distances of two-color FTL intervals with and without COs in an adjacent interval18. However, manually scoring large numbers of tetrads is laborious and time consuming. Alternatively, FTLs in the qrt1/+ or QRT1 background can be analyzed by flow cytometry, which allows rapid measurement of CO frequency and interference of ∼10,000 single pollen grains per plant19–21. Unfortunately, the flow cytometric method is unable to detect double crossovers within single intervals, requires high purity pollen samples, and uses specialized equipment for three-color measurements. As an alternative, we have developed DeepTetrad (https://github.com/abysslover/deeptetrad), a deep learning-based image recognition package that enables quick, high-throughput, automated pollen tetrad analysis that can be used with existing FTL lines.
To develop DeepTetrad, we adapted the Mask Regional Convolutional Neural Network (Mask R-CNN), integrating a deep residual network (ResNet) backbone for image recognition to detect four-pollen tetrads with and without fluorescence22,23 (Fig. 1). First, DeepTetrad must precisely recognize pollen tetrads in bright-field pollen images, which include not only tetrads but also triads, dyads and monads (Fig. 1 a–c). DeepTetrad was assembled with two separate Mask R-CNN processes, using a ResNet-FPN backbone to generate masks of the bright-field pollen images (Fig. 1a)23,24. Then, DeepTetrad was trained to detect whole tetrad, triad, or dyad images via a Tetrad Segmentation Model with a ResNet depth of 101 layers. In parallel, we also trained DeepTetrad to detect every single pollen cell within tetrads, triads, dyads, and even monads via a Pollen Segmentation Model with a ResNet of depth 50 layers (Fig 1a). We used Keras and TensorFlow backends for training, with input bright-field images of pollen tetrads from FTLs25,26.
a, Masking and training of tetrad-like and pollen images by DeepTetrad. Two separate DeepTetrad segmentation models make masks of tetrad-like and single pollens, respectively. b, Generation of masks from tetrad-like and single pollen images by DeepTetrad. c, Recognition and selection of measurable tetrad masks by DeepTetrad. Black dots represent the centroid assigned to each pollen mask in monads, dyads, triads, and tetrads. d, Tetrad classification by DeepTetrad. Bright-field (1), red (2), yellow (3), cyan (4)-filtered tetrad images, tetrad mask (5), single-pollen masks (6), three-color merged tetrads (7) and DeepTetrad output are displayed. In the bar graphs of DeepTetrad output, X axis labels indicate four pollens per tetrad and Y axis labels show the intensities of three-color fluorescence in the tetrad image. Scale bar = 0.1 mm (a, b, d, left), 0.05 mm (d, right). The colors in the tetrad mask images (a, middle panel, b, right panel, d, (6)) do not correspond to fluorescence colors.
When trained, DeepTetrad can produce masks of both tetrad-like (tetrads, triads, dyads) and single pollen-like objects from bright-field images of pollen tetrads (Fig. 1b). Next, DeepTetrad assigns a centroid to each pollen mask. Based on the position and distance between centroids of pollen masks within each tetrad-like mask, DeepTetrad recognizes measurable tetrads comprising four detectable pollen grains in the bright-field images (Fig. 1c). DeepTetrad‘s tetrad classifier then determines a tetrad type from a choice of 12 classes (A–L) for three-color assays (Supplementary Fig. 4), or from a choice of three types (PD, T, NPD) for two-color assays (Supplementary Fig. 5), according to the segregation pattern and intensity of fluorescence (yellow, red, cyan) in the four pollen masks per tetrad mask (Fig. 1d). CO frequency and interference can then be calculated using the frequency of tetrads in each class18. Because DeepTetrad is able to recognize single pollen grains and classify their fluorescence in tetrads, triads, dyads and monads, we developed the DeepMonad package by subclassing DeepTetrad. Like flow cytometry analysis, DeepMonad can measure crossover frequency and interference in FTLs by analyzing images of single fluorescent pollen grains in the qrt1/+ and QRT1 backgrounds19. In addition, DeepMonad can analyze single grains within tetrads, which allows comparison of genetic distances and interference calculated by DeepTetrad and DeepMonad (Fig. 2).
DeepTetrad recognizes measurable tetrads from a large number of tetrads, and classifies the tetrads as A (non-crossover, NCO) or B–L (recombinant tetrad classes with one or two crossovers in interval 1 and 2 (i1 and i2)) of FTL (RYC/++). Tetrad classes are B, single crossover interval 1 SCO-i1, C, SCO-i2, D, two strand double crossover 2st DCO, E, 3st DCOa, F, 3st DCOb, G, 4st DCO, H, NPD-i1 NCO-i2, I, NCO-i1 NPD-i2, J, NPD-i1 SCO-i2, K, SCO-i1 NPD-i2, and L, NPD-i1 NPD-i218. Each panel (A-L) shows bright-field (upper left), single-pollen mask (upper middle), merged-fluorescent (upper right), and single-color fluorescent (lower three) images. The colors in the tetrad mask images do not correspond to fluorescence colors. In the bar graphs, X axis labels indicate four pollens per tetrad and Y axis labels show the intensities of three-color fluorescence in the tetrad images. Scale bar = 0.1 mm.
DeepTetrad recognizes measurable tetrads (left panels) and classifies them as PD (parental ditype), T (tetra type) and NPD (non-parental ditype) of tetrad types according to segregation of fluorescence in FTL-CEN3 (YR/++) (middle panels). Each panel in the middle shows bright-field (upper left), single-pollen mask (upper middle), tetrad mask (upper right), merged-fluorescent (lower left), yellow (lower middle) and red (lower right) fluorescent images. The colors in the tetrad mask images do not correspond to fluorescence colors. In the bar graphs, X axis labels indicate four pollens per tetrad and Y axis labels show the intensities of two-color fluorescence in the tetrad images. Scale bar = 0.1 mm.
a, A quick tetrad preparation method for high-throughput imaging of tetrads. The detail procedure was described in Methods. b, Plot showing measurement of CO frequencies (cM) in single intervals of FTLs. Genetic distances of single intervals were measured by DeepTetrad, manual counting, flow cytometry and DeepMonad. c, Plot showing measurement of CO frequencies (cM/Mb) in single intervals of FTLs. A horizontal red line indicates the male chromosome average crossover rate. d, Plot showing measurement of genetic distances in various sized tetrad images by DeepTetrad. Different magnifications were applied to the same tetrad samples for imaging (Supplementary Fig. 6). e, Measurement of CO interference by DeepTetrad. The CO interference ratio (σ= Xi1 without adjacent CO/ Xi1 with adjacent CO) was measured by DeepTetrad (DT) and manual counting (MC), highlighted in blue. Interference value (I=1-coefficenct of coincidence) in yellow, was calculated by flow cytometry (FC) and DeepMonad (DM). A value of 1 and 0 indicates no interference in σ and I, respectively. The values of interference in other FTLs were measured by DeepTetrad and DeepMonad. f, The CO interference ratio in recq4a recq4b figl1 plants. DeepTetrad shows that recq4a recq4b figl1 causes interference to be absent, increasing crossover frequency in FTL-I2ab.
Since DeepTetrad does not require specialized equipment (like flow cytometry), we developed a quick, simple method to prepare a large number of pollen tetrads for high-throughput imaging (Fig. 2a). This method allows extensive image sets of pollen tetrads (bright-field, red, yellow and cyan) to be obtained, which can then be analyzed quickly and simultaneously by DeepTetrad (Supplementary Table 1). We used this technique to measure genetic distances in two-color FTL intervals (CEN3, I1b, I1c, I1b-c, I1f, I1g, I1f-g, I2f, I2g, I2f-g, I3b, I3c, I3b-c, I5a, I5b, I5a-b) using DeepTetrad (Fig. 2b, Supplementary Fig. 1). The genetic distance values obtained this way were similar to those obtained using manual tetrad counting, flow cytometry and DeepMonad (Fig. 2b, Supplementary Table 2-4). Intriguingly, our DeepTetrad analysis showed higher crossover frequencies for long intervals (I1b, I1c, I1b-c, I3b-c, I5a, I5b, I5a-b) compared to DeepMonad single pollen analysis; this was because DeepTetrad, but not DeepMonad, detects double crossovers in long intervals (Fig. 2b, Supplementary Table 2-4). In addition, the CEN3 interval which spans the centromere on chromosome 3 had a lower CO rate (2.21 cM/Mb) than the overall male chromosome average CO frequencies (4.77 cM/Mb), and intervals I2f, I2g and I2f-g which are close to the telomere had higher CO frequencies (10.19 cM/Mb, 10.71 cM/Mb and 10.23 cM/Mb, respectively) (Fig. 2c, Supplementary Fig. 1), consistent with prior observations6,7,27. DeepTetrad can also recognize tetrad images taken at different magnifications, and produces consistent CO frequency irrespective of scale (Fig. 2d, Supplementary Fig. 6).
Tetrad images were taken using an epifluorescence microscope at different magnifications (a– e) (25x, 32x, 40x, 50x, 80x) under bright-field, RFP, YPF and CFP filters. Scale bar = 0.1 mm.
DeepTetrad involves the preparation of a large number of tetrads using a quick and simple method, and increases the speed of tetrad analysis to obtain data from many individual plants by analyzing all tetrad images simultaneously. Abbreviations: CO, crossover; DCO, double crossovers.
In FTLs with T-DNAs expressing eYFP (Y), dsRed (R) and eCFP (C), the physical distances and genetic distances are shown18. Genetic distances were measured by DeepTetrad, manual counting, flow cytometry6,19–21,31, and DeepMonad. The results of statistical analyses (mean of cM, 95% confidence interval, P-value) on the genetic distances measured by different methods are shown. The P-values of significantly different crossover frequency are marked by asterisks. A P-value is not calculated when assumptions for the statistical t-test are violated. Abbreviations: DT, DeepTetrad; MC, manual counting; FC, flow cytometry; DM, DeepMonad; FTL, fluorescent tagged line.
Abbreviations: CO, crossover; non-crossover, NCO; single crossover, SCO; double crossover, DCO; st*, stand; NPD, non-parental ditype; σ, interference ratio. σ= Xi1 (with adjacent CO)/ Xi1 (w/o adjacent CO). X i1 is the map distance of the first interval (i1) generated from the Perkins equation ((1/2*T)+3*(NPD)/total). A σ value of 1 indicates no interference. The letters A-L represent tetrad classification as described previously18.
Abbreviations: NPD, non-parental ditype; T, tetra type; FTL, fluorescence tagged line.
To demonstrate DeepTetrad‘s utility for measuring CO interference we analyzed tetrad images from the three-color (YRC/+++) FTL interval-I1bc (Fig. 2e, Supplementary Fig. 1). Previously, interference had been measured in manually counted tetrads by calculating the interference ratio (defined here as σ) of the map distance of an interval (i1) in tetrads that have a CO in an adjacent interval (i2) with the map distance of the same interval (i1) in tetrads that lack a CO in i218. Analysis of 18,584 tetrads with DeepTetrad resulted in a σ value of 0.33 for I1bc which is consistent with the σ value of 0.36 obtained by manually counting 923 tetrads (Fig. 2e, Supplementary Table 2-5). Flow cytometry has also been used to measure interference in fluorescent-tagged pollen monads by calculating the ratio of observed double COs to expected double COs (I=1-coefficient of coincidence, the ratio of DCOobs to DCOexp)19. An analysis of 74,336 monads (converted from tetrad images) using this method with DeepTetrad resulted in an interference value of 0.54 for I1bc (YRC/+++) which was consistent with a value of 0.56 obtained from flow sorting 135,789 monads (Fig. 2e)19. To provide baseline values for future studies we also used DeepTetrad and DeepMonad to measured σ and I values in 4 other 3-color FTL intervals (I1fg, I2fg, I3bc and I5ab; (Fig. 2e, Supplementary Table 2 and 3). Previously, it was shown that frequency of Type II COs increases in fancm single mutants, as well as recq4a recq4b figl1 triple mutants, leading to an absence of interference (I=0, σ =1)12,14,15,19. Using DeepTetrad, we found that the σ value of FTL-I2ab in recq4a recq4b figl1 plants is 1, indicating no detectable interference, consistent with the prior observations (Fig. 2f, Supplementary Table 2 and 3). Taken together, our data demonstrate that DeepTetrad is a useful deep learning-based image recognition package for high-throughput measurements of both CO frequency and interference.
Abbreviations: CO, crossover; non-crossover, NCO; single crossover, SCO; double crossover, DCO; st*, stand; NPD, non-parental ditype; σ, interference ratio. σ= Xi1 (with adjacent CO)/ Xi1 (w/o adjacent CO). X i1 is the map distance of the first interval (i1) generated from the Perkins equation ((1/2*T)+3*(NPD)/total). A σ value of 1 indicates no interference. The letters A-L represent tetrad classification as described previously18.
The FTL-based visual tetrad assay has been used extensively in studies of plant meiosis12,17,18. The application of flow cytometry to FTLs allowed rapid, high-throughput measurement of CO frequency and interference19,21. Here, we have extended the utility of FTLs further by developing DeepTetrad to enable quick, simple, and automated tetrad analysis. DeepTetrad will accelerate genetic analysis of meiotic recombination mechanisms as well as the influence of epigenetic and environmental effects.
Methods
DeepTetrad Network Architecture
DeepTetrad assembles two separate Mask Regional Convolutional Neural Network (Mask R-CNN) for the instance segmentation task23. They generate masks of pollen objects and tetrad objects, respectively, from input bright-field images. As backbone architectures, which are responsible for feature detection, we used deep residual networks (ResNet) of depths 50 and 101, with a feature pyramid network (FPN)28. We use the same terminology and definitions as those used in the Mask R-CNN article23 when describing the backbone; ResNet-50-FPN. Multi-task loss L is also defined in the same manner.
, where Lcls is classification loss. Lbox is loss of bounding box v=(vx, vy, vw, vh), which is a rectangle defined by coordinates of the upper-top vertex (x, y), and the dimension (width w, height h). Lmask is mask loss. The i-th mask of bounding box v, which is the core feature in DeepTetrad, is a grayscale image defined by the following logical predicate.
, in which the pixel(x, y) function returns the pixel value of given coordinates in an image.
Mask segmentation is a multi-label classification. Hence Lmask must be calculated independently for each class in a single image. It is achieved by applying a sigmoid function to each pixel, from which the mean of binary cross entropy loss is calculated.
Training
DeepTetrad is trained by nVidia TITAN X with 12 GB RAM using Keras26 and Tensorflow25 backends in the CUDA 10 platform. Transfer learning is performed with pre-trained weights on a Microsoft COCO dataset29. Input images are of fixed dimensions (1,920, 2,560). Zero-padding resizes each image to the exact dimension of (2,048, 2,560), which ensures that width and height are multiples of 512. The image is cropped at random positions with dimensions of (512, 512). For pollen objects, 919 training masks and 370 validation masks were used for data augmentation. For tetrad objects, 1,371 training masks and 617 validation masks were used. Masks were annotated using VGG Image Annotator (VIA)30.
Image augmentation with scaling, translation, rotation, and shearing operations is randomly triggered to each training session and validation mini-batch. During an epoch, a mini-batch of two images per Graphic Processing Unit (GPU) is fed to the backbone. Regions of interest (ROIs) or bounding boxes are sampled 128 times for the pollen model; 512 times for the tetrad model.
The pollen model, in which ResNet-50-FPN backbone is integrated, is trained by a single GPU for 10,000 iterations using the Stochastic Gradient Descent (SGD) optimizer with a learning rate of 0.001, momentum of 0.9, and weight decay of 0.0001. The tetrad model is trained with the same configuration, except that the backbone is replaced with ResNet-101-FPN and the number of iterations is increased to 20,000.
Inference
Each backbone applies non-maximum suppression to 6,000 ROI candidates, in turn yielding 1,000 ROIs. The number of detected masks is limited by the maximum number of detecting instances D, which is set to 200. Hence, only D-detected ROIs with the highest scores are selected to create masks. If D is increased, the model may fail to infer masks because of memory limitation, or the inference may be seriously prolonged. The model might also overlook a considerable number of masks, which would be a critical problem.
As a solution, DeepTetrad tries to infer masks from cropped images rather than directly gathering them from a whole image. In an image of dimension (2048, 2560), there are at least 1,600–5,000 ground truth pollen masks, and at least 400–1,250 ground truth tetrad masks. As well as tetrad masks, monad, dyad, and triad masks will also be reported by the tetrad model. Thus, the number of ground truth masks in the whole image is much larger than D in both cases. A total of 63 cropped images of dimension (512, 512) are generated from the whole image, thereby left images of dimension (256, 512), or top images of dimension (512, 256) are intersected with one another. Complete inference for the whole image involves predicting masks from a mini-batch of 21 cropped images in three epochs.
Mask refinement
All masks from the inference stage need to be refined. Masks that are produced at the edges of the cropped image are usually not overlapping in local coordinates, which represent the spatial location before translating to that in the whole image. However, they can become broken or overlapping when translated to global coordinates, which are the coordinates in the whole image. After removing broken or overlapping masks, contours and the number of tetrad and pollen masks can be more accurately determined. The procedure is as follows: i) all pixel coordinates of pollen masks, Mp, and tetrad masks, Mt in each cropped image are moved to global coordinates by translation (affine transformation). ii) All pixel coordinates of tetrad masks are stored in a k-d tree, Tt. Similarly, all pollen masks are kept in a k-d tree, Tp. iii) All tetrad masks are queried against Tt, and, similarly, all pollen masks are queried against Tp to collect refined tetrad masks, Ψt, and refined pollen masks, Ψp, which meet the predicate below:
, in which, dist(x, y) returns a Euclidean distance between x and y, n(x) returns the number of elements, and c is either t (tetrad) or p (pollen).
The measurable tetrad masks and pollen masks, Ω, are defined as follows:
, where centroid(x) yields the median of given mask coordinates.
Centroids are calculated to associate a tetrad mask Ψt, with the refined pollen masks Ψp. Each centroid of a pollen mask, Ψp, is queried against Tt, then Ψp is associated with a tetrad mask, Ψt if the Euclidean distance between them is 0. When the number of Ψp associated with Ψt is four, they are deemed to be measurable.
Tetrad classification
A tetrad mask can be classified into a representative type of crossover event according to the fluorescence intensity values of associated pollen masks. The signal intensity, Sc, is defined as the mean of pixel values in each fluorescence channel of measurable pollen masks, in which c can be a fluorescence channel of red (R), yellow (Y), or cyan (C). We assume c={R, Y} for two-channel images, and c={R, Y, C} for three-channel images. For measurable tetrad masks, Sc of four pollen masks can be calculated, yet those which have undergone silencing in fluorescent protein expression should be ignored. If silencing occurs, the difference between Sc of the second-highest and the third-highest would be smaller than a certain threshold, Θ. Second-highest Sc represents presence or on-state of fluorescence proteins, whereas the third-highest implies absence or off-state of the proteins. Z-scores are calculated for four Sc values before finding Θ. We explored a parameter space of Θ up to two decimal places using an adaptive grid search method, then Θ was set to 0.40, meaning that Sc differences between on-state and off-state must be bigger than but not equal to 0.40 in a normal distribution.
We determine if fluorescent proteins in a pollen grain are expressed by comparing individual Sc with the median of all four Sc values. With per-channel expressions, tetrad masks are classified as one of classical three tetrad types: parental ditype (PD), tetra type (T), and non-parental ditype (NPD), with two-color FTL intervals or 12 classes (A to L) with the three-color counterparts (Supplementary Fig. 2 and 3). In three-color FTL intervals that have two intervals (i1 and i2) with four chromatids (1-4), tetrad classes are non-crossover (A), single crossover interval 1 (B; SCO-i1), single crossover interval 2 (C; SCO-i2), two strand double crossover (D; 2stDCO), three strand double crossover a (E; 3st DCOa), three strand double crossover b (F; 3st DCOb), four strand double crossover (G; 4st DCO), non-parental ditype interval 1, non-crossover interval 2 (H; NPD-i1 NCO-i2), non-crossover interval 1, non-parental ditype interval 2 (I; NCO-i1 NPD-i2), non-parental ditype interval 1, single crossover interval 2 (J; NPD-i1 SCO-i2), single crossover interval 1, non-parental ditype interval 2 (K; SCO-i1 NPD-i2) and non-parental ditype interval 1, non-parental ditype interval 2 (L; NPD-i1 NPD-i2)18.
Calculation of interference
With two-color FTL intervals, we calculate crossover frequency following Perkin‘s equation:
With three-color FTL intervals, we can calculate the interference ratio σ, which is the ratio of the map distance with adjacent crossover Xγ to the map distance without adjacent crossover Xδ
, in which PD means the number of parental ditypes or no crossover event. T is the number of tetratype or single crossover events. NPD is the number of non-parental ditype or double crossover events. A-L letters represent 12 tetrad classes from three-color FTLs (Supplementary Fig. 2)18.
In the above equations, we assumed that two adjacent genetic intervals of i1 and i2 are defined by three separate fluorescent protein transgenes of red, yellow, and cyan in sequential order. The γ represents that at least a single crossover event occurred at i2, meanwhileδdenotes that no crossover events were found at i2. We highlight that the number of crossover event is counted at i1. Tγ is the tetratype tetrads for i1 that have a CO in i2, and Tδ is the tetratype tetrads for i1 that do not have a CO in i2. DeepTetrad maps the order of input color images (red-yellow-cyan) to the physical order of fluorescent protein transgenes of FTLs for calculating the interference ratio as well as genetic distance.
Pollen tetrad preparation
FTL plants were grown at 20°C under long-day condition (16 h light/8 h dark). Twenty open flowers of a primary shoot from 30-day old FTL plants were collected in a 1.5-ml tube, and 1 ml of pollen tetrad preparation solution (17% sucrose, 2 mM CaCl2, 1.625 mM boric acid, 0.1% Triton-X-100, pH 7.5) was added before incubating for 5 min at room temperature, with gentle rotation. Flowers and the solution were mixed by inverting the tube several times. The solution of pollen tetrads was pipetted and filtered into a new 1.5-ml tube through an 80-μm nylon mesh (30 × 30 mm). The filtered solution was centrifuged at 500g for 3 min to make a yellow pellet. The supernatant was removed and discarded by pipetting or vacuum aspiration. Four μl of pollen tetrad preparation solution was added to the yellow pellet. After pipetting gently five times, the 4-μl suspension of pollen tetrads was loaded on a glass microscope slide and covered with a small cover glass (9 × 9 mm). This resulted in ∼2,500 tetrads for imaging.
Microscopy and imaging
A set of four photographs for each pollen tetrad was taken using a Leica M165 FC dissecting stereomicroscope with bright-field, RFP, YFP and CFP filters (ArticleNo 10450224, 10447410, 10447409, respectively) in sequential order. Twelve image sets per cover glass were obtained from ∼20 flowers when a magnification of 50x was used to image tetrads. Information about the gain, gamma, saturation and exposure for each FTL for high quality imaging is available (Supplementary Table 6).
Information of imaging fluorescent pollen in FTLs.
The gain, saturation, gamma values except for the exposure value are applied across images: gain of x10.1, saturation of 1.00, gamma of 1.50. In particular, exposure for the YFP image should be adjusted if its fluorescence is weaker than those in the other channels. The table shows the value of exposure applied to each FTL (fluorescent tagged line).
Acknowledgements
We thank Raphael Mercier (MPI, Germany) for providing recq4 figl1 mutant seeds. We thank our colleagues for giving critical comments. This work is funded by the Suh Kyungbae Foundation, Republic of Korea, Next-Generation BioGreen 21 Program (PJ01337001), Rural Development Administration, Republic of Korea, and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2017R1D1AB03028374). GPC is supported by a US National Science Foundation grant (IOS-1844264).