Dynamics of organelle DNA segregation in Arabidopsis development and reproduction 1 revealed with tissue-specific heteroplasmy profiling and stochastic modelling

9 Organelle DNA (oDNA) in mitochondria and plastids is vital for plant (and eukaryotic) life. Selection 10 against damaged oDNA is mediated in part by segregation – the sorting of different oDNA types into 11 different cells in the germline. Plants segregate oDNA very rapidly, with oDNA recombination protein 12 MutS Homolog 1 (MSH1), a key driver of this segregation, but in contrast to mammals, we have very 13 limited knowledge of the dynamics of this segregation within plants and between generations. Here, 14 we combine stochastic modelling with tissue-specific heteroplasmy measurements to reveal the 15 trajectories of oDNA segregation in Arabidopsis thaliana development and reproduction. We obtain 16 and use new experimental observations of oDNA through development to confirm and refine the 17 predictions of the theory inferred from existing measurements. Ongoing segregation proceeds 18 gradually but continually during plant development, with a more rapid increase between 19 inflorescence formation and the establishment of the next generation. When MSH1 is compromised, 20 we show that the majority of observed segregation could be achieved through partitioning at cell 21 divisions. When MSH1 is functional, mtDNA segregation is far more rapid than can be achieved 22 through cell divisions; we show that increased oDNA gene conversion is a plausible mechanism 23 quantitatively explaining this acceleration. We also discuss the support for different models of the 24 plant germline provided by these observations. 25


27
Mitochondria and plastids are essential sites of energy transduction across eukaryotes. Originally 28 independent organisms, they retain their own genomes (organelle DNA or oDNA; mtDNA and ptDNA 29 respectively) encoding essential aspects of bioenergetic machinery in plants (and other eukaryotes) processes, it is essential to preserve the integrity of oDNA genes. This preservation necessitates a 36 way of dealing with oDNA mutations and ensuring faithful inheritance of oDNA between generations. 37 Mutations in oDNA can give rise to heteroplasmy -a mixture of several oDNA types within a cell 38 [Wallace & Chalkia, 2013;Stewart & Chinnery, 2015]. Across eukaryotes, developmental and 39 genetic processes exist to limit the inheritance of heteroplasmy [Edwards et al., 2021]. In several 40 animals, mtDNA inheritance is shaped by the so-called developmental bottleneck [Johnston, 2019b;41 Stewart & Chinnery, 2015; Zhang et al., 2018]. Here, cell-to-cell variance in heteroplasmy is 42 increased in the female germline, so that individual gametes have a wide range of heteroplasmy 43 levels. Through this increase in variance -called segregation or "sorting out" -it is then possible for 44 some gametes to inherit lower levels of damaging mutations than the mother's average. If gametes 45 with high levels of such mutations are removed by selection, the mutational burden passed to the 46 next generation is limited. 47 Previous work has characterised inheritance and vegetative sorting of heteroplasmy in carrot 48 [Mandel et al., 2020]. Here, little evidence was described for segregation during plant development, 49 with most observations involving a loss of heteroplasmy between generations. The heteroplasmy 50 levels involved in this study were typically extreme (around 1% frequency of the minor allele), 51 meaning that such segregation would be very hard to detect; and one notable instance was recorded 52 of a 31% heteroplasmic offspring arising from a <1% heteroplasmic mother and father, suggesting 53 that a mechanism for substantial amplification of minor alleles may be present. The loss of 54 heteroplasmy upon inheritance agrees with results in Silene [Bentley et al., 2010], where only 17% 55 of offspring retained heteroplasmy that was present in their mother.  . We analyzed bulk tissue samples, so cell-to-cell 117 variability cannot be directly quantified; instead, we assume that the heteroplasmy mean in a tissue 118 sample reflects the heteroplasmy of the single cell that was the developmental ancestor of the tissue 119 [Burian et al., 2016;Furner & Pumfrey, 1992;Irish & Sussex, 1992]. This assumption allows for any 120 amount of segregation to occur during the development of the tissue from the precursor cell but 121 assumes there is no systematic shift due to selection for one oDNA type over another (compatible 122 with evidence in this system [Broz et al., 2022] and others [Mandel et al., 2020]).

123
Given this picture, bulk heteroplasmy samples from different tissues are interpretable as readouts of 124 single-cell heteroplasmy in the population of stem cell precursors to each tissue. For example, mean 125 heteroplasmy samples from three leaves are interpreted as three single cell heteroplasmy values 126 from the (earlier) population of stem cells that gave rise to those leaves. We can then construct a 127 developmental model inspired by the "ontogenetic phylogeny" picture tracking the relationships 128 between cells at different developmental stages [Wilton et al., 2018]. Here, the developmental 129 history of a set of cells is accounted for by a "cell pedigree" or "lineage tree" [Stadler et al., 2021]   The amount of segregation occurring between each developmental period is quantified in our model 144 as "effective segregation events". This is the number n of binomial cell divisions (and associated 145 oDNA reamplifications) that would generate the observed heteroplasmy variance, with an effective 146 population size Ne. We use this variable rather than a "bottleneck size" or "drift parameter"       To ask whether within-generation segregation was a genuinely continuous process, we next 248 explored the probability that the magnitude of segregation increased sequentially through 249 developmental stages (for example, whether the amount of segregation experienced by late leaves   In the wildtype mtDNA, much more segregation is observed than can be accounted for by 34 cell  Fig. S4). This combined model provides predictions for heteroplasmy 296 distributions at any given stage of plant development (Supplementary Fig. S5). We should note that  The posterior distributions we have presented are integrated over all the model structures in Fig. 1A, 308 so that they reflect "universal" behaviour regardless of the support for the individual models.

309
However, the RJMCMC process also quantifies this support for the different models of the plant 310 germline. Interestingly, we initially observed some diversity in the posterior distributions over this 311 model index (Fig. 4B). The mtDNA msh1 data has strong support for the "linear germline" model, 312 while the mtDNA wildtype and ptDNA msh1 data provide strong support for the "all separate 313 lineages" model ( Supplementary Fig. S2).

314
To interpret these findings, it helps to consider the behaviour of heteroplasmy statistics under the    and between generations. Allele specific primers and probes were designed to each SNV and 470 droplet generation and reading was performed using Bio-Rad QX200 system. This study used the 471 specific loci plastid 26553, mitochondria 91017 and mitochondria 334038, which were retained after 472 screening the original set of heteroplasmic variants for those present at moderate allele frequencies.

473
A correction factor was applied to mitochondrial data to account for the amplification of nuclear 474 copies of the mitochondrial genome (numts) found in Arabidopsis. Specifically, the large numt on 475 chromosome 2 is too similar to actual mtDNA to be distinguished with short reads or ddPCR 476 markers. So we approximate the number of nuclear genome copies in the sample (which would 477 inflate the number of apparent mitochondrial "wild type" alleles) and correct accordingly. All nupts 478 have enough sequence divergence that nuclear and plastid copies can be unambiguously

623
The derivation of this expression depends on a linear noise approximation, and the rates in the 624 above argument will of course vary as segregation proceeds. To provide a more precise estimate, 625 we implemented a simple stochastic simulation of binomial cell divisions, random re-amplification, 626 and gene conversion in a model cellular population. We simulated these processes for various gene