## Abstract

Evolutionary dynamics driven out of equilibrium by growth, expansion or adaptation often generate a characteristically skewed distribution of descendant numbers: The earliest, the most advanced or the fittest ancestors have exceptionally large number of descendants, which Luria and Delbrück called “jackpot” events. Here, we show that recurrent jackpot events generate a deterministic bias favoring majority alleles, which is equivalent to an effective frequency-dependent selection (proportional to the log ratio of the frequencies of mutant and wild-type alleles). This “fictitious” selection force results from the fact that majority alleles tend to sample deeper into the tail of the descendant distribution. The flipside of this sampling effect is the rare occurrence of large frequency hikes in favor of minority alleles, which ensures that the allele frequency dynamics remains neutral overall unless genuine selection is present. The limiting allele frequency process is dual to the Bolthausen-Sznitman coalescent and has a particularly simple representation in terms of the logarithm of the mutant frequency. The resulting picture of a selection-like bias compensated by rare big jumps allows for an intuitive understanding of allele frequency trajectories and enables the exact calculation of transition densities for a range of important scenarios, including population size changes and different forms of selection. The fixation of unconditionally beneficial mutations is shown to be exponentially suppressed and balancing selection can maintain diversity only if the population size is large enough. We briefly discuss analogous effects in disordered complex systems, where sampling-induced biases can be viewed as ergodicity breaking driving forces.

One of the virtues of mathematizing Darwin’s theory of evolution is that one obtains quantitative predictions about the dynamics of allele frequencies that can be tested with increasing rigor as experimental techniques, sequencing methods and computational power advance. The Wright-Fisher model is arguably the simplest null model of how allele frequencies change across time [17]. Although, for modeling neutral genetic diversity, it is often replaced by equivalent backward-in-time models of the ensuing tree structures [23], forward-in-time approaches are still unrivaled in their ability to include the effects of natural selection. As such, the Wright-Fisher model has been instrumental for shaping the intuition of generations of population genetics about the basic dynamics of neutral and selected variants. But transition densities derived from the Wright-Fisher model also find tangible application in scans for selection in time series data [4, 9, 22].

The Wright-Fisher model is remarkably versatile as it can be adjusted to many scenarios by the use of *effective* model parameters: An effective population size, an effective mutation rate and effective selection coefficients. But, crucially, these re-parameterizations cannot account for extremely skewed family size distributions. While remarkably skewed family distributions occur in some natural populations [15], they routinely arise in microbial populations that combine exponential growth with recurrent mutations. This was first highlighted by Luria and Delbrück [26], who noticed that mutations that occur early in an exponential growth process will produce an exceptionally large number of descendants. The distribution of such mutational “jackpot” events has a particular power law tail in well-mixed population, as is briefly explained in Fig. 1A. Simplest models of continual evolution [28] and related models of traveling waves [36] can be viewed, on a coarse-grained level, as repeatedly sampling from this jackpot distribution. (The number of draws and the characteristic resampling time scale varies with the model.) It is by now well-established that the ensuing genealogies are described by a particular multiple-merger coalescent [7, 8, 13, 29, 33–35] first identified by Bolthausen and Sznitman [5].

While extensions of the Wright-Fisher diffusion process to capture skewed offspring numbers have been formally constructed [2, 3, 14, 20], also including selection and mutations [1, 10, 11, 16, 18], we still lack explicit finite time predictions for the probability distribution of allele frequency trajectories. Our goal here is to fill this gap for the particular case of the Luria-Delbriick jackpot distribution, by characterizing the allele frequency process in such a way that it can be easily generalized, intuitively understood and integrated in time.