## 1 Abstract

We propose a new method that is able to accurately infer major haplotypes and their frequencies just from multiple samples of allele frequency data. Our approach seems to be the first that is able to estimate more than one haplotype given such data. Even the accuracy of experimentally obtained allele frequencies can be improved by re-estimating them from our reconstructed haplotypes.

Reconstructing haplotypes from sequencing data is of interest to several areas of biological and medical research. In evolutionary genetics, for instance, haplotypes help to better understand the genetic architecture of adaptation. Here we consider genomic time series data from three evolve and re-sequence experiments as an application. However, the approach can in principle be used in a wider context, as only data from multiple samples are needed, not necessarily collected over time.

## 2 Introduction

Understanding the haplotype composition of populations can frequently provide crucial information in studies relying on genetic data. Such investigations can be on different topics, such as the identification of genetic associations with diseases [Tewhey et al., 2011], the imputation of missing genotype data [Marchini et al., 2007], the inference of demographic population histories [Tishkoff et al., 1996], and the detection of traces of selection [Sabeti et al., 2002]. Therefore several recently proposed methods aim at extracting haplotypes (e.g. [Delaneau et al., 2019], [Browning et al., 2018], [Loh et al., 2016]) from sequencing data. A review of further methods and applications can be found in [Browning and Browning, 2011].

For human populations, great efforts have been put into sequencing large numbers of individuals accompanied by further efforts in the development of fast algorithms to obtain haplotype information by phasing the read data.

In other fields of applications, however, resources for sequencing are more scarce. In studies on the genetic basis of adaptation of non-human populations, for instance, the available haplotype information is often very limited, or lacking entirely. To lower the cost of the experiment, populations are frequently sequenced as a pool [Burke et al., 2010], [Illingworth et al., 2012], [Barghi et al., 2019]. This approach provides genome-wide allele frequency data on a SNP level [Futschik and Schlötterer, 2010], [Schlötterer et al., 2014], but does not lead to any direct haplotype information. It is frequently used in Evolve and Resequence (E&R) experiments [Turner et al., 2011]. These experiments implement experimental evolution, with one or more (replicate) populations being followed for several generations in the lab under artificial selection and sequenced multiple times to obtain time series of allele frequency data.

Efforts have been made to infer haplotypes and their frequencies given allele frequency data. The methods by [Excoffier and Slatkin, 1995], [Pirinen, 2009], [Gasbarra et al., 2011], [Long et al., 2011], [Kessner et al., 2013] and [Cao and Sun, 2015] use known founder haplotypes to estimate the frequency trajectories of these haplotypes over time. The approach proposed in [Franssen et al., 2017], and optimized in [Otte and Schlötterer, 2019], on the other hand, assumes no other information than the allele frequency from pool sequencing (Pool-Seq) and aims to reconstruct selected haplotype blocks. This heuristic approach, however, infers only a small subset of SNPs on one of the haplotypes using allele frequency data from a sufficient number of replicate populations and generations. Nevertheless, it has been shown that even limited haplotype information is very helpful to infer selection and understand the genetic architecture of adaptation ([Michalak et al., 2019], [Mallard et al., 2018], [Karasov et al., 2010], [Barghi et al., 2019], [Burke, 2012]).

Here, we propose a new principled approach for a situation, where only allele frequency data are available, but no candidate haplotypes from other sources, as e.g. in [Griffin et al., 2017]. The approach builds on recent work by [Behr and Munk, 2017], and may be applied for instance with several Pool-Seq samples. Our focus is on samples collected over time, but data from multiple spatial locations would be another possible application. For the approach to work, several samples showing a sufficient fluctuation in haplotype frequencies are needed. For temporal data this condition is met when selection acts on the haplotypes, or when genetic drift is sufficiently large. For spatial data, samples from a sufficiently structured population would be needed. As the number of haplotypes that can be inferred reliably from allele frequency data is typically lower than the number of available samples, often only the most common haplotypes will be reconstructed.

Here we focus on time series from only one population. For experiments starting with a common pool of founder haplotypes, a simultaneous analysis of several replicate populations may lead to a larger number of sequenced samples and consequently to more accurate estimates from our method. Such a design also permits to address additional biologically interesting questions about the genetic redundancy of adaptation, by looking at the consistency in the haplotype frequency changes across replicates.

## 3 Methods

### Notation

In the following for an integer *N* we use the notation [*N*] := {1, …, *N*}. For a matrix *A* we let *A _{i·}* and

*A*denote its

_{·i}*i*th row and column vector. With ∥

*A*∥ and

*A*

^{⊤}we denote the Frobenius norm and the transpose of a matrix

*A*, respectively. For a vector

*a*, we always assume it is a column vector. We denote

**1**= (1, …, 1)

^{⊤}the vector with just ones.

Simultaneously reconstructing the structure of dominant haplotypes (a.k.a. major haplotypes) as well as their relative proportions in the population amounts to a matrix factorization problem with finite alphabet constraint on one of the matrices, and positivity as well as unit column-sums constraints on the other. More specifically, assume we obtained relative allele frequencies *Y* ∈ [0, 1]^{N×T} from a pool sequencing experiment, at time points *t* ∈ [*T*] and SNP locations *n* ∈ [*N*], from a population that consists of *m*_{0} haplotypes. Then the underlying population allele frequencies *F* ∈ [0, 1]^{N×T} can be written as
where *S _{·i}* ∈ {0, 1}

^{N}for

*i*∈ [

*m*

_{0}] denotes the genotype structure of haplotype

*i*, that is,

*S*= 1 if haplotype

_{ni}*i*takes the reference allele at location

*n*and

*S*= 0 otherwise. The frequencies

_{ni}*W*denote the relative proportion of haplotype

_{it}*i*at time point

*t*(haplotype frequency). Ignoring any sequencing error, we have

*E*(

*Y*|

*F*) =

*F*. Our aim is to reconstruct both the matrices

*S*and

*W*from the measurement matrix

*Y*. This amounts to a specific type of a

*finite alphabet blind separation*problem [Behr and Munk, 2017, Behr et al., 2018].

In general, *m*_{0}, the overall number of haplotypes, can be very large, possibly *m*_{0} > *n*, which makes *S* and *W* non-identifiable, even from the noiseless allele frequencies *F*. However, when (for most of the time points *t*) the population is dominated by *m* ≪ *m*_{0} haplotypes, such that,
we denote structure and frequency of the dominant haplotyes as and and obtain *F* = *S ^{d}W^{d}* +

*B*, with a bias

*B*=

*SW*–

*S*, which is the allele frequency component of minor haplotypes. In the following, we will omit the superscript

^{d}W^{d}*d*and just write such that

*m*≪

*T, N*and ∥

*B*∥ ≪ ∥

*SW*∥ for

_{i·}*i*∈ [

*m*]. With our considered simulation and real data scenarios, a bias term only makes a difference when there is a large number of minor haplotypes present at several time points. For our further analysis we assume that the minor haplotypes

*B*and the major haplotypes

*SW*are independent. Treating the bias term

*B*in a Bayesian setting, we assume that

*B*is a random variable with mean

_{nt}In total, we obtain that
where *b* = (*b*_{1}, …, *b _{T}*)

^{⊤}∈ [0, 1]

^{T}is the bias term from the minor haplotype contribution.

### 3.1 HaploSep algorithm

If one directly had observed the allele frequencies *F* = *SW* + **1***b*^{⊤}, one would be able to uniquely recover *S*, *W* and *b* by exploring the ordering structure of the rows of *F* (assuming some weak identifiability conditions on *S* and *W* as detailed in the SI). For example, the row vector *F _{i·}* with the smallest norm corresponds to a haplotype structure where

*S*= (0, …, 0) and thus,

_{i·}*F*=

_{i·}*b*, which allows to recover

*b*. Similar, the second smallest row vector of

*F*corresponds to the situation where

*S*= (0, …, 0, 1) and

_{i·}*F*=

_{i·}*W*+

_{m·}*b*, which allows to recover

*W*. Proceeding in an analog way, one can uniquely recover

_{m·}*S*,

*W*and

*b*. Details are given in the SI and pseudo code is given in Algorithm 1 (SI).

However, in practice, one only obtains the noisy pool sequencing data *Y* but not the population allele frequencies *F*. Therefore, a direct application of Algorithm 1 (SI) is impossible and also not reasonable as further regulation will be required to obtain statistically stable estimates of *S*, *W* and *b*. Therefore, we consider a relaxation of the exact solution to *Y* = *SW* + *b*, i.e. we seek to solve the optimization problem
for *i* ∈ [*m*], *t* ∈ [*T*], *n* ∈ [*N*]. If *Y* were normally distributed, this would be the maximum likelihood estimator. For Pool-Seq data, due to its discrete structure, *Y* is clearly not normally distributed. Therefore, in principle, one may try to model the noise distribution of *Y* more precisely, and use a more targeted loss function than the *L*^{2}-norm loss in (3). However, in our simulations we found that loss functions based on a binomial model for the pool sequencing procedure do not provide a significant improvement over the *L*^{2}-norm loss and are computationally more challenging. This may be caused in part by the unpredictable variation of the bias term *B*.

Due to the discrete nature of *S*, the optimization problem in (3) is highly non-convex, which reveals this as a difficult issue. However, conditioned on either of (*W, b*) or *S*, optimization in (3) becomes tractable: Indeed, minimizing (3) given (*W, b*) corresponds to a simple clustering problem with known centers

Given *S*, on the other hand, minimizing (3) corresponds to a simple linear regression problem with linear constraints on *W, b*. Thus, a very natural approach to tackle the minimization problem in (3) is to employ an iterative Lloyd’s type algorithm [Lu and Zhou, 2016]. That is, to initialize either *S* or (*W, b*) and update iteratively until convergence of .

Recently, [Lu and Zhou, 2016] showed that for a sub-Gaussian error distribution and appropriate initialization of either labels or clusters, Lloyd’s algorithm converges to an exponentially small clustering error in log(*N*) iterations. For generic Lloyd’s they also show that spectral clustering provides an appropriate initialization. Here, however, we cannot initialize the centers directly, but rather have to initialize the frequencies *W* and the bias term *b*, which indirectly determine centers via (4). For this, we propose to employ Algorithm 1 in the SI applied to the clustered observations *Y _{n·}*. Pseudo-code is given in Algorithm 3 in the SI.

Similar as Algorithm 1 yields exact recover of *W* in the noiseless population case *Y* = *F*, it can be shown that Algorithm 3 yields stable recovery of *W* whenever the estimated centers are close enough to the noiseless centers in (4). More precisely, this means that whenever , then it follows that , for any sufficiently small *ϵ*, see [Behr et al., 2018, Behr, 2018] for details. The pseudo-code in Algorithm 2 in the SI summarizes our complete procedure for iteratively recovering the haplotype structure *S* and frequencies *W* from data *Y*, using Algorithm 3 as initialization. The only tuning parameter of our procedure in Algorithm 2 is the threshold *δ* used in our iteration stopping criterion. We found that *δ* = 0.001 works well in practice with convergence in usually a couple of iterations.

Note that in practice, the number of major haplotypes *m* is not given and has to be estimated from the data *Y*. Cross validation and bootstrap methods could be employed for this purpose. In Section S1-4, we present a different approach which is (computationally) much simpler and based on singular value decomposition. As further information on the reliability of our estimates, we propose accuracy measures and explain their computation in Section S1-4.

## 4 Simulations

Our simulations are designed to mimic experimental evolution (see e.g. [Kawecki et al., 2012], [Long et al., 2015] and [Schlötterer et al., 2015] for reviews). These experiments permit to study evolutionary adaptation under controlled laboratory conditions, making it easier to disentangle adaptive responses from other factors such as demography or genetic drift. Typically multiple populations of organisms are kept in the laboratory for several generations under stressful conditions chosen by the experimenter. DNA sequence information is commonly obtained at different time points to study the genetic basis of adaptive responses. However the separate sequencing of individuals at high coverage will often be too time consuming or costly for larger populations. As a consequence, the analysis is frequently carried out based on estimated population allele frequencies from pools of individuals sequenced together, although haplotype information would be helpful for a better understanding of the adaptive process.

With our simulations, we intend to illustrate how haplotypes and their relative frequencies can be reconstructed in such experiments. Furthermore, we show that it is possible to obtain improved population allele frequency estimates with pool sequencing experiments by using the reconstructed haplotypes and their estimated frequencies.

### 4.1 Reconstruction of haplotype structure and frequency

To illustrate our method, we discuss a viability selection model with a selected locus (selection coefficient *s* = 0.05) occurring on three of the haplotypes. Drift is simulated every generation via multinomial sampling from the haplotypes present in the previous generation. We consider an experiment with a constant population size over 150 generations. To mimic pool sequencing, the allele frequency data are obtained via binomial sampling at a Poisson (λ = 80) coverage from the population allele frequencies. We chose starting haplotypes from a set of founder haplotypes sequenced by [Barghi et al., 2019]. Due to the shared genealogy, such haplotypes share a large amount of similarity, making the reconstruction challenging. In this context, we consider three scenarios that differ with respect to the population size and the number of starting haplotypes. These parameters were chosen to mimic the experiments described in Section 5, namely, experiments with *C. elegans*, *D. simulans*, and the Longshank mice experiment. Further details concerning the simulation setup can be found in Section S2 (SI).

To better understand the performance of our estimates, some exemplary situations are shown in Fig. 1 and the respective allele composition results in Fig. S1 (SI). Whereas the allelic composition of the major haplotypes is estimated with very low error in Fig. 1(a), the haplotype frequency estimates become accurate only at later stages of this experiment. In Fig. 1(b), on the other hand, the frequency estimates are very accurate except at the very beginning of the experiment, but only the composition of the dominating haplotype is accurately estimated. Finally, in Fig. 1(c) both frequencies and allelic composition of the dominating haplotypes are estimated very accurately. The explanation for these observed patterns is that several low frequency confounder haplotypes are present for a considerable time in the Longshank mice experiment, before most of them get eliminated by genetic drift. With our *C. elegans* example on the other hand, genetic drift eliminates all except one haplotype very quickly. The remaining haplotype is easily estimated, but for the disappearing ones, only very few time points provide information to reconstruct their composition. Therefore the reconstruction error is between 24% and 37%. Finally for *D. simulans*, there is sufficient information in the data to estimate the two dominating haplotypes more or less perfectly. Frequency and allelic composition is accurately inferred even for a third one.

For a more complete picture, we now report reconstruction errors for 100 simulation runs under the scenario mimicking the Longshank mice experiment in Fig. 2. The number of reconstructed haplotypes is estimated for each run via our model selection criterion explained in S1-3 (SI). The boxplots in (a) depict the errors in terms of the mismatch proportion for each of the reconstructed haplotypes, whereas (b) provides errors in terms of the mean absolute difference between the true and estimated frequencies at each time point. According to (a), the composition of the dominating haplotype (the one with the highest inferred frequency at the end of the experiment) is always estimated nearly perfectly. For the other haplotypes, the accuracy depends on whether the simulated trajectory provides enough information. As we simulated three of the haplotypes as selected, those haplotypes often (but not always) reached sufficiently high allele frequencies during the experiment and could therefore be estimated reliably. The frequency estimates in (b) clearly improve over time, illustrating again that accurate frequency estimates can be expected at time points where not too many haplotypes are present. Our simulations led to occasional outliers, i.e. situations where the accuracy is less satisfactory. For a practical application, we therefore recommend to use the accuracy scores *R*^{2} proposed in Section S1-4 and the frequency change of reconstructed haplotypes for assessing the reliability of our estimates. In Fig. 2 for instance, we filtered out scenarios where either *R*^{2} < 0.8 or the frequency change of dominating haplotype (HapID1) is below 0.1. These criteria generate a reasonable threshold to filter out a small proportion of problematic scenarios. For different experimental designs, we recommend to validate the thresholds with simulations. In supplementary Section S6, we provide a more detailed discussion of situations that may lead to outlying estimates.

For results simulated under the other two experimental setups, see Figs. S2 and S3 (SI).

When planning an experiment, it can be useful to know under which design parameters haplotype reconstruction will tend to be reliable. For this purpose, we provide simulation results exploring the influence of E&R designs on the accuracy of our method and summarize the results in the supplement. See Section S2 for a summary of the simulated scenarios, and Section S3 for the results obtained under these scenarios. While our simulations suggest a good performance over a wide range of scenarios, we recommend that potential users perform additional simulations, if their experimental design deviates from our considered scenarios.

A general observation is that the more the haplotypes change in their frequencies during the experiment, the better the haplotype reconstruction gets. In E&R this can be achieved through a large enough selection pressure affecting the investigated genomic region, or through small population sizes such that genetic drift causes large frequency changes. In other applications, where samples may differ in location rather than time, a sufficient amount of population structure would be needed.

### 4.2 Improved allele frequency estimates

With known founder haplotypes, it has been shown in [Tilk et al., 2019] that allele frequency estimates from pool sequencing can often be improved by using haplotype information. Here we investigate, if this observation can be extended to the case of unknown founder haplotypes by using our estimates of important underlying haplotypes and their frequencies. Indeed, allele frequency estimates can be obtained by multiplying the matrix of the reconstructed haplotype structure with the matrix of the estimated haplotype frequencies and adding the estimated bias term . Using the simulated data in Section 4.1, we compared the so obtained estimates with the original allele frequencies from pool sequencing. As a measure of the difference in accuracy, we computed the ratio
where *N* is the number of SNPs, *y _{i}* is the true allele frequency of SNP

*i*, is the allele frequency of SNP

*i*estimated using the reconstructed haplotypes, and is the one estimated by pool sequencing. If

*α*is smaller than one, the haplotype based estimate performs better.

For each time point, we computed *α* based on all SNPs where the allele is not fixed or lost. To eliminate situations where the haplotype reconstruction does not work so well, we filter using the criterion in S1-4 to decide whether to use the haplotype based allele frequency estimates. Fig. 3 summarizes the relative performance for the Longshank mice experiment based on the filtered data. Analogous results for the other two experimental designs can be found in Fig. S14 (SI). Our results reveal that the use of the reconstructed haplotype information typically leads to an improved accuracy. Indeed, since haplotype frequency estimates combine information across many SNPs, they are less noisy than allele frequencies from pool sequencing for individual SNPs.

## 5 Application to real data

We applied our approach to two E&R data sets taken from [Barghi et al., 2019], and [Noble et al., 2019], and to the not yet published data set described in [Castro et al., 2019].

In each case, we looked at a small genomic region under selection. With the data set from [Barghi et al., 2019] and [Noble et al., 2019], we also compared our inferred haplotypes with reference haplotypes provided by the authors.

As a further validation of our approach, we compared our reconstructed haplotypes with paired end reads using the original sequencing data from [Barghi et al., 2019]. We found our reconstructed haplotypes to be vastly concordant with the reads when interpreting them as very short haplotypes. For further details see Section S8 (SI).

### 5.1 Drosophila simulans (D. simulans)

We now consider the E&R experiment of [Barghi et al., 2019]. There the base population consists of 202 isofemale lines of *Drosophila simulans*. Ten replicate populations were kept for 60 generations, with sequencing data available every ten generations. Furthermore, a sample of 189 founder haplotypes was sequenced, as well as 100 additional ones from five evolved replicates.

For this data, we first identified interesting genomic regions by testing for signals of selection at the SNP level using the modified χ^{2} and Cochran–Mantel–Haenszel tests [Spitzer et al., 2020] that account for drift and sequencing noise in the data. We then applied our method to multiple regions showing statistically significant allele frequency changes. We considered positions 11.239636 to 11.591566 Mb on chromosome 2L for replicate 3 as an example. Fig. 4 provides the estimated haplotype trajectories, as well as a comparison between our estimated allelic composition and the best matching founder haplotype sequences.

Due to the presence of a large number of similar haplotypes in particular at the early generations, the reconstruction is quite challenging for this experiment. Nevertheless, the dominating haplotype is usually reconstructed almost without error.

### 5.2 Longshank experiment in mice

In the Longshank mice experiment, individuals from a mouse population were selected to produce offspring according to their tibia length/body mass ratio. The evolution of three populations (two Longshank lines and one control line) was followed over several generations. For details on the experimental design, we refer to [Castro et al., 2019] and [Marchini et al., 2014]. We received time series data from this experiment for the the Nkx3-2 region (4395 SNPs and indels) of the Longshank 1 (LS1) line collected every generation, from generation 0 to 20 (still unpublished). The allele frequencies were missing for a large number of SNPs. Therefore we decided to remove the later generations (14-20) from our analysis, because of their particularly high proportion of missing values. For the generations 0-13, we only kept those SNPs for which all allele frequencies were available. Filtering the data this way, we reconstructed haplotypes from the remaining 561 SNPs in this region.

The right panel of Fig. S22 displays our estimated haplotype trajectories and the corresponding accuracy scores. Unfortunately no founder haplotypes or read data for comparison purposes were available to us with this experiment.

### 5.3 Caenorhabditis elegans (C. elegans)

We next look at the experiment described in [Noble et al., 2019]. There, three replicate populations experienced an increasing quantity of NaCl during their evolution. The base population comprised 10^{4} individuals that originated from 16 founder inbred lines. Pool sequenced allele frequency data have been made available to us for the base population and for the evolved populations at generations 50 and 100. Additionally, sequence information for the 16 founder inbred lines, as well as low coverage sequencing of a few individuals from the base population and the two subsequent time points has been provided. Further details on the experimental design can be found in [Noble et al., 2017].

As for section 5.1 we searched for genomic regions showing signatures of selection. As an example, we provide results for a genomic region containing 666 SNPs (chromosome 5, 14924777-15216613 bp). The upper panel of Fig. S23 (SI) illustrates the close match between the reconstructed haplotypes and the most similar sequenced founder haplotype. For each replicate line, the lower panel of Fig. S23 (SI) shows the reconstructed haplotype trajectories, together with the corresponding accuracy measures. Moreover, since the three replicates show similar evolutionary patterns, we decided to also apply our method to all of them simultaneously. The result seems consistent with our single replicate analysis in terms of the reconstructed haplotypes and their frequency trajectories. This suggests parallel evolution across the replicates (see Fig. S24 in the SI).

## 6 Discussion

We proposed a new principled approach that for the first time estimates multiple completely unknown haplotypes from allele frequency data only. Under a suitably chosen experimental design, and with sufficiently large allele frequency changes, the allelic composition of the unknown haplotypes can be recovered reliably. Strong enough selection provides one scenario that leads to sufficient fluctuations in allele frequency.

A good reconstruction of haplotype frequencies is achieved at time when a moderate number of haplotypes is present at a sufficiently high frequency which may not be the case at early time points when an experiment starts with many founder haplotypes. Further important design parameters that affect the quality of our reconstruction are the population size, and the number of samples with sequence information.

A lot of scientific studies use haplotype information to answer their research questions. By providing estimates of the most important underlying haplotypes, our method will help researchers that only have allele frequency data available.

Our estimated haplotype frequencies can also be used to obtain allele frequency estimates that are less noisy on average than the original frequency data obtained for instance via pool sequencing. Indeed, by combining information from several neighboring SNPs, the sampling variation introduced by sequencing a whole pool of individuals gets averaged out to some extent.

As our next step, we plan to extend our approach to data from locally structured populations, where samples are usually taken from multiple subpopulations. The reconstructed haplotypes may also be helpful to impute missing data from low coverage sequencing. Another plan for subsequent research is to use information from the paired end read data directly with our estimates, as these reads might be interpreted as very short observed haplotypes.

We implemented our method *haploSep* in an *R* package available on Github at https://github.com/MartaPelizzola/haploSep.

## Acknowledgement

We are grateful to the laboratories of Nick Barton, Christian Schlötterer, and Henrique Teotonio for providing us with their experimental data. This work has been supported by the Austrian Science Fund (FWF Doctoral Program Vienna Graduate School of Population Genetics”, DK W1225-B20). MB was supported by Deutsche Forschungsgemeinschaft (DFG; German Research Foundation) Postdoctoral Fellowship BE 6805/1-1. Moreover, MB acknowledges funding of DFG-GRK 2088. This work benefited from a research stay that was partially supported by the Simons Foundation and by the Mathematisches Forschungsinstitut Oberwolfach. AM and MB acknowledge support of DFG-SFB 803 Z02. AM and HL are funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC 2067/1 - 390729940.

## References

- [Barghi et al., 2019].↵
- [Behr, 2018].↵
- [Behr et al., 2018].↵
- [Behr and Munk, 2017].↵
- [Browning et al., 2018].↵
- [Browning and Browning, 2011].↵
- [Burke, 2012].↵
- [Burke et al., 2010].↵
- [Cao and Sun, 2015].↵
- [Castro et al., 2019].↵
- [Delaneau et al., 2019].↵
- [Excoffier and Slatkin, 1995].↵
- [Franssen et al., 2017].↵
- [Futschik and Schlötterer, 2010].↵
- [Gasbarra et al., 2011].↵
- [Griffin et al., 2017].↵
- [Illingworth et al., 2012].↵
- [Karasov et al., 2010].↵
- [Kawecki et al., 2012].↵
- [Kessner et al., 2013].↵
- [Loh et al., 2016].↵
- [Long et al., 2015].↵
- [Long et al., 2011].↵
- [Lu and Zhou, 2016].↵
- [Mallard et al., 2018].↵
- [Marchini et al., 2007].↵
- [Marchini et al., 2014].↵
- [Michalak et al., 2019].↵
- [Noble et al., 2017].↵
- [Noble et al., 2019].↵
- [Otte and Schlötterer, 2019].↵
- [Pirinen, 2009].↵
- [Sabeti et al., 2002].↵
- [Schlötterer et al., 2015].↵
- [Schlotterer et al., 2014].
- [Spitzer et al., 2020].↵
- [Tewhey et al., 2011].↵
- [Tilk et al., 2019].↵
- [Tishkoff et al., 1996].↵
- [Turner et al., 2011].↵