Elsevier

Theoretical Population Biology

Volume 89, November 2013, Pages 64-74
Theoretical Population Biology

Analysis and rejection sampling of Wright–Fisher diffusion bridges

https://doi.org/10.1016/j.tpb.2013.08.005Get rights and content

Abstract

We investigate the properties of a Wright–Fisher diffusion process starting at frequency x at time 0 and conditioned to be at frequency y at time T. Such a process is called a bridge. Bridges arise naturally in the analysis of selection acting on standing variation and in the inference of selection from allele frequency time series. We establish a number of results about the distribution of neutral Wright–Fisher bridges and develop a novel rejection-sampling scheme for bridges under selection that we use to study their behavior.

Introduction

The Wright–Fisher Markov chain is of central importance in population genetics, and it has contributed greatly to the understanding of the patterns of genetic variation seen in natural populations. Much recent work has focused on developing sampling theory for neutral sites linked to sites under selection (Smith and Haigh, 1974, Kaplan et al., 1989, Nielsen et al., 2005, Etheridge et al., 2006). Typically, the site under selection is assumed to have dynamics governed by the diffusion process limit of the Wright–Fisher chain, in which case the genealogy of linked neutral sites can be constructed using the framework of Hudson and Kaplan (1988). However, due to the complicated nature of this model, analytical theory is necessarily approximate, and the main focus is on simulation methods. In particular, a number of simulation programs, including mbs (Teshima and Innan, 2009) and msms (Ewing and Hermisson, 2010), have recently appeared to help facilitate the simulation of neutral genealogies linked to sites undergoing a Wright–Fisher diffusion with selection.

Simulations of Wright–Fisher paths under selection can be easily carried out using standard techniques for simulating diffusions. Frequently, however, it is necessary to simulate a Wright–Fisher path conditioned on some particular outcome. For example, to simulate the path of an allele under selection that is currently at frequency x, a time-reversal argument shows that it is possible to simulate a path starting at x conditioned to hit 0 eventually (Maruyama, 1974). However, more complicated scenarios, including the action of natural selection on standing genetic variation, require more elaborate simulation methods (Peter et al., 2012).

The stochastic process describing an allele that starts at frequency x at time 0 and is conditioned to end at frequency y at time T is called a bridge between x and y in time T, or a bridge between x and y over the time interval [0,T]. Wright–Fisher diffusion bridges appear naturally in the study of selection acting on standing variation because it is necessary to know the path taken by an allele at current frequency y that fell under the influence of natural selection at a time T generations in the past when it was segregating neutrally at frequency x. Wright–Fisher diffusion bridges are also of interest for their application to inference of selection from allele frequency time series (Bollback et al., 2008, Malaspinas et al., 2012, Mathieson and McVean, 2013, Feder et al., 2013). In particular, analysis of bridges can help determine the extent to which more signal is gained by adding further intermediate time points.

In addition to their applied interest, there are interesting theoretical questions surrounding Wright–Fisher diffusion bridges. For alleles conditioned to eventually fix, Maruyama (1974) showed that the distribution of the trajectory does not depend on the sign of the selection coefficient; that is, both positively and negatively selected alleles with the same absolute value of the selection coefficient exhibit the same dynamics conditioned on eventual fixation. It is natural to inquire whether the analogous result holds for a bridge between any two interior points. Moreover, the degree to which a Wright–Fisher bridge with selection will differ from a Wright–Fisher bridge under neutrality is not known (in connection with this question, we recall the well-known fact that the distribution of a bridge for a Brownian motion with drift does not depend on the drift parameter, and so it is conceivable that the presence of selection has little or no effect on the behavior of Wright–Fisher bridges). Lastly, the characteristics of the sample paths of the frequency of alleles destined to be lost in a fixed amount of time are not only interesting theoretically but may also have applications to geographically structured populations (Slatkin and Excoffier, 2012).

Here, we investigate various features of Wright–Fisher diffusion bridges. The paper is structured as follows. First, we establish analytical results for neutral Wright–Fisher bridges. Then, we derive a novel rejection sampler for Wright–Fisher bridges with selection, and use it to study the properties of such processes. For example, we estimate the distribution of the maximum of a bridge from 0 to 0 under selection, and investigate how this distribution depends on the strength of selection.

Section snippets

Background

The Wright–Fisher diffusion with genic selection is a diffusion process {Xt,t0} with state space [0,1] and infinitesimal generatorL=γx(1x)x+12x(1x)2x2. When γ=0, the diffusion is said to be neutral; otherwise, the drift term captures the strength and direction of natural selection.

The corresponding Wright–Fisher diffusion bridge, {Xtx,z,[0,T],0tT} is the stochastic process that results from conditioning the Wright–Fisher diffusion to start with value x at time 0 and end with value z at

Transition densities for the neutral Wright–Fisher diffusion

When there is no natural selection (i.e.,  γ=0), the transition densities of the Wright–Fisher diffusion can be expressed as f(x,y;t)=l=2ql(t)k=1l1(lk)xk(1x)lkB(y;k,lk), where the ql(t) are the transition functions of a death process starting at infinity with death rate 12n(n1) when n individuals are left alive and B(;α,β) is the density of the Beta distribution with parameters α and β  (Ethier and Griffiths, 1993). That is, ql(t) is the probability that a Kingman coalescent tree with

General framework

When selection is incorporated into the Wright–Fisher model, there is no known series formula for the transition density akin to (3.1) (but see Kimura, 1955, Kimura, 1957 for attempts using perturbation theory, as well as Song and Steinrücken (2012) and Steinrücken et al. (2012) for methods of approximating an eigenfunction expansion computationally). Therefore, analytical results for distributions associated with the corresponding bridge like those we obtained in the neutral case are not

Discussion

We have examined the behavior of Wright–Fisher diffusion bridges under both neutral models and models with genic selection. Although various conditioned Wright–Fisher diffusions have been studied in the past, Wright–Fisher diffusions conditioned to obtain a specific value at a predetermined time have not been studied extensively. We have elucidated some of the properties of Wright–Fisher bridges using a combination of analytical theory and simulations.

In contrast to Brownian motion with drift,

Acknowledgments

The authors thank M. Slatkin and B. Peter for initial discussions that led to our interest in this topic.

JGS was supported in part by NIH NRSA trainee appointment grant T32-HG00047 and by NIH grant R01-GM40282. RCG was supported by the Miller Institute for Basic Research in Science, University of California at Berkeley. SNE was supported in part by NSF grant DMS-0907630.

References (30)

  • A. Beskos et al.

    Exact simulation of diffusions

    Annals of Applied Probability

    (2005)
  • J.P. Bollback et al.

    Estimation of 2Nes from temporal allele frequency data

    Genetics

    (2008)
  • J.F. Crow et al.

    An Introduction to Population Genetics Theory

    (1970)
  • E. Csáki et al.

    On the joint distribution of the maximum and its location for a linear diffusion

    Annales de l’Institut Henri Poincaré, Probabilités et Statistiques

    (1987)
  • A. Etheridge et al.

    An approximate sampling formula under genetic hitchhiking

    Annals of Applied Probability

    (2006)
  • S.N. Ethier et al.

    The transition function of a Fleming–Viot process

    Annals of Probability

    (1993)
  • G. Ewing et al.

    MSMS: a coalescent simulation program including recombination, demographic structure and selection at a single locus

    Bioinformatics (Oxford, England)

    (2010)
  • Feder, A., Kryazhimskiy, S., Plotkin, J.B., 2013. Identifying signatures of selection in genetic time series. arXiv...
  • R. Fisher

    On the dominance ratio

    Proceeding of the Royal Society of Edinburg

    (1922)
  • R.C. Griffiths et al.

    Diffusion processes and coalescent trees

  • R.R. Hudson et al.

    The coalescent process in models with selection and recombination

    Genetics

    (1988)
  • N. Ikeda et al.
  • N.L. Kaplan et al.

    The “hitchhiking effect” revisited

    Genetics

    (1989)
  • M. Kimura

    Some problems of stochastic processes in genetics

    The Annals of Mathematical Statistics

    (1957)
  • M. Kimura

    Stochastic processes and distribution of gene frequencies under natural selection

  • Cited by (19)

    • Wright–Fisher diffusion bridges

      2018, Theoretical Population Biology
      Citation Excerpt :

      In this section we consider the genealogy of a Wright–Fisher bridge when there is selection in the model. This is a new approach different from that in Schraiber et al. (2013). The genealogy of the Wright–Fisher diffusion with selection is more complex that in a neutral model and the transition functions for the coalescent genealogy do not have an explicit form.

    • An informational transition in conditioned Markov chains: Applied to genetics and evolution

      2016, Journal of Theoretical Biology
      Citation Excerpt :

      We make the assumption that mutation can be neglected during the finite time interval separating two observations.2 We define a conditioned trajectory to be the set of states of a population that are sequentially visited over time, where the population starts in a specific state, at the initial time, and ends in a specific set of states at the final time (for work on conditioned trajectories see Zhao et al., 2013, 2014; Schraiber et al., 2013). The set of final states could consist of multiple states of the population, or just a single state, as we shall assume in some illustrative examples.

    • Exact simulation of conditioned wright-fisher models

      2014, Journal of Theoretical Biology
      Citation Excerpt :

      This fact is exploited to produce a simple simulation method that differs from the direct method described above. No such simple simulation method is known for the diffusion approximation of the Wright–Fisher model, where the only method for the generation of conditioned continuous state/continuous time trajectories is based on trajectory rejection (Schraiber et al., 2013). The work of Schraiber et al. (2013) uses a non-linear change of variables combined with Girsanov׳s theorem, and it is by no means obvious how to extend this methodology to more complex/higher dimensional problems, such as populations with multiple alleles with selection, or populations spread over multiple patches.

    • Core elements of a TPB paper

      2014, Theoretical Population Biology
    View all citing articles on Scopus
    View full text