## Abstract

Understanding past dispersal and breeding events can provide insight into ecology and evolution, and can help inform conservation strategies and the control of pest species. However, parent-offspring dispersal can be difficult to investigate in rare species and in small pest species such as mosquitoes. Here we develop a methodology for estimating parent-offspring dispersal from the spatial distribution of close kin, using pairwise kinship estimates derived from genome-wide single nucleotide polymorphisms (SNPs). SNPs were scored in 162 *Aedes aegypti* (yellow fever mosquito) collected from eight close-set, high-rise apartment buildings in an area of Malaysia with high dengue incidence. We used the SNPs to reconstruct kinship groups across three orders of kinship. We transformed the geographical distances between all kin pairs within each kinship category into axial standard deviations of these distances, then decomposed these into components representing past dispersal events. Using these components, we isolated the axial standard deviation of parent-offspring dispersal. From this, we estimated neighbourhood area (129 m), median parent-offspring dispersal distance (75 m), and oviposition dispersal radius within a gonotrophic cycle (36 m). We also analysed genetic structure using distance-based redundancy analysis and linear regression, finding isolation by distance both within and between buildings, and estimating neighbourhood size at 268 individuals. These findings indicate the scale required to suppress local outbreaks of arboviral disease and to target releases of modified mosquitoes for mosquito and disease control. Our methodology is readily implementable for studies of other species, including pests and species of conservation significance.

## Introduction

Dispersal is a key trait in ecology and evolution, allowing species to evade stressful areas and locate favourable new areas, and determining levels of gene flow that influence the adaptability of species (Clobert, Baguette, Benton, & Bullock, 2012). Knowledge of dispersal can also to be vital to applied research that seeks to determine the conservation status of threatened species or the biosecurity risk of pest species, or for managing outbreaks of pests and disease vectors once invasions have taken place (Killeen, Knols, & Gu, 2003; Ouborg, Piquot, & Van Groenendael, 1999). Dispersal has traditionally been investigated by marking and tracking individuals across a landscape. However, these methods are often laborious and do not capture past dispersal patterns that produce gene flow. New methods and tools in landscape genetics can help understand past dispersal and gene flow patterns, including potential environmental parameters that limit dispersal (Schmidt, Filipovic, Hoffmann, & Rasic, 2018; Watts et al., 2007). The advent of high-density sequencing technologies has also provided increased power for landscape genomic studies conducted at spatial scales fine enough to investigate dispersal discretely over generations, and to measure individual acts of movement (Schmidt, Rasic, et al., 2017).

Under Wright’s isolation by distance framework, the impact of dispersal on genetic variation in continuously distributed populations is described in terms of the neighbourhood size (*NS*), a product of parent-offspring dispersal variance and the ideal density of breeding adults within the dispersal area (Wright, 1946). Wright’s equation, *NS* = 4*πσ*^{2}*d*, represents a key bridge between local demographic processes and broader patterns of genetic differentiation. As regards the ecology and genomics of a species, this bridge can be crossed in both directions, either using genomic differentiation to understand processes of dispersal and density, or alternately using estimates of density and dispersal to gain an insight into likely processes of genomic differentiation.

The link between *NS* and dispersal has been investigated using composite methodologies incorporating both individual genetic distances and mark-release-recapture (MRR) (Sahlsten, Thorngren, & Hoglund, 2008; Watts et al., 2007). However, for many species it is difficult to estimate density or dispersal using MRR methods. For instance, the small size of mosquitoes necessitates the use of laboratory-reared mosquitoes that are released for recapture from a central point (Honorio et al., 2003; Reiter, Amador, Anderson, & Clark, 1995). If marking dyes/powders negatively impact the fitness of released individuals, or if rearing conditions do not adequately represent release conditions, this methodology risks overestimating or underestimating dispersal distances. For example, *Aedes aegypti* (yellow fever mosquito) reared in suboptimal larval conditions may settle for ovipositing in suboptimal sites more readily than those reared in favourable larval conditions (Kaur, Lai, & Giger, 2003). Thus, if rearing conditions are unfavourable compared to conditions at the release site, released mosquitoes may disperse less than mosquitoes from the wild population under investigation.

MRR techniques can also be expanded to incorporate information from genetic inferences of kinship (Bravington, Skaug, & Anderson, 2016). This framework of close-kin mark-recapture (CKMR) uses a genetic ‘mark’ that extends beyond the individual by considering individuals to have genetically ‘marked’ their close kin, such that any kin sampled constitute an extended ‘recapture’ of the original individual. Kinship estimation can also be useful for investigating migration between populations, particularly in cases where migration rates have changed too recently to have produced corresponding changes in population genetic structure (Palsbøll, Zachariah, & Bérubé, 2010), such as might be expected in invasive species. While studies using microsatellites can often only confidently identify first-order kin relations (parent-offspring or full-sibling), the use of high density, genome-wide molecular markers can enable reasonably accurate assignment of individuals to second-order (e.g. half-sibling) and third-order (e.g. first cousin) groupings (Phillips, García-Magariños, Salas, Carracedo, & Lareu, 2012).

In this paper, we develop a methodology that uses genome-wide single nucleotide polymorphisms (SNPs) to estimate relatedness across three orders of kinship, and then uses the spatial distribution of these kin to estimate parent-offspring dispersal and related parameters. Our method inherits from CKMR, but exploits the fact that the spatial separation distance between each kin pair can be characterised as a set of distances representing past parental dispersal events (Figure 1), provided that the population is sampled at a specific point in time and from a set of sampling locations distributed continuously through space. By separating the composite life-histories associated with each kinship category into discrete parent-offspring dispersal events, we derive estimates of parent-offspring dispersal distances. Also, as most populations sampled continuously through space should have more second-order than first-order relations, and more third-order than second-order relations, this approach provides a means of investigating dispersal using a greater number of data points than by using first-order relatives alone, as dispersal distance estimates from first, second, and third-order relatives can all be incorporated into a single analysis. By using thousands of SNPs to estimate kinship and by integrating three orders of kinship into a single analysis, our methodology can be conducted with smaller sample sizes than those often required for kinship studies, a requirement which may have prevented more widespread adoption of kinship-based methodologies (Palsbøll et al., 2010).

We use the above methodology to investigate dispersal in *Ae. aegypti*, the primary vector of arboviral diseases such as dengue, chikungunya, and Zika (Morrison, Zielinski-Gutierrez, Scott, & Rosenberg, 2008). This highly anthropophilic mosquito is generally considered a weak disperser by flight (Harrington et al., 2005), though some MRR studies have reported flights over long distances (Honorio et al., 2003; Reiter et al., 1995). Given a local abundance of human hosts, the primary driver of dispersal in females is the search for suitable oviposition sites (Edman et al., 1998), and when undertaking ‘skip’ oviposition a female may oviposit at several sites during a single gonotrophic cycle (Reiter, 2007). Typically, females mate only once in their lives, while males may have multiple partners (Christophers, 1960). Despite its limited active dispersal, *Ae. aegypti* has managed to invade much of the global tropics over the past several centuries, dispersing passively along human trade and transport routes (Powell & Tabachnick, 2013).

We focus on dispersal across a residential site in Kuala Lumpur, Malaysia, as part of preparation for active interventions involving releases of *Ae. aegypti* transinfected with the bacterium *Wolbachia*. These interventions aim to replace uninfected *Ae. aegypti* with Wolbachia-infected mosquitoes that have a reduced potential to transmit dengue (Hoffmann et al., 2011; O’Neill et al., 2018; Schmidt, Barton, et al., 2017). The long-term persistence and spread of *Wolbachia* in *Ae. aegypti* will strongly depend on local *Ae. aegypti* dispersal characteristics (Hoffmann et al., 2014; Schmidt, Barton, et al., 2017; Turelli & Barton, 2017). Understanding intergenerational movement will also help inform strategies for responding to newly-detected *Ae. aegypti* incursions globally or to localised dengue outbreaks by chemical suppression of populations. However, applications of this methodology extend well beyond *Ae. aegypti* and other pest vectors. For instance, a key question in insecticide resistance management in agricultural pests concerns local movement of resistance alleles once resistance first arises, to help reduce resistance spread (Maino, Binns, & Umina, 2018). And in threatened species, understanding the spread of individuals across metapopulations can be vital in managing the species (Szczys, Oswald, & Arnold, 2017). Our approach to estimating dispersal and neighbourhood size therefore has numerous applications for both threatened and threatening species.

## Materials and Methods

### Mosquito sampling

*Aedes aegypti* were collected from Mentari Court in Petaling Jaya, Malaysia, between September 19^{th} and October 9^{th}, 2017. Mentari Court consists of a set of 18-storey apartment blocks occupying a 500 × 250 m area, enclosing a vegetated area and a school (Figure 2). We deployed ovitraps on either the third or the fourth floor of each building, which was of sufficient height to avoid sampling *Ae. albopictus*, which are common at ground level in Mentari Court. We collected and stored *Ae. aegypti* larvae every week for three weeks. We divided the study region into nine sample sites across eight buildings (Figure 2) and sampled 18 mosquitoes from each site for DNA extraction, selecting six mosquitoes from different traps from each of the three weeks.

### DNA extraction and genotyping

DNA was extracted from the 162 *Ae. aegypti* using a Roche High Pure™ PCR Template Preparation Kit (Roche Molecular Systems, Inc., Pleasanton, CA, USA), with the addition of an RNase A digestion step. Extracted DNA was used to construct two RAD libraries containing 81 individuals each. We followed the double-digest Restriction-site Associated DNA sequencing protocol for *Ae. aegypti* developed by Rasic, Filipovic, Weeks, and Hoffmann (2014), but selected a smaller size range of DNA fragments (350 – 450 bp) to accommodate the larger number of individuals in each library. The libraries were sequenced on the Illumina Hiseq3000 platform at the Monash Health Translation Facility, obtaining 100 bp paired-end reads and using a 20% Phi-X spike-in to reduce the impact of low diversity RADtags.

Following sequencing, barcoded reads were filtered, truncated to 80 bp, and demultiplexed using the process_radtags program in Stacks v2.0 (Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013). Paired and single-end reads were concatenated, and aligned to the *Ae. aegypti* nuclear genome assembly AaegL4 (Dudchenko et al., 2017) using Bowtie 2 (Langmead & Salzberg, 2012), with --very-sensitive alignment settings. We used the program ref_map to build a Stacks catalog and the program Populations to select SNP loci that were scored in at least 70% of mosquitoes. We filtered further with VCFtools (Danecek et al., 2011), retaining SNPs in Hardy-Weinberg equilibrium (P = 1e10^{−5}) and with minor allele frequencies > 0.05, and thinned the remaining SNPs so that none was within 250 kbp of another. Thinning at this threshold in *Ae. aegypti* retains approximately 8 SNPs per map unit, a sampling density shown to largely eradicate linkage effects in SNPs (Cho & Dupuis, 2009). This set of 3,939 SNPs was used for all downstream analyses.

### Estimation of kinship categories and coefficients

To estimate the kinship category of each pair of mosquitoes, we calculated Loiselle’s kinship coefficient *k* (Loiselle, Sork, Nason, & Graham, 1995) in the program SPAGeDi (Hardy & Vekemans, 2002). Kinship coefficients represent the probability that any allele scored in both individuals is identical by descent (Wright, 1922), with theoretical mean *k* values for each kinship category as follows: full-siblings = 0.25, half-siblings = 0.125, full-cousins = 0.0625, half-cousins = 0.0313, second cousins = 0.0156, unrelated = 0. We ignored intergenerational kinship categories (e.g. uncle-niece), as the maximum of two weeks between the sampling of any pair of individuals would be insufficient time for the ontogeny of an additional generation (Christophers, 1960).

To assign pairs of individuals to relatedness categories across three orders of kinship (i.e. first cousins), we first used maximum-likelihood estimation in the program ML-Relate (Kalinowski, Wagner, & Taper, 2006) to identify first order (full-sibling) and second order (half-sibling) pairs. We used the *k* scores of pairs within the full-sibling and half-sibling datasets to calculate standard deviations for these categories.

However, ML-Relate is not configured to determine third order relationships (e.g. cousins). To determine cousins, we estimated a lower bound of *k* that separated first cousins from unrelated pairs and those of more distant kin groups. Here we define first cousins as including both full-cousins and half-cousins (Figure 1). We then produced simulated *k* scores for each kinship category, assuming that the *k* scores within each kinship category followed a normal distribution with a unique mean and standard deviation, and that these scores combined produced the entire distribution of *k* scores. Standard deviations for full-cousins, half-cousins, and second cousins were assumed to correspond to the standard deviation of the entire population after full-sibling and half-sibling pairs were removed.

Using the theoretical means and standard deviations of *k*, we randomly sampled 100,000 simulated *k* scores from each kinship category. In the initial pool of 13,041 empirical mosquito pairings, ML-Relate identified approximately 50 full-sibling and half-sibling pairs. Assuming that the data contained twice as many first cousin (full and half) pairings as sibling (full and half) pairings, and twice as many second cousin pairings as first cousin pairings, final sampling distributions were developed as follows: 100,000 unrelated, 2,000 second cousins, 500 full-cousins, 500 half-cousins, 250 half-siblings, and 250 full-siblings, giving a ratio of 400:8:2:2:1:1. This assumption is reasonable if the local population size is approximately constant; for a diploid population of constant size, an average of two offspring from any one individual will themselves have offspring. If we take *n* to be the number of larvae parented by one individual, the ratio of first cousin to sibling pairings would be approximately , which ranges from 1.1 – 4 for 2 ≤ *n* ≤ 10. If we allow a slight population expansion with an average of 2.4 reproducing offspring, ratios from 1 ≤ *n* ≤ 10 produce values from 2.2 – 2.4.

To analyse how closely this distribution approximates the field data, we randomly sampled 10,000 simulated *k* scores from the above sampling distribution and plotted a histogram of this combined distribution and a histogram of the unrelated distribution against a histogram of 10,000 *k* scores from the empirical data. As the combined distribution matched the empirical distribution much more closely than the unrelated distribution (Supplementary Figure 1), we adopted it for kinship inference. Note that this unrelated distribution is not completely reflective of a natural population, as it assumes total random mating and zero population structure. It is a convenient null from which various models including this one can be developed and compared.

To determine a lower threshold of *k* to define a kinship category for first cousins, we plotted histograms of *k* scores for each kinship category, following the ratio described above (Supplementary Figure 2). We assigned *k* = 0.06 as the lower threshold to describe first cousin relationships. Individual pairs of *k* > 0.06 that were neither full-siblings nor half-siblings were much more likely to be first cousins than any other category.

### Inference of dispersal distributions

We treated separation distances between pairs of sampled kin as representative of multiple past dispersal events, following Figure 1. Full-sibling separation distances represent the dispersal of the mother during oviposition. Half-sibling separation distances represent the breeding dispersal of the father between matings, plus the post-mating dispersal of the mother between mating and oviposition, plus the ovipositional dispersal of the mother. First cousin separation distances represent the ovipositional dispersal of the grandmother, plus the pre-mating dispersal of each parent, plus the post-mating dispersal and ovipositional dispersal of the mother. In all cases, we considered dispersal to operate across a two-dimensional plane, and ignored any differences in sampling altitude between buildings.

Axial standard deviations correspond in form to the dispersal component of *NS* (Wright, 1946). We calculated axial standard deviations from the distributions of separation distances for each kinship category. To do this, we (i) projected distances onto a polar coordinate system with a random angle of rotation, (ii) converted the distances to one-dimensional vectors by multiplying each distance by the cosine of its rotation angle and (iii) calculated the standard deviation of the resulting distribution. The final axial standard deviations were estimated by applying steps i) – iii) to each kinship category. See Supplementary Text for details.

### Determination of parent-offspring dispersal

Parent-offspring dispersal is described by Wright (1946) as the distribution of parents at some phase of the life cycle relative to offspring at the same phase. These events serve to produce geographical distributions of not only full-siblings and half-siblings but full-cousins and half-cousins as well (Figure 1). Specifically, the full-cousin and half-cousin dispersal distributions are produced by combining the parent-offspring dispersal distributions with the full-sibling and half-sibling distributions respectively. As the variance of the normal distribution formed by the combination of two normal distributions is equal to the sum of the component variances, we can infer parent-offspring axial dispersal, σ_{PO}, with the following equations:

As we assume a mixed distribution of full-cousins and half-cousins in the kinship pairings, these represent the upper (full-cousin) and lower (half-cousin) bounds of parent-offspring dispersal estimates using the first cousin distribution. Assuming a 50:50 ratio of full-cousins and half-cousins in our data, the mean parent-offspring axial dispersal can be approximated by the following equation:

We use equation (3) for all future calculations of parent-offspring dispersal. For each axial dispersal distribution (σ), 2σ represents the effective radius of dispersal, within which 86.5% of dispersed individuals are expected to be found (Wright, 1946). In the case of parent-offspring dispersal, this constitutes the radius of a circle defining Wright’s neighbourhood area, the spatial component of the *NS* estimate within an isolation by distance framework.

### Geographical genetic structure and neighbourhood size

We performed distance-based redundancy analyses (dbRDA) to quantify the effects of geographical distance, sampling week, and building residency on patterns of genetic distance observed among *Ae. aegypti* at Mentari Court. For these tests, we resampled the dataset so that no full-siblings or half-siblings were included, retaining 130 individuals. This was necessary as the inclusion of sibling pairs can bias estimates of population structure (Goldberg & Waits, 2010). We were interested in whether genetic distances were spatially structured at this scale, and if so whether structure best fit a pattern of isolation by distance, isolation by time, isolation by building residency, or some combination of the three. All analyses were performed with the R package “vegan” (Oksanen et al., 2010), and all geographical distances were transformed with a natural logarithmic function.

We used dbRDA to build three models. The first (“dist”) tested for effects of only geographical distance on genetic distance. The other models tested for effects of geographical distance on genetic distance after conditioning for the effects of sampling week and building (“dist | *env*”), and for effects of sampling week and building on genetic distance after conditioning for the effects of geographical distance (“env | *dist*”). We quantified the marginal explanative power of each variable in each of the dbRDAs using ANOVA with 10,000 permutations. The functions *capscale* and *anova.cca* were used for dbRDA and ANOVA respectively.

As dbRDA requires independent variables to have one-dimensional inputs, we first transformed the matrix of pairwise geographical distances into eight principal components (PCs) using the function *pcnm*. Sampling time was treated as a continuous variable of integers corresponding to the first, second, and third week of sampling, while building residency was modelled categorically. We used a pairwise matrix of Rousset’s *a* scores (Rousset, 2000) as the dependent variable, calculated in SPAGeDi. For geographical distance in the dbRDAs “dist | *env*” and “env | *dist*”, we used only the PCs that were significant in the dbRDA “dist” (P < 0.006 following Bonferroni correction).

In a population experiencing isolation by distance, *NS* is estimated as the inverse of the slope of the regression of pairwise genetic distance against the natural logarithm of geographical distance (Rousset, 2000). That is, *NS* = *b*^{−1}, where *Genetic distance* = *a* + ln(*Geographical distance*) * *b*. We used the R function “lm” to perform three of these regressions. The first, using all non-sibling pairs of individuals, we used to estimate *NS* following the above. For the other two regressions we considered only within-building pairings and between-building pairings, to investigate isolation by distance within and between buildings respectively.

## Results

### Kinship inference

Using the 3,939 SNPs retained after filtering, we identified 13 full-sibling pairs and 34 half-sibling pairs using ML-Relate. We also designated 51 pairs with *k* > 0.06 as first cousins. Mean separation distance for full-siblings was 18.1 m, for half-siblings it was 48.6 m, and for cousins it was 75.1 m (Table 1). Supplementary Figure 3 shows the spatial distribution of *k* scores across geographical distance for all kinship categories as scatterplots (3A) and density histograms (3B).

### Determination of parent-offspring dispersal via kinship distributions

For each kinship category, we used geographical distances of separation between kin to construct a bivariate normal distribution, from which we calculated the axial standard deviation of dispersal (Table 1). Using the full-sibling, half-sibling, and first cousin standard deviations, we estimated parent-offspring axial dispersal as 64 m (23 – 93 m; 95% C.I.). The radius of effective parent-offspring dispersal is equal to double the parent-offspring axial standard deviation, coming to 129 m (47 – 185 m; 95% C.I.). This corresponds to neighbourhood area, the circle within which 86.5% of parent-offspring dispersal occurs.

Following the process for parent-offspring dispersal, we generated dispersal radii for each kinship distribution. Referencing the axial standard deviations in Table 1, we derived a dispersal radius of 36 m (12 – 59 m; 95% C.I.) for full-siblings, representing ovipositional dispersal; a dispersal radius of 116 m (53 – 180 m; 95% C.I.) for half-siblings, representing breeding and ovipositional dispersal; and a dispersal radius of 154 m (104 – 202 m; 95% C.I.) for first cousins (see Figure 1). Additionally, through simulations of the parent-offspring axial standard deviation, we estimated the median parent-offspring dispersal distance as 75 m, which represents the range within which 50% of dispersal events lie.

Pairwise locations of all full-siblings, half-siblings, and first cousins are shown in Figure 2. All full-sibling pairs were found in the same building (Figure 2a). Half-sibling pairs tended to be found in the same building or an adjacent building, although two pairs were found between nonadjacent blocks (Figure 2b). First cousin pairs were mostly distributed between adjacent buildings, but were distributed between nonadjacent buildings more often than half-siblings (Figure 2c).

### Geographical genetic structure and neighbourhood size

The dbRDA evaluating the effects of geographical distance on genetic structure (“dist”) indicated that 3 of 8 PCs were within the significance threshold (Bonferroni-corrected critical value: P < 0.006). These PCs were used as the geographical distance covariate in the dbRDAs env |*dist* and dist | *env* (Table 2). After conditioning for the effects of geographical distance (Bonferroni-corrected critical value: P < 0.025), env |*dist* showed that building residency had a significant effect on patterns of genetic structure (P = 0.002; F = 1.67; 7 d.f.), while sampling week did not (P = 0.854; F = 0.690; 2 d.f.). After conditioning for the effects of building residency and sampling week, the 3 PCs in dist | *env* showed no significant association between geographical distance and genetic distance (P = 0.0901; F = 1.155; 3 d.f.). As building residency and geographical distance were necessarily correlated, these results indicate a significant effect of building residency independent of geographical distance, and an uncertain effect of geographical distance.

We explored the effects of geographical distance further using linear regressions. The three linear regressions calculated for all non-sibling pairs (P < 2e^{−16}; R^{2} = 0.0090; slope = 0.0037), all non-sibling pairs within the same building (P = 0.012; R^{2} = 0.0062; slope = 0.0041), and all non-sibling pairs in different buildings (P = 1.3e^{−09}; R^{2} = 0.0048; slope = 0.00484), all showed significant positive associations between geographical and genetic distances (Supplementary Figure 4). These isolation by distance patterns indicate that *Ae. aegypti* populations are spatially structured within buildings as well as between buildings, show slightly stronger structuring between buildings than within buildings, and show a clear effect of geographical distance independent of building residency. Considering the size of buildings at Mentari Court, this implies isolation by distance at spatial scales of less than 200 m. Using the slope of the regression for all non-sibling pairs, we calculated *NS* as 268 *Ae. aegypti* (222 – 345; 95% C.I.). This constitutes the effective number of breeding individuals inhabiting a circle of radius two times the axial standard deviation of parent-offspring dispersal (Wright, 1946), so 258 m^{2}.

## Discussion

This study has developed a methodology that uses genomic inferences of kinship to estimate parent-offspring dispersal and related population parameters, and applied this methodology to investigate an urban population of the invasive disease vector *Ae. aegypti*. From 162 sequenced individuals we identified 98 pairs of kin across three orders of kinship. By limiting the duration of sampling and by sampling only eggs, we were able to ignore intergenerational kin categories, and thus assigned the 98 pairs to intragenerational categories: full-sibling, half-sibling, full-cousin, and half-cousin. After transforming the spatial distances separating kin pairs within each category into axial standard deviations, we isolated the parent-offspring component of dispersal, which corresponds to the difference between the axial standard deviations of full-siblings and full-cousins and of half-siblings and half-cousins. Using our final estimate of parent-offspring dispersal, we calculated neighbourhood area (129 m), median parent-offspring dispersal distance (75 m), and radius of ovipositional dispersal within a gonotrophic cycle (36 m). Our additional analyses with dbRDA and linear regression revealed isolation by distance both within and between buildings, and provided an estimate of *NS* (Wright, 1946) of 268 individuals. These results indicate the scale of intergenerational dispersal and population structuring in *Ae. aegypti*, and thus also the scale at which efforts must be focused to restrict new invasions, suppress or transform established populations, and respond strategically to arboviral outbreaks. With appropriate modifications where required, our methodology will be readily applicable to studies of other species of interest such as pests and those of conservation significance.

Our estimates of parent-offspring dispersal, combined with the results of the dbRDAs and linear regressions, suggest that the *Ae. aegypti* population of Mentari Court is hierarchically structured. The linear regressions indicate isolation by distance patterns both within (
< 200 m) and between buildings, while the dbRDAs showed that building residency had an effect on structure even after accounting for geographical distance. These results suggest dispersal within buildings is less restricted than dispersal between buildings, and that dispersal between adjacent buildings is more common than between nonadjacent buildings. This interpretation is in accordance with kin observations, wherein most kin pairs found in nonadjacent buildings were cousins (Figure 2), which are a result of multiple parent-offspring dispersal events (Figure 1). This also accords with our estimate that 86.5% of parent-offspring dispersal occurs within a circle of radius 129 m, which corresponds in scale to movement within buildings and between adjacent buildings but not between nonadjacent buildings. Thus, dispersal of *Ae. aegypti* between nonadjacent buildings at Mentari Court is likely to be a multigenerational process. This equates to a matter of months for *Ae. aegypti*, which typically has 10 – 12 generations per year (Christophers, 1960).

These findings are broadly consistent with previous estimates from MRR studies which describe a tendency of *Ae. aegypti* to stay within the building of its release, and when dispersing farther rarely travel more than 150 m (Harrington et al., 2005; Maciel-de-Freitas, Codeco, & Lourenco-de-Oliveira, 2007; Ordonez-Gonzalez, Mercado-Hernandez, Flores-Suarez, & Fernandez-Salas, 2001; Russell, Webb, Williams, & Ritchie, 2005). Occasional long-distance dispersal > 500 m has also been recorded through MRR (Honorio et al., 2003; Reiter et al., 1995); while these dispersal distances are larger than the sampling scale of this study, they have been observed in genomic studies (Schmidt et al., 2018). Likewise, our finding that buildings or the space between them acts as a dispersal barrier is consistent with observations of fine-scale dispersal barriers in *Ae. aegypti* (Hemme, Thomas, Chadee, & Severson, 2010; Schmidt et al., 2018). The 18-storey high-rise blocks of Mentari Court are populated with *Ae. aegypti* up to the top floor, and are likely to provide adequate within-building opportunities for mating, human blood-feeding, and oviposition, with dispersal within buildings likely to be safer and easier than crossing the more ‘hostile’ open spaces between buildings. Future work could examine dispersal patterns in building complexes of fewer storeys to test if these lead to higher rates of movement between buildings, which could result from smaller buildings having fewer oviposition sites. Future work could also consider vertical dispersal between floors, which was not considered in this study.

If movement patterns in Mentari Court are representative of high-rise building complexes generally, we expect these findings to have general implications regarding *Ae. aegypti* as a disease vector, as an invasive species, and as a target of biological control initiatives. In the event of an arbovirus outbreak, the initial spread of the virus by female *Ae. aegypti* will likely be restricted to within buildings or between nearby buildings. Spread by females to more distant buildings would likely take multiple generations, or multiple gonotrophic cycles of a single female, and thus proceed more slowly. Following a new detection of invasive *Ae. aegypti*, we expect limited active dispersal from the point of introduction, though passive dispersal by human transport may spread the incursion more quickly. Finally, considering releases of *Wolbachia-infected Ae. aegypti* at Mentari Court, we expect that successful invasion of one building with *Wolbachia* is unlikely to result in the invasion of other buildings, as movement between buildings may be insufficiently common for the invasion to surpass a critical threshold frequency of 0.25 – 0.35 (Schmidt, Barton, et al., 2017; Turelli & Barton, 2017). However, limited dispersal between buildings is also likely to ensure that a successfully invaded building remains so, as few infected *Ae. aegypti* will leave and few uninfected *Ae. aegypti* will arrive.

Using the regression between individual genetic distance and the natural logarithm of geographical distance (Rousset, 2000), we estimated *Ae. aegypti NS* at 268 individuals. Combining this with our parameter for parent-offspring dispersal (σ) of 129 m, we can estimate the ideal density *d* of *Ae. aegypti* breeding adults as 1281 km^{−2}, using Wright’s equation for *NS* under isolation by distance, *NS* = 4*πσ*^{2}*d* (Wright, 1946). In interpreting these values, it is essential to be clear about the distinction between *NS* and effective population size (*N _{e}*).

*NS*is a parameter representing breeding structure in an isolation by distance framework, and relates directly to dispersal, while

*N*is dependent on local species abundance and thus habitat availability (Nunney, 2016). These parameters are therefore expected to estimate different numbers of individuals. An extensive study of

_{e}*Ae. aegypti*estimated average

*N*of 400 – 600 across different sites and timepoints (Saarman et al., 2017), while census size estimates range from 900 – 5,500 individuals (Carvalho et al., 2015; Lounibos, 2003; Sheppard, Macdonald, Tonn, & Grab, 1969). Relative census sizes of populations can be estimated using variations in

_{e}*N*and may improve our ability to understand

_{e}*Ae. aegypti*population characteristics. Census size estimates are often five to ten-fold larger than those of

*N*End (Luikart, Ryman, Tallmon, Schwartz, & Allendorf, 2010; Palstra & Fraser, 2012); were a similar ratio to hold for

_{e}*NS*, we would expect to find between 1,340 and 2,680

*Ae. aegypti*within the dispersal radius, a figure consistent with typical adult census population sizes for this species. All of this emphasizes the benefits of combining kinship-based dispersal estimates with more widely-used, distance-based investigations of genetic structure to understand local population and breeding processes in

*Ae. aegypti*.

Apart from the introduction of novel oviposition sites through ovitraps, our methodology of inferring dispersal through genetic relatedness leaves dispersal processes essentially undisturbed, an advantage over MRR studies. Accordingly, genomic inferences of dispersal may more accurately reflect the true biological processes occurring within the study site. An additional advantage of kinship-based methodologies is their capacity to investigate population processes such as dispersal among populations in which migration rates have changed too recently to have produced corresponding changes in population genetic structure (Palsbøll et al., 2010), such as might be expected in invasive species and those with habitats impacted by humans. Ongoing decreases in sequencing costs means that the approach is becoming feasible for a wider range of species, particularly as our methodology parameterises dispersal using estimates from three orders of kinship, and does not depend on large sample sizes required for many kinship-based methodologies. When stringent requirements on sample size are relaxed, more widespread adoption of kinship-based methodologies seems likely (Palsbøll et al., 2010), particularly since current genetic markers allow for more accurate estimation of kinship categories than previously-used markers (Hauser, Baird, Hilborn, Seeb, & Seeb, 2011). In this study we sequenced 162 *Ae. aegypti* on only two sequencing lanes, which provided sufficient read depth and breadth to identify 98 pairs of close kin.

Our methodology is applicable to a range of species, including those of medical, economic, or conservation significance. However, differences in breeding and dispersal ecologies between species will alter the way that the method is applied. For instance, for species where females mate more than once, additional steps will be required to separate half-sibling dispersal from maternal dispersal. Similarly, species with sex-biased dispersal will require additional steps when decomposing axial standard deviations into components. Also, while our method is well-suited to studying rare species of conservation significance, in which direct population manipulation through MRR may not be possible, these studies will likely have low sample size and involve inbred individuals (Hedrick & Kalinowski, 2000). These conditions may make it more difficult to estimate kinship using genomics, though we note that parts of our methodology can equally be applied using kinship estimates derived through other means such as visual tracking of relatives.

The kinship-based inference processes described here have close affinities with the work of Bravington et al. (2016), and constitute an important extension to CKMR. Using genome-wide SNPs, we have been able to extend the CKMR methodology to cover three orders of kinship, estimating relatedness to the level of half-cousin. Likewise, this study has developed an approach for estimating intergenerational dispersal through the genomic marking of unsampled parents and grandparents. This will be particularly helpful for studies of species with traits that make them unsuitable for standard MRR experiments, such as due to small size or high population density. For example, monitoring of rare mammals is often carried out by the scats they leave, with recent work using DNA from scats to infer *N _{e}* from kinship (Skrbinšek et al., 2012). As methods for retrieving DNA from scats improve (Schultz, Cristescu, Littleford-Colquhoun, Jaccoud, & Frère, 2018), investigating kin networks and intergenerational dispersal with high density markers may become increasingly viable for achieving conservation objectives.

## Conclusions

We developed and applied a methodology to investigate dispersal and breeding dynamics in *Ae. aegypti* from the spatial distribution of 98 pairs of kin across three orders of kinship. We also observed genetic structure within and between buildings at Mentari Court, with most movement between nonadjacent buildings taking place over multiple generations. These findings are being considered in the design of future *Wolbachia* releases at Mentari Court and other high-density urban sites, and they will also inform protocols for the effective deployment of resources in response to disease outbreaks. Our methodological approach for investigating dispersal and breeding dynamics will be readily adaptable to future studies of dispersal in a range of species.

## Data Accessibility

Sorted .bam files and geographical locations for 162 *Aedes aegypti* have been archived at the Sequence Read Archive at NCBI Genbank, with accession number PRJNA542421.

## Author Contributions

Moshe Jasper: Performed research, wrote the paper, analysed data, contributed new analytical tools

Thomas L. Schmidt: Performed research, wrote the paper, provided supervision

Nazni W. Ahmad: Designed research, performed research

Steven P. Sinkins: Designed research, provided funding

Ary A Hoffmann: Designed research, provided funding, provided supervision

## Acknowledgements

This work was funded by the National Health and Medical Research Council (Program Grant no. 1132412; Fellowship Grant no. 1118640), and the Wellcome Trust (Grant no. 108508). We thank three anonymous reviewers for their comments.