Strict adherence to Mendel’s First Law across a large sample of human sperm genomes

Mendel’s Law of Segregation states that the offspring of a diploid, heterozygous parent will inherit either allele with equal probability. While the vast majority of loci adhere to this rule, research in model and non-model organisms has uncovered numerous exceptions whereby “selfish” alleles are disproportionately transmitted to the next generation. Evidence of such “transmission distortion” (TD) in humans remains equivocal in part because scans of human pedigrees have been under-powered to detect small effects. Recently published single-cell sequencing data from individual human sperm (n = 41,189; 969-3,377 cells from each of 25 donors) offer an opportunity to revisit this question with unprecedented statistical power, but require new methods tailored to extremely low-coverage data (∼0.01 × per cell). To this end, we developed a method, named rhapsodi, that leverages sparse gamete genotype data to phase the diploid genomes of the donor individuals, impute missing gamete genotypes, and discover meiotic recombination breakpoints, benchmarking its performance across a wide range of study designs. After applying rhapsodi to the sperm sequencing data, we then scanned the gametes for evidence of TD. Our results exhibited close concordance with binomial expectations under balanced transmission, in contrast to tenuous signals of TD that were previously reported in pedigree-based studies. Together, our work excludes the existence of even weak TD in this sample, while offering a powerful quantitative framework for testing this and related hypotheses in other cohorts and study systems.


Introduction
In typical diploid meiosis, each gamete randomly inherits one of two alleles from a heterozygous parent-a widely supported observation that forms the basis of Mendel's Law of Segregation. However, many previous studies have also uncovered notable exceptions, collectively termed "transmission distortion" (TD), whereby "selfish" alleles cheat this law to increase their frequencies in the next generation. Indeed, examples of TD have been characterized in nearly all of the classic genetic model organisms, as well as numerous other systems [1][2][3][4][5][6][7][8][9][10][11][12][13]. Mechanisms include meiotic drive [14], gamete competition or killing [15], embryonic lethality [16], and mobile element insertion [17]. Such phenomena are frequently associated with sterility or subfertility [18,19], but may spread through the population despite negative impacts on these components of fitness [20].
Previous attempts to study TD in humans have revealed intriguing global signals but have failed to iden-tify individual loci that achieve genome-wide significance and can be discerned from sequencing or analysis artifacts [21]. For example, using data from large human pedigrees, Zöllner et al. [22] reported a slight excess of allele sharing among siblings (50.43%)-a signature that was di↵use across the genome, with no individual locus exhibiting a strong signature. Meyer et al. [23] applied the transmission disequilibrium test (TDT) [24] to genotype data from three large datasets of human pedigrees. While multiple loci exhibited signatures suggestive of TD, the authors could not confidently exclude genotyping errors, and the signatures did not replicate in data from additional pedigrees. Similarly, Patterson et al. [25] applied the TDT to large-scale genotype data from the Framingham Heart Study but attributed the vast majority of observed signals to single nucleotide polymorphism (SNP) genotyping errors.
Genotyping of large samples of gametes o↵ers an alternative approach for discovering TD, albeit only for TD mechanisms operating prior to the timepoint at which the gametes are collected (e.g., meiotic drive or gamete killing). Meyer et al. [23], for example, used this approach in the attempted replication of their pedigreebased test, failing to validate their most promising TD signal. Wang et al. [26] and Odenthal et al. [27] similarly used sequencing and targeted genotyping, respectively, to scan samples of sperm for evidence of TD. While neither study observed the long tracts of TD signals that are expected under a classic model of meiotic drive, they did uncover short tracts suggestive of biased gene conversion. This observation is potentially consistent with the reported rapid evolutionary turnover in the landscape of meiotic recombination hotspots [28]. Such hotspots are associated with high rates of crossovers and non-crossovers, which produce short tracts of gene conversion.
Importantly, most previous studies have been limited in their statistical power for detecting weak TD. While pedigree-based studies have been constrained by the small size of human families, gamete-based studies were historically constrained by costs and technical challenges of single-cell genotyping, limiting analysis to relatively few gametes or small portions of the genome. The recent development of large-scale whole-genome sequencing of single sperm (termed "Sperm-seq") [29,30] o↵ers an opportunity to study the phenomenon of TD with unprecedented statistical power. Using a highly multiplexed droplet-based approach, Sperm-seq facilitates genotyping of tens of thousands of sperm from each of several donor individuals, in turn revealing detailed patterns of meiotic recombination and aneuploidy. However, the low sequencing coverage per cell (⇠0.01⇥) necessitates the development of tailored statistical methods for recovering gamete genotypes.
To this end, we developed a method called rhapsodi (R haploid sperm/oocyte data imputation) that uses lowcoverage single-cell DNA sequencing data from large samples of gametes to reconstruct phased donor haplotypes, impute gamete genotypes, and map meiotic recombination events (Fig. 1). Here we introduce this method and quantify its performance over a broad range of sample sizes and sequencing depths, demonstrating its e↵ectiveness across diverse study designs. Key improvements to the haplotype phasing and crossover calling methods in the Sperm-seq paper [29] include gauging model performance over a wide range of possible data inputs, directly comparing our method to an existing tool, and o↵ering a thoroughly documented and accessible software package.
We applied rhapsodi to published single-cell sequencing data from 41,189 human sperm (969-3,377 cells from each of 25 donors) [29,30], using the resulting imputed genotype data to scan for TD across the genome. After stringent filtering for segmental duplications and other sources of genotyping error, our results exhibited close concordance with null expectations under strict Mendelian inheritance, both with regard to individual loci and aggregate genome-wide signal. Together, our study underscores the extraordinary fidelity of human meiosis for ensuring balanced transmission of alleles to the gamete pool.

Results
A method for single-gamete sequencing analysis We developed a method to phase donor haplotypes, impute gamete genotypes, and discover meiotic recombination events using low-coverage single-cell DNA sequencing data obtained from multiple gametes from a given donor (see Methods) ( Fig. 1a). Briefly, chromosomes are segmented into overlapping windows, and within each window, the sparse gamete genotype observations at heterozygous SNPs (hetSNPs) are clustered to distinguish the two haplotypes (i.e., phase the genotypes) of the diploid donor individual (Fig. 1b). The sequences of alleles that compose these haplotypes are decoded based on majority votes within each cluster, and haplotypes from overlapping windows are then stitched together based on sequence identity, thereby achieving chromosome-scale phasing. A Hidden Markov Model (HMM), with transition probabilities determined by rates of meiotic crossover and emission probabilities determined by rates of genotyping error, is then used to infer the most likely path along the phased haplotypes for each gamete (Fig. 1c), thereby imputing missing genotype data (Fig. 1d). Points where such paths are inferred to switch from one donor haplotype to the other suggest the locations of meiotic recombination events.
Evaluating performance on simulated data In order to benchmark rhapsodi's performance, we developed a generative model to construct input data with varying sample sizes, rates of meiotic recombination, sequencing coverages, and genotyping error rates (Fig. S1). We then applied rhapsodi to the simulated data, matching input parameters to those used in the simulations (average of 1 recombination event per chromosome and genotyping error rate of 0.005). Phasing was assessed based on accuracy, completeness, switch error rate, and largest haplotype segment (Fig. 2a,  Fig. S2a). Across the range of study designs, we observed that phasing performance improved with increasing amounts of data (gamete sample size and coverage). For specific scenarios involving low coverages or small numbers of gametes, this relationship was not always monotonic. This suggests that other parameters that we currently hold fixed (e.g., window size used in phas- Figure 1 rhapsodi: R haploid sperm/oocyte data imputation. Low coverage data from individual gametes (A) is clustered to phase the diploid donor haplotypes (B). A Hidden Markov Model, with tunable rates for genotyping error and frequency of meiotic crossovers per chromosome, is applied to trace the most likely path along the phased haplotypes for each gamete (C) thereby imputing the missing gamete genotypes (D) which can be used to discover meiotic recombination events as switches from one donor haplotype to another (e.g., purple to turquoise in G4 SNPs 7 & 8 or turquoise to purple in Gm SNPs 4 & 5). ing) may interact and influence performance-an area for future methodological fine-tuning.
With the exception of very small gamete counts, we observed that phasing performance reaches a plateau at ⇠0.1⇥ coverage. Yet performance is also high at even lower coverages when compensated by large sample sizes of gametes. Qualitatively similar trends were observed for the tasks of imputing gamete genotypes and discovering meiotic recombination breakpoints ( Fig. 2b and Fig. 2c; Fig. S2b and Fig. S2c).
For study designs roughly mirroring the published Sperm-seq data [29] (1,000 gametes, 30 Discovery of meiotic recombination events exhibited the weakest relative performance among the three tasks, albeit still strong in absolute terms. This is likely due to a combination of this task's dependence on the successful completion of the previous two tasks, the simplifying assumptions employed within the generative model, and the inherently challenging nature of this task in datalimited scenarios. In relation to the last point, we observed that the resolution of inferred meiotic recombination breakpoints was strongly associated with the sequencing coverage of the input data (Fig. S3a), as well as the underlying density of hetSNPs across the genome. Assuming a pairwise nucleotide diversity of 1 per 1,000 bp [31], and given that the theoretical limit of median resolution is two hetSNPs, we found that a coverage of 2.3⇥ (missing genotype rate of 10%) was required to approach this theoretical limit. Meanwhile, for coverage in line with the Sperm-seq data [29] (⇠0.01⇥), we inferred a median resolution of 167.5 kbp, in line with empirical observations from the original study [29]. Through closer investigation of the locations of false positive (FP) and false negative (FN) recombination events (Fig. S3b), we identified three error modes that originate from various limitations of the data. Specifically, we attributed the vast majority of FNs to crossovers occurring near the ends of chromosomes or pairs of crossovers occurring in close proximity to one another, especially at low coverages ( Fig. S3c). In the second of such cases (co-occurring FNs), the genotype data may be too sparse to capture one or more informative markers that flank the recombination breakpoint(s).
Notably, such nearby crossovers should be rare in data from humans and many other species due to the phenomenon of crossover interference [32], which causes crossovers to be spaced farther apart than expected by chance. By simulating crossover locations under a uniform distribution, our benchmarking strategy is thus conservative, as it would overweight the importance of this error mode. A third mode of error, which manifests as pairs of FPs and FNs, owes to slight displacement of the crossover breakpoint ( Fig. S3c), which may arise as a consequence of premature or delayed switching behaviors of the HMM.
We next assessed rhapsodi's robustness to parameter mis-specification by altering the recombination and genotyping error rate parameters relative to the simulations. Only one parameter was mis-specified at a time, and the other was matched to that used in the simulation. While our results suggest overall robustness, we found that overestimating the genotyping error rate or average recombination rate ( Fig. S4 and Fig. S5) had less e↵ect on performance than underestimating either of these parameters ( Fig. S6 and Fig. S7). In practice, such parameters may be informed based on outside knowledge for a given species (recombination rate ⇡ 1 ⇥ 10 8 per bp for humans) and sequencing platform (error rate ⇡ 0.005 per bp).
Benchmarking against existing methods We next compared rhapsodi to an existing software tool, called Hapi, which was recently developed for the same task of gamete imputation and was demonstrated to outperform all similar methods [33]. We compared the performance of rhapsodi and Hapi using simulated data from a range of 3 to 150 gametes and coverages ranging from 0.001⇥ to 2.303⇥ (Fig. 3). Datasets with more than 150 gametes were not benchmarked because Hapi's runtime with larger sample sizes became intractable, taking up to 39 hours per simulated dataset. Of 2,916 simulated datasets, Hapi was able to successfully phase, impute, and detect crossovers in 1,902 datasets (65%), while rhapsodi was able to do so in 2,754 datasets (94%) (Fig. S8). For datasets that Hapi could not analyze, rhapsodi maintained high accuracy and completeness ( Fig. S9) with low cost to performance in comparison to datasets that both tools analyzed successfully. Hapi typically achieved high phasing accuracy, but often at the cost of low completeness (Fig. 3a). In contrast, rhapsodi exhibited relative balance between accuracy and completeness. Across a large range of study designs, rhapsodi phased and imputed a greater proportion of SNP genotypes than Hapi with little cost to accuracy (Fig. 3a  -Fig. 3b). This increase in completeness was most pronounced at the low coverages (<0.1⇥) that characterize the Sperm-seq data (Fig. 3).
Application to data from human sperm Given this confidence in the performance of our method, we proceeded to the analysis of published [29,30] singlecell DNA sequencing data from 41,189 human sperm (969-3,377 cells from each of 25 donors; ⇠0.01⇥ coverage per cell) (Fig. S10). Before applying rhapsodi, we performed stringent filtering to mitigate sequencing, alignment, and genotyping errors that could confound our downstream tests of TD. Specifically, we removed low-quality and aneuploid cells for the chromosome of interest; excluded regions of low mappability or extreme repeat content; restricted analysis to known SNPs from the 1000 Genomes Project [34]; and excluded regions harboring potential segmental duplications in each donor individual's genome (see Methods and Supplemental Text). While this stringent filtering removed a total of 15,138,461 (⇠30%) SNPs from the input data, we emphasize that true signatures of TD should be minimally a↵ected by such filtering, as such signatures are expected to extend across large genomic intervals due to the extreme nature of linkage disequilibrium (LD) among the sample of gametes (Fig. S11).
Applying rhapsodi to these filtered data, we phased donor haplotypes at an average of 99.9 (±0.16)% of SNP positions per donor and chromosome (i.e., leaving ⇠0.1% as unknown phase); imputed an average 99.3 (±1.23)% of genotypes per gamete and chromosome; and identified an average of 1.17 (median of 1) meiotic recombination events per gamete and chromosome, with a mode of 24 autosomal crossovers per gamete genome, broadly consistent with the reported length of the autosomal genetic map for males [35]. The average resolution of breakpoints was 664 kbp (±1.25 Mbp), with a median of 357 kbp. (Values are reported as the mean, plus or minus one standard deviation.) This median is in line with empirical observations from the original study [29]. As previously reported [29], the inferred crossover locations were concentrated in similar genomic regions across all 25 donors, with the highest densities occurring near the ends of chromosomes (Fig. S12). The overall distributions of crossover locations were qualitatively similar to the patterns observed in the prior analysis of a subset of these donors [29].

Strict adherence to Mendelian expectations across sperm genomes
The scale of the Sperm-seq dataset o↵ers strong statistical power to detect even slight deviations from Mendelian expectations, as supported by our simulations of TD across a range of gamete sample sizes and transmission rates (Fig. S13). For each of the 25 donors, we scanned across the imputed sperm genotypes, performing a binomial test for each SNP. Naive multiple testing approaches based on the nominal number of tests would be over-conservative in this context, given the extreme levels of LD across the sample of closely related sperm. We therefore applied a principal component analysis-based approach to infer the e↵ective number of independent statistical tests [36,37] and used this value as the basis of a Bonferroni correction (p-value threshold = 1.78 ⇥ 10 7 ; see Methods). After applying this correction, no individual SNP exhibited evidence of TD at the level of genome-wide significance (Fig. 4).
No global signal of biased transmission in human sperm In addition to the absence of locus-specific signatures achieving genome-wide significance, we observed that the combined distribution of p-values closely mirrors that which is expected under the null hypothesis of no TD (Fig. 5). This absence of global signal is intriguing given the strong statistical power underlying the analysis.
To more formally quantify global evidence of TD, we calculated the overall degree of allele sharing at hetSNPs among sperm from the same donor. After pruning the data to SNPs in approximate linkage equilibrium (see Methods) we found that across all pairwise comparisons of sperm from each of 25 donors, the average proportion of shared alleles was 0.499996, consistent with the distribution of this proportion under null simulations with no TD (n = 500 simulations; p-value = 0.47; Fig. 5).

Discussion
Past studies of TD in humans have been limited in statistical power, either due to the small size of human families or costs and technical limitations of gamete genotyping. Such limitations are compounded by the high burden of multiple testing in genome-wide scans.
Here we developed a genotype phasing and imputation method, termed rhapsodi, that allowed us to leverage low-coverage sequencing data from a large number of single gametes to impute gamete genotypes and test for TD with unprecedented statistical power.
In contrast to the tentative locus-specific and genomewide signals reported in previous studies, our results are consistent with strict adherence to Mendel's Law of Segregation across this sample of human sperm genomes. The signals that we observed are well explained based on the expected variance of the binomial process of Mendelian segregation under the null hypothesis of balanced transmission genome-wide. This negative result hinged on stringent filtering of the input data, as phenomena such as hidden structural variation and ensuing mapping and genotyping errors may otherwise generate false signatures of TD. It is important to emphasize that even stringent filtering is not expected to eliminate true signals, which should span large genomic intervals due to the extreme LD within the samples of sperm from a given donor.
Given the strong statistical power of our study, the absence of TD signal excludes the possibility of even weak TD. This negative result provides a striking contrast to the numerous validated examples uncovered in other species. However, our observations do not preclude the occurrence of TD in humans, for reasons that we enumerate below. First, our study was limited to sperm from 25 donor individuals. If true TD involves rare alleles or particular patient populations that were not sampled in the datasets we analyzed, our study design would miss such cases. A second and related consideration is that if not balanced by countervailing forces, alleles that are subject to TD may rapidly fix in the population, such that detecting such alleles during their ephemeral existence as polymorphisms may be exceedingly unlikely. Third, our analysis is solely sensitive to mechanisms of TD that would operate prior to the sampling of the sperm cells (e.g., sperm killing), and is therefore blind to mechanisms that may operate in female meiosis (e.g., meiotic drive) or at di↵erent stages of development (e.g., fertilization, maternal-fetal incompatibility, postzygotic selection, etc.). Notably, female meiosis may be particularly susceptible to meiotic drive given the asymmetric nature of the cell divisions that produce the oocyte and polar bodies [39]. Finally, our analysis does not exclude the existence of TD driven by biased gene conversion, which would generate short tracts be-low the resolution of our study. Indeed, the occurrence of this phenomenon is well documented based on both comparative evolutionary data and targeted analysis of human recombination hotspots, and is beyond the scope of our study [28].

Conclusion
In summary, we introduced and benchmarked a method to phase donor haplotypes, impute gamete genotypes, and infer meiotic recombination events using lowcoverage single-cell sequencing data from a large sample of gametes. We applied this method to scan a large sample of human sperm genomes for evidence of "selfish" alleles that violate Mendel's Law of Segregation. We observed no evidence of even weak e↵ects across this sample, either with respect to individual loci or aggregate genome-wide signal, in turn supporting a strict model of balanced transmission. Measuring the fidelity of Mendelian segregation and uncovering the mechanisms that safeguard or subvert this process remain important goals for human genetics. Given the documented fertility impacts of TD in other organisms, a thorough understanding of the loci driving such inheritance patterns could help reveal novel genetic causes of human infertility. More broadly, our method o↵ers a flexible and reproducible toolkit for testing this and related hypotheses in other cohorts and study systems, especially in light of the declining costs and expanding application of single-cell sequencing methods.

Data input and filtering
The main algorithmic steps of rhapsodi consist of (1) data preprocessing and filtering, (2) reconstruction of phased donor haplotypes (Fig. 1b), (3) imputation of gamete genotypes (Fig. 1c), and (4) discovery of meiotic recombination breakpoints. A table of sparse gamete genotypes (SNPs ⇥ gametes) with corresponding genome positions is required by rhapsodi as input (Fig. 1a). Genotypes may be encoded in one of two forms: 0/1/NA (denoting reference, alternative, or miss-ing genotypes, respectively) or A/C/G/T/NA (denoting nucleotides or missing genotypes, respectively). For A/C/G/T/NA encoding, reference and alternative alleles are also required as input. Data is then filtered to consider only SNP positions that are heterozygous for the given individual, with at least one reference and one alternative allele observed across gametes. Phased donor haplotypes, imputed genotype (and corresponding source haplotype) for each SNP in each gamete, and a list of locations of recombination breakpoints detected for each gamete is returned by rhapsodi as output (Fig. 1d). We detail the methods that produce each output below.
Reconstruction of phased donor haplotypes Our phasing approach leverages patterns of co-inheritance across gametes to reconstruct donor haplotypes from the sparse genotype matrix derived from low coverage sequencing (Fig. 1a). Chromosomes are divided into overlapping windows, and within each window, we compute pairwise binary (Jaccard) distances, cluster the gamete genotypes using Ward's method [40] (implemented with the "ward.D2" function in R), and cut the resulting tree into two groups to distinguish the haplotypes (Fig. 1b). For each cluster within a window, we use majority voting to decode the phased haplotype sequences. This approach depends on the reasonable assumption that recombination is su ciently rare, a↵ecting a small minority of gametes within each window. If no genotype achieves a majority at a given position, we designate the phase at that position as ambiguous (denoted as "NA"). Finally, haplotypes of adjacent (overlapping) windows are stitched together by considering the genotype consensus (i.e., rate of matching) for the overlapping SNPs. In the stringent stitching mode, if consensus is greater than a given threshold (default = 0.9), the haplotypes under consideration are inferred to be the same (i.e., to derive from the same donor homolog). If the consensus is less than 1 minus the threshold (default = 1 0.9 = 0.1), the haplotypes under consideration are inferred to be distinct (i.e., to derive from di↵erent donor homologs). Meanwhile, if the consensus lies between these upper and lower boundaries, the windows are considered too discordant to ascertain the relation of the haplotypes, and the method returns an error. However, rhapsodi can optionally be run in a lenient stitching mode which continues with phasing regardless of the level of consensus. In this case, the haplotypes with the highest consensus are assumed to derive from the same homolog. Imputation of gamete genotypes Our imputation approach uses a Hidden Markov Model (HMM) and a subsequent filling step to infer the genotypes of each gamete, e↵ectively tracing the most likely path through the donor haplotypes given the sparse observations for a given gamete. Our model assumes that (1) the hidden haplotype state at a given SNP depends only on the haplotype state at the SNP immediately prior; (2) adjacent genotypes originate from the same donor haplotype unless a meiotic recombination event occurred between the two; (3) the observed SNP alleles are either correct or in rare cases arise from genotyping errors (e.g., PCR errors, contaminant DNA co-encapsulated with the sperm genome during library preparation, sequencing errors, mapping errors, etc.). The HMM uses the phased donor haplotypes and accounts for probabilities of genotyping errors and meiotic crossovers: the probability of genotyping errors is reflected in the emission probabilities, while the probability of meiotic crossovers is reflected in the transition probabilities (Fig. 1c). We then apply the Viterbi algorithm [41] to determine the most likely sequence of hidden states, iterating over all gametes independently (Fig. 1d). To avoid potential over-smoothing behavior of the HMM, we optionally overwrite the imputed alleles with any observed alleles for rare sites where these genotypes conflict. The HMM only considers observed genotypes and makes no predictions for missing genotypes. To impute missing genotypes, we assume that tracts of unobserved sites that are bordered by SNPs originating from a single donor haplotype also originate from that same haplotype. We thus fill in the genotypes for these unobserved sites. Similarly, we assume that recombination does not occur between the last observed SNPs and the ends of the chromosomes, again filling genotypes at unobserved sites in these terminal regions. Tracts of unobserved sites that are bordered by SNPs assigned to di↵erent donor haplotypes remain unassigned (denoted as "NA").

Discovery of meiotic recombination breakpoints
Using the imputed gamete genotypes and the underlying phased donor haplotypes, we identified meiotic recombination breakpoints as sites where, for a given gamete chromosome, the inferred path switches from one donor haplotype to the other. This step should typically be run on the smoothed data (i.e., without superimposing the original observations), as even a low rate of genotyping error could otherwise manifest as false meiotic recombination. Recombination breakpoints are reported as intervals, defined by the last observed site inferred to derive from one haplotype and the first observed site inferred to derive from the other. These start and end points thus demarcate regions in which the meiotic recombination events likely occurred.
Assessing performance with simulation We developed a generative model to simulate input data and assess rhapsodi's performance (Fig. S1) while varying (a) the number of gametes, (b) the underlying number of hetSNPs, (c) depth of coverage, (d) recombination rate, and (e) genotyping error rate. Given these arguments, we construct a sparse matrix for input to the rhapsodi pipeline. The algorithmic steps of this generative model include (1) building phased diploid donor haplotypes (Fig. S1a), (2) randomly tracing through these haplotypes in the face of recombination to construct gamete genotypes (Fig. S1b), (3) masking genotype observations to mimic low sequencing coverage (Fig. S1c), (4) superimposing genotype errors at a given rate (Fig. S1d), and (5) filtering to only retain hetSNPs. We detail each step in turn below.
In the first step of the generative model, we construct phased haplotypes of the diploid donor by randomly generating a vector of 0's and 1's (with equal probabilities) at a length equal to the underlying number of hetSNPs. This binary vector is then inverted to generate the complementary haplotype (Fig. S1a).
We then use these simulated homologs to generate genotypes for each gamete (Fig. S1b). Specifically, we sample a count of recombination breakpoints from a Poisson distribution, where lambda reflects the average recombination rate. Each recombination breakpoint is randomly placed on the respective chromosome, and genotypes occurring after that breakpoint are inverted. In cases where a gamete chromosome does not recombine, the chromosome is identical to one of the parental chromosomes. By placing crossovers randomly, our method ignores heterogeneity in the landscape of meiotic recombination, as well as the phenomenon of crossover interference [32], which suppresses nearby crossovers.
After obtaining the gamete genotype matrix, we simulated the sparse coverage of single-cell DNA-seq by masking observed genotypes. The missing genotype rate (MGR) is obtained from the probability density function of a Poisson distribution with x = 0 and lambda equal to the coverage metric. We then multiplied the MGR by the total number of simulated SNP observations (number of underlying hetSNPs ⇥ number of gametes) to compute the total number of genotypes that should be masked (denoted as "NA") (Fig. S1c). Following a similar approach, we computed the number of genotyping errors by multiplying the error rate by the number of non-missing genotypes and randomly placed these errors at individual SNPs by inverting the respective alleles (0 to 1 or vice versa) (Fig. S1d).
Finally, SNPs or rows with at least one 0 genotype and one 1 genotype are retained, and the rest of the rows are filtered out, retaining only hetSNPs where both alleles were observed. In this way, the final number of hetSNPs in the generated output may be less than or equal to the specified input number of underlying hetSNPs.
We evaluated the quality of donor haplotype phasing by considering switch error rate, largest haplotype segment, completeness, and phasing accuracy (see Supplemental Text for definition of metrics). To assess the quality of imputation of gamete genotypes, we similarly quantified accuracy, completeness, and switch error rate. Finally, we assessed meiotic recombination discovery by computing the True Positive Rate (TPR), the False Discovery Rate (FDR), and the breakpoint resolution. Given the imbalance in class membership for the number of true recombination breakpoints compared to the number of false recombination breakpoints, we also considered precision and F1 Score.
Comparison to existing methods We compared our method to the software package Hapi (haplotyping with imperfect genotype data) [33], which was similarly developed for phasing and imputation based on single-cell DNA sequencing of gametes. Hapi was demonstrated to outperform the only other haploid-based algorithm, PHMM (pairwise Hidden Markov Model) [42], as well as two commonly applied diploid-based phasing methods, WhatsHap [43] and HapCUT2 [44] in terms of accuracy, reliability, completeness, and cost-e↵ectiveness. Because Hapi outperforms PHMM and the PHMM software was not readily available, we did not directly compare PHMM with rhapsodi.
In contrast to our method, Hapi was designed for datasets that contain relatively small numbers of gametes (⇠3-7) sequenced at individually higher coverages (> 1⇥). Given this distinction, the default behavior of Hapi's autorun function was not appropriate for our simulated datasets. In particular, Hapi's imputation function (hapiImpute) by default filtered out any SNPs that had a missing genotype observation for at least one gamete, which results in all data being removed from our low-coverage datasets. We thus disabled this filtering behavior prior to benchmarking against our method.
As part of Hapi's autorun function, the HMM used to filter errors uses a fixed set of transmission and emission probabilities. To more fairly compare the tools to each other, we instead provided Hapi with an HMM with parameters matching those of our simulation, thus closely mirroring the approach we used to evaluate rhapsodi in this analysis. After rhapsodi and Hapi were run on a dataset, the performances of both methods were evaluated using rhapsodi's assessment functions.
Genome-wide scan for TD To scan the genome for TD, single-cell genotype calls were generated from sperm sequencing data as described by Bell et al. [29]. Briefly, genomic variants were called for each donor using all sequence data from all sperm cells. After selection of heterozygous sites, the software GenotypeSperm (Drop-seq Tools v2.2, https://github.com/broadinstitute/ Drop-seq/releases) was used to determine which sites were captured by each sperm cell barcode and which allele(s) were present at each of these observed sites. Aneuploidy status and cell doublets were identified as previously described [29,30]. Only cell barcodes associated with single sperm cells (non-doublets) with even sequence coverage were included in the analysis, while aneuploid autosomes were excluded from the analysis.
Before applying rhapsodi for phasing and imputation, we applied several additional filtering criteria with the goal of restricting analysis to genuine SNPs and mitigating artifacts induced by sequencing, mapping, and other sources of genotyping error. Specifically, we excluded all technically challenging regions of the genome previously identified by the Genome in a Bottle Consortium (GRCh38 "union") [45] and/or the ENCODE Consortium [46]. We additionally required at least one observation of both the reference and alternative allele across sperm from a given donor. We also restricted to SNPs that were previously observed to be polymorphic in external data from the 1000 Genomes Project [34]. Finally, we removed any sites with evidence of segmental duplication (see Supplemental Text and Fig. S14).
We used a two-tailed binomial test (comparing counts of each allele under a null hypothesis of balanced transmission) to scan for TD at each hetSNP across sperm genomes from each of 25 donors. As input to this analysis, we used the imputed gamete data output by rhapsodi, applying the option to superimpose any observed genotypes (at rare sites of discordance with inferred HMM state).
Significance threshold for TD scan In our previous step, we applied a separate binomial test for TD at each hetSNP in each donor individual. Given the relative infrequency of recombination (per base pair per meiosis), sperm genomes are composed of large tracts of the donor's two parental haplotypes. By consequence, genotype data from the sample of sperm from a given donor are a↵ected by extreme LD, which typically extends across entire chromosomes. As such, our binomial tests are highly correlated across the genome, and naive multiple testing correction based on the number of SNPs would be overly conservative.
To account for this e↵ect, we applied the method sim-pleM [36,37], which uses a principal components analysis (PCA) approach to compute the e↵ective number of independent statistical tests. For the Sperm-seq dataset, simpleM estimated 281,368 e↵ective tests (compared to 34,799,282 nominal tests). We then used the e↵ective number of independent tests to apply a Bonferroni correction of 0.05 / 281,368 = 1.78 ⇥ 10 7 .
Calculation of global signal of TD We quantified global evidence of TD across the sperm genomes by calculating the rate of allele sharing all pairs of sperm from each donor. First, we applied PLINK to the fully imputed sperm genotype matrix output by rhapsodi to generate a list of the SNPs deemed to segregate in near linkage equilibrium (plink -file filename -no-fid -no-parents -no-sex -no-pheno -indep 50 5 2 -out outfilename) [47]. We then computed a Hamming distance matrix, e↵ectively counting mismatches between each pair of sperm genomes from each donor. After repeating the above steps for all donors, we reported the mean across all pairs of sperm cells.
Null simulations were conducted by applying a modified version of the generative model (without missing genotypes or errors), using gamete numbers, pruned SNP counts, and crossover counts identical to those inferred from the empirical data. We repeated the above procedure for computing pairwise distances on each simulated dataset, recording the mean proportion of shared alleles across all sperm from each of 25 donors. Repeating this entire procedure 500 times allowed the construc-tion of a null distribution to which we compared our allele sharing calculation from the Sperm-seq data. We then computed a one-tail p-value, defined as the proportion of null simulations where the average pairwise allele sharing was greater than or equal to the observed value.