Abstract
The lion (Panthera leo) is one of the most popular and iconic feline species on the planet, yet in spite of its popularity, the last century has seen massive declines for lion populations worldwide. Genomic resources for endangered species represent an important way forward for the field of conservation, enabling high-resolution studies of demography, disease, and population dynamics. Here, we present a chromosome-level assembly for the captive African lion from the Exotic Feline Rescue Center as a resource for current and subsequent genetic work of the sole social species of the Panthera clade. Our assembly is composed of 10x Genomics Chromium data, Dovetail Hi-C, and Oxford Nanopore long-read data. Synteny is highly conserved between the lion, other Panthera genomes, and the domestic cat. We find variability in the length and levels of homozygosity across the genomes of the lion sequenced here and other previous published resequence data, indicating contrasting histories of recent and ancient small population sizes and/or inbreeding. Demographic analyses reveal similar histories across all individuals except the Asiatic lion, which shows a more rapid decline in population size. This high-quality genome will greatly aid in the continuing research and conservation efforts for the lion.
Introduction
The lion (Panthera leo) is historically one of the most widespread carnivores on the planet, previously ranging from the tip of southern Africa, to the southern edge of North America [1,2]. However, over just the past 25 years, the African lion (P. leo leo) has lost more than half of its population, while the other recognized subspecies, the Asiatic lion (P. leo persica), has been reduced to fewer than 1,000 individuals, occupying little of their former range as a single population in the Gir forest, India. The remaining Asiatic lions are suspected to be suffering from reproductive declines due to inbreeding depression [3] and have been subject to several outbreaks of canine distemper virus (CDV;[4]).
Genetic markers have played a key role in studying the biogeography, history, and movement of lions for the past 50 years (e.g. [2,5,6], [7–10]). However, studies have been mostly limited to microsatellites with limited use of nuclear and mitochondrial genetic data (e.g. [11–17]). More recently, reduced representation sequencing has enabled genomic genotyping using the domestic cat or tiger as a reference [18]. Felid karyotypes are thought to be highly conserved [19,20], but studies have shown a reference mapping bias for estimation of statistics such as heterozygosity [21] and accurate allele calling [22], both of which are important for assessing population history.
The causes of the decline in lions are multifactorial. Lions have been hunted by humans for thousands of years, possibly first as a direct competitor and threat to our survival [23], for initiation rituals and rites of passage (e.g. [24–26]), to reduce predation of domesticated animals, and more recently for sport [27–30]. The recent rise of illegal trade in lion parts and illicit breeding practices has escalated over the past ten years, bringing hunting practices and international laws into the spotlight. In addition, several documentaries have exposed the lion breeding industry within South Africa, which uses fenced lions for ‘petting’, canned hunting experiences, and ultimately as skeletons for export, likely destined for Asian medicines [31]. Accurate and rapid genotyping could aid law enforcement to reveal whether the origins of trafficked goods are from wild or captive populations.
Moreover, rapid population decline has put lions at the forefront of the conservation debate over translocations and how best to manage populations. Many efforts to restore previous populations have focused on translocation lions within and between various South African lion populations (e.g. [32,33]. Information about local population adaptation, deleterious alleles, and potential inbreeding is lacking, which further complicates managed relocations. While increasing genetic diversity remains a widely accepted conservation goal, recent theoretical implications suggest consideration should be made when moving individuals from large heterozygous populations into small homozygous populations [34]. Genomic resources will aid immensely in these estimations and have already shown to be highly preferable to microsatellites or a reduced number of loci (see e.g. [35–37]).
To date, no de novo genome assembly for an African lion exists and only two individuals have been resequenced (Cho et al. (2013). A de novo assembly of an Asiatic lion was recently completed [38], but was limited to short-read technology, and was thus highly fragmented. Asiatic and African lions are currently regarded as separate subspecies [1,6,39], and we regard them as such for these analyses. Here, we present a high-quality, de novo genome assembly for the lion (Panthera leo), referred to as Panleo1.0. We use a combination of 10x Genomics linked-read technology, Dovetail Hi-C, and Oxford Nanopore long read sequencing to build a highly contiguous assembly. We verify the conserved synteny of the lion in comparison with the domestic cat assembly and also examine the demography and heterozygosity of the lion compared with other felids. It is our hope that this genome will enable a new generation of high quality genomic studies of the lion, in addition to comparative studies across Felidae.
Results
Genome assembly and continuity
The assembly generated with 10x Genomics Chromium technology yielded a consistent, high-quality starting assembly for the lion. In general, assembly statistics are improved when compared to previous assemblies generated using short-insert and mate-pair Illumina libraries, such as the tiger [56], cheetah [61], Amur leopard [62], lynx [63], and puma [64]. All these assemblies have improved their scaffold statistics through a variety of technologies, such as Pacbio, Bionano, Nanopore, or Hi-C (Supplementary Table S3; see publications above and DNAZoo). The lower contig scores are also emulated through a higher number of missing BUSCOs (Table 2, Supplementary Table S4). Although we were unable to compare it to the de novo assembly of the Asiatic lion from Mitra et al. (2019), they report a scaffold N50 of approximately 20kb, suggesting our assembly represents significant improvement.
Using long sequencing reads to close gaps in draft genome assemblies
While the draft assemblies using either 10x alone or 10x + Dovetail Hi-C were of high-quality, they contained a number of gaps containing unknown sequence (Table 1). We therefore wondered if some of these gaps could be closed using long sequencing reads. Using a single Oxford Nanopore MinION flowcell we generated a total of 1,520,012 reads with an average read length of 6,528 bp, resulting in approximately 4x coverage of the P. leo genome. We then identified single reads which spanned gaps of any size and, for each gap, used MUSCLE and Cons to generate a consensus sequence of sequence spanning that gap (see methods). Using this approach we closed 26,403 gaps of 10, 100, or 400 bp with an average coverage of 3x per gap. We then identified split reads which spanned any gap 3kb or larger and again, for any instance in which multiple reads spanned a gap, pooled those reads and used MUSCLE and cons to generate a consensus sequence of the sequence spanning the gap. If only one read spanned the gap the raw sequence from that read was used to fill the gap. This approach resulted in the closing of 574 gaps of 3,000, 5,000, or 10,000 bp with an average coverage of 1x per gap. Overall, this approach closed 26,977 out of 42,635 gaps on 416 of the 8,061 scaffolds in the 10x + Dovetail assembly and reduced the overall size of the genome assembly by 1.6 million bp while increasing the mean contig size from 66 kb to 106 kb. Overall, this approach resulted in a substantial improvement on average contig size and associated statistics in the lion genome, but did not improve BUSCO scores for the genome. A detailed description of the gaps filled in using Nanopore can be found in Supplementary Table S2.
Phylogenetics
To verify the phylogenetic relationships of the taxa using the de novo genomes, we constructed a phylogenetic tree using a maximum-likelihood approach. Consistent with recent phylogenetic analyses of the, we found that the lion, the leopard, and the tiger form a cluster representing Panthera, with the leopard and lion constituting sister species within the group [65,66]. The cheetah and puma comprise another cluster, with the lynx sitting outside this grouping [66]. The domestic cat is the most distantly related to all of the species tested here. Since we used protein files to infer the phylogenetic relationships, we found very high posterior probabilities across all the nodes (Figure 1).
Repetitive Element and Gene Annotations
We generated statistics for repetitive elements in each genome using a pipeline which combines homology-based evidence and de novo repeat finding. On average, the continuity of the assembly did not greatly affect our ability to identify repeats (Table S5). Assemblies from each felid genome analyzed contained between 40.0%-42.5% repeats (Table S6). Alternatively, gene annotation results showed that more continuous assembles generate less annotated genes on average (Table S7, Table S8). Possibly, this indicates that more fragmented assemblies cause misidentifications of gene regions by automated annotation software.
Synteny
We constructed genome synteny visualizations for chromosome-level assemblies of the domestic cat (F. silvestris: GCA_000181335), the lion (P. leo), and the tiger (P. tigris; [56], [57,58]). Each assembly was aligned to the domestic cat and the lion, in order to observe similarities and differences between the genomes. Consistent with expectation, we found very few large rearrangements in the karyotype across species (Figure 2, Supplementary Figures S1, S2).
Heterozygosity
We mapped raw Illumina reads to each respective species genome, as well as to the domestic cat assembly. We found that, on average, mapping to the domestic cat assembly resulted in lower heterozygosity calls and an average of 10% fewer reads successfully mapped (Table S10). However, this pattern was inconsistent and reversed for the Asiatic lion individual (Figure 3, Table S10). These results are supported by [21], who found that the reference used had some effect on heterozygosity inference, but little effect on the inference of population structure. In addition, we find that there is substantial variation in genome-wide heterozygosity estimates across the four lions that were tested (0.0012, 0.0008, 0.0007, and 0.00019, respectively). The two captive lions sequenced in Cho et al. (2013) may have been substantially inbred as part of their captive population, but no further details on the individuals are available.
Because the assembly quality varied, we also performed two brief tests on Panthera datasets which have multiple reference genome assemblies (Table S9). We find that in general, fragmented assemblies do not seem to strongly influence heterozygosity calls, regardless of whether or not the individual in question represents the reference sequence.
Runs of homozygosity
Using the mapped files created during the previous step, we investigated how runs of homozygosity were distributed across the four lion genomes. We found that there were a high proportion of short runs contained within the Asiatic lion genome (Figure S3), and to a lesser extent, the two previously published captive lion genome sequences from Cho et al. (2013). In general, heterozygosity was much lower genome-wide in the Asiatic individual (Figure 4), indicating that, along with showing signs of recent inbreeding, the population has likely been small for a long time (see [67]). Panleo1.0 (“Brooke”) had the highest overall heterozygosity, which may be a result of admixture that has occurred during unmonitored captive breeding which occurs outside of regulated zoos.
Demographic history
PSMC analyses revealed similar demographic history of Panleo1.0 and the two genomes from Cho et al. (2013). These genomes show an initial, small decline approximately 1 million years ago (MYA), and a secondary more severe decline beginning approximately 500,000 years ago (Figure 5; Supplementary Figures S4, S5, S6). These trends are consistent with the fossil record which has revealed declines of large mammal populations during this time period, possibly due to Archaic human influence and/or climate changes (e.g. [68,69]). However, the Asiatic lion genome shows a more rapid decline over the past 100,000 years, but shows no consistent population size approximately 1MYA. It is possible that the low heterozygosity of the Asiatic lion was low enough to impede the inference of accurate historical NE due to a distortion of the coalescent patterns across the genome. Corroborating these issues, other studies have shown variation between results in PSMC analyses within individuals of the same species and suggest that alternative coalescent methods should be used to confirm historical demographic trends [70]. Further, the genome sequenced here shows inconsistent signals in more recent history as the inference approaches the 100,000 year mark, which could be due to inflations of heterozygosity across the genome. Since the ancestral history of the individual sequenced here is unknown, it is possible that she is the result of crossings between multiple, distinct populations of lion which have variable histories.
Discussion
Long and linked-read genomic technologies such as 10x Genomics, Nanopore, and Hi-C allow rapid and economical de novo construction of high quality and highly contiguous genomes (e.g. [71]). Projects such as Genome 10k [72,73], i5k [74], DNAzoo (DNAzoo.org; [57,58]), and bird10k [75] aim to vastly improve our general understanding of the evolution of genomes, and both the origin and fate of diversity of life on earth. Such high quality assemblies will not only contribute to our understanding of the evolution of genomes, but also have practical applications in population genetics and conservation biology.
The chromosome-level de novo assembly of the lion genome presented here was constructed in three steps - 10x Genomics was used to create the base assembly and Dovetail Hi-C and Oxford Nanopore were used to improve contiguity. We show that each step results in substantial improvement to the genome, indicating that these methods are not redundant. At the same time, our data indicate that 10x and Hi-C alone are enough to approximate chromosomes in a typical mammal genome. Nanopore data, even with a small amount of very long reads, was enough to fill in many of the small gaps and ambiguous sequences across the genome.
The quality of this assembly allowed us to investigate the co-linearity of the genome compared to other felids and the importance of the reference sequence for estimating heterozygosity. As has been reported before [19,20], we find that the genomes of felids are largely co-linear and indicate that few large-scale chromosomal rearrangements have occurred across species. However, reference sequence bias can have substantial and unpredictable effect on estimating heterozygosity, possibly due to mismapping.
The variation of heterozygosity inference across the four lions tested here is further evidence that single genomes are not representative of the heterozygosity of a species or even the populations (captive or wild) from where they are derived. This assembly has also allowed us to compare fine-scale patterns of heterozygosity across the genome, where we find a substantial amount of variation between individuals. This contiguous genome will allow us to perform analyses on recent inbreeding and ROH in wild individuals across their range, how heterozygosity patterns differ between populations with different evolutionary histories, and how management decisions such as translocations and barriers to dispersal affect wild populations. Further, captive management of populations also stand to gain from genetic monitoring tools and, as we have shown here, individuals from zoos may harbor early signs of diversity loss and the accumulation of long runs of homozygosity. Even outside the nuanced case of the Asiatic lion, where dramatic population declines occurred prior to managers stepping in to monitor individuals, captive-bred populations often come from few founders with the addition of new individuals as available. If captive populations are truly meant to be a resource for conservation at large, more work must be done to understand the genetic implications of such scenarios.
Demographic analyses are also greatly aided by continuous sequence and rely on the inference of coalescence across the genome. As we detected a different historic demography for the Asiatic lion, it would be pertinent to examine how recent and rapid inbreeding affects the ability of these software to detect NE over time. Further, examination of the patterns of diversity loss across wild individuals, especially populations which have been suggested to show signs of inbreeding (see the Ngorongoro crater lion population; [3,10,76]), will aid managers in decision making to ensure a future for existing lion populations.
This study will allow a surge forward in conservation efforts for the lion and enable studies across many facets of evolutionary biology, such as the highly conserved, yet phenotypic diversity of the genus Panthera. Undeniably, lion research has a historic legacy of collaboration across fields [77] and this genome will aid in future endeavors to prevent further loss of one of the world’s most iconic species. Most importantly, it will enable low-cost resequencing efforts to be completed, in addition to a wide range of other genetic studies, in order to further the conservation efforts of the lion.
Methods
Library Preparation and Sequencing
Whole blood samples were collected on two occasions during routine dental and medical procedures on an adult female lion (“Brooke”) from the Exotic Feline Rescue Center in 2017. Blood was collected in EDTA tubes, briefly held at −20C before being shipped overnight to Stanford and subsequently frozen at −80C. Approximately 1mL of whole blood was used for 10x Genomics Chromium library preparation and sequencing at HudsonAlpha in Huntsville, Alabama. This library was sequenced on an Illumina HiSeq X Ten. An additional 1mL was then sent to Dovetail Genomics in Santa Cruz, California for HiC library preparation and subsequent sequencing on the Illumina HiSeq X platform.
DNA for Nanopore sequencing was extracted from three 500uL aliquots of whole blood using the Quiagen DNeasy kit following the manufacturer’s instructions. DNA was eluted into 50uL and then concentrated to approximately 25ng/μL using a Zymo DNA Clean and Concentrator Kit. The final elution volume after concentrating was approximately 50μL. Libraries for Nanopore sequencing were prepared using a 1D genomic ligation kit (SQK-LSK108) following the manufacturer’s instructions with the following modifications: da-tailing and FFPE repair steps were combined by using 46.5μL of input DNA, 0.5μL NAD+, 3.5μL Ultra II EndPrep buffer and FFPE DNA repair buffer, and 3.0μL of Ultra II EndPrep Enzyme and FFPE Repair Mix, for a total reaction volume of 60μL. Subsequent thermocycler conditions were altered to 60 minutes at 20C and 30 minutes at 65C. The remainder of the protocol was performed according to the manufacturer’s instructions. 15μl of the resulting library was loaded onto a MinION with a R9.4.1 flowcell and run for 48 hours using MinKNOW version 2.0. Fastq files were generated from raw Nanopore data using Albacore version 2.3.1. Pass and fail reads were combined for a total of 1,520,012 reads with an average read length of 6,528 bp, with 336,792 of these reads greater than 10kb, and a longest read length of 62,463 bp.
Genome Assembly
The 10x reads were assembled using Supernova version 1.2.1 with standard settings [40]. A single version of the genome was output using the ‘--pseudohap 1’ flag. This assembly was then provided to the HiRise software [41] as the starting assembly. HiRise was performed by Dovetail Genomics (Santa Cruz, CA) and the resulting assembly returned to us.
Using long sequencing reads to close assembly gaps
Long sequencing reads generated by Nanopore sequencing were used to close gaps in the final 10x + Dovetail assembly. First, all Nanopore reads were mapped to the 10x + Dovetail Hi-C assembly using BWA [42] with the ont2d option (flags: -k14 -W20 -r10 -A1 -B1 -O1 -E1 - L0). Gaps were then closed using one of two methods, depending on whether single reads mapped to both sides of the gap.
We first identified single reads that had not been split by the aligner that mapped to at least 50 bp of sequence on either side of a gap in the 10x + Dovetail assembly and found 110,939 reads meeting this criteria. The sequence spanning the gap plus 50 bp on either side was extracted from the read and combined with other reads spanning the same gap into a single fasta file. To improve the quality of the alignment, 50 bp of sequence from either side of the gap from the reference genome was added to the fasta file. MUSCLE version 3.8.31 [43] was used, with default settings, to generate a multiple sequence alignment using all input sequences for each gap. Cons version 6.5.7.0 [44] was used to create a consensus sequence from the multiple alignment generated by MUSCLE. For each consensus sequence all N’s were removed.
Gaps not closed by single reads were then filtered and instances in which a single read was split and mapped to either side of a gap were identified, revealing 841 reads meeting this criteria. The sequence that spanned the gap but was not mapped was isolated and the 50 bp of sequence from the reference genome was added to either side of the unmapped sequence in a fasta file containing all gaps. In those instances where more than one split read spanned a gap MUSCLE was used to generate a multiple sequence alignment and Cons was then used to create a consensus sequence. Gaps in the reference genome were then replaced with the new consensus sequence.
Quality Assessment
In order to assess the continuity of each genome assembly, we first ran scripts from Assemblathon 2, which gives a detailed view of the contig and scaffold statistics of each genome [45]. We then ran BUSCOv3 [46] in order to assess the conserved gene completeness across the genomes. We queried the genomes with the mammalian_odb9 dataset (4104 genes in total). We ran all three versions of the genome assembled here (10x, 10x + Hi-C, and 10x + Hi-C + Nanopore). The final version of the assembly (10x + Hi-C + Nanopore) is what we refer to as Panleo1.0.
We also used the genes queried by BUSCOv3 in order to infer phylogenetic relationships among the felids with assembled genomes (see Table S1 for details). We first extracted all the genes in the mammalia_odb9 dataset produced for each genome by each independent BUSCO run. These protein sequences were then aligned using MAAFT ([47]; flags ‘--genafpair’ and ‘--maxiterate 10000’). We then used RAxML [48] to build phylogenies for each of the genes. We used flags ‘-f a’, ‘-m PROTGAMMAAUTO’, ‘-p 12345’, ‘-x 12345’, ‘-# 100’, which applied a rapid bootstrap analysis (100 bootstraps) with a GAMMA model for rate heterogeneity. Flags ‘-p’ and ‘-x’ set the random seeds. We subsequently used the ‘bestTree’ for each gene and ran ASTRAL [49] on the resulting trees (3,439 trees total) to output the best tree under a maximum-likelihood framework.
Repeat Masking
We identified repetitive regions in the genomes in order to perform repeat analysis and to prepare the genomes for annotation. Repeat annotation was accomplished using homology-based and ab-initio prediction approaches. We used the felid RepBase (http://www.girinst.org/repbase/; [50]) repeat database for the homology-based annotation within RepeatMasker (http://www.repeatmasker.org; [51]). The RepeatMasker setting -gccalc was used to infer GC content for each contig separately to improve the repeat annotation. We then performed ab-initio repeat finding using RepeatModeler (http://repeatmasker.org/RepeatModeler.html; [52]). RepeatModeler does not require previously assembled repeat databases and identifies repeats in the genome using statistical models. We performed two rounds of repeat masking for each genome. We first hard masked using the ‘-a’ option and ‘-gccalc’ in order to calculate repeat statistics for each genome. We subsequently using the ‘-nolow’ option for soft-masking, which converts regions of the genome to lower case, but does not entirely remove them. The soft-masked genome was used in subsequent genome annotation steps.
Annotation
Gene annotation was performed with the Maker3 annotation pipeline using protein homology evidence from the felid, human, and mouse UniProt databases. Gene prediction was performed with the Augustus Human model. We calculated annotation statistics on the final ‘gff’ file using jcvi tools ‘-stats’ option [53].
Synteny
We identified scaffolds potentially corresponding to chromosomes and any syntenic rearrangements between species. To do this we used the LAST aligner [54] to align the 20 largest scaffolds from each assembly to the linkage groups established by felcat9 (NCBI: GCA_000181335). We first created an index of each genome using the ‘lastdb’ function with flags ‘-P0’, ‘-uNEAR’, and ‘-R01’. We then determined substitutions and gap frequencies using the ‘last-train’ algorithm with flags ‘-P0’, ‘--revsym’, ‘--matsym’, ‘--gapsym’, ‘-E0.05’, and ‘-C2’. We then produced many-to-one alignments using ‘lastal’ with flags ‘-m50’, ‘-E0.05’, ‘-C2’ and the algorithm ‘last-split’ with flag ‘-m1’. Many-to-one alignments were filtered down to one-to-one alignments with ‘maf-swap’ and ‘last-split’ with flag ‘-m1’. Simple sequence alignments were discarded using ‘last-postmask’ and the output converted to tabular format using ‘maf-convert -n tab’. Alignments were then visualized using the CIRCA software (http://omgenomics.com/circa) and mismap statistics calculated.. We did not visualize any alignments that had an error probability greater than 1×10^-5. We additionally did not plot the sex chromosomes due to excessive repetitive regions and differences between the sexes of the animals that we used.
Heterozygosity
Raw illumina reads from each species were mapped to the domestic cat genome (NCBI: GCA_000181335) and the reference genome for each respective species using BWA-MEM [42]. Observed heterozygosity was calculated using ANGSDv0.922 [55]. We first estimated the SFS for single samples using the options ‘-dosaf 1’, ‘-gl 1’, ‘-anc’, ‘-ref’, ‘-C 50’, ‘-minQ 20’, -’fold 1’, and ‘-minmapq 30’ (where ‘-anc’ and ‘-ref’ were used to specify the genome it was mapped to). Subsequently, we ran ‘realSFS’ and then calculated the heterozygosity as the second value in the site frequency spectrum.
To control for possible differences in heterozygosity due to mapping or assembly quality, we also performed the same analyses on genome assemblies of different qualities for the lion (P.leo; this study, 10x and 10x + HiC + Nanopore), and the tiger (P. tigris; [56], [57,58], [59]).
Runs of homozygosity
Mapped sequences subsequently were used to infer runs of homozygosity across the genome. We used the ‘mafs’ output files from an additional run using ANGSD by adding the filters ‘-GL 1’, ‘-doMaf 2’, ‘-SNP_pval 1e-6’, ‘-doMajorMinor 1’, ‘-only_proper_pairs 0’, and ‘-minQ 15’. This run outputs a file that contains the positions of heterozygous sites across the genome. We counted the number of heterozygous sites in 1Mb bins across each scaffold and computed 1) the number of heterozygous sites in each bin and 2) the frequency of bins containing the number of heterozygous sites per kilobase. We then visualized this across the chromosomes as a proxy for runs of homozygosity in the genome, in addition to the length of the runs.
Demographic history
We removed small scaffolds (cutoff of L90 value for each genome) and those which aligned to the cat X chromosome from the lion (see LAST mapping section above). The remaining scaffolds were used for PSMC. Reads were mapped to the remaining scaffolds using BWA-MEM and the consensus sequence called using SAMtools mpileup [60], BCFtools call, and vcfutils ‘vcf2fastq’. Minimum depth cutoffs of 10 and maximum depth cutoffs of 100 were applied to all genomes using vcfutils.In order to visualize the PSMC graphs, we applied a mutation rates of 0.9e-09 [56] and a generation time of five years for the lion. We compared these inferences with those from two previously resequenced lions [56] and the Asiatic lion [38].
Author Contributions
E.E.A. conceived of the study, performed genome assembly and analysis. R.W.T. performed genome annotation and analysis. D.E.M. performed scaffolding using Nanopore data. G.B. and C.K. assisted with genome sequencing and assembly. E.E.A., R.W.T, D.E.M., E.A.H., and D.P. wrote the manuscript. All authors approved the manuscript.
Data Availability
Data and scripts used to produce these analyses are available upon request from the authors. Print will be updated with NCBI information when available.
Acknowledgements
We thank J. Taft, J. Herrberg, R. Rizzo and the rest of the staff and volunteers of Exotic Feline Rescue Center, Center Point, Indiana, for providing access to samples from Brooke the lion. We thank K. Reeves and B. Nimmo of Tigers in America for coordination of samples and their continuing support of this project and the Stanford Program for Conservation Genomics. We also thank the University of Illinois College of Veterinary Medicine Urbana-Champagne, the Peter Emily Foundation, and Dr. G. Weber-Reid for assistance with sample transfers and access. We also thank K. Panchenko for assistance writing assembly quality scripts. We thank M. Daly of Dovetail Genomics for assistance with the Dovetail submission and processing. We thank E. Ebel, G. Battu of HudsonAlpha, for assistance with lab work and sequencing. A special thanks to T. Yokoyama for assistance with Nanopore sequencing and preparation and C. Yakym for assistance with making plots. Thanks to K. Solari and S. Morgan for helpful comments on the manuscript. Animal silhouettes used are from shutterstock.com (IDs:94265464 and 279191219). We fondly remember Brooke the lion, who passed away July 2019.