Abstract
Using ancient DNA sequences, we explored the function of olfactory receptor genes in the genus Homo. Humans, Neandertals, and Denisovans independently adapted to a wide range of geographic environments and the odours produced by their food. Variations in their odorant receptor protein sequence and structure resulted in variation in detection and perception. Studying thirty olfactory receptor genes, we found our relatives showed highly conserved receptor structures, but Homo sapiens did not. Variants led to changes in sensitivity to some odors, but no change in specificity, indicating a common olfactory repertoire in our genus. Diversity of geographic adaptations in H. sapiens may have produced greater functional variation in our lineage, increasing our olfactory repertoire and expanding our adaptive capacity.
One-Sentence Summary Using ancient DNA we studied the sense of smell in our extinct ancestors and in our relatives, Denisovans and Neanderthals
Main Text
Terrestrial animals smell by binding odorant molecules to odorant receptors (ORs) but there is limited knowledge of odorant-OR and genotype-phenotype associations. Variation in mammalian ORs is strongly linked to ecological and dietary niche (1, 2). The human genus Homo underwent the most radical ecological niche expansion of all primates when migrating out of Africa and adapting to diverse global environments (3). Denisovans and Neandertals ancestors dispersed from Africa earlier than present-day humans (4) (∼750,000 versus 65,000 years ago) and separated from each other ∼300,000 years ago (5) (Fig. S1). Neandertals were geographically wide-ranging (western Europe, Middle East, Asia) while Denisovans were geographically constrained to Siberia (6, 7), the Tibetan plateau (8), and possibly beyond Wallace’s Line (9). Olfactory stimuli from divergent environments following independent dispersals from Africa may have left traces of variation in Homo ORs. Among present-day humans, changes in OR function are linked to major evolutionary dietary shifts, such as scavenging, hunting, animal milk consumption, cooking, domestication (1, 10–14). But, what about the gap between Homo migrations and human-specific changes in ORs?
Using published ancient DNA sequences to analyze genetic variation for population structure and test functional differences in gene variants for ecological differentiation, we assess whether there is a shared Homo olfactory repertoire (range of detectable odors). We previously explored genetic and functional variation in present-day humans, Altai Neandertal, and Denisovan for OR7D4 (15). We extended our study to 29 additional open reading frames (Table S1) for ORs with known human receptor-odor (16, 17) and two additional Neandertals (Chagyrskaya, Vindija) and one ancient human (Ust’-Ishim) who lived in the same Altai montane locality (Table S2). We used 1000 Genomes for present-day humans (Table S3) (6). We found that novel variants alter sensitivity but not specificity of OR function and conclude that the olfactory repertoires of extinct lineages were highly overlapping. We also argue that expansion of the human olfactory repertoire occurred after our split with other migratory members of our genus.
Results
Genetic variation
Extinct lineages and Ust’-Ishim had fewer DNA and protein variants than 1000 Genomes (Fig. 1, Table S4)—on average 0.19% of nucleotides across 17 genes (or 0.11% when dividing across all 30 genes, including those matching the human wild type) compared to 0.82% of nucleotides across 30 genes in 1000 Genomes. Ancient populations were more prone to genetic drift (fixation and loss) due to smaller effective population sizes and living in small, isolated communities (18). Extinct lineages exhibit a pattern of OR gene conservation and present-day humans exhibit a pattern of tolerating variation.
The fixation index (Fst) measures genetic variance due to population structure (typically weighted by population size) and ranges from 0 (no differentiation) to 1 (complete differentiation) (19). The Fst values by gene for 1000 Genomes populations are lower than other large-bodied mammals with wide geographic dispersal (Table S5) (20). OR5P3 fixation was the highest at 11.4%, the lower limit of possible significant differentiation by population structure. The 1000 Genomes populations are not highly differentiation based on the 30 ORs used in this study an Fst mean of 4%. In contrast, the genus Homo (1000 Genomes and ancient samples) has an Fst mean of 11%, the lower end of possible significant differentiation by population structure. There was variation among Fst by gene, however. Ten Fst values were <1% (OR2W1, OR2J2, OR4Q3, OR5K1, OR6P1, OR7D4, OR8B3, OR10G7, OR11A1, OR51L1), which suggests the sort of random distribution associated with panmixia or genetic conservation. Of the 13 genes with Fst > 12%, seven were >20% (1C1, 5AN1, 7C1, 10G3, 10J5, 51E1, 51E2), which suggests the sort of distribution associated with high population differentiation. Of the genes with high Fst, only two had novel variants in extinct populations, which suggests that the small number of novel variants are not highly influential in differentiating populations (Table S7). The greater number of genes with high Fst indicate that Homo used to be more structured by population than present-day humans.
Looking at the distribution of variants, two ancient genes were identical to the human wild type (OR4Q3, OR8B3) and twenty contained variants also observed with 1000 Genomes—shared variation (Table S6), suggesting these variants occurred prior to global divergence of Homo.
Only 11 genes, with a total of 14 variants, were unique to ancient samples (not found in 1000 Genomes). These genes may well be ones that were influenced by olfactory stimuli in divergent environments following dispersals from Africa. Denisovan had nine novel variants (of which two were synonymous) compared to the Neandertal five (of which 2 were synonymous). No novel variants were observed in the ancient human Ust’-Ishim.
Amino acid sequences for the ORs studied were used to form a cladogram to explore the relationships of our samples (Fig. S2). Extinct lineages formed a clade with Vindija Neandertal the most distinct, which was unexpected because genome-wide studies have indicated that Vindija is most genetically similar to Chagyrskaya Neandertal (18). The extinct clade was closest to the ancient human Ust’-Ishim and then to East and South Asian present-day humans—the latter groups harbor genetic signatures of introgression with Neandertals and Denisovans (9).
Functional Variation
Because gene function in not reliably predictable for ORs from sequence data (21, 22), we directly measured the functional responses of ORs containing novel variants.
Each OR protein, expressed in a cell line, was screened against seven odorants previously identified in the literature as evoking responses: OR1A1 (21, 23), OR1C1 (22), OR2C1 (21), OR10J5 (21), OR5P3 (21), and OR10G3 (22). Dose response assays for the top screening responses included seven concentrations of the odors delivered separately.
There were only three Neandertal genes containing novel variants. Their dose responses were not correlated with those of present-day humans (Figs. 2A, 2C, 2D). Only 1C1 had a detectable response but significantly lower than that of present-day humans (Figs. 3A, 3C, 3D, 4, S2, S3).
Despite the higher number of novel OR variants in the Denisovans and higher dose responses compared to present-day humans (Figs. 2C, 2D), the OR responses for six genes and those for human reference were significantly correlated (R2 = 0.87) (Fig. 2B). Tested Denisovan ORs were less sensitive to odors that present-day humans perceive as floral but much more sensitive to odors perceived as spicy, balsamic, or unpleasant (e.g., sulfur 4x greater and balsamic 3x greater than in present-day humans) (Table 1). Higher dose responses in Denisovan ORs appear to be driven by two amino acid variations in two of the ORs with novel variants (Fig. 3, Fig. S3).
Despite using the same DNA sequence as previous studies of 2B11 (22) and 6P1 (17), the human reference for both did not respond to any of the ∼350 odors (100µM) against which they were tested (Fig. S4)—neither did the extinct linages. In previous studies, OR2B11 (22) and OR6P1 (17) responded strongly to coumarin and anisaldehyde, respectively, at concentrations higher than 100µM. We found that such concentrations often cause cell toxicity or OR non-specific cell responses.
OR1A1. All but 2-heptanone induced a response in the screening assay (Fig. S3A) and the highest responses were for honey (particularly for Denisovan) (Table 1, Fig. 3). Neither of the two Denisovan variants (V461I1.54, T257M6.54) were in an amino acid region critical for mammalian OR function nor were they involved in the odorant binding cavity—perhaps explaining their minimal functional impact.
OR1C1. The only significant response was from Chagyrskaya Neandertal to androstadienone (Fig. S3B) and it was weak (Table 1; Fig. 3). Chagyrskaya Neandertal 1C1 variant Y120H3.48 is part of the highly conserved MAY3.48DRY motif involved in the activation of mammalian ORs (Fig. 3, Fig. S3B), which might explain why this variant alters function.
OR2C1. Screening assay responses were strong but not statistically significant (Fig. S3A). The dose response assay for octanethiol produced a statistically significant response in the Denisovan version of this OR (Table 1, Fig. 3). The shared C149W4.50 corresponds to the conserved W4.50. The W residue is highly conserved in GPCRs but less so in ORs (58%). The location of the novel Denisovan I214T5.54 in TM5 is below residues involved in canonical ligand binding cavity and it points into the receptor rather than the surface. In addition, prior functional tests for C149 found a similar response (17). The C149W allele may stop protein function and may have produced octanethiol-specific anosmia.
OR5P3. In the screening assay, Vindija Neandertal did not have a significant response but the human reference responded to five of the seven odorants (Fig. S3B). Vindija dose responses, which included higher concentrations of the top three responses for present-day humans (courmarin and both enantiomers of carvone), did not exceed control (Fig. 3). The cell surface expression for Vindija indicated that the OR proteins was present at the cell surface, albeit at a slightly lower level than that in present-day humans. Vindija F159L4.58 is in the extracellular part of TM4 (Fig. 3), near 4.53, which is involved in mouse OR trafficking (24). We observed slightly lower trafficking of the Vindija protein. A similar mutation (S155A4.56) in human OR1A2 decreases in vitro responses to (S)-(-)-citronellal (23). If OR5P3 F159L4.58 is involved in odorant binding, the mutation of this position from phenylalanine to leucine might prevent the π-π stacking interaction between the aromatic residue and coumarin. We conclude that the Vindija protein is not functional and this might be attributable to many potential reasons (a few examples are lack of odorant binding, or fail in activation mechanism, or fail to bind the G protein).
OR10G3. The screening assay revealed significant responses for all seven odors in the Denisovan OR and in the human reference (Fig. S3A). Denisovan variants had significantly stronger dose responses to vanillin and ethyl vanillin compared to present-day humans (Table 1; Fig. 3). Neither of the Denisovan variants ( S73G2.52, E197K5.36) were located in conserved amino acid regions (Fig. 3). TM2 is not involved in odorant binding or receptor function, implying that S73G2.52 probably did not change the receptor response. TM5 (E197K5.36) forms part of the binding cavity but position 5.36 is located at the very limit of ECL2. K at this position is a rare residue in present-day human ORs (3.6%), suggesting a functional, adaptative reason for this change. The location of the variant suggests it may be involved in ligand entry.
OR10J5. There were three significant screening responses to lyral and helional and eugenol (Fig. S3A). The Denisovan response to top odor lyral was lower than that of the human reference (Table 1, Fig. 3). The G21RNter variant of 10J5 found in the Denisovan is located at the very end of the N terminal end, just before the start of TM1. The role of this region in OR function is undetermined.
Discussion
Ten novel missense variants were located in 8 genes (out of 30), with five being functionally different from present-day humans (1C1, 2C1, 5P3, 10G3, 10J5), one the same (1A1) and two without identifiable ligands (2B11, 6P1). Given the small percentage of genes with variants altering OR function, members of the genus Homo likely shared an olfactory repertoire, with Neandertals and Denisovans smelling the same range of odors we do but having different dose responses to those odors. When OR function was altered by a novel OR, the difference was in sensitivity rather than specificity. Novel Denisovan OR variants (1A1, 2C1, 10G3, 10J5) were twice as responsive as human equivalents to odors present-day humans perceive as spicy, balsamic, and unpleasant (Table 1), but not to odors perceived as floral. Novel Neandertal variants were three times less responsive than human ORs, including reduced responses to odors perceived as green, floral, and spicy (Table 1). There is some correlation in Neandertal skull morphology that suggests their olfactory bulbs were smaller than present-day humans (6), but the link between bulb size and olfactory acuity is unclear (25, 26).
The Denisovan ORs strong response to honey may be ecologically significant. Honey is the most energy-dense natural food and is a prized component of living hunter-gatherer diets (except where bees are rare or absent)—even great apes have a ‘honey tooth’ (27). Energy-dense foods like sugars are sought by larger-brained primates (28) and, based on oral microbiome data, Neandertals and present-day humans share functional adaptations in nutrient metabolism including starch digestion that are not found in our closest ape relatives (29). The high response to vanilla odors suggests a potential response to sweet things—an odor-taste pairing common in present-day humans (30).
Local ecological adaptive pressures may have acted on ORs in extinct lineages to produce the few novel variants observed, but extinct lineages were less variable in OR genes and proteins compared to 1000 Genomes. While differences in sample sizes might account for some of the striking differences, the reduced variation is probably due to genetic drift effect or conservation. Small effective population sizes cause bottlenecks and small, isolated populations (18) are prone to genetic drift. Purifying selection has been observed in chimpanzee ORs compared to a mix of relaxed and positive selection in human ORs (31) and extinct lineages may also have been subject to this, as evidenced by fewer variants that mostly code for synonymous proteins (compared to protein variation in 1000 Genomes). The mean of Fst values across genes comparing 1000 Genomes to extinct lineaages (11%) is higher than those for 1000 Genomes population comparisons (4%), which suggests that there are structural differences between them and us that reflect both explanations—drift and conservation. Based on our data the last common ancestor shared by Homo (present-day humans, Neandertals, Denisova, and others) and Pan (chimpanzees, bonobos) had a conserved set of ORs. Present-day humans derived away from the pattern of conservation more recently, with evolutionary pressure toward increased missense variation.
Despite having a shared repertoire with Neandertals and Denisovans, present-day humans are novel in their highly variable OR repertoire which may reflect cultural adaptations following migrations from Africa. Relaxed selection on OR genes for groups no longer engaging in traditional lifestyles is possible (32). 1000 Genomes groups outside Africa are less variable in most OR genes than those in Africa and OR gene enrichment is observed in African hunter- gatherer groups (Hadza and Pygmies) but not African agricultural (Yoruba) and pastoral (Maasai) groups (33). Tanzanian Sandawe hunter-gatherers show no OR allelic enrichment, however, which undermines the case for relaxed selection (33). High allelic diversity and OR generalization on a broad scale may have functional implications, such as increasing the effective size of the olfactory repertoire (34). Understanding our unique OR allelic diversity is an important challenge.
Our data provide insights into how the dispersal of human lineages outside of Africa (Denisova, Neandertal, ancient human) may have affected olfactory gene repertoire and function.
Understanding the evolutionary genetics and functional significance of observed OR allelic variability in and almong human populations and extinct relatives sheds light on the role of olfaction in key aspects of human culture, and perhaps our current success as a global species.
Funding
Provide complete funding information, including grant numbers, complete funding agency names, and recipient’s initials. Each funding source should be listed in a separate paragraph.
US National Science Foundation Award 1550409 (KCH) National Institutes of Health grant K99DC018333 (CADM) US National Science Foundation Award 1556207 (HM) National Institutes of Health grant DC014423 (HM) National Institutes of Health grant DC016224 (HM)
Author contributions
Conceptualization: KCH, HM, MC
Methodology: KCH, CADM, HM, MC
Investigation: KCH, CADM, MA
Visualization: KCH, CADM
Funding acquisition: KCH, CADM, HM
Project administration: KCH, HM
Supervision: KCH, HM
Writing – original draft: KCH, CADM
Writing – review & editing: KCH, CADM, MC, HM
Competing interests
Authors declare that they have no competing interests.
Data and materials availability
Data Availability. Raw data for this project and data derived from it are available in at https://github.com/kchoover14/OldNoses and are usable under the license provided in the repository.
Code Availability
Code for VCF data scraping and code for analysis in Tables S4-S5 and Figs. 1-2 and Figs. S1-S2, S7-S9 are available at https://github.com/kchoover14/OldNoses and are usable under the license provided in the repository.
Supplementary Materials
Materials and Methods
Materials
We cataloged variants in high quality paleogenomic sequence data produced by the Max Planck Institute Leipzig (Table S2) for Neandertals (Altai, Chagyrskaya, Mezmaiskaya), Denisovan 3 (the only high quality Denisovan genome, finger phalanx), and an ancient human hunter-gatherer from Siberia (Ust’-Ishim). We used the data generated by snpAD in /neandertal/vindija/vcf, an ancient DNA damage-aware genotyper.1 The VCF reference genome for genomes analyzed was hg19/GRCh37. Only variants with a minimum genotype quality of 20 (GC20) were used for downstream analysis. Our sample of present-day humans was from the 1000 Genomes dataset (35) consisting of over 2,500 individuals from 26 populations in Africa, the Americas, Europe, and Asia (Table S3). The 26 groups in 1000 Genomes were comprised of individuals with at least 3 out of 4 grandparents identifying membership in the group. While there are some groups engaging in pastoral or traditional horticulture/farming lifestyles, most practice mixed subsistence economies and lead western lifestyles. Although the draft genomes of Neandertals and Denisovans were reported as having more nonfunctional ORs than present-day humans (36), these genomes have since been revised (37, 38). The bioinformatic identification of olfactory pseudogenes has been experimentally challenged (39) by data showing the receptors containing coding sequence regions split into separate exons are conserved across mammals and expressed at the same level as protein-coding receptors with a single exon (40).
We focused on 30 OR genes shown to generate functional response data in previous studies (16) (Table S1). Gene regions were targeted using NCBI and RefSeq ranges—in the case of eight genes that contained up- and downstream sequences, the region was cut to the protein coding portion of the gene to be consistent with the other 22 genes that contained no up- or downstream areas.
Methods
Variant calling
Max Planck and 1000 Genomes Project both used the reference genome, hg19/GRCh37, to call variants using VCFtools (41). GRCh37 is built from sequences from different individuals and serves as the wild type relative. If the reference allele is C at a specific genomic position, a variant is called if it is not a C. We excluded insertions and deletions from analysis. BCFtools (42) was used to slice VCFs into chromosomes (ancient DNA) and genes of interest. For 1000 Genomes, BCF tools was also used to generate population specific VCF files. BCF tools was used to create consensus sequences for each gene for the entire 1000 Genomes dataset and for each population.
We cataloged variants using two sets of VCF data published by Max Planck Institute for Evolutionary Anthropology Leipzig in 2013 and 2016. Both datasets are based on the same original sequence data but the 2016 VCFs were generated using snpAD, an ancient DNA damage-aware genotyper, and are more conservative estimates mutations (37).
The 2016 ancient VCFs (Table S2, S6) include high quality data from three Neandertals (Altai, Chagyrskaya, Vindija), lower quality data from one Neandertal (Mezmaiskaya), high quality from one Denisova (Denisovan 3), and one ancient human contemporary to Altai Neandertal and Denisova in Denisova Cave (Ust’-Ishim).
The 2013 VCF data analysis included high quality data—but not subject to the damage-aware genotyper—for Altai Neandertal, Denisova 3, and Ust’-Ishim. Variants were called using a custom bioinformations pipeline (Fig. S5). Data are found at http://cdna.eva.mpg.de/ust-ishim/VCF and http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/. Most variants were shared with present-day humans (Table S8). Three genes matched the human wild type (contained no variants, novel or shared) in all three samples tested (OR5K1, OR11A1, OR56A4) and 16 contained novel missense variants (Table S9). See Fig. S6 for functional results and Fig. S7 for regression results—Altai is significantly different from present-day humans and Denisova but there were few differences in terms of patterns of response.
Novel variant calling
R v4.1.0 (43) via R Studio v1.4.1717 (Rstudio, inc., Boston, MA, 2015) were used for analysis. Excel and csv files were read using readxl (44). The hash function in R used 1000 Genomes variant data (35) as a key to flag variants present in ancient samples but not found in present-day humans. Data were manipulated using dplyr (45) and tidyverse (46).
Genetic Analysis
Variants were called using DNAsp to generate Nexus files for each gene comparing a consensus sequence for each of the 26 groups in 1000 Genomes to the human reference sequence used for the published datasets (Max Planck and 1000 Genomes). Consensus sequences were used for each group in 1000 Genomes to reduce the sample size for comparison to ancient sequences which represented individuals, not groups. Biostrings (47) was used in R to create protein sequence files and call amino acid substitutions. Figures for percentage of variation by gene were generated using the R package ggplot2 with the viridis color-blind friendly palette and plotted panels using gridExtra (48). Concatenated amino acid sequences were used to infer phylogenetic relationships across populations for 30 genes (49). The phylogenetic tree was created using the R packages ape (50), seqinr (51), phylogram (52), gdsfmt (53), SNPrelate (54), and dendextend (55). Fst was calculated using gdsfmt (53) and SNPRelate (54) for Figs 1-2 and Figs S1-S2, S7- S9.
Primer design
ORs for extinct humans were created by mutating human ORs to match paleogenomic sequence Data using chimeric PCR. Forward and reverse PCR primers containing the desired mutation were designed to have a 56⁰C or 58⁰C annealing temperature, obtained from Integrated DNA Technologies, and diluted to 5μM.
Chimeric PCR
Chimeric PCR was performed using Phusion polymerase and Rho-tagged OR in a pCI vector as a template, with separate reactions using the forward primers paired with a 3’ pCI primer or reverse primers paired with a 5’ pCI primer (56). The reaction was started at 98⁰C for 30 seconds, then run for 25 cycles of the following conditions: denaturation at 98⁰C for 5 seconds, primer annealing at 55⁰C for 15 seconds, and elongation at 72⁰C for 30 seconds. The reaction was then held at 72⁰C for 5 minutes. The products resulting from the forward and reverse primers were combined for each mutant and diluted 10x with distilled water (Gibco). A second PCR was performed using Phusion polymerase, 5’ and 3’ pCI primers, and the combined and diluted products for each desired mutant as the template. The same PCR conditions were used. The products were purified using the QIAquick PCR Purification Kit (Qiagen), cut with MluI and NotI restriction enzymes, run on a 1.1% agarose gel with GelRed, and extracted using the QIAquick Gel Extraction Kit (Qiagen). The products were then ligated into Rho-tagged pCI cut with MluI and NotI using T4 ligase (New England Biolabs) and used to transform competent ampicillin-resistant E. Coli. These were plated on LB-ampicillin plates and incubated at 37⁰C overnight, then a single colony was grown in 4mL of 2XYT-ampicillin (100mg/mL) medium overnight at 37⁰C. The Denville Miniprep Purification Kit was used to lyse the bacteria and purify the plasmid DNA. The concentration of the products was determined using an Eppendorf Biophotometer, and then adjusted to 100ng/μL using TE buffer. The products were sequenced using BigDye Terminator Sequencing Kit (Applied Biosystems), purified using Sephadex (GE Healthcare), and sequenced with 3130 Genetic Analyzer (Applied Biosystems).
Luciferase assay
Functionality was tested by a Luciferase assay as described in Zhuang and Matsunami (16). Hana3A cells were plated in Minimum Essential Medium (MEM) containing 10% FBS (vol/vol), penicillin-treptomycin and amphotericin B on Poly-D-lysine coated 96-well plates (Corning #3843) and co-transfected using Lipofectamine2000 (Invitrogen) with the prepared Rho-tagged OR or the empty vector pCI (negative control), RTP1S, and the muscarinic receptor M3. The cells were transfected with elements from the Dual-Glo Luciferase Assay System (Promega): the cAMP response element (CRE) coupled with a luciferase reporter gene (L); and the constitutively active SV40 promoter region coupled with the Renilla luciferase (RL) reporter. In a period of 18-24 hours after transfection, the medium was replaced with 0 μM, 1 μM, 3.16 μM, 10 μM, 31.6 μM, 100 μM, or 316 μM of odorants (Sigma Aldrich) in CD293 (Gibco) containing glutamine and copper and incubated for 3.5 hours. After the addition of Dual-Glo luciferase substrate and buffer (Promega), plates were read using BMG Labtech POLARStar Optima plate reader. Promega Stop-and-Glo buffer and RL substrate were added, and plates were read again. The degree of activation was quantified for each well in Microsoft Excel using the formula (L- 400)/(RL-400) and reported on the y axis (emitted in lumens). The RL quantification is made due to variation in cell numbers in each well and the 0μM values provide the basal activity value for each OR. 0μM thus served as a comparison to identify OR response values to odorant stimulations —if the odor value on the y-axis exceeded the 0μM value, the OR was taken to have responded to the odor but if the odor value did not exceed the 0μM value, the OR was considered unresponsive. Dose responses and EC50s were determined with Graphpad Prism 7. Activity index have been calculated from dose responses by multiplying the absolute value of the logEC50 to the efficacy. If EC50 could not be determined because the dose response did not reach its plateau, logEC50 were set as the arbitrary value -2. These values were used for Table 1 along with odor descriptors and olfactory notes from Good Scents (57).
Flowcytometry
The cell surface expression of the Rho-tagged ORs was evaluated as described in Ikegami, de March et al. (24). HEK293 cells were plated in 35 mm plates at 25% confluency and grown overnight at 37°C and 5% CO2. The cells were then transfected with the Rho-tagged ORs and GFP using the Lipofectamine2000 reagent (ThermoFisher Scientific) in MEM supplemented with 10% FBS. 18 to 24h later, the cells were detached from the plates with cellstripper and resuspended in PBS containing 15 mM NaN3, 2% FBS in ice. The cells were then centrifuged at 4°C and resuspended and incubated for at least 30min in primary antibody (mouse anti Rho4D2, Millipore Sigma). The cells were then centrifuged at 4°C and resuspended and incubated for at least 30 min and in the dark in phycoerythrin (PE)-donkey anti-mouse F(ab’)2 Fragment antibody (Jackson Immunologicals: 715-116-150). Finally, the cells were stained with 7-Amino- actinomycin D (7AAD, Calbiochem). The cell surface expression was monitored through the PE fluorescence emitted by cells GFP positive, single, spherical, viable and 7AAD negative using a BD FACSCanto II. The results were analyzed with Flowjo. Olfr539 and Olfr541 were added to the experiment plan as positive and negative, respectively, controls of OR cell surface expression.
Homology model
The protocol followed a previously published method (56). 391 human OR sequences were aligned to pre-aligned sequences of 11 GPCRs including bovine rhodopsin (PDB: 1U19), human chemokine receptors CXCR4 (PDB: 3ODU) and CXCR1 (PDB: 2LNL), and human adenosine a2A receptor (PDB: 2YDV) using Jalview (58). The four experimental GPCR structures (1U19, 3ODU, 2YDV and 2LNL) were used as templates to build the human consensus OR by homology modeling with Modeller. The human consensus amino acid sequence was determined by aligning 391 human OR sequences and by selecting the most conserved amino acid for each position. Five models were obtained and the one fulfilling several constraints (binding cavity sufficiently large, no large folded structure in extra-cellular loops, all TMs folded as α-helices, a small α-helix structure between TM3 and TM4) was retained for further structural analysis.
Visualization of the models and picture generation were performed with VMD.
Regression Analysis
Data were analyzed in R for Windows 4.1.0, using the R Studio GUI for Windows 1.4.1717. Data were wrangled using dplyr (45) and tidyverse (46). The dose response dependent variable was fitted to distributions using fitdistrplus (59) and formed log normal distributions. The natural log was used to transform to normally distributed data for linear regression models. The car package (60) was used for ANOVA. 0μM values, as noted above, provided a control (the baseline threshold of OR activation). Because the number of cells varied across plates, the OR tested for each sample had a different 0μM. As such, removing values below the 0μM for each correlation between present-day humans and ancient DNA sample resulted in some odors missing values for one of the lineages. We compared the results of regression for testing using only active ORs (above the 0μM value) and for the full set of ORs—the results were nearly identical (Fig S8 for 2016 VCFs, Fig S9 for 2013 VCFs). Testing the full suite of responses, active or below the threshold, allowed a broader comparison of how these cells operate. Analysis was conducted on the full dataset for both between sample correlations and for mean difference testing. Plots for data visualization and results were created using ggplot2 (61) and ggpubr (62). Panels were created using cowplot (63) and ggarrange.
Data S1. (separate file)
Raw data and codes for this project and data derived from it are available in at https://github.com/kchoover14/OldNoses and are usable under the license provided in the repository.
Acknowledgments
The high-performance computing and data storage resources operated by the Research Computing Systems Group at the University of Alaska Fairbanks, Geophysical Institute.
Footnotes
This version has a revised Fst table and the data that underlie that calculation. There were some minor edits to text following Fst revision and first review process.