Cytokine gene polymorphism and parasite susceptibility in free-living rodents: Importance of non-coding variants

Associations between genetic variants and susceptibility to infections have long been studied in free-living hosts so as to infer the contemporary evolutionary forces that shape the genetic polymorphisms of immunity genes. Despite extensive studies of proteins interacting with pathogen-derived ligands, such as MHC (major histocompatilbility complex) or TLR (Toll-like receptors), little is known about the efferent arm of the immune system. Cytokines are signalling molecules that trigger and modulate the immune response, acting as a crucial link between innate and adaptive immunity. In the present study we investigated how genetic variation in cytokines in bank voles Myodes glareolus affects their susceptibility to infection by parasites (nematodes: Aspiculuris tianjensis, Heligmosomum mixtum, Heligmosomoides glareoli) and microparasites (Cryptosporidium sp, Babesia microti, Bartonella sp.). We focused on three cytokines: tumour necrosis factor (TNF), lymphotoxin alpha (LTα), and interferon beta (IFNβ1). Overall, we identified four single nucleotide polymorphisms (SNPs) associated with susceptibility to nematodes: two located in LTα and two in IFNβ1. One of those variants was synonymous, another located in an intron. Each SNP associated with parasite load was located in or next to a codon under selection, three codons displayed signatures of positive selection, and one of purifying selection. Our results indicate that cytokines are prone to parasite-driven selection and that non-coding variants, although commonly disregarded in studies of the genetic background of host-parasite co-evolution, may play a role in susceptibility to infections in wild systems.


Introduction
Parasite-driven selection is considered a key factor shaping the evolution of the components of the immune system. In mammals and higher vertebrates, the immune system comprises dozens of interacting molecules, yet the mechanisms of this selection have been comprehensively studied only in the case of major histocompatibility complex (MHC) genes (e.g. Radwan et al 2020). Though the MHC plays an important role in the antigen-based pathogen recognition, it is not the only nor the main factor responsible for resistance against pathogens (Jepson et al. 1997). Recently, researchers focused on the components of the innate immunity, in particular the toll-like receptors (eg. Fornuskova et al. 2013, Babik et al. 2015, Kloch et al. 2018 but evolution of other elements of the immune system, including cytokines, remains poorly understood. The cytokines, signalling molecules capable of triggering and modulating the immune response are the crucial link between innate and adaptive immunity. As an efferent arm of the immune system and they are expected to be more evolutionary constrained than elements of the afferent arm (Chapman et al. 2016) but a handful of studies reported signatures of balancing or positive selection in this group of molecules. For instance, balancing selection was found in interleukins Il-1B, Il-2, and TNF in field voles (Turner et al. 2012), and in Il-10 and CD14 in humans (Ferrer-Atmetlla et al. 2008). Another evidence for contemporary parasite-driven selection operating on cytokines are significant associations between genetic variants and susceptibility to infections. In humans, polymorphism within the LTα has been linked to several diseases, such as Mycobacterium leprae (Ware 2005) and malaria (Barbier et al. 2008). Among rodents, both positive and negative associations between variation in interleukins and infections with uni-and multicellular parasites were find in the field vole (Turner et al. 2011). Single study from a freeliving animal revealed that variants affecting susceptibility may be located outside the coding part of a cytokine gene (Guivier et al. 2010) but further studies are needed to evaluate the role of noncoding polymorphisms.
To better understand the role of the parasite-driven selection in maintaining polymorphism in cytokines, and to further explain the role of this polymorphism in resistance against pathogens in the wild, we studied three cytokine genes: tumor necrosis factor (TNF), lymphotoxin alpha (LTα) formerly known as tumor necrosis factor beta (TNF), and interferon beta (IFNβ1). Their versatile role within the immune system, along with an evidence from human studies underlying its importance in a response against various pathogens, makes them a promising candidate gene for studying the mechanisms of parasite-driven selection. TNF initiates acute phase response and acts as an endogenous pyrogen contributing to the inflammation. Through stimulating endothelial cells in blood vessels it plays a role in preventing the pathogen from entering the bloodstream and containment of local infection (Murphy et al. 2012). LTα is expressed in lymphocytes and it plays a major role in immunomodulation and signal transduction within the immune system. It also induces inflammation and is the key factor facilitating the innate immune response through activation of IFNβ and NF-κB pathways (Iizuka et al. 1999;Ware 2005). LTα is also necessary for effective adaptive responses involving T and B lymphocytes (De Togni et al. 1994). IFNβ is produced in response to viral but also bacterial infections (Nagarajan 2011).
Through specific pathway, it plays a major role in linking innate and adaptive immunity. We expected to detect signatures of selection acting on the studied loci and to find significant associations between specific variants and susceptibility to infections.

Samples and parasite screening
In the present work we used samples collected in 2005 and in 2016 in three sites in NE Poland (Urwitałt, Pilchy, Tałty) located within 30 km distance. There are temporal and spatial differences in parasite prevalence between these sites described comprehensively elsewhere (eg. Bajer et al. 2005, Welc-Faleciak et al. 2008, and associations between parasite load and variation in MHC and TLR genes have been reported (Kloch et al. 2010(Kloch et al. , 2018. Number of samples per year per site is given in Table S1. The field procedures followed the guidances of the National Ethics Committee for Experimentation on Animals and were approved by the Local Ethical Committee no. 1 in Warsaw, decisions no. 280/2003 and 304/2012. Field procedures and analysis of helminth infections followed protocols described in Bajer et al. (2005) and Kloch et al. (2010). In brief, voles were livetrapped in wooden traps, transported to the field station, euthanised, and sectioned in search for internal helminths. Infections with the intestinal protist Cryptosporidium sp. were identified in faecal smears using the Ziehl-Nielsen staining technique (Heinriksen and Pohlenz 1981). Blood pathogen Bartonella sp. was identified by PCR using primers described by Norman et al. (1995), and for Babesia microti we used protocol described by Persing et al. (1992). Details of the PCR reactions are given in Table S2.

Sequencing and genotyping
To minimize PCR errors, cytokine genes were amplified using high fidelity polymerase Phusion or Q5 (NewEngland Biolabs). The mix contained 10uM dTNPs, 0.5uM of each primer, 0.02U/ul of the polymerase, and 10-50 ng of genomic DNA. Primer sequences and PCR conditions specific for each locus are given in Table S2. Amplified regions spanned most of the expressed exons and respective separating introns (Table S3).
Samples from 2005 and 2016 were processed separately following the same protocol. In both experiments, the amplicons were pooled for each individual as described previously (Kloch et al. 2018) and purified twice using CleanUp kit (Aabiot). The libraries were constructed using using Nextera XT DNA Library Preparation Kit, and sequenced using MiSeq Reagent Kit v3 in Illumina MiSeq machine. The only difference between experiments was the number of cycles used for sequencing: samples from 2005 were sequenced using 150 cycles, and samples from 2016 using 300 cycles. Both runs gave high coverage exceeding 1000x per site and per sample.
Reads from both runs were processed in the same pipeline as described previously (Kloch et al. 2018). In brief, adapter-clipped reads were mapped using bwa-mem ver. 0.7.12 with default parameters (Li 2013) against reference constructed from regions including the genes of interests extracted from bank vole genome (PRJNA290429). Duplicated read-pairs were removed using Picard MarkDuplicates (http://picard.sourceforge.net), and variants were called in two-step procedure in FreeBayes v 1.1.0-60 (Garrison & Marth 2012). In the first round, potential variants were called using the following parameters: minimal fraction of alternate allele of 20% as recommended by Nielsen et al. (2011), minimum number of reads supporting alternate allele > 2, and minimal read coverage > 5. The results were filtered using vcffilter v. 41  (Castelli et al. 2015, Lima et al. 2016). The reading frames and intron/exon structure were resolved through an alignment to the orthologue mouse sequences. The sequencing revealed 43 SNP in ~2000bp in total.

Associations between genetic variants and parasite load
The sample size was strongly limited by the permission from the Ethical Committee.
Thus, in order to minimize false positives we selected such a combination of variables fitted in the models that produce reliable results with lowest possible sample size (Hong and Park 2012).
First, we removed variants and individuals with missing SNPs. Next, SNPs were filtered in PLINK v1.90 (Purcell et al. 2007) so that only those in Hardy-Weinberg equilibrium (threshold of p<0.001) and in linkage equilibrium (r>0.7) were retained. (We tested for LD values from r>0.6 to r>0.9 and all the values resulted in the same set of SNPs). This ensured that genetic variants fitted in models can be treated as independent explanatory variables. After filtering, of the initial 18 SNPs in TNF, 17 in LTα, and 8 in IFNβ1 we retained one SNP in TNF, 7 SNP in LTα, and two in

IFNβ1.
In a second step, we excluded from the models variants with MAF<0.5 and those present in fewer than 5 animals (see Table S4), as we lacked statistical power to test for their effects. As response variables we used only pathogens that infected 20-80% of hosts (Table S5).
We started the modelling from testing for the effect of non-genetic terms that may affect the parasite load: year and site of sampling, and host sex and host age approximated by its body mass (Table S6). If p<0.1, these terms were included in the models as co-factors. For each studied gene, we constructed two sets of generalized linear models (GLM) using R with either i) parasite presence/absence, or ii) parasite abundance (number of parasites of a given species per host) as the response variables. We used binomial errors for parasite presence/absence and Poisson for abundance, controlling for overdispersion. The explanatory variables were SNPs retained after filtering and non-genetic terms as described above. To minimize type I error when comparing effects of more than one genetic term, the p-values were corrected for multiple comparisons using Benjamini-Yekutieli false discovery rate (Benjamini and Yekulteli 2005) using p.adjust function in R. For all the calculations we used R version 3.6.3 (R Core Team 2020).

Tests of selection
To identify potential targets of selection, we computed phylogenetically controlled codon-based tests which are suitable for identifying sites under selection using sets of sequences from a single species (Kosakovsky Pond and Frost 2005). For the calculations, we used DataMonkey server (Delport et al. 2010). Prior to analysis, we tested for possible recombinations using GARD, Genetic Algorithm Recombination Detection (Kosakovsky Pond et al. 2006). We

Cytokine polymorphisms and susceptibility to infections
The only SNP in TNF gene retained after filtering did not affect presence/absence nor intensity of infection with any of the studied parasites (Table S7).
Two SNPs in LTα were significantly associated with parasite load (Table 1). Individuals with genotype CT in position 322 coding for non-synonymous substitution Gln > Arg were more often infected with the nematode A. tianjinensis than homozygotes TT (57.1 vs. 34.4% infected, respectively, Figure 1, Table 1). The alternative homozygote CC was found in a single host so we lacked statistical power to estimate its effect. An intronic variant LTα 525 affected abundance of infection with the nematode H. mixtum. Homozygotes TT were infected by fewer worms compared to genotypes GG and TG (Figure 1). Two IFNβ1 variants affected the parasite load (Table 1). Voles heterozygous in synonymous variant IFNβ1 105 were less frequently infected with the nematode H. glareoli (11.7%) compared to homozygotes TT (57.14%) and CC (18.9%, Figure 1), and they also harboured fewer worms (0.411) compared to 1.78 vs 1.19 in the respective homozygotes. In IFNβ1 127 homozygotes AA were most often infected (38%), and prevalence was similar in genotypes AG and GG. Homozygotes GG had the lowerst infection intensity but average high parasite load in heterozygotes may be contributed to a single highly infected individual. When it was excluded, intensity of infection was similar in genotypes AG and GG.
Genotype in IFNβ1 105 affected also infection risk with blood parasite Bartonella yet this effect was marginally non-signficant after controlling for multiple comparisons (p=0.068, Table   1). Heterozygous voles were more often infected than either homozygote (36.3% compared to 18.4% in CC and 21.24% in TT.

Signatures of selection
Selection tests were generally consistent across loci, and each of them reported the same set of codons (Table S8)

Discussion
Studies of associations between polymorphisms in the immunity genes and susceptibility to diseases in wild mammals usually focus on receptor proteins, such as MHC or TLR. Since they physically interact with pathogen-derived motifs, their nucleotide composition can be directly attributed to functional variation. In the case of cytokines -excreted molecules involved in signal transduction -identifying potential targets for selection is not straightforward. Some authors suggested that molecules of such a function are primary affected by purifying selection (Chapman et al. 2016), but several studies in free-living mammals showed otherwise (Turner et al. 2011(Turner et al. , 2012. Numerous human studies associated SNP variation in cytokines with a variety of diseases and conditions (reviewed in Hollegaard and Bidwell 2006), and in the present study we show that cytokine variance play also a role in susceptibility to infections in free-living rodents.
We found signatures of positive selection in two variants: LT 525 and IFN 127 what suggests evolutionary pressure favouring these sites. This may be explained by significant effect of these variants on parasite load what is consistent with a pathogen-mediated model of evolution: repeated rounds of positive selection interspersed with purifying selection. Interestingly, variant IFN1 105 associated with susceptibility to two pathogens displayed signatures of purifying selection. Individuals with the genotype TT were more susceptible to the nematode H. glareoli but less to blood parasite Bartonella sp. compared to other genotypes, and the frequency of the TT genotype was twice lower than the alternative CC. We hypothesize that TT genotype is negatively selected what may be caused by a stronger fitness effect of an infection with bacteria that attacks endothelium and erythrocytes compared to nematodes dwelling the intestine lumen.
Variant IFN1 105 affecting parasite load is synonymous. This result may seem confusing, as synonymous substitutions do not affect the amino-acid composition of a protein.
However, site-specific signals of selection in this locus strongly suggest that non-neutral pressure is exerted on these sites. The role of non-coding variants may be more significant than previously thought, for instance the list of human diseases associated with synonymous mutations is expanding (Sauna and Kimchi-Sarfaty 2011). In a large meta-analysis of human GWAS, Chen et al. (2010) reported that synonymous SNPs were as often involved in disease mechanisms as non-synonymous SNPs, and they were not in linkage disequilibrium with causal non-synonymous SNPs. Several mechanisms may explain the role of synonymous mutations. They may affect mRNA splicing and stability of transcripts (Cartegni et al. 2002). Although coding for the same amino-acid, some variants may be preferred during the elongation, promoting co-evolution to optimise translation efficiency (Plotkin and Kudla 2011). On the other hand, since synonymous mutations are assumed to be evolutionary silent, their effects might have been under-reported (Chen et al. 2010, Sauna andKimchi-Sarfaty 2011). This further underlines the importance of studies on synonymous SNPs in non-model species for better understanding processes maintaining genetic diversity within the immune system in the wild.
Another type of non-coding SNPs that we found to significantly affect the parasite load was an intronic variant LTα 525. Notably, it was not linked to any exonic variant, and strong LD was found only to another intronic SNP. Again, human association studies confirmed the role of intronic polymorphisms in susceptibility to diseases (Manolio et al. 2009;Cooper 2010), particularly when mutations are located close to intron-exon junctions or within a branchpoint sequence (Královicová et al. 2006). Intronic variants may also affect expression through alternative splicing or interactions with regulatory elements (Cooper 2010). The intronic LTvariant 525 that affected susceptibility to infection with H. mixtum was located only 18bp from intron-exon junction. Introns in LTα gene were shown to affect expression in several in vitro and in vivo studies (reviewed in Yokley et al. 2010), and intronic SNPs can still affect splicing or expression, even if separated by over 30bp from any splice site (Coulombe-Huntington et al. 2009;Cooper 2010).
Studies of the effect of intronic variants on resistance against infections in the wild are rare but in humans, positively selected SNP associated with susceptibility to Lassa virus were found in interleukin IL21 gene outside the open-reading frame (Andersen et al. 2012). The authors suggested that those variants may lead to regulatory changes such as differential gene expression.
Intronic variants may also affect cytokine interactions with other components with the immune system. For instance, an intronic variant in the human IFN gene coincides with a putative NF- B binding site which might have functional consequences for the transcription of the human IFN gene (Pravica et al. 2000). In bank voles, SNP located within the promoter of the TNF gene affected susceptibility to Puumala virus (PUUV) (Guivier et al. 2010). Unfortunately, we could not confirm this pattern as the fragment amplified in the current study did not span 5' upstream region.
Our finding suggests that intronic and synonymous variants may play an important role in susceptibility to infections in wild systems. Thus, it should not be neglected in studies of the genetic background of host-parasite co-evolution.

Funding
The work was supported by grant no. 2012/07/B/NZ8/00058 from the Polish National Science Centre to AK.
Author contributions AK conceived the study, genotyped samples and analysed data, ABi did test of selection. AK and ABi prepared the manuscript. AK and ABa identified helminth infections and collected data in the field with the help from EM. EM and RWF analysed infections with blood parasites.

Declaration of Competing Interest
None.