An Alzheimer’s disease-associated common regulatory variant in a PTK2B intron alters microglial function

Genome-wide association studies (GWAS) are revealing an ever-growing number of genetic associations with disease, but identifying and functionally validating the causal variants underlying these associations is very challenging and has only been done for a vanishingly small number of variants. Here we validate a single nucleotide polymorphism (SNP) associated with an increased risk of Alzheimer’s disease (AD) in an intronic enhancer of the PTK2B gene, by engineering it into human induced pluripotent stem cells (hiPSCs). Upon differentiation to macrophages and microglia, this variant shows effects on chromatin accessibility of the enhancer and increased binding of the transcription factor CEBPB but only subtle effects on PTK2B or CLU expression. Nevertheless, this variant also results in global changes to the transcriptome and phenotype of these cells. Expression of interferon gamma responsive genes including chemokine transcripts and their protein products are altered, and chemotaxis of the resulting microglial cells is affected. This variant thus causes disease-relevant transcriptomic and phenotypic changes, and we propose that it acts by altering microglia reactivity, consistent with the role of these cells in progression of AD.


Introduction
Alzheimer's disease (AD) is the most common form of dementia, and has a strong genetic component estimated to be 60-80% [1].AD is characterised by the presence of β-amyloid-containing plaques (Aβ) and intracellular neurofibrillary tangles of hyperphosphorylated tau, accompanied by neuroinflammation, neuronal, and synapse loss [2].Multiple GWAS studies have collectively identified more than 80 genetic loci associated with disease [3][4][5][6][7], the vast majority of which do not affect protein coding sequence.Due to linkage disequilibrium at these loci, it is difficult to identify the causal variants.Multiple computational methods have been developed to prioritise variants that cause disease including statistical fine mapping, co-localisation with expression or chromatin accessibility quantitative trait loci (eQTL, caQTL), overlap with enhancer regions identified by for example the assay for transposase accessible chromatin (ATAC) or transcription factor binding in disease relevant cell types.However, none of these methods are able to unambiguously assign causal variants and, given the very small number of truly validated variants, it is difficult to benchmark their effectiveness.Experimental methods have also been developed to identify or prioritise variants such as massively parallel reporter assays (MPRA [8]) and genome engineering-based interrogation of enhancers (GenIE [9], GenIE-ATAC [10]) but these are limited to understanding the effects of variants on transcription or chromatin accessibility and do not directly demonstrate the downstream effects of these polymorphisms.Genome engineering of isogenic human cell models, such as differentiated hiPSCs differing in a single variant, provide an attractive means for understanding the role of specific SNPs on cell function and have been successful in understanding the role of strong effect familial mutations [11] [12,13].However, such systems have yet to be successfully applied to more than a handful of common genetic variants [14,15,16] and their effects are often confounded by variation between edited clones and during differentiation [17].
Neuroinflammation has been shown to be a crucial factor in many neurodegenerative disorders, especially AD [18].Microglia are resident immune cells of the central nervous system and perform various functions under both homeostatic and disease conditions [19].They play important roles in synaptic remodelling, synaptic pruning and phagocytosis of dead neurons [20].Microglia have been implicated in AD by human genetics studies because of specific expression of AD-related genes in these cells [21][22][23], colocalisation of microglial eQTLs with AD GWAS hits [24,25] and implication of rare causal variants in microglial-specific genes, such as TREM2 [26] and CD33 [26,27].Nevertheless, the exact role of microglia in AD pathogenesis is still unclear.Aβ and tau aggregates are able to activate microglia, stimulate cytokines and chemokine production and elicit an inflammatory response that can have two functions: it can be neuroprotective by increasing Aβ or tau clearance, but it can also contribute to Aβ and tau production and spreading and induce neurodegeneration and synaptic loss [28,29].Microglia are known to be involved in clearance of aggregated proteins such as Aβ [30] and accumulation of extracellular Aβ is accelerated by insufficient microglial activation or phagocytic capacity [31].However, depletion of microglia in amyloid precursor protein-transgenic mouse models was found to have little effect on plaque formation or maintenance.Microglia have also been found to potentially contribute to tauopathy by spreading tau protein via exosomes across brain regions [32] as well as contributing to its uptake and degradation [29].Moreover, sustained microglia activation appears to contribute to neurodegeneration by impacting synaptic plasticity and causing neuronal damage [33,34].As well as microglia, other myeloid cell types, such as infiltrating monocytes [35,36], choroid plexus [37] and perivascular macrophages [38][39][40], have also been found to play a role in AD pathogenesis.Circulating monocytes can infiltrate the brain and target cerebral Aβ deposits to reduce plaque load [41][42][43] and depletion of perivascular macrophages in the TgCRND8 mouse model of AD significantly increases vascular Aβ levels [44].Conversely, β amyloid deposition in cortical blood vessels is reduced by stimulation of perivascular macrophage turnover [44].Moreover, studies have shown that macrophages isolated from individuals with AD are poorly phagocytic for Aβ, more susceptible to apoptosis and have impaired chemotaxis [45,46].Chemokines are produced by several cells in the central nervous system such as neurons, astrocytes, immune cells and particularly microglia.They are known to recruit peripheral immune cells through the blood brain barrier as well as activate brain resident macrophages and microglia [47].Elevated levels of several chemokines have been found in plasma, cerebrospinal fluid and brain tissue of AD patients [48].Proinflammatory chemokines are believed to contribute to neuroinflammation and chronic activation of microglia and macrophages in late and symptomatic stages of AD [47] and may constitute a useful biomarker to monitor disease progression [49].However, chemokine-mediated microglia recruitment and phagocytosis aids clearance of Aβ oligomers and knockout of some chemokines, such as CCL2 (and CCR2 receptor) in mouse models of AD results in accumulation of intracellular soluble Aβ oligomers and decline of cognitive function [50][51][52].Aβ can stimulate human monocytes and microglia to produce chemokines, such as CXCL8, CCL2 and CCL3, which also play a role in leukocyte migration to sites of inflammation [53].PTK2B is a non-receptor tyrosine kinase that is expressed in neurons, microglia, astrocytes, monocytes and tissue-resident macrophages, and has diverse physiological and pathological roles including cell adhesion, cell migration, inflammatory responses, tumour invasiveness [54], neuronal development and plasticity [55,56].PTK2B is a member of the focal adhesion kinase (FAK) subfamily and after autophosphorylation it recruits Src-family kinases [57] and interacts with multiple partners to activate various downstream signalling pathways, such as the MAPK/ERK pathway [55].In neurons, PTK2B is implicated in synaptic plasticity [58] and neurite outgrowth [59].In non-neuronal cells, such as osteoclasts [60], macrophages [61][62][63] and monocytes [64], PTK2B is involved in cell migration and regulation of the actin cytoskeleton downstream of integrins.In monocytes, the ROS-sensitive calcium channel TRPM2 can activate the PTK2B/ERK pathway, leading to nuclear translocation of nuclear factor-kB and increased chemokine production [65].This pathway is activated by Aβ in microglia, where PTK2B is autophosphorylated and generates a positive feedback loop to further activate TRPM2 [66].PTK2B has been implicated in AD through a number of GWAS studies [4,6,[67][68][69][70][71][72][73][74].Evidence suggests that genetic deletion of Ptk2b rescues a number of Aβ-associated phenotypes, such as memory impairment, synapse loss and impaired synaptic plasticity in APP/PS1 animal models of AD [75].Conversely, overexpression of Ptk2b has been reported to be protective in 5xFAD [76] and human neuronal systems [77].Similarly, data collected from hemizygous PS19 (MAPT P301S, 1N4R) transgenic mice crossed with Ptk2b −/− animals, points to a protective role for Ptk2b against Tau phosphorylation and Tau-induced behavioural deficits [78].Thus, several lines of evidence point to PTK2B as an important player in the pathophysiology of AD but much remains to be learnt about the exact role of PTK2B in AD pathogenesis.Clusterin (CLU) is a glycoprotein which is mainly secreted but also exists intracellularly, and is involved in a variety of cellular functions [79].Secreted clusterin acts as a molecular chaperone which stabilizes misfolded proteins and can prevent protein aggregation [80].CLU is also involved in modulating inflammatory proteins as well as proteins involved in cell survival and apoptosis [79].Cellular stress can lead to an upregulation of CLU which activates anti-apoptotic pathways and enables cell survival [81][82][83].CLU is expressed across many tissues, and in the brain it is expressed in neurons as well as astrocytes and microglia [79] .Secreted clusterin interacts and can bind Aβ, and evidence suggests that this can alter Aβ aggregation and promote clearance [81,84,85] These studies point to a neuroprotective role of CLU, however other evidence suggests that clusterin instead reduces Aβ clearance and increases neurotoxicity [86,87].CLU is the third greatest genetic risk factor for late onset AD and several SNPs in this gene alter AD susceptibility [79].CLU is upregulated in the hippocampus, cortex and CSF of AD patients and it colocalises with Aβ [81,88].AD GWAS studies have identified a locus on chromosome 8 which contains two independent association signals, one over PTK2B and one over CLU [67].Mapping of chromosomal interactions at the PTK2B locus in microglia has shown that enhancers harbouring AD-risk variants interact with active promoters of both PTK2B and CLU [89] but further investigation is needed to identify causal variants at this locus and their putative target genes.

Results
Prioritisation of a putative Alzheimer's disease variant at the PTK2B locus GWAS have identified two independent genetic associations at the PTK2B-CLU locus [67].The signal over the PTK2B gene colocalises with both an eQTL and caQTL in hiPSC-derived macrophages (Fig 1A) [90].Re-analysis of publicly available datasets [25,90,91] shows that the risk haplotype is linked to both a downregulation of PTK2B gene expression in hiPSC-derived macrophages (Fig 1B and 1C) and an increase in chromatin accessibility at an enhancer within intron 6 of the PTK2B gene (Fig 1D).Interestingly, in this case there is a single variant in the credible set (rs28834970) that lies within the ATAC peak that is changed in accessibility in the caQTL analysis, strongly suggesting that this variant is causal (C being the risk allele).There is some inconsistency in published literature on the directionality of these effects [25,90,91], but this data has been reanalysed in conjunction with the authors, and in additional hiPSC-derived macrophage datasets so we are confident of the directionality of these changes described here.This has confirmed that the C allele of rs28834970 is associated with decreased expression of PTK2B in hiPSC-derived macrophages (Fig 1B and Fig 1C), whereas in monocytes and one study of primary microglia, the C allele of rs28834970 is associated with increased expression of PTK2B (Supp Fig 1A) [90].In a recent study this variant was also found to be associated with an increase in PTK2B expression in blood of AD patients and not in any brain region examined [92].This evidence suggests that the rs28834970 variant is likely causal but highlights the need for further investigation of the role of this variant in AD pathogenesis.

Characterisation of rs28834970 in hiPSC-derived macrophages and microglia
Given that the PTK2B GWAS signal colocalises with an eQTL/caQTL in hiPSC-derived macrophages, the rs28834970 variant is within a macrophage-and microglia-specific accessible chromatin peak and that PTK2B plays an important role in myeloid cells we decided to focus our investigation on the role of this variant in macrophages and microglia, although we cannot exclude a role in other cell types.We edited the C allele of rs28834970 into a well characterised hiPSC line (KOLF2_C1) that was homozygous for the T allele at this position (and thus the non-risk haplotype) using CRISPR-mediated homology directed repair [93] (Suppl Fig 1B and Methods).This allows us to unambiguously study the role of rs28834970 independently of the other variants in linkage disequilibrium.We subsequently analysed three independent homozygous (C/C) clones and two unedited (T/T) clones, one of which had been through the editing process, and the second being the parental line.These were differentiated in three biological replicates into either macrophages [94,95] or microglia [96,97] followed by a number of assays including transcriptomics (RNAseq), chromatin accessibility (ATACseq) and quantitative digital RT-PCR (dd-qRT-PCR) to measure gene expression (Fig 1E).Differentiation was highly reproducible between the two different genotypes by FACS for macrophage markers CD14, CD16 and CD206 (Supp Fig 2A ) and microglia markers CD11b and CD45 (Supp Fig 2B).This was further validated by analysis of macrophage and microglial marker gene expression [96,98] which were mostly consistent with the differentiation trajectory (Supp Fig 2C ), including the homeostatic microglial marker P2RY12 that distinguishes between microglia and macrophages and microglial enriched genes e.g.TREM2, GPR34 and C1Qa [96,98].Moreover, there did not appear to be genotype-dependent changes in expression of canonical markers of macrophage or microglial activation (Supp Fig 2D).

rs28834970 introduces a novel CEBPB binding site that increases chromatin accessibility at an intronic enhancer in hiPSC-derived macrophages and microglia
We analysed the effect of the rs28834970 allele on chromatin accessibility and function of the intronic enhancer element.ATACseq analysis showed that the C allele created a novel region of accessible chromatin in both macrophages and microglia (Fig 2A ), which was consistent with the expectations from the caQTL analysis in hiPSC-derived macrophages (Fig 1D).Analysis of chromatin accessibility within a 600 kb window around the SNP (Supp Fig 3A ) showed that there was no change in most peaks, and the only statistically significant change consistent between macrophages and microglia was the new peak over rs28834970 (macrophages: p=1.65e-16 log2FC=2.71,microglia: p=4.19e-12 log2FC=2.77)(Supp Fig 3B).Peaks within the CHRNA2 and CLU genes were significantly reduced in microglia, and an intergenic peak was increased in macrophages, but these were not consistent between the two cell types, or with the expression changes of these genes (Supp Fig 3B and Supplementary Table 1).
Further analysis of the sequence around rs28834970 showed that the C allele increased the probability of binding of transcription factors CEBPA and CEBPB (Fig 2B).We thus analysed CEBPB binding to DNA by CUT&RUN, which showed an increased binding for the C allele of rs28834970 (p=0.04 log2FC=2.09)(Fig2A ), consistent with CEBPB binding being responsible for the change in chromatin accessibility.Indeed, in a genome-wide analysis, this was the only region that showed a significant change in CEBPb binding between the C/C and T/T alleles at rs28834970 (Fig 2C)(Supplementary Table 2).
We next analysed the effect of rs28834970 on expression of PTK2B and the other genes within a 600kb window around this variant (Supp Fig 3A).CHRNA2 and EPHX2 are not expressed in macrophages or microglia and were therefore not investigated.Both RNAseq and digital droplet qPCR (dd-qPCR) analysis showed that expression of PTK2B was modestly reduced by introduction of the C allele in macrophages (RNAseq: p=0.019 and log2FC=-0.2,ddqPCR: p=0.02 and log2FC=-0.39)(Fig 2D ) and microglia (RNAseq: p=0.06 log2FC=-0.57,ddqPCR: p=0.29 and log2FC=-0.71)(Fig 2E) but was only statistically significant with ddqPCR in macrophages and not microglia.However, this effect is consistent in direction between RNAseq and ddqPCR and across microglia and macrophages, and it is in line with the effect of the eQTL (Fig 1B,Fig 1C).The effect on PTK2B expression was the most consistent change in microglia and macrophages within the window around rs28834970, but there was additionally an upregulation of CLU expression specifically in macrophages (RNAseq: log2FC=0.72p=0.03) which was not observed in microglia (Supp Fig 3C).This implies that there may be an effect of the rs28834970 enhancer on PTK2B and CLU expression in macrophages, but its effect in microglia is less clear.This could be due to the small effect size of such common regulatory variants, which makes it difficult to detect in these assays, or to other mutations in the same haplotype contributing further to expression changes in this locus.We also analysed changes in splicing of the PTK2B gene and did not detect any significant changes in splicing of the exons near to the variant (Supp Fig 3D).
Taken together, these results validate that rs28834970 within the PTK2B locus is a causal variant and acts through introducing a novel CEBPB binding site that increases chromatin accessibility at the intronic enhancer.There is a subtle decrease in PTK2B expression and increase in CLU expression caused by rs28834970 but this was only marginally significant in macrophages and not in microglia, so it is difficult to establish whether the effect of rs28834970 is mediated through PTK2B or CLU expression.

rs28834970 causes strong effects on the transcriptome including chemokine expression
We next analysed the effects of editing rs28834970 on genome-wide gene expression in macrophages and microglia.In both cell types, principal component analysis showed that although there was a strong clone and batch effect, the two genotypes were well separated by the second principal component (Fig 3A  3).Importantly, there was a significant (p<0.0001)overlap in the differentially expressed genes between the two cell types (Supp Fig 4B ), further corroborating that we were identifying true effects of the genotype, not simply variation between edited clones or differentiation artefacts.
We performed gene set enrichment analysis of the differentially expressed genes relative to all genes expressed in the respective cell type.This showed that a number of gene ontology pathways relevant to microglial or macrophage function were enriched in the downregulated gene set including chemokine-mediated signalling, cellular response to interferon gamma (IFNγ) and migration and chemotaxis (Fig 3C,Fig 3D).These pathways were frequently overlapping between microglia and macrophages (Supp Fig 4C ), again supporting that these changes were genotype-dependent.Conversely, there were no obviously relevant gene ontology (GO) enrichments in the upregulated gene set (Supp Fig 4D ) which may be due to the smaller number of genes analysed.Further analysis of the downregulated genes confirmed that there was considerable overlap between microglia and macrophages across the different GO categories (Fig 3E).Of note, we found that the Macrophage Receptor With Collagenous Structure (MARCO) gene was among the top 3 downregulated genes in both microglia and macrophages with the C/C allele (Supp Fig 4E).MARCO is a putative Aβ receptor that triggers uptake and downstream signalling in microglia [99,100].Given their role in microglial and macrophage function, we focused on the downregulation of chemokines, MHC complex proteins and surface receptors and validated their expression by dd-qPCR.All of the chosen genes showed some level of downregulation as expected in microglia, and all except one (HLA-DQA1) were downregulated in macrophages (Fig 3F ), with effect sizes generally consistent between dd-qPCR and RNAseq.

rs28834970 causes changes in chemokine release and migration
Guided by the consistent effects on chemokine and migration-related gene expression in both macrophages and microglia, we next analysed the release of 13 chemokines using a multiplexed assay on microglial cell supernatants from all differentiated clones.In unstimulated cells, we saw significant reduction in production of 11 of the 13 chemokines tested with the C/C allele at rs28834970, consistent with the results of the transcriptomic analysis (Fig 4A).CCL22 was unexpectedly upregulated, and CXCL2 and CXCL5 were unchanged (Fig 4A), which may be a result of the different readouts measured by the two assays (mRNA vs secreted protein).
Next we aimed to evaluate production of these chemokines in stimulated microglia.We treated microglia with IFNγ and lipopolysaccharide (LPS) and analysed whether chemokine production was altered in stimulated microglia.As expected, both stimulations increased production of most ( 10 We further investigated the effect of rs28834970 on migration and chemotaxis of microglia differentiated from all iPSC clones, using a transwell assay in the presence or absence of IFNγ and with or without a chemoattractant relevant to microglia, complement 5a (C5a).Microglia harbouring the unedited (T/T) allele showed chemotaxis towards C5a that was inhibited by IFNγ treatment (Fig 4C).Microglia harbouring the C/C alleles at rs28834970 showed no significant chemotaxis towards C5a or inhibition of migration following IFNγ treatment (Fig 4C).Taken together, these data suggest that microglia harbouring the C/C allele are less sensitive to the chemoattractant C5a and to the inhibitory effect of IFNγ on migration.
In summary, we show that the C allele of rs28834970 lowers chemokine release by microglia, reduces chemotaxis towards C5a and alters their response to IFNγ stimulation, perhaps by altering microglia reactivity.

Discussion
One of the major challenges in interpreting GWAS is to understand the genes and causal variants underlying the genetic association with disease.The use of isogenic cell lines generated by CRISPR in model systems such as hiPSC differentiated cell types offer one way to dissect the role of individual variants in cellular phenotypes independently of other variants in linkage disequilibrium.However, common regulatory variants that are identified in GWAS have rarely been linked to changes in cellular 'omics or phenotypes.Here we identify a likely causal variant from comparison of Alzheimer's disease GWAS, eQTL and caQTL studies and engineer this using CRISPR/Cas9 into hiPSCs to generate isogenic cell lines differing in this single variant.We show that the C/C allele of rs28834970 introduces a novel CEBPB binding site that increases chromatin accessibility at an intronic enhancer, consistent with the results of the caQTL analysis in hiPSC-derived macrophages.Previous reports showed that this variant causes a decrease in PTK2B expression in monocytes [94] and macrophages [25,82,83] (Fig 1B and C).However, in our study we observed only a slight downregulation of PTK2B in microglia and macrophages expressing the C/C allele, which only reaches statistical significance in macrophages.We also found a subtle upregulation of CLU expression in macrophages but not in microglia expressing the C/C allele, making it difficult to make a definite conclusion on the gene targeted by rs28834970.This could be due to the small effect size making it challenging to study with a limited number of isogenic cell clones, contribution of other SNPs within the same risk haplotype or compensatory regulatory mechanisms that may buffer expression levels.Nevertheless, we found that the C/C allele at rs28834970 caused a large transcriptional response, including downregulation of chemokines and of genes involved in cellular response to IFNγ.This risk variant also resulted in phenotypic changes, such as reduction in chemokine release and decreased chemotaxis towards C5a.Taken together, this suggests that rs28834970 is a functional regulatory variant associated with AD risk at this locus.We propose that rs28834970 causes a change in microglia reactivity, resulting in a different response to stimulations such as IFNγ.This may impact the ability of the microglia to respond to aggregated proteins such as Aβ plaques and cellular damage in the brain, therefore increasing the risk of AD.Indeed several pro-inflammatory chemokines, such as CCL2, CCL5 and IL18, which are downregulated in microglia with the C/C allele both in naïve state and when stimulated with IFNγ, are produced by microglia in response to Aβ and are important for Aβ-induced microglia chemotaxis [103][104][105].Even though overexpression of these chemokines is characteristic of neuroinflammation, correlated with disease progression and found in late stages of AD, knockout of chemokines, such as CCL2, and chemokine receptors, such as CCR2 and CCR5, in mice is associated with increased Aβ deposition and accumulation [47,[50][51][52]106].It has also been found that patients carrying CCR5Δ32 mutation, which prevents CCR5 surface expression, develop AD at a younger age [107].Therefore, we hypothesize that in individuals carrying the C/C (risk) allele of rs28834970, downregulation of these chemokines in macrophages and microglia affects Aβ-induced microglia chemotaxis, leukocyte recruitment and clearance of Aβ, and may increase the risk of developing symptomatic AD.The decrease in expression of the putative Aβ receptor MARCO in the C/C allele may also contribute to this effect.We believe this study demonstrates the power of isogenic cell lines to identify and validate individual variants, including common regulatory SNPs identified from GWAS.

Engineering rs28834970 C/C allele in hiPSCs
The C/C risk allele at variant rs28834970 was engineered in hiPSCs by nucleofection of ribonucleoprotein (RNP) complex containing full-length chemically modified synthetic guide RNA and eSpCas9, along with a ssODN repair template, as previously described [93].Briefly, eSpCas9 [108] was expressed and purified from Escherichia coli using a His-tag.KOLF2_C1 cells were cultured ahead of nucleofection in TeSR E8 medium (StemCell Technologies) on Synthemax (Corning) (final amount 5 μg/cm 2 ).eSpCas9 (5μl, 20 μg) was mixed with full-length chemically modified guide RNA (AGTGGAATTGTACAACACTG) (Synthego) (5 μl, 225 pmol) at room temperature for 20 min for RNP complexes to form, followed by addition of the ssODN repair template (CTGGCAAGACTAATCTACTTTCTATTTTTATGGATCTGCCTTTTCTGGTCATTCCATATAAGTGGAATTGCACAACA CTGTGGCCTTTCGCGACGGCTGCTTTCACTTAGCACAATGTTTTGAAACTTCCTCCATGTTGT) (5 μl, 500 pmol) just before the nucleofection.Cells were detached using Accutase (StemCell Technologies) and dissociated into a single cell suspension. 1 × 10 6 cells were resuspended in P3 buffer (Lonza), mixed with the RNP/template complex and nucleofected using Lonza 4D-Nucleofector with program CA137.After nucleofection cells were plated onto a 10 cm dish coated with Synthemax (5 μg/cm 2 ) with TeSR-E8 supplemented with CloneR (StemCell Technologies).Cells were cultured to 80% confluency, detached using Accutase and dissociated into single cells for subcloning.5000 cells were plated in a 10 cm dish and expanded, and colonies were picked for genotyping and freezing.Colonies were screened for introduction of the homozygous C allele by high throughput sequencing of an amplicon spanning the rs28834970 variant using an Illumina MiSeq instrument.Final cell lines were validated using Sanger sequencing.Clones used in downstream assays include: three homozygous (C/C) clones (named C8, C11 and D3), one unedited (T/T) picked clone (B11) and the parental KOLF2_C1 line.

Fluorescence activated cell sorting analysis
Microglia or macrophages were detached using Accutase (StemCell Technologies) and washed twice in PBS.1x10 5 cells were incubated with Human TruStain FcX (Fc Receptor Blocking Solution) (Biolegend) for 30 minutes at 4°C in the dark.Microglia were stained with 5 ul each of PE-anti human CD45 antibody clone HI30 (Biolegend) and APC-anti human CD11b antibody clone ICRF44 (Biolegend) for 30 minutes at 4°C.Macrophages were instead stained with 5 ul of either APC/Cy7-anti human CD16 antibody clone 3G8 (Biolegend), APC-anti human CD206 antibody (BD Biosciences) or PE-anti human CD14 antibody (BD Biosciences).Isotype controls were stained with either 0.2 mg/ml APC-mouse IgG1 k(BD Biosciences), 0.2 mg/ml of PE-mouse IgG2A (BD Biosciences) or 0.2 mg/ml APC/Cy7 mouse IgG1 k (BD Biosciences).Cells were then washed three times in FACS buffer (5% FBS in D-PBS) and resuspended in 200 l of DAPI solution (1 ug/ml in PBS) and analysed.All samples were analysed using a BD LSRll instrument.An unpaired t-test was used to test the difference between samples from cells with the T/T and the C/C allele at rs28834970.

RNA extraction
TRI Reagent (Zymo) was added direct on cell layers previously washed in PBS and mixed.RNA was extracted using the Direct-zol RNA Miniprep kit (Zymo) following the manufacturer's protocol.The optional in-column DNase digest was performed, and RNA was eluted in 50 μl DNAse/RNAse-free water.We made cDNA from 1 μg RNA using Superscript IV (Thermo-Fisher) according to the manufacturer's protocol.
Reverse transcription 1 μg RNA was reverse transcribed using Superscript IV (Thermo-Fisher) and random hexamers, according to the manufacturer's protocol.

RNA sequencing of hiPSC-derived macrophages and microglia
Transcriptome libraries for hiPSC-derived macrophages and microglia were generated with the Illumina TruSeq stranded RNAseq kit (polyA) and all samples were sequenced using Illumina HiSeq with 50 million mapped reads on average for each sample.Reads were mapped to the human genome (GRCh38/hg38) using the BWA software [109] and coverage tracks were visualised on the UCSC genome browser using bigwig files, made with bamCoverage [110] and normalised by Counts per Million (CPM).The number of reads per transcript were counted using the featureCounts software [111] and a custom R script was used to calculate Transcripts per Million (TPM).For RNAseq analysis of macrophages, TPMs were corrected for batch effects using limma function limma::removeBatchEffect [112].For microglia, all samples were differentiated and sequenced in the same batch.Raw read counts per transcript were used to analyse differential gene expression between cells with the T/T and the C/C allele at rs28834970, using the package DESeq2 [113].DESeq2 was also used to generate PCA plots of variance stabilizing transformation (vst) transformed counts.To investigate whether the rs28834970 variant had an effect on PTK2B splicing, we used EdgeR [114] to calculate differential exon usage between microglia harbouring the C/C and the T/T allele.Gene ontology (GO) analysis of differential expressed genes was carried out using g:Profiler package in R [115].
ATACseq of hiPSC-derived macrophages and microglia hiPSC-derived macrophages and microglia were detached using a solution of 4 mg/ml Lidocaine hydrochloride monohydrate (Sigma) and 5mM EDTA (ThermoFisher scientific) for 15 minutes at 37°C.ATACseq was performed as previously described [116].Briefly cells were lysed using sucrose buffer containing 10 mM Tris-Cl pH 7.5, 3 mM CaCl 2 , 2 mM MgCl 2 , 0.32 M sucrose and permeabilised using 0.5% Triton-X-100 (Sigma).Nuclei were recovered by spinning at 450xg for 5 minutes at 4°C and tagmentation was performed using Illumina Tagment DNA TDE1 Enzyme and Buffer Kits for 30 minutes at 37°C37C. 5 l of TDE1 enzyme was used per 100,000 cells.MinElute PCR Purification Kit buffer PB (Qiagen) was added to stop the reaction and tagmented DNA was purified and eluted in 10 l of EB.The whole volume was PCR amplified and dual indexed Illumina adapters were added (primer sequences from [116]).Amplified libraries were gel purified and sequenced using an Illumina Novaseq 6000 with 30 million reads on average per sample.Reads were mapped to the human genome (GRCh38/hg38) using the BWA software [109] and duplicate reads were removed using picard tools MarkDuplicates.Coverage tracks were visualised on the UCSC genome browser using bigwig files, made with bamCoverage [110] and normalised by Counts per Million (CPM).The number of mapped reads per sample was determined and all files were subsampled to the number of reads of the lowest sample, using samtools view [117].ATACseq peaks were called using MACS2 [118] with flags --shift -100 --extsize 200 for window extension.A merged file of all called peaks in all samples was created and the number of reads under each peak was counted in all samples using bedtools coverageBed function [119].Genome-wide analysis of differential chromatin accessibility between cells harbouring the C/C and the T/T allele at rs28834970 was performed using DESeq2 [113].DESeq2 was also used to generate PCA plots of variance stabilizing transformation (vst) transformed counts.Peaks were annotated to genes using CHIPseeker functiomannotatePeak [120].

CUT & RUN assay
Microglia were harvested using a solution of 4mg/ml Lidocaine hydrochloride monohydrate (Sigma) and 5mM EDTA (ThermoFisher Scientific) for 15 minutes at 37°C37C.CUT&RUN for CEBPβ was performed using CUTANA CUT&RUN Kit (EpiCypher) according to the manufacturer's instructions.400,000 microglia were used per sample and permeabilization was optimised using Digitonin (EpiCypher) 0.01%.1ug of C/EBP beta Antibody (H-7) sc-7962 (Santa Cruz Biotechnology) was used per sample and one control sample was treated in parallel with 1 l of CUTANA Rabbit IgG CUT&RUN Negative Control Antibody.0.5ng of E. coli Spike-in DNA (EpiCypher) was added to each sample to normalise reads to control for experimental variability and sequencing depth.Around 5 ng of purified CUT&RUN-enriched DNA was used to prepare NGS libraries using the NEBNext Ultra II DNA Library Prep Kit (NEB) with NEBNext Multiplex Oligos for Illumina (Dual Index Primers Set 1) (NEB).All libraries were sequenced 150bp paired end using an Illumina Novaseq 6000 with 15 million reads on average per sample.Reads were trimmed to exclude Illumina adapter sequences using trim_galore and mapped to the human genome (GRCh38/hg38) using the bowtie2 software [121].The number of human mapped reads per sample was determined and all files were subsampled to the number of reads of the lowest sample, using samtools view [117].Trimmed reads were also mapped to the E. coli genome using bowtie2 and reads mapped to the human genome were normalised to spiked-in E.coli DNA using a previously published script (https://github.com/Henikoff/Cut-and-Run/blob/master/spike_in_calibration.csh)integrated into a custom bash script (https://github.com/ericabello/PTK2B_rs28834970/blob/main/Cut_Run_CEBPb_micro/cmds_CutRun_CEBPb_micro.sh).Coverage tracks were visualised on the UCSC genome browser using bigwig files made from E.coli normalised bedgraph files of fragments 1-1000bp.Bigwigs were made using UCSC bedGraphToBigWig.Peaks in all samples containing the CEBPβ antibody were called using the SEACR software [122] and the IgG negative control sample was used as input to identify the threshold value at which the percentage of target versus IgG signal is maximized.A merged file of all CEBPβ peaks in all samples was created and the number of reads under each peak was counted in all samples using bedtools coverageBed function [119].Genome-wide analysis of differential CEBPβ binding between microglia harbouring the C/C and the T/T allele at rs28834970 was performed using DESeq2 [113].
Digital droplet qPCR 50 ng of cDNA per sample was mixed with 10 μl of ddPCR Supermix for probes (no dUTP) (Biorad), 1 μl of FAM-tagged TaqMan assay for detection of the gene of interest and 1 μl of VIC-tagged Taqman assay for a housekeeping gene control.The housekeeping control gene was chosen to match the level of expression of the gene of interest in each cell type.
Taqman assays list (ThermoFisher Scientific), containing probes and primers: Macrophages Droplets were generated for each reaction using a QX200 Droplet Generator (Biorad) and transferred to 96 well plates.Sealed plates were incubated in a thermocycler and cDNA was PCR amplified with the following steps: 95°C for 10 min then 95°C for 30 sec and 60°C for 1 min for 40 cycles and finally 98°C for 10 min.The resulting FAM and VIC fluorescence was read using a QX200 Droplet Reader (Biorad) and the fraction of positive droplets in the sample was determined.Poisson statistics were used to determine the absolute number of copies per μl in each sample for both target and housekeeping control genes.The number of copies of target gene per sample was normalised by the control gene for each reaction and an unpaired t-test Benjamini-Hochberg multiple testing correction was used to test the difference between samples from cells with the T/T and the C/C allele at rs28834970.

Microglia stimulation and Luminex assay
hiPSC-derived microglia at day 6 of differentiation were stimulated with IFN (R&D Systems) 200 ng/ml or Lipopolysaccharide (LPS) (Invitrogen) 100 ng/ml for 24 hours.Culture plates were spun at 4,000 rpm for 20 min at 4°C and media collected.Harvested media was diluted 1:20 for unstimulated controls and 1:250 for stimulated samples.Luminex assay was performed using ProcartaPlex Human Basic Kit (Invitrogen) and a panel of ProcartaPlex Human Simplex (Invitrogen) beads, according to the manufacturer's recommendations.Plates were read using a Luminex MAGPIX instrument.Media from cells with the T/T and C/C allele were ran on the same plate, allowing direct comparison of each protein concentration across genotypes.An unpaired t-test with Benjamini-Hochberg multiple testing correction was used to test the difference in concentration (pg/ml) between different samples.The propagation of error for log fold change values of chemokine concentration between unstimulated and stimulated microglia was calculated using the Python package uncertainties (https://pythonhosted.org/uncertainties/).

Microglia stimulation and migration assay
HiPSCs-derived microglia at day 7 of differentiation were stimulated with IFN (R&D Systems) 100 ng/ml for 72 hours.At day 10, cells were detached using a solution of 4 mg/ml Lidocaine hydrochloride monohydrate (Sigma) and 5 mM EDTA (Gibco) and 5.5 × 10 4 cells in 100 l macrophage media were seeded onto each transwell (PET with 5 μm pores, Sarstedt) in a 24-well plate.Cells were incubated for 15 minutes to settle and 600 l of macrophage media containing 3 nM human recombinant C5a was added beneath the transwells.The plate was incubated for 10 hours to allow cell migration.The transwell was then rinsed in PBS, transferred to a fresh 24-well plate, and fixed with 4% paraformaldehyde for 20 min at RT. Cells on either side of the transwell membrane were stained with NucBlue (Invitrogen) and imaged with an EVOS FL Auto automated microscope (Thermo Fisher) set up to take images of each transwell in full (top image).The transwells were swabbed with a cotton wool bud to remove cells on the top surface, leaving behind only migrated cells, transferred into a fresh plate and imaged again with the same settings (bottom image).Cell counting was performed with CellProfiler 3.0 software [123] and the percentage of migrated cells per transwell was calculated as: (no.cells in bottom image) ÷ (no.cells in top image) × 100.Stimulations were performed in triplicate and the average for each sample across wells was taken.An unpaired t-test was used to test the difference between different samples.

Materials and Correspondance
All correspondence and requests for materials should be addressed to Andrew Bassett (ab42@sanger.ac.uk) or Erica Bello (eb956@cam.ac.uk).

Figures
Fig 3B).Similar results were seen with analysis of global chromatin accessibility, where genotypes were separated by the second principal component (Supp Fig 4A).Analysis of the differentially expressed genes between genotypes showed 690 downregulated and 428 upregulated genes in macrophages (Fig 3A), and 760 downregulated and 328 upregulated in microglia (Fig 3B) (Supplementary Table of 12) of the chemokines (Supp Fig 5A and Supp Fig 5B).CCL1 production was increased by LPS stimulation in microglia harbouring the C/C allele but not in microglia with the T/T allele (Supp Fig 5B).Similar to what was observed in unstimulated cells, production of most of the chemokines (10 of 12) was reduced in the C/C allele compared to the T/T allele at rs28834970 when microglia were stimulated with IFNγ, but the magnitude of the reduction was reduced (Fig 4B).A similar response was seen upon stimulation with LPS (Supp Fig 5C).Taken together, these data suggest that the C/C allele results in a lower basal chemokine release, and in the presence of stimuli such as IFNγ, this reduction becomes less marked.

Figure 1 -
Figure 1 -The rs28834970 variant in PTK2B is predicted to modulate an intronic enhancer A) PTK2B gene structure and Manhattan plots for Alzheimer's Disease (AD) GWAS hits (top panel), eQTL in naïve iPSC-derived macrophages (second panel) and caQTL in naïve iPSC-derived macrophages (bottom panel).P-values replotted from[90].B) Boxplots showing the expression of the PTK2B gene stratified by the rs28834970 genotype in iPSC-derived macrophages.Data replotted from[25].The y-axis shows

Figure 2 -
Figure 2 -Isogenic edited cell lines show that rs28834970 affects chromatin accessibility and CEBPβ binding at the intronic enhancer in iPSC-derived macrophages and microglia but only modestly decreases PTK2B expression.A) Coverage tracks of cells with T/T (orange) and C/C (blue) allele at rs28834970.The tracks are: ATAC-seq in macrophages (upper track), ATAC-seq in microglia (second track), CUT&RUN

Figure 3 -
Figure 3 -The rs28834970 variant affects global gene expression in iPSC-derived macrophages and microglia

Figure 4 -
Figure 4 -The rs28834970 variant causes changes in chemokine production and migration in iPSC-derived microglia A) Concentration of downregulated chemokines in unstimulated microglia with the C/C allele at rs28834970 relative to the concentration in cells with the T/T allele, measured by Luminex assay.Results are shown as the mean ± SEM of independent biological replicates (n=3) of each of the five differentiated iPSC clones.ns= non-significant, * p<0.05, **p<0.01,***p<0.001,****p<0.0001,an unpaired t-test.B) Concentration of downregulated chemokines in microglia with the C/C allele at rs28834970 relative to the concentration in cells with the T/T allele, stimulated with IFNγ.Results are shown as the mean ± SEM of independent biological replicates (n=3) of each of the five differentiated iPSC clones.. ns= non-significant, * p<0.05, **p<0.01,***p<0.001,****p<0.0001,an unpaired t-test.C) Bar graph showing the percentage of migrated microglia harbouring the T/T allele (left) or the C/C allele (right) either unstimulated or stimulated with IFNγ, and with or without the chemoattractant C5a.Results are shown as the mean ± SEM of independent biological replicates (n=3) of each of the five differentiated iPSC clones.ns= non-significant, * p<0.05, **p<0.01,***p<0.001,unpaired t-test.