An Alzheimer’s disease-associated common regulatory variant in PTK2B has causal effects on microglial function

transcriptomic and phenotypic changes, and we propose that it acts by altering microglia reactivity, consistent with the role of these cells in progression of AD.


Introduction
Alzheimer's disease (AD) is the most common form of dementia, and has a strong genetic component estimated to be 60-80% 1 .AD is characterised by the presence of β-amyloidcontaining plaques (Aβ) and intracellular neurofibrillary tangles of hyperphosphorylated tau, accompanied by neuroinflammation, neuronal, and synapse loss 2 .Multiple GWAS studies have collectively identified more than 80 genetic loci associated with disease [3][4][5][6][7] , the vast majority of which do not affect protein coding sequence.Due to linkage disequilibrium at these loci, it is difficult to identify the causal variants.Multiple computational methods have been developed to prioritise variants that cause disease including statistical fine mapping, co-localisation with expression or chromatin accessibility quantitative trait loci (eQTL, caQTL), overlap with enhancer regions identified by e.g.assay for transposase accessible chromatin (ATAC) or transcription factor binding in disease relevant cell types.However, none of these methods are able to unambiguously assign causal variants and, given the vanishingly small number of truly validated variants, it is difficult to benchmark their effectiveness.Experimental methods have also been developed to identify or prioritise variants such as massively parallel reporter assays (MPRA 8 ) and genome engineering-based interrogation of enhancers (GenIE 9 , GenIE-ATAC 10 ) but these are limited to understanding the effects of variants on transcription or chromatin accessibility and do not directly demonstrate the downstream effects of these polymorphisms.Genome engineering of isogenic human cell models, such as differentiated hiPSCs differing in a single variant, provide an attractive means for understanding the role of specific SNPs on cell function and have been successful in understanding the role of strong effect familial mutations 11,12,13 .However, such systems have yet to be successfully applied to more than a handful of common genetic variants 14,15,16 and their effects are often confounded by variation between edited clones and during differentiation 17 .Neuroinflammation has been shown to be a crucial factor in many neurodegenerative disorders, especially AD 18 .Microglia are resident immune cells of the central nervous system and perform various functions under both homeostatic and disease conditions 19 .Microglia play important roles in synaptic remodelling, synaptic pruning and phagocytosis of dead neurons 20 .Microglia have been implicated in AD by human genetics studies because of specific expression of AD-related genes in these cells [21][22][23] , colocalisation of microglial eQTLs with AD GWAS hits 24,25 and implication of rare causal variants in microglial-specific genes, such as TREM2 26 and CD33 26,27 .Nevertheless, the exact role of microglia in AD pathogenesis is still unclear.Aβ and tau aggregates are able to activate microglia and elicit an inflammatory response that can have two functions: it can be neuroprotective by increasing Aβ or tau clearance, but it can also contribute to Aβ and tau production and spreading and induce neurodegeneration and synaptic loss 28,29 .Microglia are known to be involved in clearance of aggregated proteins such as Aβ 30 and imbalanced microglial activation or insufficient microglial phagocytic capacity can accelerate the accumulation of extracellular Aβ 31 .However, depletion of microglia in amyloid precursor protein-transgenic mouse models was found to have little effect on plaque formation or maintenance.Microglia have also been found to potentially contribute to tauopathy by spreading tau protein via exosomes across brain regions 32 as well as its uptake and degradation 29 .Moreover, sustained microglia activation appears to contribute to neurodegeneration by impacting synaptic plasticity and causing neuronal damage 33,34 .As well as microglia, other myeloid cell types, such as infiltrating monocytes 35,36 , choroid plexus 37 and perivascular macrophages [38][39][40] , have also been found to play a role in AD pathogenesis.PTK2B is a non-receptor tyrosine kinase that is expressed in neurons, microglia, astrocytes, monocytes and tissue-resident macrophages, and has diverse physiological and pathological roles including cell adhesion, cell migration, inflammatory responses, tumor invasiveness 41 , neuronal development and plasticity 42,43 .PTK2B is a member of the focal adhesion kinase (FAK) subfamily and after autophosphorylation it recruits Src-family kinases 44 and interacts with multiple partners to activate various downstream signalling pathways, such as the MAPK/ERK pathway 42 .In neurons, PTK2B can be activated following stimulation of glutamate receptors 45 , neuronal depolarization 46 and increased intracellular free Ca2+ 47 .In neurons PTK2B is implicated in synaptic plasticity 48 and neurite outgrowth 49 .In non-neuronal cells, such as osteoclasts 50 , macrophages [51][52][53] and monocytes 54 , PTK2B is involved in cell migration and regulation of actin cytoskeleton downstream of integrins.In monocytes, the ROS-sensitive calcium channel TRPM2 can activate the PTK2B/ERK pathway, leading to nuclear translocation of nuclear factor-kB and increased chemokine production 55 .This pathway is activated by Aβ in microglia, where PTK2B is autophosphorylated and generates a positive feedback loop to further activate TRPM2 56 .PTK2B has been implicated in AD through a number of GWAS studies 4,6,[57][58][59][60][61][62][63][64] .As well as being involved in response to Aβ, PTK2B co-localizes with hyperphosphorylated Tau in the brain of AD patients and in a mouse model, and it may contribute to Tau phosphorylation 65,66 .Evidence suggests that genetic deletion of Ptk2b rescues a number of Aβ-associated phenotypes, such as memory impairment, synapse loss and impaired synaptic plasticity, in APP/PS1 animal models of AD 67 .Data collected from hemizygous PS19 (MAPT P301S, 1N4R) transgenic mice crossed with Ptk2b −/− animals, points to a protective role for Ptk2b against Tau phosphorylation and Tau-induced behavioral deficits, in part through the suppression of LKB1 and p38 MAPK 68 .Several lines of evidence point to PTK2B as an important player in the pathophysiology of AD but much remains to be learnt about the exact role of PTK2B in AD pathogenesis.

Prioritisation of a putative Alzheimer's disease causal variant at the PTK2B locus
GWAS have identified two independent genetic associations at the PTK2B-CLU locus 57 .The signal over the PTK2B gene colocalises with both an eQTL and caQTL in hiPSC-derived macrophages (Fig 1a) 69 .Re-analysis of publicly available datasets 25,69,70 shows that the risk haplotype is linked to both a downregulation of PTK2B gene expression in hiPSC-derived macrophages (Fig 1b and 1c) and an increase in chromatin accessibility at an enhancer within intron 6 of the PTK2B gene (Fig 1d).Interestingly, in this case there is a single variant in the credible set (rs28834970) that lies within the ATAC peak that is changed in accessibility in the caQTL analysis, strongly suggesting that this variant is causal (C being the risk allele).Note that there is some inconsistency in published literature on the directionality of these effects 69,25,70 , but this data has been reanalysed in conjunction with the authors, and in additional hiPSC-derived macrophage datasets so we are confident of the directionality of these changes described here.This confirmed that the C allele of rs28834970 decreases expression of PTK2B in hiPSC-derived macrophages (Fig 1b and Fig   1c, whereas in monocytes and one study of primary microglia, the C allele of rs28834970 is associated with increased expression of PTK2B (Supp Fig 1a) 69 .

Characterisation of rs28834970 in hiPSC-derived macrophages and microglia
In a recent study this variant was found to be associated with an increase in PTK2B expression only in blood of AD patients and not in any brain region examined 71 .Given the important role of PTK2B in myeloid cells and the evidence suggesting this variant modifies expression of PTK2B in this cell type, we decided to investigate the role of this variant in macrophages and microglia.We edited the C allele of rs28834970 into a well characterised hiPSC line (KOLF2_C1) that was homozygous for the T allele at this position (and thus the non-risk haplotype) using CRISPR-mediated homology directed repair 72 (Suppl Fig 1b and   Methods).This allows us to unambiguously study the role of rs28834970 independently of the other variants in linkage disequilibrium.We subsequently analysed three independent homozygous (C/C) clones and two unedited (T/T) clones, one of which had been through the editing process, and the second being the parental line.These were differentiated in three biological replicates into either macrophages 73,74 or microglia 75,76 followed by a number of assays including transcriptomics (RNAseq), chromatin accessibility (ATACseq) and quantitative digital RT-PCR (dd-qRT-PCR) to measure gene expression (Fig 1e).Differentiation was highly reproducible between the two different genotypes by FACS for macrophage markers CD14, CD16 and CD206 (Supp Fig 2a) and microglia markers CD11b and CD45 (Supp Fig 2b).This was further validated by analysis of macrophage and microglial marker gene expression which were mostly consistent with the differentiation trajectory and did not cluster by genotype (Supp Fig 2c).Similarly, there did not appear to be genotype-dependent change in expression of canonical markers of macrophage or microglial activation (Supp Fig 2d).rs28834970 reduces PTK2B expression by creating a novel CEBPB binding site in hiPSCderived macrophages and microglia We then analysed the effect of the rs28834970 allele on PTK2B expression and function of the intronic enhancer element.Consistent with the prediction from the eQTL analysis in hiPSC-derived macrophages, both RNAseq and digital droplet qPCR (dd-qPCR) analysis showed that expression of PTK2B was modestly reduced by introduction of the C allele in macrophages (RNAseq: p=0.019 and logFC=-0.2,ddqPCR: p=0.0045 and logFC=-0.39)(Fig 2a) and microglia (RNAseq: p=0.06 logFC=-0.57,ddqPCR: p=0.22 and logFC=-0.71)(Fig 2b).The effect was consistent between RNAseq and dd-qPCR, and across microglia and macrophages, but was variable in magnitude.This is consistent with the small effect size in the eQTL analysis (Fig 1b and Fig 1c), and is likely a feature of the majority of common variants that would be expected to have only small or context-specific effects on target gene expression 16 .Given that an enhancer can have long range effects and mapping of chromosomal interactions at this locus in microglia had shown that enhancers harboring AD-risk variants interacted with active promoters of not only PTK2B and CLU, but also TRIM35 and CHRNA2 77 , we also analysed expression of genes in a 600 kb window around rs28834970 (Supp Fig 3a).The effect on PTK2B expression was the most consistent change in microglia and macrophages in this window, but there was additionally a slight upregulation of CLU expression specifically in macrophages (logFC=0.72p-value=0.03)which was not observed in microglia (Supp Fig 3b).This implies that in macrophages, there may be an effect of the rs28834970 enhancer on CLU expression as well as PTK2B.We also analysed changes in splicing of the PTK2B gene and saw only small changes between the two genotypes (Supp Fig 3c), none of which affected splicing of the exons near to the variant.We next analysed the effect of rs28834970 on chromatin accessibility, which showed that the C allele created a novel region of accessible chromatin in both macrophages and microglia (Fig 2c) which was consistent with the expectations from the caQTL analysis in hiPSC-derived macrophages (Fig 1d).Analysis of chromatin accessibility within a 600 kb window around the SNP showed that there was no change in most peaks, and the only statistically significant change consistent between macrophages and microglia was the new peak over rs28834970 (macrophages: p=1.65e-16 logFC=2.71,microglia: p=4.19e-12 logFC=2.77)(Supp Fig 3d).Peaks within the CHRNA2 and CLU genes were significantly reduced in microglia, and an intergenic peak was increased in macrophages, but these were not consistent between the two cell types, or with the expression changes of these genes (Supp Fig 3d and Supplementary Table 1).Further analysis of the sequence around rs28834970 showed that the C allele increased the probability of binding of transcription factors CEBPA and CEBPB (Fig 2d).We thus analysed CEBPB binding to chromatin by CUT&RUN, which showed an increased binding for the C allele of rs28834970 (p=0.04 logFC=2.09)(Fig2c), consistent with CEBPB binding being responsible for the change in chromatin accessibility.Indeed, in a genome-wide analysis, this was the only region that showed a significant change in CEBPb binding between the C/C and T/T alleles at rs28834970 (Fig 2e)(Supplementary Table 2).CEBPA and CEBPB have previously been shown to have both activator and repressor functions 78 , and thus we propose that this novel binding event represses expression of PTK2B.Taken together, these results validate that rs28834970 within the PTK2B locus is the causal variant and acts through introducing a novel CEBPB binding site that increases chromatin accessibility at the intronic enhancer, and results in repression of PTK2B expression.

rs28834970 causes strong effects on the transcriptome including chemokine expression
We next analysed the effects of editing rs28834970 on genome-wide gene expression in macrophages and microglia.Principal component analysis showed that although there was a strong clone and batch effect, the two genotypes were well separated by the second principal component (Fig 3a).Similar results were seen with analysis of global chromatin accessibility, where genotypes were separated by the second principal component (Supp Fig 4a).Analysis of the differentially expressed genes between genotypes showed 690 downregulated and 428 upregulated genes in macrophages, and 760 downregulated and 328 upregulated in microglia (Fig 3a) (Supplementary Table 3).Importantly, there was a significant (p<0.0001)overlap in the differentially expressed genes between the two cell types (Supp Fig 4b), further corroborating that we were identifying true effects of the genotype, not simply variation between edited clones or differentiation artefacts.We performed gene set enrichment analysis of the differentially expressed genes relative to all genes expressed in the respective cell type.This showed that a number of gene ontology pathways relevant to microglial or macrophage function were enriched in the downregulated gene set including chemokine-mediated signalling, cellular response to interferon gamma (IFNγ) and migration and chemotaxis (Fig 3b, Fig 3c).These pathways were frequently overlapping between microglia and macrophages (Supp Fig 4c ), again supporting that these changes were genotype-dependent.Conversely, there were no obviously relevant gene ontology (GO) enrichments in the upregulated gene set (Supp Fig   4d) which may be due to the smaller number of genes analysed.Further analysis of the downregulated genes confirmed that there was considerable overlap between microglia and macrophages across the different GO categories (Fig 3d).Of note, we found that the Macrophage Receptor With Collagenous Structure (MARCO) gene was among the top 3 downregulated genes in both microglia and macrophages with the C/C allele (Supp Fig 4e).Given their role in microglial and macrophage function, we focused on chemokines, MHC complex proteins and surface receptors and validated their expression by dd-qPCR.All of the chosen genes showed some level of downregulation as expected in microglia, and all except one (HLA-DQA1) were downregulated in macrophages (Fig 3e), with effect sizes generally consistent between dd-qPCR and RNAseq.

rs28834970 causes changes in chemokine release and migration
Guided by the consistent effects on chemokine and migration-related gene expression in both macrophages and microglia, we next analysed the release of 13 chemokines using a multiplexed assay on microglial cell supernatants.In unstimulated cells, we saw significant reduction in production of 10 of the 13 chemokines tested with the C/C allele at rs28834970, consistent with the results of the transcriptomic analysis (Fig 4a).CCL22 was unexpectedly upregulated, and CXCL2 and CXCL5 were unchanged (Fig 4a).Given that several of these differentially expressed chemokines are produced by microglia in response to stimuli such as IFNγ and lipopolysaccharide (LPS), we also analysed whether their production was altered in microglia harbouring the C/C allele in the presence of these stimuli.As expected, both treatments increased production of most ( 10 .Taken together, these data suggest that the C/C allele results in a lower basal chemokine release, and in the presence of stimuli such as IFNγ, this reduction becomes less marked.We further investigated the effect of rs28834970 on microglial migration and chemotaxis using a transwell assay in the presence or absence of IFNγ and with or without a chemoattractant relevant to microglia, complement 5a (C5a).Microglia harbouring the unedited (T/T) allele showed chemotaxis towards C5a that was inhibited by IFNγ treatment (Fig 4c).Microglia harbouring the C/C alleles at rs28834970 showed no significant chemotaxis towards C5a or inhibition of migration following IFNγ treatment (Fig 4c).Taken together, these data suggest that microglia harbouring the C/C allele are less sensitive to the chemoattractant C5a and to the inhibitory effect of IFNγ on migration.In summary, we show that the C allele of rs28834970 lowers chemokine release by microglia, reduces chemotaxis towards C5a and alters their response to IFNγ stimulation, perhaps by altering microglia reactivity.

Discussion
One of the major challenges in interpreting GWAS is to understand the genes and causal variants underlying the genetic association with disease.The use of isogenic cell lines generated by CRISPR in model systems such as hiPSC differentiated cell types offer one way to dissect the role of individual variants in cellular phenotypes independently of other variants in linkage disequilibrium.However, common regulatory variants that are identified in GWAS have rarely been linked to changes in cellular 'omics or phenotypes.
Here we identify a likely causal variant from comparison of Alzheimer's disease GWAS, eQTL and caQTL studies and engineer this using CRISPR/Cas9 into hiPSCs to generate isogenic cell lines differing in the single variant.We show that the C/C allele of rs28834970 likely introduces a novel CEBPB binding site in an intron of PTK2B that results in transcriptional repression of PTK2B expression.Even though the downregulation of PTK2B caused by the variant does not reach statistical significance in our models, it is consistent with previous reports that this variant causes a decrease in PTK2B expression in monocytes 79 and macrophages 25,69,70 (Fig 1b and c).This could be due to the small effect size making it difficult to study with a limited number of isogenic cell clones or compensatory regulatory mechanisms that may buffer expression levels.Nevertheless, we found that the C/C allele at rs28834970 caused a large transcriptional response and resulting phenotypic changes, including reduction in chemokine release and decreased chemotaxis towards C5a.Downregulation of genes involved in cellular response to IFNγ and the lack of inhibition of migration following IFNγ treatment, are phenotypes consistent with the downregulation of PTK2B in cells harbouring the C/C allele.Following IFNγ receptor stimulation, PTK2B has been shown to bind to JAK2 and contribute to the activation of STAT1 and MAPK pathways 80 .Moreover, inhibition of PTK2B has been shown to decrease GSK-3β-mediated IFNγ signalling 80 .We propose that these phenotypic changes are the result of a change in microglia reactivity, causing them to respond differently to stimulations such as IFNγ, and this may impact the ability of the microglia to respond to aggregated proteins such as Aβ plaques and cellular damage in the brain of Alzheimer's disease patients.Indeed several pro-inflammatory chemokines, such as CCL2, CCL5 and CXCL1, which are downregulated in microglia with the C/C allele both in naive state and when stimulate with IFNγ, are produced by microglia in response to Aβ and are important for Aβ-induced microglia chemotaxis [81][82][83] .We believe this study demonstrates the power of isogenic cell lines to identify and validate individual variants, including common regulatory SNPs identified from GWAS.

Engineering rs28834970 C/C allele in hiPSCs
The C/C risk allele at variant rs28834970 was engineered in hiPSCs by nucleofection of ribonucleoprotein (RNP) complex containing full-length chemically modified synthetic guide RNA and eSpCas9, along with a ssODN repair template, as previously described 72 .
Briefly, eSpCas9 84 was expressed and purified from Escherichia coli using a His-tag.KOLF2_C1 cells were cultured ahead of nucleofection in TeSR E8 medium (StemCell Technologies) on Synthemax (Corning) (final amount 5 μg/cm 2 ).eSpCas9 (5μl, 20 μg) was mixed with full-length chemically modified guide RNA (AGTGGAATTGTACAACACTG) (Synthego) (5 μl, 225 pmol) at room temperature for 20 min for RNP complexes to form, followed by addition of the ssODN repair template (CTGGCAAGACTAATCTACTTTCTATTTTTATGGATCTGCCTTTTCTGGTCATTCCATATAAGTGGAATTGCA CAACACTGTGGCCTTTCGCGACGGCTGCTTTCACTTAGCACAATGTTTTGAAACTTCCTCCATGTTGT) (5 μl, 500 pmol) just before the nucleofection.Cells were detached using Accutase (StemCell Technologies) and dissociated into a single cell suspension. 1 × 10 6 cells were resuspended in P3 buffer (Lonza), mixed with the RNP/template complex and nucleofected using Lonza 4D-Nucleofector with program CA137.After nucleofection cells were plated onto a 10 cm dish coated with Synthemax (5 μg/cm 2 ) with TeSR-E8 supplemented with CloneR (StemCell Technologies).Cells were cultured to 80% confluency, detached using Accutase and dissociated into single cells for subcloning.5000 cells were plated in a 10 cm dish and expanded, and colonies were picked for genotyping and freezing.Colonies were screened for introduction of the homozygous C allele by high throughput sequencing of an amplicon spanning the rs28834970 variant using an Illumina MiSeq instrument.Final cell lines were validated using Sanger sequencing.Clones used in downstream assays include: three homozygous (C/C) clones (named C8, C11 and D3), one unedited (T/T) picked clone (B11) and the parental KOLF2_C1 line.

Fluorescence activated cell sorting analysis
Microglia or macrophages were detached using Accutase (StemCell Technologies) and washed twice in PBS.1x10 5 cells were incubated with Human TruStain FcX (Fc Receptor Blocking Solution) (Biolegend) for 30 minutes at 4°C in the dark.Microglia were stained with 5 ul each of PE-anti human CD45 antibody clone HI30 (Biolegend) and APC-anti human CD11b antibody clone ICRF44 (Biolegend) for 30 minutes at 4°C.Macrophages were instead stained with 5 ul of either APC/Cy7anti human CD16 antibody clone 3G8 (Biolegend), APC-anti human CD206 antibody (BD Biosciences) or PE-anti human CD14 antibody (BD Biosciences).Isotype controls were stained with either 0.2 mg/ml APC-mouse IgG1 k(BD Biosciences), 0.2 mg/ml of PE-mouse IgG2A (BD Biosciences) or 0.2 mg/ml APC/Cy7 mouse IgG1 k (BD Biosciences).Cells were then washed three times in FACS buffer (5% FBS in D-PBS) and resuspended in 200 l of DAPI solution (1 ug/ml in PBS) and analysed.All samples were analysed using a BD LSRll instrument.An unpaired t-test was used to test the difference between samples from cells with the T/T and the C/C allele at rs28834970.

RNA extraction
TRI Reagent (Zymo) was added direct on cell layers previously washed in PBS and mixed.RNA was extracted using the Direct-zol RNA Miniprep kit (Zymo) following the manufacturer's protocol.The optional in-column DNase digest was performed, and RNA was eluted in 50 μl DNAse/RNAse-free water.We made cDNA from 1 μg RNA using Superscript IV (Thermo-Fisher) according to the manufacturer's protocol.

Reverse transcription
1 μg RNA was reverse transcribed using Superscript IV (Thermo-Fisher) and random hexamers, according to the manufacturer's protocol.

RNA sequencing of hiPSC-derived macrophages and microglia
Transcriptome libraries for hiPSC-derived macrophages and microglia were generated with the Illumina TruSeq stranded RNAseq kit (polyA) and all samples were sequenced using Illumina HiSeq with 50 million mapped reads on average for each sample.
Reads were mapped to the human genome (GRCh38/hg38) using the BWA software 85 and coverage tracks were visualised on the UCSC genome browser using bigwig files, made with bamCoverage 86 and normalised by Counts per Million (CPM).The number of reads per transcript were counted using the featureCounts software 87 and a custom R script was used to calculate Transcripts per Million (TPM).For RNAseq analysis of macrophages, TPMs were corrected for batch effects using limma function limma::removeBatchEffect 88 .For microglia, all samples were differentiated and sequenced in the same batch.Raw read counts per transcript were used to analyse differential gene expression between cells with the T/T and the C/C allele at rs28834970, using the package DESeq2 89 .DESeq2 was also used to generate PCA plots of variance stabilizing transformation (vst) transformed counts.To investigate whether the rs28834970 variant had an effect on PTK2B splicing, we used EdgeR 90 to calculate differential exon usage between microglia harbouring the C/C and the T/T allele.Gene ontology (GO) analysis of differential expressed genes was carried out using g:Profiler package in R 91 .

ATACseq of hiPSC-derived macrophages and microglia
hiPSC-derived macrophages and microglia were detached using a solution of 4 mg/ml Lidocaine hydrochloride monohydrate (Sigma) and 5mM EDTA (ThermoFisher scientific) for 15 minutes at 37°C.ATACseq was performed as previously described 92 .Briefly cells were lysed using sucrose buffer containing 10 mM Tris-Cl pH 7.5, 3 mM CaCl2, 2 mM MgCl2, 0.32 M sucrose and permeabilised using 0.5% Triton-X-100 (Sigma).Nuclei were recovered by spinning at 450xg for 5 minutes at 4°C and tagmentation was performed using Illumina Tagment DNA TDE1 Enzyme and Buffer Kits for 30 minutes at 37°C37C. 5 l of TDE1 enzyme was used per 100,000 cells.MinElute PCR Purification Kit buffer PB (Qiagen) was added to stop the reaction and tagmented DNA was purified and eluted in 10 l of EB.The whole volume was PCR amplified and dual indexed Illumina adapters were added (primer sequences from 92 ).Amplified libraries were gel purified and sequenced using an Illumina Novaseq 6000 with 30 million reads on average per sample.
Reads were mapped to the human genome (GRCh38/hg38) using the BWA software 85 and duplicate reads were removed using picard tools MarkDuplicates.Coverage tracks were visualised on the UCSC genome browser using bigwig files, made with bamCoverage 86 and normalised by Counts per Million (CPM).The number of mapped reads per sample was determined and all files were subsampled to the number of reads of the lowest sample, using samtools view 93 .ATACseq peaks were called using MACS2 94 with flags --shift -100 --extsize 200 for window extension.A merged file of all called peaks in all samples was created and the number of reads under each peak was counted in all samples using bedtools coverageBed function 95 .Genome-wide analysis of differential chromatin accessibility between cells harbouring the C/C and the T/T allele at rs28834970 was performed using DESeq2 89 .DESeq2 was also used to generate PCA plots of variance stabilizing transformation (vst) transformed counts.Peaks were annotated to genes using CHIPseeker functiomannotatePeak 96 .

CUT & RUN assay
Microglia were harvested using a solution of 4mg/ml Lidocaine hydrochloride monohydrate (Sigma) and 5mM EDTA (ThermoFisher Scientific) for 15 minutes at 37°C37C.CUT&RUN for CEBPβ was performed using CUTANA CUT&RUN Kit (EpiCypher) according to the manufacturer's instructions.400,000 microglia were used per sample and permeabilization was optimised using Digitonin (EpiCypher) 0.01%.1ug of C/EBP beta Antibody (H-7) sc-7962 (Santa Cruz Biotechnology) was used per sample and one control sample was treated in parallel with 1 l of CUTANA Rabbit IgG CUT&RUN Negative Control Antibody.0.5ng of E. coli Spike-in DNA (EpiCypher) was added to each sample to normalise reads to control for experimental variability and sequencing depth.Around 5 ng of purified CUT&RUN-enriched DNA was used to prepare NGS libraries using the NEBNext Ultra II DNA Library Prep Kit (NEB) with NEBNext Multiplex Oligos for Illumina (Dual Index Primers Set 1) (NEB).All libraries were sequenced 150bp paired end using an Illumina Novaseq 6000 with 15 million reads on average per sample.
Reads were trimmed to exclude Illumina adapter sequences using trim_galore and mapped to the human genome (GRCh38/hg38) using the bowtie2 software 97 .The number of human mapped reads per sample was determined and all files were subsampled to the number of reads of the lowest sample, using samtools view 93 .Trimmed reads were also mapped to the E. coli genome using bowtie2 and reads mapped to the human genome were normalised to spiked-in E.coli DNA using a previously published script (https://github.com/Henikoff/Cut-and-Run/blob/master/spike_in_calibration.csh)integrated into a custom bash script (https://github.com/ericabello/PTK2B_rs28834970/blob/main/Cut_Run_CEBPb_micro/cmds_CutRun_CEBPb_micro.sh).Coverage tracks were visualised on the UCSC genome browser using bigwig files made from E.coli normalised bedgraph files of fragments 1-1000bp.Bigwigs were made using UCSC bedGraphToBigWig.Peaks in all samples containing the CEBPβ antibody were called using the SEACR software 98 and the IgG negative control sample was used as input to identify the threshold value at which the percentage of target versus IgG signal is maximized.A merged file of all CEBPβ peaks in all samples was created and the number of reads under each peak was counted in all samples using bedtools coverageBed function 95 .Genome-wide analysis of differential CEBPβ binding between microglia harbouring the C/C and the T/T allele at rs28834970 was performed using DESeq2 89 .

Digital droplet qPCR
50 ng of cDNA per sample was mixed with 10 μl of ddPCR Supermix for probes (no dUTP) (Biorad), Droplets were generated for each reaction using a QX200 Droplet Generator (Biorad) and transferred to 96 well plates.Sealed plates were incubated in a thermocycler and cDNA was PCR amplified with the following steps: 95°C for 10 min then 95°C for 30 sec and 60°C for 1 min for 40 cycles and finally 98°C for 10 min.The resulting FAM and VIC fluorescence was read using a QX200 Droplet Reader (Biorad) and the fraction of positive droplets in the sample was determined.
Poisson statistics were used to determine the absolute number of copies per μl in each sample for both target and housekeeping control genes.The number of copies of target gene per sample was normalised by the control gene for each reaction and an unpaired t-test was used to test the difference between samples from cells with the T/T and the C/C allele at rs28834970.

Microglia stimulation and Luminex assay
hiPSC-derived microglia at day 6 of differentiation were stimulated with IFN (R&D Systems) 200 ng/ml or Lipopolysaccharide (LPS) (Invitrogen) 100 ng/ml for 24 hours.Culture plates were spun at 4,000 rpm for 20 min at 4°C and media collected.Harvested media was diluted 1:20 for unstimulated controls and 1:250 for stimulated samples.Luminex assay was performed using ProcartaPlex Human Basic Kit (Invitrogen) and a panel of ProcartaPlex Human Simplex (Invitrogen) beads, according to the manufacturer's recommendations.Plates were read using a Luminex MAGPIX instrument.Media from cells with the T/T and C/C allele were ran on the same plate, allowing direct comparison of each protein concentration across genotypes.An unpaired t-test was used to test the difference in concentration (pg/ml) between different samples.The propagation of error for log fold change values of chemokine concentration between unstimulated and stimulated microglia was calculated using the Python package uncertainties (https://pythonhosted.org/uncertainties/).

Microglia stimulation and migration assay
HiPSCs-derived microglia at day 7 of differentiation were stimulated with IFN (R&D Systems) 100 ng/ml for 72 hours.At day 10, cells were detached using a solution of 4 mg/ml Lidocaine hydrochloride monohydrate (Sigma) and 5 mM EDTA (Gibco) and 5.  B) Boxplots showing the expression of the PTK2B gene stratified by the rs28834970 genotype in iPSC-derived macrophages.Data replotted from 25 .The y-axis shows normalised expression levels (log TPM value) and each dot shows the expression level of a single sample.C) Boxplots showing the expression of the PTK2B gene stratified by the rs28834970 genotype in naïve iPSC-derived macrophages.Data replotted from 70 .The y-axis shows normalised expression levels (log TPM value) and each dot shows the expression level of a single sample.D) ATAC-seq fragment coverage in iPSC-derived macrophages stratified by the rs28834970 genotype (top panel) and CEBPβ ChIP-seq fragment coverage and peaks in primary human macrophages (middle panel).Data replotted from 69 .The bottom panel shows the structure of the PTK2B gene.E) Overview of the experimental design.Created with BioRender.com.cellular response to interferon-gamma chemokine-mediated signaling pathway lymphocyte chemotaxis neutrophil chemotaxis monocyte chemotaxis granulocyte chemotaxis leukocyte chemotaxis peptide antigen assembly with MHC-II protein complex regulation of mononuclear cell proliferation innate immune response antigen processing of exogenous peptide via MHC-II cellular response to interleukin-1 response to cytokine interferon-gamma production response to other organism

5' 3' T G G A A T T G C A C A
of 12) of the chemokines (Supp Fig 5a and Supp Fig 5b).CCL1 production was increased by LPS stimulation in microglia harbouring the C/C allele but not in microglia with the T/T allele (Supp Fig 5b).Similar to what was observed in unstimulated cells, production of most of the chemokines was reduced in the C/C allele at rs28834970 when microglia were stimulated with IFNγ, but the magnitude of the reduction was greatly reduced, such that only 5 of the 12 chemokines were significant (Fig 4b).A similar response was seen upon stimulation with LPS (Supp Fig 5c)

1
μl of FAM-tagged TaqMan assay for detection of the gene of interest and 1 μl of VIC-tagged Taqman assay for a housekeeping gene control.The housekeeping control gene was chosen to match the level of expression of the gene of interest in each cell type.Taqman assays list (ThermoFisher Scientific), containing probes and primers: 5 × 10 4 cells in 100 l macrophage media were seeded onto each transwell (PET with 5 μm pores, Sarstedt) in a 24-well plate.Cells were incubated for 15 minutes to settle and 600 l of macrophage media containing 3 nM human recombinant C5a was added beneath the transwells.The plate was incubated for 10 hours to allow cell migration.The transwell was then rinsed in PBS, transferred to a fresh 24-well plate, and fixed with 4% paraformaldehyde for 20 min at RT. Cells on either side of the transwell membrane were stained with NucBlue (Invitrogen) and imaged with an EVOS FL Auto automated microscope (Thermo Fisher) set up to take images of each transwell in full (top image).The transwells were swabbed with a cotton wool bud to remove cells on the top surface, leaving behind only migrated cells, transferred into a fresh plate and imaged again with the same settings (bottom image).Cell counting was performed with CellProfiler 3.0 software 99 and the percentage of migrated cells per transwell was calculated as: (no.cells in bottom image) ÷ (no.cells in top image) × 100.Stimulations were performed in triplicate and the average for each sample across wells was taken.An unpaired t-test was used to test the difference between different samples.

Figure 2 -
Figure 2 -Isogenic edited cell lines show that rs28834970 effects PTK2B expression through effects on chromatin accessibility and CEBPβ binding at the intronic enhancer in iPSC-derived macrophages and microglia.

Figure 3 -
Figure 3 -The rs28834970 variant affects global gene expression in iPSC-derived macrophages and microglia

Figure 3 A
Figure 3

Figure 4 -Figure 4
Figure 4 -The rs28834970 variant causes changes in chemokine production and migration in iPSC-derived microglia A) Concentration of downregulated chemokines in unstimulated microglia with the C/C allele at rs28834970 relative to the concentration in cells with the T/T allele, measured by Luminex assay.Results are shown as the mean ± SEM of independent biological replicates (n=3).ns= non-significant, * p<0.05, **p<0.01,***p<0.001,****p<0.0001,unpaired t-test.B) Concentration of downregulated chemokines in microglia with the C/C allele at rs28834970 relative to the concentration in cells with the T/T allele, stimulated with IFNγ.Results are shown as the mean ± SEM of independent biological replicates (n=3).ns= nonsignificant, * p<0.05, **p<0.01,***p<0.001,****p<0.0001,unpaired t-test.C) Bar graph showing the percentage of migrated microglia harbouring the T/T allele (left) or the C/C allele (right) either unstimulated or stimulated with IFNγ, and with or without the chemoattractant C5a.Results are shown as the mean ± SEM of independent biological replicates (n=3).ns= non-significant, * p<0.05, **p<0.01,***p<0.001,unpaired t-test.
100ect of rs28834970 SNP on transcription factor binding probability.Position probability matrices are shown for CEBPβ (top) and CEBPα (middle) as well as the position of the binding motifs on the PTK2B sequence (bottom).The plot was generated using motifbreakerR100.E) Volcano plot showing fold changes of CEBPβ binding between microglia harbouring the C/C and the T/T allele for all detectable peaks, measured by CUT&RUN.Blue dots indicate peaks with no significant change while red dots represent significantly increased peaks (logFC>1 and p<0.05).The only significantly increased CEBPβ peak found spans the rs28834970 variant in the PTK2B intron.