Abstract
Recent genome-wide association studies (GWAS) have identified numerous schizophrenia (SZ) associated loci. As is common across disorders, many of the SZ associated variants are located outside protein-coding regions and are hypothesized to differentially affect the transcription of nearby genes. To systematically identify which variants affect potential regulatory activity, we assayed 1,049 variants in high Linkage Disequilibrium (LD) within 64 reported SZ-associated loci and an additional 30 SNPs from 9 loci associated with Alzheimer's disease (AD) using a massively parallel reporter assay (MPRA). Each variant was assayed on the center of a 95bp synthetic oligonucleotide. The resulting library was transfected 3 independent times into K562 chronic myelogenous leukemia lymphoblasts and another 6 times into SK-SY5Y human neuroblastoma cells. We identified 148 SNPs with significant allelic differences in their effect on expression of the reporter gene in the K562 cells and 53 in the SK-SY5Y cells with an average of 2.6 such SNPs per locus and a median of 1. The overlap between cell lines was modest with 9 SNPs having significant allelic differences in both lines, 8 of these 9 in the same direction. We do not observe a direction preference (increased or decreased enhancer activity) for risk vs. non risk alleles. We find that large LD blocks have a greater density of functional SNPs supporting instances of combinatorial SNP effects that may lead to selection at the haplotype level. Our results help determine driver GWAS variant(s), guide the functional follow up of disease associated loci and enhance our understanding of the genomic dynamics of gene regulation.
Footnotes
Minor wording and formatting revisions