Abstract
GWAS have revealed that 88% of disease associated SNPs reside in noncoding regions. However, noncoding SNPs remain understudied, partly because they are challenging to prioritize for experimental validation. To address this deficiency, we developed the SNP effect matrix pipeline (SEMpl). SEMpl estimates transcription factor binding affinity by observing differences in ChIP-seq signal intensity for SNPs within functional transcription factor binding sites genome-wide. By cataloging the effects of every possible mutation within the transcription factor binding site motif, SEMpl can predict the consequences of SNPs to transcription factor binding. This knowledge can be used to identify potential disease-causing regulatory loci.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.