## Abstract

Although quantitative trait locus (QTL) associations have been identified for many molecular traits such as gene expression, it remains challenging to distinguish the causal nucleotide from nearby variants. In addition to traditional QTLs by association, allele-specific (AS) QTLs are a powerful measure of cis-regulation that are largely concordant with traditional QTLs, and can be less susceptible to technical/environmental noise. However, existing asQTL analysis methods do not produce probabilities of causality for each marker, and do not take into account correlations among markers at a locus in linkage disequilibrium (LD). We introduce PLASMA (PopuLation Allele-Specific MApping), a novel, LD-aware method that integrates QTL and asQTL information to fine-map causal regulatory variants while drawing power from both the number of individuals and the number of allelic reads per individual. We demonstrate through simulations that PLASMA successfully detects causal variants over a wide range of genetic architectures. We apply PLASMA to RNA-Seq data from 524 kidney tumor samples and show that over 13 percent of loci can be fine-mapped to within 5 causal variants, compared less than 2 percent of loci using existing QTL-based fine-mapping. PLASMA furthermore achieves a greater power at 50 samples than conventional QTL fine-mapping does at over 500 samples. Overall, PLASMA achieves a 6.4-fold reduction in median 95% credible set size compared to existing QTL-based fine-mapping. We additionally apply PLASMA to H3K27AC ChIP-Seq from 28 prostate tumor/normal samples and demonstrate that PLASMA is able to prioritize markers even at small samples, with PLASMA achieving a 1.4-fold reduction in median 95% credible set sizes over existing QTL-based fine-mapping. Variants in the PLASMA credible sets for RNA-Seq and ChIP-Seq were enriched for open chromatin and chromatin looping (respectively) at a comparable or greater degree than credible variants from existing methods, while containing far fewer markers. Our results demonstrate how integrating AS activity can substantially improve the detection of causal variants from existing molecular data and at low sample size.

## 1 Introduction

A major open problem in genetics is understanding the biological mechanisms underlying complex traits, which are largely driven by non-coding variants. A widely adopted approach for elucidating these regulatory patterns is the identification of disease variants that also modify individual-level molecular activity (such as gene expression) in the population [1–4]. These quantitative trait loci (QTLs) are typically single nucleotide polymorphisms (SNPs) that exhibit a statistical association with overall gene expression abundance [5–8]. Although QTL association analysis is now mature, it remains challenging to identify the precise variants that causally influence the molecular trait (as opposed to variants in linkage disequilibrium (LD) with causal variants), a task known as fine-mapping [9]. As only a small subset of QTL-associated markers are estimated to be causal [10, 11], direct experimental validation is prohibitive and has motivated statistical fine-mapping solutions [12]. The aim of statistical fine-mapping is to quantify the probability of each marker being causal, allowing one to prioritize the most likely causal markers, and thus formally quantify the effort needed for experimental validation. Recent statistical fine-mapping methods operate on summary QTL statistics and can handle multiple causal variants by modeling the local LD structure [13–16]. These models have two outputs to help guide the prioritization of putative causal SNPs. First, for each marker, a Posterior Inclusion Probability (PIP) is calculated, which are corresponds to the marginal probability of causality for the given marker. Second, a *n*%-confidence credible set is created: a set of markers with an *n*% probability of containing all the causal markers. Although QTL studies have sufficient power to identify thousands of associations, they are typically insufficient for fine-mapping below dozens of credible variants, even for very large studies [5, 17]. The need for large studies is severely limits QTL analyses of expensive assays such as ChIP or single-cell RNA-seq, or of difficult to collect tissues.

Here, we sought to improve molecular fine-mapping by leveraging intra-individual allele-specific (AS) signal, which is a measure of cis-regulatory activity that is independent of total, interindividual variation. For heterozygous variants residing in expressed exons, it is often possible to map expressed reads to each allele and quantify the extent that molecular activity is allele-specific [6, 18–21]. AS analysis allows for a precise comparison of the effects on molecular activity that are specific to each allele (cis-effects), while controlling for effects affecting both alleles (trans-effects). Thus, AS data is inherently less noisy than regular QTL data, which captures total expression regardless of source. The AS effect-size has also been shown to be highly correlated to conventional QTL effect-sizes, implying that both features typically the same underlying cis-regulatory patterns [22]. Several methods have recently been developed to robustly identify asQTLs [19, 20, 23], but the calculated association statistics follow a different distribution than QTL summary statistics and cannot be directly integrated into existing fine-mapping software to produce valid posterior measures and credible sets.

To combine the scalability of QTL analysis with the power of AS analysis, we introduce PLASMA (PopuLation Allele-Specific Mapping), a novel fine-mapping method that gains power from both the number of individuals and the number of allelic reads per individual. By modeling each locus across individuals in an allele-specific and LD-aware manner, PLASMA achieves a substantial improvement over existing fine-mapping methods with the same data. We demonstrate through simulations that PLASMA successfully detects causal variants over a wide range of genetic architectures. We apply PLASMA to diverse RNA-Seq data and ChIP-seq data and show a significant improvement in power over conventional QTL-based fine-mapping.

## 2 Results

### 2.1 Overview of PLASMA

PLASMA’s inputs are determined from a given individual-level sequencing-based molecular phenotype (gene or peak) and the corresponding local genotype SNP data (Figure 1a). For each sample, we assume the variant data is phased into haplotypes and expression reads have been mapped to each variant. Reads intersecting heterozygous markers (signified as fSNPs, or feature SNPs, indicated with green or purple on the figure) are then assigned to a particular haplotype, indicated as blue or red on the figure. These reads are then aggregated in an haplotype-specific manner to produce a total expression phenotype and an allelic imbalance phenotype. This aggregation of reads is analogous to the way existing methods such as RASQUAL and WASP calculate allelic fractions and total fragment counts [19, 20]. The total expression phenotype (y) is simply the total number of mapped reads. The allelic imbalance phenotype (w) is defined as the log read ratio between the haplotypes. This log-odds-like phenotype has previously been used to analyze asQTL effect sizes, showing consistency with conventional QTL analysis [22]. In practice, we also mitigate the effect of mapping bias by running state of the art mapping bias and QC pipelines on all RNA-Seq and ChIP-Seq data prior to analysis [19].

PLASMA integrates two statistics computed for each marker to perform fine-mapping: a QTL association statistic (*z _{β}*) based on the total phenotype and an AS association statistic (

*z*) based on the allelic imbalance phenotype. Figure 1b shows how a causal marker influences total expression and allelic imbalance, and how this effect influences the statistics for the marker (see Methods for quantitative explanations of the statistics). Here, the causal marker’s alternative allele causes higher expression compared to the wild-type allele. We see that increasing the dosage (

_{ϕ}*x*) of the alternative allele increases the total expression (

*y*) at the locus. The effect size (

*β*), consistent with a typical QTL analysis, quantifies the association between a marker’s allelic dosage and the total expression at the locus with a linear relationship:

From this effect size, PLASMA calculates *z _{β}*, the QTL association statistic (See Methods for the precise relationship with

*β*). Note that this statistic is not dependent on haplotype-specific data.

On the other hand, looking at the heterozygotes, we see that the haplotype possessing the alternative allele has a higher expression than the haplotype possessing the wild-type allele. In other words, the direction of imbalance of expression (*w*) is the same as the direction of the phase (*v*) of the allele. The *ϕ* effect size quantifies the association between a marker’s phasing with the imbalance of expression. An important departure from existing methods is that PLASMA models a linear relationship between the phase of a causal marker and the log read ratio, rather than directly relating the genotype to the allelic fraction in a non-linear manner:

To calculate the AS association statistic *z _{ϕ}*, PLASMA models the quality of each sample, taking into account each sample’s read coverage and read overdispersion (Figure 1c, see Methods for the precise weighing scheme).

These QTL and AS association statistics, together with the local LD matrix, are then jointly used to fine-map the locus (Figure 1c, see Methods for the fine-mapping model). Since PLASMA models both *z _{β}* and

*z*as a linear combination of genotypes,

_{ϕ}*z*and

_{β}*z*have identical LD (see Supplemental Methods for proof). PLASMA assumes that the QTL and AS statistics measure the same underlying cis-regulatory signal and are thus expected to have the same direction and same causal variants (but see Discussion for possible model violations). Although they measure the same underlying effect, the two statistics have independent noise because the intra-heterozygous variance is considered only in AS analysis, allowing them to be used jointly in fine-mapping. Furthermore, PLASMA accepts, as a hyperparameter, a correlation between QTL and AS effects, allowing the two sets of statistics to utilize a joint probability distribution (though we find that setting this hyperparameter to zero yields the most power). The distribution is used to assign a probability to a given causal configuration, a binary vector signifying the causal status of each marker in the locus. Although the correlation between QTL and AS causal effects can vary based on the hyperparameter specification, PLASMA assumes that the AS and QTL phenotypes have the same causal variants. PLASMA searches through the space of possible causal configurations, within a constraint on the number of causal variants. This procedure is related to that in CAVIAR, CAVIARBF, and FINEMAP [13–15], but generalized to the two correlated expression phenotypes. From these scored configurations PLASMA computes a posterior inclusion probability (PIP) for each marker, indicating the marginal probability that a marker is causal, and a

_{ϕ}*ρ*-level credible set containing the causal variant with

*ρ*probability.

### 2.2 Simulation framework

We evaluate PLASMA with a framework that simulates the expression of whole loci in an allele-specific manner. This simulation framework jointly simulates total reads and allele-specific read counts, under given values of the number of causal variants, the QTL heritability, the AS heritability, the variance of the AS phenotype across samples, and the expected read coverage (see Methods). The variance and heritability of the AS phenotype are handled by two separate parameters, where the former describes the total spread of allelic imbalance, and the latter specifies the fraction of the variance that is due to genetic effects. This allows us to investigate cases where a significant amount of observed imbalance is caused by non-genetic variance in the allelic expression. To quantify the total variance of the AS phenotype in the population, we define the “standard allelic deviation” as the standard deviation of the AS phenotype *w*, quantified on the allelic fraction scale (between 0.5 and 1). Importantly, this metric is independent of the genetic effect, which is controlled by the heritability parameter. Simulations were performed using real phased haplotype data from the 1000 Genomes Project European samples.

As the performance of standard QTL association models is well established, we first focused on performance of our proposed AS statistic. Figure S1a shows how the mean *z _{ϕ}* varies as a function of standard allelic deviation and mean read coverage at a fixed AS heritability of 0.5. Second, Figure S1b shows how the mean

*z*varies as a function of standard allelic deviation and heritability with mean coverage fixed at 100. The statistic is the greatest at high read coverage and high heritability, consistent with the degree of experimental and intrinsic signal available to the model. These results hold even at low AS variance (standard allelic deviation of 0.6) and show that PLASMA does not conflate high AS variance (standard allelic deviation) with high signal (coverage or heritability). This robustness to variance in the AS phenotype makes the model resistant to false-positives driven by non-genetic sources of allelic variance. At very high variance (standard allelic deviation greater than 0.8),

_{ϕ}*z*shows a sharp decrease. This decrease in signal is due to an increase in the sampling error of the AS phenotype (

_{ϕ}*w*) at high overall variance, as shown in Equation 46 (See Supplemental Methods for a mathematical relationship between total variance and sampling error.)

### 2.3 Comparison with existing methods in simulation

Next, we compare PLASMA’s fine-mapping performance with existing fine-mapping methods. We test two different “flavors” of PLASMA, “PLASMA-JI” and “PLASMA-AS.” The PLASMA-JI (Joint-Independent) flavor looks at both AS and QTL statistics, assuming a shared set of AS and QTL causal variants, and also that the AS and QTL causal effects are uncorrelated. The “PLASMA-AS” flavor is restricted to only AS data. As a baseline, we also compared PLASMA to a QTL-Only version of PLASMA and to the CAVIAR method (expected to be equivalent to PLASMA QTL-Only) [13]. The behavior and performance of CAVIAR is representative of similar QTL-based methods such as CAVIARBF, FINEMAP, and PAINTOR without functional annotation data [14–16]. We furthermore compare the flavors of PLASMA against the only other publicly-released fine-mapping method (to our knowledge) that integrates AS data described in the preprint of Zou *et al.*, 2018 [24]. This unnamed method, which we denote as “CAVIAR-ASE, “ utilizes the association between SNP heterozygosity and a binary indicator of allelic imbalance. By binarizing allelic imbalance, CAVIAR-ASE is expected to lose power relative to treating imbalance as a quantitative phenotype but may be more robust to spurious AS signal. Furthermore, CAVIAR-ASE utilizes only indicators of heterozygosity, rather than marker phasing. CAVIAR-ASE can therefore be used with unphased genotypes, but at the expense of being unable to leverage the direction of the allelic effect.

First, we evaluate how well each model prioritizes candidate causal markers using simulated loci with one causal variant. We define the “inclusion curve” for each model, where markers are ranked by posterior probability and added one by one to a cumulative set (note that this set is not dependent on the definition of a credible set). The x axis represents the cumulative number of markers chosen, and the y axis represents the “inclusion rate,” the proportion of true causal markers among the chosen markers. Figures 2a and c show inclusion plots at low and high AS variance, respectively. As expected, the QTL-Only flavor and the CAVIAR methods are indistinguishable and do not vary with AS variance (thus, we do not include CAVIAR in further results). Furthermore, we see that PLASMA-JI and PLASMA-AS perform similarly at both levels of AS variance. Lastly, we see a dependency of CAVIAR-ASE’s performance on the degree on AS variance.

Second, we evaluate the ability of each model to rule out likely non-causal markers in simulated loci with one causal variant. To do so, we directly compare the distributions of the 95% confidence credible sets, with smaller sets indicating higher specificity. Figures 2b and d show distribution plots at low and high AS variance, respectively. At low variance, PLASMA-JI offers the smallest mean credible set size (10.4), followed by PLASMA-AS (10.6), then CAVIAR-ASE (44.6), and lastly CAVIAR (78.2) and QTL-Only (77.0). The AS-based flavors of PLASMA are resistant to changes in AS variance, with mean credible set sizes at high AS variance of 9.6 for PLASMA-JI and 10.4 for PLASMA-AS. In contrast, the performance of CAVIAR-ASE varies significantly with the degree of AS variance, even when the underlying signal (coverage and heritability) is constant, with a mean set size of 67.4 at high variance. This sensitivity may be due to the fact that CAVIAR-ASE does not incorporate marker phasing, and thus must rely solely on the intensity of imbalance, rather than the direction of imbalance.

Third, we run the AS-based methods across a wide range of coverage and heritability conditions, recording the mean 95% confidence credible sets, shown in Figure 3. Figures 3a-d show mean credible set sizes as a function of AS variance and coverage, and Figures 3e-h show mean credible set sizes as a function of AS variance and AS heritability. In terms of the range of set sizes, the PLASMA-JI flavor performs the best (7.4 variants on average at best conditions), followed by the PLASMA-AS flavor (7.4 at best conditions), and lastly the CAVIAR-ASE method (18.0 at best conditions). Generally speaking, all methods show results consistent with the behavior of *z _{ϕ}* in Figure S1. Although increasing either coverage or heritability results in smaller set sizes, increasing coverage beyond 100 gives diminishing returns as the observed expression levels approach the true expression levels. As expected, CAVIAR-ASE tends to struggle at low AS variance, especially apparent at a standard allelic deviation of 0.55, with a mean set size of 52.3 at best. This may be due to the large majority of samples falling under the threshold for allelic imbalance at 0.65. To verify that PLASMA is calibrated across the full range of conditions, Figure S2 shows that the 95% credible set sizes have at least a 95% chance of including the causal variant.

### 2.4 Inference of multiple causal variants

To demonstrate PLASMA beyond a one-causal-variant assumption, we fine-mapped sets of simulated loci with 2 causal variants with each flavor of PLASMA. Figure 4 shows the inclusion curve and the distribution of 95% confidence credible set sizes for each flavor. For both curves, a success is defined as the inclusion of both causal variants. For PLASMA-JI, the median credible set size increases from a mean of 9.5 for one causal variant to 87.0 for two causal variants. This apparent decrease in power is consistent with results in earlier QTL fine-mapping analysis [13, 14], where capturing all causal variants becomes increasingly difficult as the number of causal variants increase. Nevertheless, PLASMA-JI and PLASMA-AS deliver an improvement over QTL-Only fine-mapping, with mean credible set sizes of 87.0, 93.0, 95.1, for PLASMA-JI, PLASMA-AS, and QTL-Only (respectively). Due to the difficulty of fine-mapping multiple causal variants [10], along with the estimate that over 75% of loci do not display allelic heterogeneity, further analyses on experimental data were performed under the one-causal-variant assumption.

Unlike the single causal variant case, where all model hyperparameters were inferred from simulation parameters, the causal variance hyperparameters in this case were manually calibrated. We believe that this need for calibration is due to linkage disequilibrium obfuscating the relationship between causal effect sizes and total heritability at a locus. (See Supplemental Methods for information about hyperparameter estimation.) The results shown in this section is calibrated such that the recall rates for 95% confidence credible sets are 0.95, 0.95, and 0.978 for the PLASMA-JI, QTL-Only, and PLASMA-AS flavors, respectively.

### 2.5 Fine-mapping of TCGA kidney RNA-Seq data

To evaluate our method on real data, we fine-mapped gene expression data from 524 human kidney tumor samples and 70 matched normal samples collected by TCGA [25]. The data was processed through a rigorous QC pipeline to account for mapping biases based on established best practices [19, 22]. Figure 5 shows credible set size distribution plots for tumor and normal data under a 1 causal variant assumption. Among the tumor samples (N=524), 22.4% of loci are fine-mapped within 10 variants with PLASMA-JI, while 3.1% of loci are fine-mapped within 10 variants with QTL-Only fine-mapping (Table S1a). Furthermore, PLASMA-JI achieves a median credible set size for 45 variants, whereas QTL-Only achieves a median credible set size of 289 variants (Table S2a). We also see a significant improvement over CAVIAR-ASE, which has 4.7% of loci fine-mapped within 10 causal variants, and a median credible set size of 292. Results for normal samples (N=70) have a similar trend, with 7.0%, 0.2%, and 0.5% of loci fine-mapped within 10 causal variants, for PLASMA, QTL-Only, and CAVIAR-ASE respectively (Table S1b). Median credible set sizes of 67, 374, and 348 variants, for PLASMA, QTL-Only, and CAVIAR-ASE respectively (Table S2b). The lower power for all models is due to having fewer normal samples than tumor samples. To show that these credible set sizes are robust our choice of heritability hyperparameters, we also ran the full set of tumor genes with the AS heritability hyperparameter set to 0.05 instead of 0.5. A comparison of the credible set sizes with those from the original parameters are shown in Figure S3 and Table S3.

To investigate how the methods perform at lower sample sizes, we randomly subsample individuals prior to fine-mapping. Figure 6 plots the credible set size distributions for PLASMA-JI, QTL-Only, and CAVIAR-ASE at various sample sizes. In terms of loci fine-mapped to credible set sizes within 10 variants in tumor (Table S4a), PLASMA with 50 samples has approximately the same power as QTL-Only fine-mapping with 500 samples. In terms of median credible set size, PLASMA with 10 samples has about the same power as QTL-Only fine-mapping with 500 samples (Table S4c). All methods increase in credible set size as the sample size is restricted but the relative gain of PLASMA over the other methods decreases across sample sizes. PLASMA yields a 6.4-fold decrease in median credible set size over QTL-Only fine-mapping at 524 samples, but a 1.3-fold decrease at 10 samples (Table S4c). This implies that PLASMA scales more effectively with sample size than conventional QTL fine-mapping. Nevertheless, PLASMA yields a substantial reduction of credible set sizes even with sample sizes as low as 10, with a median credible set size of 287 in tumor, compared to a median set size of 381 with QTL-Only fine-mapping. We furthermore see in Figure 6b that at a given sample size, PLASMA has higher power for normal samples than for tumor samples, which we believe is due to the lower variance in the normal data.

Next, we look at how causal variant prioritization is impacted by sample size in the down-sampled analysis. Because we do not know the true causal variants in each locus, as a proxy we use markers with a posterior probability of at least 0.1 when fine-mapped with the QTL-Only method on all samples. We note that this will strongly bias the credible set in favor of the QTL model and thus do not compare to the QTL-Only model. In Figure S4, we again see that PLASMA is more effective than CAVIAR-ASE at each sample size. In terms of loci fine-mapped to within 10 variants in tumor (Table S4a), PLASMA with 100 samples has greater power than CAVIAR-ASE fine-mapping with 500 samples. At a given sample size, PLASMA is thus better able to prioritize variants that will be ranked highly in larger studies.

Lastly, we look at how PLASMA prioritizes experimentally-verified causal variants at GWAS risk loci. Figure 7 shows the strength AS and QTL associations for DPF3 and SCARB1, genes in two kidney GWAS loci that have verified causal variants [23, 26]. At each sample size threshold, the AS statistic more confidently identifies the true causal variant than the QTL statistic. In the case of DPF3, the AS statistic is able to prioritize the true causal variant at a substantially lower sample size than the QTL statistic. Moreover, we see that the 95% credible sets from the PLASMA-AS model are smaller than those from the QTL-Only model at a given sample size. By producing a more accurate and confident prioritization of causal variants, PLASMA can substantially reduce the difficulty of experimentally validating causal variants.

### 2.6 Fine-mapping of prostate H3k27ac ChIP-Seq data

To evaluate PLASMA with a different molecular phenotype, we fine-mapped H3k27ac activity measured by ChIP-seq from 24 human prostate tumor samples and 24 matched normals. Although this study measures chromatin activity rather than expression, the nature of the data is nearly identical to that of RNA-Seq and is processed analogously by our QC pipeline and by PLASMA. Instead of fine-mapping eQTLs around gene loci, we fine-mapped chromatin QTLs (cQTLs) around chromatin peaks. Figures 8 shows distribution plots for tumor and normal data under a 1 causal variant assumption. Among the tumor loci, 15.2% of loci are fine-mapped within 50 variants with PLASMA-JI, while 1.9% of loci are fine-mapped within 50 variants with QTL-Only (Table S5a). Furthermore, PLASMA achieves a median credible set size of 226, compared to QTL-Only fine-mapping achieving a size of 322 (Table S6a). PLASMA also outperforms CAVIAR-ASE, with 1.9% of loci fine-mapped within 50 causal variants (no gain over QTL-Only), and a median credible set size of 310. Results from normal samples are similar, with 10.2%, 2.4%, and 2.6% of loci fine-mapped within 50 causal variants, for PLASMA, QTL-Only, and CAVIAR-ASE respectively (Table S5b). These methods achieve a median credible size of 232, 321, and 313 variants, respectively (Table S6b). Overall, these ChIP fine-mapping results are roughly in line with those from RNA-Seq fine-mapping.

### 2.7 PLASMA increases functional enrichment of credible set markers

To evaluate PLASMA’s ability to select markers in functional regions using kidney RNA-Seq data, we look for enrichment of prioritized variants at open chromatin regions measured with DNAse-Seq in a kidney cell-line [27]. Since chromatin accessibility is an indicator of transcription factor binding and regulation [28], an enrichment of credible set markers for open chromatin would indicate that the fine-mapping procedure is prioritizing markers in functionally relevant regions. For instance, the causal variant in the DPF3 locus lies within a DNAse-Seq peak (Figure 7a). We note that quantifying overlapping with an independent functional feature such as open chromatin imposes no assumptions on the ground truth, in contrast to comparing to external QTL/GWAS data which may be biased towards QTL-Only analysis. We define the null distribution as the credible set markers being located independently of open chromatin and use Fisher’s exact test to calculate enrichment as a function of minimum causal variant probability. Figures 9a and b, and Tables S7 and S8 show the p-values and odds ratios, respectively, (computed by Fishers exact test) as a function of posterior probability threshold from each fine-mapping method. In terms of both p-values and odds ratios, we see that the credible set markers produced by PLASMA, for the most part, display a significantly stronger enrichment with open chromatin compared to existing methods in terms of both p-values and odds ratios. For instance, at the *p* = 0.1 threshold for tumor samples, PLASMA’s credible set markers achieve a p-value of 5.61 × 10^{-57} and an odds ratio of 2.30, respectively. In comparison, credible sets from QTL-Only fine-mapping at that threshold achieve a p-value of 1.50 × 10^{-5} and odds ratio of 1.58, respectively. This enrichment shows that even with far smaller credible sets, PLASMA is able prioritize markers that fall in regions of likely functional significance. The difference between PLASMA and existing methods is greatest at higher posterior probability thresholds. We believe that this is due to PLASMA assigning a more meaningful number of markers with such high posterior probabilities, compared to existing methods that are rarely so confident about a marker’s causal status.

Similarly, to validate the credible sets computed from prostate ChIP-Seq data, we look for enrichment of credible set markers at chromatin looping anchors measured by Hi-ChIP in a prostate cell-line. Regulatory elements overlapping loops are more likely to be involved in cis-regulation and we reasoned that they should therefore be enriched for true causal cQTLs [29, 30]. Again we note that this functional feature is independent of the QTL signal or locus LD and is not biased towards a QTL or AS model. Figure 9c and Table S9 show the p-values and odds ratios, respectively, across models as a function of posterior probability threshold (computed by Fishers exact test). Comparing the methods, we see that the credible set markers produced by PLASMA display a significantly stronger enrichment with looping anchors compared to the other methods in terms of both p-values and odds ratios. For instance, at the *p* = 0.1 threshold, PLASMA’s credible sets achieve a p-value of 9.13 × 10^{-7} and an odds ratio of 1.76, respectively. In contrast, credible set markers from QTL-Only fine-mapping at that threshold achieve a p-value of 0.37 and odds ratio of 0.34, respectively.

## 3 Discussion

We present PLASMA, a statistical fine-mapping method that utilizes allele-specific expression and phased genotypes to select candidate causal variants. By modeling gene expression at a locus in an allele-specific manner, PLASMA scales in power both across individuals and across read counts. Through read-count-level simulations of loci, we show that PLASMA performs robustly across a wide range of realistic conditions and consistently outperforms existing statistical fine-mapping methods, including cases where a significant amount of observed imbalance is caused by non-genetic factors. We further demonstrate this increased power on experimental data by applying PLASMA to a large RNA-Seq study, as well as a smaller ChIP-Seq study. In both cases, PLASMA achieves substantially smaller credible set sizes compared to existing fine-mapping methods, greatly increasing the number of loci amenable to experimental causal variant validation. Lastly, we show that even with these greatly reduced (more specific) credible set sizes, PLASMA achieves an equivalent or superior degree functional enrichment as existing methods. These results not only present PLASMA as a powerful tool for prioritizing causal variants, but also demonstrate how AS analysis can be directly integrated into statistical fine-mapping. A key benefit of PLASMA is its ability to utilize existing, conventional sequencing-based QTL data, such as RNA-Seq, CHiP-Seq, and ATAC-Seq at low sample size. This allows researchers to gain significant insight simply by revisiting past QTL studies, especially those with sample sizes too low for conventional QTL fine-mapping.

Although it is evident that an AS analysis with PLASMA confers more signal than an equivalently-sized QTL analysis, AS analysis presents additional obstacles and potential confounders. First, unlike conventional QTL fine-mapping methods that rely only on allelic dosage, PLASMA additionally utilizes genotype phasing, making phasing accuracy a potential concern. However, since PLASMA focuses on cis-regulation, the genotypes observed span no more than several hundred kilobases per locus, well within the high accuracy range of modern phasing algorithms [31]. Second, PLASMA depends on having heterozygous individuals in the tested feature and SNP in order to leverage AS signal. In our analyses we focused on features that were testable by AS (10946 of 19645 total genes, 113459 of 525629 total peaks). However, even in the complete absence of heterozygotes, PLASMA can still conduct conventional fine-mapping based on dosage and total expression. Recent technologies that could potentially offer greater signal include RNA-seq with unspliced transcripts [32], and direct allele-specific measurement of expression using single-cell RNA-Seq [33]. Third, PLASMA assumes the same causal configuration underlying both the AS and QTL effects (and is thus able to combine the signals) but the causal effects may differ due to real biological confounding. For example, cis effects on gene A followed by (local) trans effects of gene A on gene B would be identified as a QTL association, but would not exhibit AS association. This would be a model violation for PLASMA and produce larger credible set sizes. Although PLASMA can consider correlations between causal AS and QTL affect sizes, this parameter is hard to estimate, and we find in real data that the model with correlation set to zero (PLASMA-JI) exhibited greater power than a non-zero constant. Future work is required to fully elucidate the relationship between allele-specific and total effects, which likely differs across genes. Fourth, genomic imprinting (where either the maternal or paternal copy of the gene is silenced) or random monoallelic expression would produce the appearance of allelic imbalance within affected individuals in the absence of true cis-regulatory signal [20]. Although PLASMA does not explicitly model such biases, a bias that is independent of genotype will only cause a reduction in power and not produce false-positives. A potential extension would be to model such violations or discrepancies between the QTL and AS models directly, following the lines of methods such as RASQUAL [20]. Fifth, PLASMA currently does not incorporate covariate analysis in the allele-specific model (though the intra-individual nature of the test controls for false positives), which could additionally be used to model environmental confounders and increase power [34]. AS covariate analysis could potentially be achieved through a multivariate likelihood ratio test as in WASP [19].

PLASMA’s approach in combining QTL and AS signals opens up possible future work in two distinct directions. The first direction would be to build upon the generative fine-mapping model to incorporate additional sources of signal. For example, one can incorporate epigenomic annotation data by setting the priors for causality for each marker. Approaches used in existing QTL-based methods such as PAINTOR and RiVIERA-MT [16, 35] could be transferred to PLASMA with relatively little difficulty. Another possibility would be to conduct N-phenotype colocalization by utilizing additional phenotypes in addition to the AS and QTL phenotypes. Generalizing from two to multiple phenotypes would be straightforward, and could utilize the colocalization algorithm first introduced in eCAVIAR [2]. A second, more general direction would be to adapt QTL-based population genetics methods to utilize AS summary statistics. Since both QTL and AS statistics can be characterized as linear combinations of haplotype-level genotypes, they share many distributional properties, including LD, allowing them to be easily interchangeable in many circumstances. One such application would be gene expression prediction for transcriptome-wide association studies (TWAS) [36], where the increased signal of AS statistics could increase power to identify gene-phenotype relationships. Broadly speaking, the allele-specific model and association statistics that PLASMA introduces will be relevant to any analysis of small sample size or limited tissue.

## 4 Methods

### 4.1 Modeling QTL and AS summary statistics

Marginal QTL effect sizes for a given locus are calculated under the conventional linear model of total gene expression, with the allelic dosage (x) as the independent variable, and the total expression (y) as the independent variable. Let us consider a QTL study of a given locus with *n* individuals and *m* markers. Let y be an (*n* × 1) vector of total expression across the individuals, recentered at zero. Given a marker *i*, let x* _{i}* be a zero-recentered vector of dosage genotypes. We define

*β*, the genetic effect of marker

_{i}*i*on total gene expression as follows:

We use the maximum likelihood estimator of *β _{i}*, equivalent to the ordinary-least-squares linear regression estimator:

We define our QTL summary statistic (Wald statistic) for marker *i* as:
where is calculated from the residuals.

AS effect sizes are calculated under a weighted linear model, with the phasing (**v**) as the independent variable, and the allelic imbalance (**w**) as the dependent variable. We model allele-specific expression under the observation that a cis-regulatory variant often has a greater influence on the gene allele of the same haplotype. We define a marker’s phase *v* as 1 if haplotype *A* contains the alternative marker allele, −1 if haplotype *B* contains the alternative marker allele, and 0 if the individual is homozygous for the marker. Let *w* be the log expression ratio between haplotypes A and B, *ϕ _{i}* be the AS effect size of variant

*i*, and ζ

*be the residual, interpreted as the log baseline expression ratio between haplotypes A and B. We additionally define a sampling error for each individual, quantifying the quality of data from the sample. The genetic effect of marker*

_{i}*i*on allele-specific expression is as follows:

Experimentally-derived AS data, such as RNA-Seq data, yield reads that are mapped to a particular haplotype. For a given individual *j*, we define *c _{A,j}* as the allele-specific read count from haplotype

*A*. We model the allele-specific read count as drawn a beta-binomial distribution, given the total mapped read count

*c*:

_{j}We use this beta binomial model to estimate the variance of the sampling error *τ _{j}*:
where

*ρ*is the overdispersion and is an adjusted estimator of

_{e,j}*w*to reduce the bias of . (Full derivation in Supplementary Methods).

_{j}Due to heteroscedasticity among individuals, we estimate the AS effect size *ϕ _{i}* in a weighted manner, giving larger weights to individuals with lower estimated sampling error. Given individual

*j*, we define the weight for

*j*as the inverse of the estimated sampling error variance:

We define our weight matrix **Ω** as a diagonal matrix with *Ω _{j,j}* =

*ω*. We use the weighted-least-squares estimator for

_{j}*ϕ*:

_{i}With this estimator, we define the AS association statistic for marker *i* as the AS effect size divided by the estimated variance of the effect size (full derivation in Supplemental Methods):

### 4.2 Inference of credible sets and posterior probabilities

PLASMA defines a joint generative model for total (QTL) and haplotype-specific (AS) effects on expression. We define **ẑ** as the combined vector of AS association statistics and QTL association statistics:

Let **R _{z}** be the genotype LD matrix, and

*r*be a hyperparameter describing the overall correlation between the QTL and AS summary statistics calculated across all loci. We define the combined correlation matrix

_{βϕ}**R**as:

We model the joint distribution as multivariate normal, with covariance **R**:

We introduce a likelihood function that gives the probability of statistics **ẑ**, given a causal configuration. We define a causal configuration **c** as a vector of causal statuses corresponding to each marker, with 1 being causal and 0 being non-causal. We assume that the causal configuration is the same for the QTL and AS signals.

We define hyperparameters and as the variance of AS and QTL causal effect sizes, respectively *r _{c,βϕ}* as the underlying correlation of the causal QTL and AS effect sizes. (This is not to be confused with

*r*, which concerns the correlation between the association statistics. See Supplementary methods for a mathematical relationship between these two hyperparameters.) We show that these three hyperparameters are closely related to the heritability of gene expression (Supplemental Methods). We define

_{βϕ}**Σ**, the covariance matrix of causal effect sizes, given a causal configuration:

_{c}We define our likelihood for a causal configuration as:

Let *γ* be the prior probability that a single variant is causal and 1 − *γ* as the probability that a variant is not causal. We define the prior probability of a configuration consisting of *m* variants as:

With the prior and likelihood, we define the posterior probability of a causal configuration, normalized across the set of all possible configurations ℂ:

We define the *ρ*-level credible set as the smallest set of markers with a *ρ _{c}* probability of including all causal markers. We define as the set of all causal configurations whose causal markers is a subset of , excluding the null set. We calculate

*ρ*as the sum of the probabilities of the configurations in :

_{c}Additionally, we define a marker’s posterior inclusion probability (PIP) as the probability that a single given marker is causal, marginalized over all other markers. We calculate this probability by summing over all configurations containing the marker.

To reduce the number of configurations to evaluate in the case of multiple causal variants, we use the heuristic that configurations with significant probabilities tend to be similar to each other. We use a shotgun stochastic search procedure to find all configurations with a significant probability. For each iteration of the algorithm, the next configuration is drawn randomly from the neighborhood of similar configurations, weighted by the posterior probability of each candidate. Upon termination, we assume that all configurations with nonzero probability have been uncovered.

Given the large number of configurations evaluated, it is impractical to calculate the best possible credible set satisfying *ρ _{c}*. Instead, we use a greedy approximation algorithm. At each step, before

*ρ*is reached, the algorithm adds the marker that increases the confidence the most.

_{c}### 4.3 Generation of simulated loci

Genotype data was sampled from phased SNP data using the CEU population in the 1000 Genomes Project. First, a contiguous section of markers in Chromosome 22 is randomly chosen from the genotypes. Next, a random selection of samples are randomly selected from the section. The genotypes corresponding to the chosen samples yield two haplotype matrices, which we denote **H*** _{a}* and

**H**

*.*

_{b}Among the markers, the desired number of causal markers is randomly selected. In the case of multiple causal variants, each causal marker is assigned a relative effect size, sampled from a normal distribution with zero mean and unit variance. For each individual, we calculate **q*** _{a}* and

**q**

*, the ideal un-scaled gene expression for each haplotype, by multiplying the relative effect sizes with each haplotype matrix.*

_{b}With this haplotype-specific expression, we simulate read count data. In real data, only a fraction of the reads can be mapped to a specific haplotype. Due to this difference between total reads and mapped reads, we calculate the allelic imbalance and the total read count (QTL) separately.

To calculate total read count data, we model total ideal un-scaled expression **q*** _{t}* as

**q**

*+*

_{a}**q**

*, the sum of the haplotype-specific un-scaled gene expression. We then add Gaussian-distributed noise so that the variance of*

_{b}**q**

*is consistent with the total variance across samples as specified by the QTL heritability. Finally, we scale this final expression so that the total expression across samples is of unit variance. We do not explicitly generate total read counts, since a multiplicative factor across samples does not influence the QTL association statistics calculated by the model. This is reflective of typical QTL study protocols which aggressively rank/quantile normalize the data to fit a normal distribution.*

_{t}To calculate allele-specific read counts, we take into account heritability, mean read coverage, and the total variance of the AS phenotype. We model the ideal allelic imbalance phenotype as logit (calculated element-wise). We then add Gaussian-distributed noise so that the signal-to-noise ratio of the phenotype’s variance is consistent with the specified AS heritability. This noisy phenotype is then scaled to the specified total variance. The read coverage for each sample is then drawn from a Poisson distribution, given the mean read coverage. Lastly, allele-specific read counts are generated from these phenotypes, with the counts for each sample being drawn from a beta-binomial distribution.

### 4.4 Quality control of genotype data

For TCGA data, germline genotype calls were downloaded from the Genomic Data Commons. For PrCa ChIP samples, germline genotypes were called from blood as described in Ref. [37]. Genotypes were imputed to the Haplotype Reference Consortium [38] using the Michigan Imputation Server [39] and restricted to variants with INFO greater than 0.9 and MAF greater than 0.01. Variants were further restricted to QC-passing SNPs from Ref. [38] which represent common, well-mapped variants from the 1000 Genomes project.

### 4.5 Quality control of RNA-seq data

Raw RNA-seq BAM files were downloaded from the Genomic Data Commons. Initial RNA-seq mapping and alignment was performed following TCGA parameters for the STAR aligner [40]. Mapping bias was accounted for by re-mapping using the WASP pipeline [19] and the STAR aligner with the same parameters. Reads were randomly de-duplicated as recommended by the WASP pipeline.

Somatic copy number calls were downloaded from FireBrowse and local beta-binomial overdispersion parameters were estimated for each contiguous region of copy number change.

### 4.6 Quality control of ChIP-seq data

ChIP-seq experiments were performed as described in Ref. [37]. Reads were aligned using bwa and default parameters [41], and peaks were called using MACS2 and default parameters (with DNA-seq input provided as control) [42]. Peaks were then unified across all samples. Mapping bias was accounted for by re-mapping using the WASP pipeline and the bwa aligner with the same parameters. Reads were randomly de-duplicated as recommended by the WASP pipeline. Beta binomial overdispersion parameters were estimated globally for each sample as somatic copy number was expected to be minimal.

### 4.7 Allele-specific quantification

The StratAS algorithm was used to quantify allele-specific signal and identify initially significant features for fine-mapping [23]. For each peak/gene (the feature) and individual all reads at heterozygous SNPs in the feature were aggregated to compute the haplotype-specific read counts, and summed across the two haplotypes of each individual to compute the QTL read counts. Each QC passing variant within 100kb of the feature was then tested for an allele-specific association with the feature and features that were significant at a genome-wide false discovery rate (FDR) of 5% were retained for fine-mapping.

### 4.8 Functional enrichment analysis

For QTLs fine-mapped from RNA-seq we selected regions of accessible chromatin in the most relevant tissue as reference the functional feature, reasoning that high-confidence causal variants should be more abundant in accessible regions. For QTLs fine-mapped from ChIP-seq we selected chromosome looping anchors from Hi-ChIP in the relevant tissue as the reference functional feature, reasoning that high-confidence causal variants should be more abundant in regions that are in conformation with promoters.

Enrichment was then estimated by computing the proportion of markers in credible sets that intersect with the functional feature. Controls were calculated as the intersection between all tested markers and the functional feature. Odds ratios and p-values were computed with Fisher’s exact test.

## Supplemental Methods

### 5.1 Modeling total expression at a locus (QTL)

#### 5.1.1 Modeling genetic effects on total expression

We calculate marginal effect sizes for a given locus under the conventional linear model of total gene expression. Let us consider a QTL study of a given locus with *n* individuals and *m* markers. Let **y** be an (*n* × 1) vector of total expression across the individuals, recentered at zero. Given a marker *i*, let **x**_{i} be an (*n* × 1) zero-recentered vector of genotypes. We define *β _{i}*, the genetic effect of marker

*i*on total gene expression as follows:

We model the residuals **ϵ**_{i} as normally distributed with variance .

#### 5.1.2 Calculation of QTL summary statistics

We use the maximum likelihood estimator of *β _{i}*, equivalent to the ordinary-least-squares linear regression estimator:

Under the null model where *i* is not causal, *i* does not explain any amount of variation of the phenotype, and the variance of **y** is simply . Thus, under the null:

We estimate from the residuals:

We thus define our QTL summary statistic (Wald statistic) for marker *i* as:

We assume that the number of individuals is enough such that the observed statistic is normally distributed with unit variance:

In the case where **x**_{i} is of unit variance, the statistic simplifies to:

### 5.2 Modeling allele-specific expression at a locus (AS)

#### 5.2.1 Modeling haplotype-specific effects on expression

We model allele-specific expression under the observation that a cis-regulatory variant often has a greater influence on the gene allele of the same haplotype. Under this model, an individual who is heterozygous for one or more cis-regulatory markers will show an imbalance in expression between the alleles.

From a quantitative perspective, let us consider a single locus in a single individual who is heterozygous for marker *i*. Let 0 and 1 represent the wild-type and alternative marker alleles, respectively. We define *e*_{0} as the expression of the gene allele on the same phase as marker allele 0, and *e*_{1} as the expression of the gene allele on the same phase as marker allele 1. Let and be baseline expressions without the effect of marker *i*. We define *δ _{i}* as the cis-regulatory strength of marker allele 1 over marker allele 0 such that:

If we define *i*’s phase, *v _{i}*, we can arbitrarily assign haplotypes

*A*and

*B*. The above equation then becomes:

The marker’s phase is 1 if haplotype *A* contains the alternative marker allele, −1 if haplotype *B* contains the alternative marker allele, and 0 if the individual is homozygous for the marker.

We now re-write Equation 28 as a linear model. Let *w* be the log expression ratio between haplotypes A and B:

Let *ϕ _{i}* be the log allelic fold change (logAFC) caused by variant i:

Let *ζ _{i}* be the log baseline expression ratio between haplotypes A and B:

With these parameters we rewrite Equation 28 as:

Given *n* individuals, this expression becomes:

We assume that **ζ**_{i} is drawn from a normal distribution with variance . Note that under this model, *ϕ _{i}* can be interpreted as the effect size of marker

*i*on allelic imbalance, with

**ζ**

_{i}as the residuals. Furthermore, assuming no haplotype bias, both

**w**and

**v**

_{i}are zero-centered in expectation.

Experimentally-derived AS data, such as RNA-Seq data, yield reads that are mapped to a particular haplotype. Given *c _{A}* and

*c*, the read counts mapped to haplotypes

_{B}*A*and

*B*respectively, we define our estimator of

*w*as:

For a given individual *j*, we define *c _{A,j}* as the allele-specific read count from haplotype

*A*. We model the allele-specific read count as drawn a beta-binomial distribution, given the total mapped read count

*c*:

_{j}We define *π _{j}* as the expected proportion of read counts (allelic fraction) from haplotype

*A*:

*α _{j}* and

*β*can be re-parameterized in terms of

_{j}*π*and the sampling overdispersion

_{j}*ρ*.

_{e}With this re-paramaterization, the mean and variance of *c _{A,i}* is given as follow:

We use this beta binomial model to estimate the variance of *ŵ _{i}*. We scale the distribution by to get the mean and variance for the read count proportion:

We define *w** as the logit-transformed allelic fraction:

We can thus find the approximate mean and variance of *ŵ _{j}* given using Taylor expansions:

Note that *w* and *w** are not equivalent because . Equation 45 implies that *ŵ* is a biased estimator of *w**, especially at low read counts and/or high overdispersion. To get an estimator of *w** with reduced bias, we take the approximation that sinh(*w**) ≈ *w** around zero:

We use *ŵ** to find an estimator of , the variance of *ŵ*:

Given our estimator *ŵ _{j}*, we quantify the sampling error

*τ*=

_{j}*ŵ*−

_{j}*w*, with and . Thus, across individuals:

_{j}#### 5.2.2 Calculation of AS summary statistics

Due to heteroscedasticity among individuals, we estimate the AS effect size *ϕ _{i}* in a weighted manner, giving larger weights to individuals with lower expected sampling error. Given individual

*j*, we define the weight for

*j*as the inverse of the estimated read count variance:

We define our weight matrix **Ω** as a diagonal matrix with **Ω**_{j,j} = *ω _{j}*.

We use the weighted-least-squares estimator for *ϕ _{i}*:

Under the null model where *i* is not causal, the variance of *w _{j}* is , and the variance of

*ŵ*is . Thus, under the null:

_{j}We now estimate from the residuals. Note that we are estimating Var(*ζ _{i}*), but the residuals are

**ζ**

_{i}+

**τ**, so we cannot directly use the variance of the residuals. We instead use the following estimator for :

We show that this estimator is equal to in expectation:

With this estimator, we define the AS association statistic for marker *i* as follows:

We assume that the observed statistic is normally distributed with unit variance:

To gain an intuitive understanding of the association statistic, let us examine it under simplifying conditions. We assume that **v**_{i} is of unit variance, that read count overdispersion is negligible, and that allelic imbalance and read coverage are fixed across individuals. Under these conditions, let for coverage *c* and some constant *k*. Equation 55 simplifies to:

We can see that under high experimental noise (*k*/*c*), the denominator is dominated by the quality of data (read coverage). In contrast, when experimental noise is low, the denominator is dominated by , determined by the inherent heritability of the locus’s AS phenotype.

### 5.3 Inference of causal variants with QTL and AS statistics

#### 5.3.1 Modeling the correlation of summary statistics among markers

Due to linkage disequilibrium, there exist significant correlations of genotypes among markers. This correlation is reflected in the correlations in the association statistics. Given a set of *m* markers, we model a set of association statistics **ẑ _{α}** for a locus as following a multivariate normal distribution with covariance

**R**:

_{z}Note that because the statistics are all of unit variance, **R _{z}** is also the correlation matrix.

We now show that the correlation matrices for QTL and AS association statistics are both equivalent to the correlation matrix of the marker genotypes. Let **u**_{j,h} be the haploid 0/1 genotypes of markers on haplotype *h* of individual *j*. We assume that the genotypes are well-approximated by a multivariate normal distribution:

The uncentered diploid 0/1/2 genotypes **x′**_{j} can thus be expressed as the sum of two independent haploid genotypes of haplotypes *A* and *B*:

Likewise, the −1/0/1 marker phases **v**_{j} can be expressed as the difference of haploid genotypes:

Note that **x′**_{j} and **v**_{j} refer to the (uncentered) genotypes and phases across markers for a particular individual *j*. This is in contrast to **x**_{i} and **v**_{i} used earlier, which refer to the (centered) genotypes and phases across individuals for a particular marker *i*.

Examining Equations 21 and 24, we see that the QTL association statistic can be expressed as:
where *f*(**x**_{i}) is a vector-to-scalar function that ensures unit variance. Let **X** be an *n* by *m* matrix of genotypes for all individuals and markers, and let **F** be an *m* by *m* diagonal matrix such that **F**_{i,i} = *f*(**x**_{i}). Across markers, the expression becomes:

Given that each row of **X** is an independent realization of a multivariate-normally-distributed variable **x**_{j}, the distribution of **ẑ _{β}** can be expressed as an affine transformation of the distribution of

**x**

_{j}=

**x′**

_{j}− 2

**μ**.

_{u}Since is a diagonal matrix, the correlation matrix of **ẑ _{β}** is the same as the correlation matrix calculated from

**Σ**.

_{u}We examine the AS association statistic in a similar manner. Looking at Equations 51 and 55, the AS association statistic can be expressed as:
where *g*(**v**_{i}, **Ω**) is a vector-to-scalar function that ensures unit variance. Let **V** be an *n* by *m* matrix of genotypes for all individuals and markers, and let **G** be an *m* by *m* diagonal matrix such that **G**_{i,i} = *g*(**v**_{i}, **Ω**). Across markers, the expression becomes:

The distribution of **ẑ _{ϕ}** can thus be expressed as:

Since **Σ _{u}** is transformed by a diagonal matrix, the correlation matrix of

**ẑ**is the same as the correlation matrix calculated from

_{ϕ}**Σ**.

_{u}Thus, both sets of summary statistics have the same correlation matrix **R _{z}**, which is also the genotype correlation matrix

#### 5.3.2 Jointly modeling total and haplotype-specific effects on expression

We define **ẑ** as the combined vector of AS association statistics and QTL association statistics:

Let *r _{βϕ}* be the overall correlation between the QTL and AS summary statistics calculated across all loci. We define the combined correlation matrix

**R**as:

We model the joint distribution as multivariate normal, with covariance **R**:

#### 5.3.3 Modeling summary statistics given a causal configuration

The goal of this method is to infer the causal markers, given QTL and AS association statistics. To this end, we introduce a likelihood function that gives the probability of statistics **ẑ**, given a causal configuration. We define a causal configuration **c** as a vector of causal statuses corresponding to each marker, with 1 being causal and 0 being non-causal.

Let **z _{c,ϕ}** and

**z**be the underlying causal AS and QTL effects, respectively, across markers such that:

_{c,β}We define hyperparameters and as the variance of AS and QTL causal effect sizes, respectively *r _{c,βϕ}* as the underlying correlation of the causal QTL and AS effect sizes. (This is not to be confused with

*r*, which concerns the correlation between the association statistics.) We define

_{βϕ}**Σ**, the covariance matrix of causal effect sizes, given a causal configuration:

_{c}We model the causal effect sizes, given a causal configuration, as drawn from a multivariate normal distribution:

Furthermore, we model the expected association statistic for a given marker as a linear combination of all effects correlated to the marker.

Combining Equations 70 and 75, we get a probability distribution for the observed association statistics given a causal configuration. This is our likelihood for a causal configuration.

To get a prior distribution for the causal configuration **c**, we define the hyperparameter *γ* as the prior probability that a single variant is causal and 1 − *γ* as the probability that a variant is not causal. The probability of a configuration consisting of *m* variants thus becomes:

We can view the prior as a regularization term by taking the negative log:

Since **c** is a binary vector, ║**c**║_{k} is the same for all positive *k*. Thus, the prior imposes *L _{k}* regularization with

*λ*= −logit (

*γ*). In practice, this regularization favors causal configurations with fewer causal variants.

With the prior and likelihood, we define the posterior probability of a causal configuration, normalized across the set of all possible configurations ℂ:

This posterior probability can be alternatively expressed with Bayes Factors. We define the null model as the scenario where all markers are non-causal, so that **c** = **0**. The Bayes Factor for a particular **c** would thus be:

We rewrite Equation 79 with Bayes Factors:

#### 5.3.4 The *ρ*-level credible set

In practice, due to the large number of possible configurations, the probability of any given configuration will likely be small. For more meaningful probabilities, we calculate the total probability of the possible non-null configurations from a set of markers.

We define as a set of markers that putatively includes all causal markers. We define as the set of all causal configurations whose causal markers is a subset of , excluding the null set. Thus, the probability that includes all causal markers is the sum of the probabilities of the configurations in .

We set this probability as *ρ _{c}*, the confidence level of . Given a value for

*ρ*, commonly 0.95, we seek to find that minimizes the number of causal variants.

_{c}#### 5.3.5 The posterior inclusion probability

An alternative way of summarizing the configurations is to calculate a marker’s posterior inclusion probability (PIP), also known as the posterior probability of association. We define the PIP as the probability that a single given marker is causal, marginalized over all other markers. We calculate this probability by summing over all configurations containing the marker.

### 5.4 Computational optimization and implementation

#### 5.4.1 Shotgun stochastic search across configurations

The computation of the probability of a given configuration requires knowledge of the Bayes Factor for every possible configuration. As there are 2^{m} possible configurations, traversing this whole space is intractable. To reduce the number of configurations to evaluate, we use the heuristic that configurations with significant probabilities tend to be similar to each other.

We use a shotgun stochastic search procedure to find all configurations with a signifcant probability. Given a selected configuration **c**, we define , the neighborhood of **c**, as follows:

All configurations resulting from setting a causal marker in

**c**to non-causalAll configurations resulting from setting a non-causal marker in

**c**to causalAll configurations resulting from swapping the casual statuses of two markers in

**c**

For each iteration of the algorithm, the next configuration is drawn randomly from , weighted by the posterior probability of each candidate. Upon termination, we assume that all configurations with nonzero probability have been uncovered.

#### 5.4.2 Calculation of the *ρ*-level credible set

Given the large number of configurations evaluated, it is impractical to calculate the best possible credible set satisfying *ρ _{c}*. Instead, we use a greedy approximation algorithm. At each step, before

*ρ*is reached, the algorithm adds the marker that increases the confidence the most.

_{c}#### 5.4.3 Bayes factor evaluation with matrix reduction

Direct calculation of the Bayes Factor for a configuration requires the manipulation of *m* × *m* matrices, resulting in an *O*(*m*^{3}) runtime per configuration. We now show that it is sufficient to evaluate the Bayes Factor using only the elements corresponding to causal SNPs. This reduces complexity from *O*(*m*^{3}) to *O*(*k*^{3}), where *k* is the number of causal variants.

We expand the MVN probability density functions in equation 80 and use the binomial inverse theorem:

We permute *s* to separate causal and non-causal SNPs:

We likewise permute the rows and columns of **R** and **Σ _{c}** such that:

Note that **Σ _{c}** can be nonzero only among causal markers since

**c**is 0 for non-causal markers. Furthermore:

Blockwise inversion yields:

We can simplify this equation since **Σ _{c,CC}** is of full rank and is thus invertible:

We can also simplify the determinant in Equation 83:

We have thus shown that evaluating the Bayes Factor with only putative causal markers is mathematically equivalent to evaluating with all markers. Thus:

### 5.5 The hyperparameters in terms of heritability

The model takes a number or hyperparameters specifying the variances and covariances of the association statistics. We reparameterize the hyperparameters in terms of the QTL heritablity of the locus and the AS heritability of the locus .

First, we look at and , given *k* expected causal variants. Let be the overall variance of the QTL phenotype across the individuals. Since the heritability *h _{β}* is the proportion of the variance attributed to the causal variance, the average variance of a causal marker’s QTL effect size is given by:
where

*l*is the average LD score between the given marker and the other causal markers. Similarly, the variance of the AS effect size is given by:

However, in the case of the AS phenotype, where the quality of data varies considerably among individuals, we must take into account the variance introduced by sampling. As we recall, the variance of the observed phenotype for a given individual *j* under a beta-binomial model is:

We now derive an estimator for the total expected variance of the observed phenotype, across individuals. As an approximation, we substitute the individual read coverage for the expected read coverage , assumng that . Thus:

Since we model *ŵ _{j}*, as normally distributed, is approximately normally distributed. We now find . Given a normally distributed zero-mean variable , the even-numbered moments are given by:

Taking the Taylor expansion of cosh and using linearity of expectation:

Substituting this result back into the formula for :

Thus, the variance of the calculated AS effect size is given by:

We also define the observed AS heritability such that

We now derive an expression for *r _{βϕ}*, the overall correlation between the QTL and AS statistics for a casual variant. We first find an expression, given causal variant

*i*for the variance of

*ẑ*. Since

_{β,i}*i*is causal, the variance of is a combination of the variance of a causal variant and the average phenotypic variation across individuals:

We model the QTL statistic of a causal variant as a zero-mean normal distribution:

We model the AS statistic in a similar manner:

We model the noise as independently distributed between the two statistics, but the causal variance as correlated with coefficient *r _{c,βϕ}*. Thus:

We now find the covariance between *ẑ _{β}* and

*ẑ*. Since the distributions are zero-mean, the covariance is just . Expanding out this product:

_{ϕ}The correlation is thus:

## Acknowledgments

We thank F. Hormozdiari for guidance on statistical fine-mapping and C. Kalita for guidance on model validation. We also thank B. Pasanuic and C. Giambartolomei for helpful feedback.