PT - JOURNAL ARTICLE
AU - Wang, Gao
AU - Sarkar, Abhishek
AU - Carbonetto, Peter
AU - Stephens, Matthew
TI - A simple new approach to variable selection in regression, with application to genetic fine-mapping
AID - 10.1101/501114
DP - 2020 Jan 01
TA - bioRxiv
PG - 501114
4099 - http://biorxiv.org/content/early/2020/06/01/501114.short
4100 - http://biorxiv.org/content/early/2020/06/01/501114.full
AB - We introduce a simple new approach to variable selection in linear regression, with a particular focus on quantifying uncertainty in which variables should be selected. The approach is based on a new model — the “Sum of Single Effects” (SuSiE) model — which comes from writing the sparse vector of regression coefficients as a sum of “single-effect” vectors, each with one non-zero element. We also introduce a corresponding new fitting procedure — Iterative Bayesian Stepwise Selection (IBSS) — which is a Bayesian analogue of stepwise selection methods. IBSS shares the computational simplicity and speed of traditional stepwise methods, but instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select. We provide a formal justification of this intuitive algorithm by showing that it optimizes a variational approximation to the posterior distribution under the SuSiE model. Further, this approximate posterior distribution naturally yields convenient novel summaries of uncertainty in variable selection, providing a Credible Set of variables for each selection. Our methods are particularly well-suited to settings where variables are highly correlated and detectable effects are sparse, both of which are characteristics of genetic fine-mapping applications. We demonstrate through numerical experiments that our methods outper-form existing methods for this task, and illustrate their application to fine-mapping genetic variants influencing alternative splicing in human cell-lines. We also discuss the potential and challenges for applying these methods to generic variable selection problems.Competing Interest StatementThe authors have declared no competing interest.