PT - JOURNAL ARTICLE
AU - Wang, Gao
AU - Sarkar, Abhishek
AU - Carbonetto, Peter
AU - Stephens, Matthew
TI - A simple new approach to variable selection in regression, with application to genetic fine-mapping
AID - 10.1101/501114
DP - 2018 Jan 01
TA - bioRxiv
PG - 501114
4099 - http://biorxiv.org/content/early/2018/12/19/501114.short
4100 - http://biorxiv.org/content/early/2018/12/19/501114.full
AB - We introduce a simple new approach to variable selection in linear regression, and to quantifying uncertainty in selected variables. The approach is based on a new model – the “Sum of Single Effects” (SuSiE) model – which comes from writing the sparse vector of regression coefficients as a sum of “single-effect” vectors, each with one non-zero element. We also introduce a corresponding new fitting procedure – Iterative Bayesian Step-wise Selection (IBSS) – which is a Bayesian analogue of traditional stepwise selection methods. IBSS shares the computational simplicity and speed of traditional stepwise methods, but instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select. We show that the IBSS algorithm computes a variational approximation to the posterior distribution under the SuSiE model. Further, this approximate posterior distribution naturally leads to a convenient, novel, way to summarize uncertainty in variable selection, and provides a Credible Set for each selected variable. Our methods are particularly well suited to settings where variables are highly correlated and true effects are very sparse, both of which are characteristics of genetic fine-mapping applications. We demonstrate through numerical experiments that our methods outperform existing methods for this task, and illustrate the methods by fine-mapping genetic variants that influence alternative splicing in human cell-lines. We also discuss both the potential and the challenges for applying these methods to generic variable selection problems.