TY - JOUR T1 - Improving on a modal-based estimation method: model averaging for consistent and efficient estimation in Mendelian randomization when a plurality of candidate instruments are valid JF - bioRxiv DO - 10.1101/175372 SP - 175372 AU - Stephen Burgess AU - Verena Zuber AU - Apostolos Gkatzionis AU - Jessica MB Rees AU - Christopher N Foley Y1 - 2017/01/01 UR - http://biorxiv.org/content/early/2017/08/11/175372.abstract N2 - Background A robust method for Mendelian randomization does not require all genetic variants to be valid instruments to give consistent estimates of a causal parameter. Several such methods have been developed, including a mode-based estimation method giving consistent estimates if a plurality of genetic variants are valid instruments; that is, there is no larger subset of invalid instruments estimating the same causal parameter than the subset of valid instruments.Methods We here develop a model averaging method that gives consistent estimates under the same ‘plurality of valid instruments’ assumption. The method considers a mixture distribution of estimates derived from each subset of genetic variants. The estimates are weighted such that subsets with more genetic variants receive more weight, unless variants in the subset have heterogeneous causal estimates, in which case that subset is severely downweighted. The mode of this mixture distribution is the causal estimate. This heterogeneity-penalized model averaging method has several technical advantages over the previously proposed mode-based estimation method.Results The heterogeneity-penalized model averaging method outperformed the mode-based estimation in terms of effciency and outperformed other robust methods in terms of Type 1 error rate in an extensive simulation analysis. The proposed method suggests two distinct mechanisms by which inflammation affects coronary heart disease risk, with subsets of variants suggesting both positive and negative causal effects.Conclusions The heterogeneity-penalized model averaging method is an additional robust method for Mendelian randomization with excellent theoretical and practical properties, and can reveal features in the data such as the presence of multiple causal mechanisms. (249 words)Key messagesWe propose a heterogeneity-penalized model averaging method that gives consistent causal estimates if a weighted plurality of the genetic variants are valid instruments.The method calculates causal estimates based on all subsets of genetic variants, and upweights subsets containing several genetic variants with similar causal estimates.The method is asymptotically effcient and does not rely on bootstrapping to obtain a confidence interval, nor is the confidence interval constrained to be symmetric.In particular, the confidence interval can include multiple disjoint intervals, suggesting the presence of multiple causal mechanisms by which the risk factor influences the outcome.The method can incorporate biological knowledge to upweight the contribution of genetic variants with stronger plausibility of being valid instruments. ER -