PT - JOURNAL ARTICLE AU - Jack Bowden AU - Fabiola Del Greco M AU - Cosetta Minelli AU - Debbie Lawlor AU - Nuala Sheehan AU - John Thompson AU - George Davey Smith TI - Improving the accuracy of two-sample summary data Mendelian randomization: moving beyond the NOME assumption AID - 10.1101/159442 DP - 2017 Jan 01 TA - bioRxiv PG - 159442 4099 - http://biorxiv.org/content/early/2017/07/05/159442.short 4100 - http://biorxiv.org/content/early/2017/07/05/159442.full AB - Two-sample summary data Mendelian randomization (MR) incorporating multiple genetic variants in a meta-analysis framework is a popular technique for assessing causality in epidemiology. If all genetic variants satisfy the instrumental variable (IV) assumptions, then their individual causal ratio estimates should be homogeneous. Observed heterogeneity, therefore, supports the notion that a portion of variants violate the IV assumptions due to pleiotropy. Model fitting and heterogeneity assessment in MR requires an approximation for the variance of each ratio estimate. We show that the most popular approximation can lead to an inflation in the chances of detecting heterogeneity when in fact it is not present. Conversely, an ostensibly more accurate approximation can dramatically increase the chances of failing to detect heterogeneity, when it is truly present. Here we derive a ‘modified 2nd order’ approximation to the variance that makes use of the derived causal estimate to mitigate both of these adverse effects. Using Monte Carlo simulations, we show that the modified 2nd order approximation outperforms both its 1st and 2nd order counterparts in the presence of weak instruments or a large causal effect. We illustrate the utility of the new method using data from a recent two-sample summary data MR analysis to assess the causal role of systolic blood pressure on coronary heart disease risk. Modified 2nd order weighting should be used as standard within two-sample summary data MR studies for model fitting, the quantification of heterogeneity and the detection of outliers. R code is provided to apply these weights in practice.