Abstract
Although large amounts of genomic data are available, it remains a challenge to reliably infer causal (i.e., regulatory) relationships among molecular phenotypes (such as gene expression), especially when many phenotypes are involved. We extend the interpretation of the Principle of Mendelian randomization (PMR) and present MRPC, a novel machine learning algorithm that incorporates the PMR in classical algorithms for learning causal graphs in computer science. MRPC learns a causal biological network efficiently and robustly from integrating genotype and molecular phenotype data, in which directed edges indicate causal directions. We demonstrate through simulation that MRPC outperforms existing general-purpose network inference methods and other PMR-based methods. We apply MRPC to distinguish direct and indirect targets among multiple genes associated with expression quantitative trait loci.