TY - JOUR T1 - MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect JF - bioRxiv DO - 10.1101/2020.07.14.201475 SP - 2020.07.14.201475 AU - Ammar Tareen AU - Mahdi Kooshkbaghi AU - Anna Posfai AU - William T. Ireland AU - David M. McCandlish AU - Justin B. Kinney Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/06/27/2020.07.14.201475.abstract N2 - Multiplex assays of variant effect (MAVEs) are diverse techniques that include deep mutational scanning (DMS) experiments on proteins and massively parallel reporter assays (MPRAs) on cis-regulatory sequences. MAVEs are being rapidly adopted in many areas of biology, but a general strategy for inferring quantitative models of genotype-phenotype (G-P) maps from MAVE data is lacking. Here we introduce a conceptually unified approach for learning G-P maps from MAVE datasets. Our strategy is grounded in concepts from information theory, and is based on the view of G-P maps as a form of information compression. We also introduce MAVE-NN, an easy-to-use Python package that implements this approach using a neural network backend. The ability of MAVE-NN to infer diverse G-P maps—including biophysically interpretable models—is demonstrated on DMS and MPRA data in a variety of biological contexts. MAVE-NN thus provides a unified solution to a major outstanding need in the MAVE community.Competing Interest StatementThe authors have declared no competing interest. ER -