TY - JOUR T1 - Explainable machine learning models of major crop traits from satellite-monitored continent-wide field trial data JF - bioRxiv DO - 10.1101/2021.03.08.434495 SP - 2021.03.08.434495 AU - Saul Justin Newman AU - Robert T Furbank Y1 - 2021/01/01 UR - http://biorxiv.org/content/early/2021/03/09/2021.03.08.434495.abstract N2 - Four species of grass generate half of all human-consumed calories1. However, abundant biological data on species that produce our food remains largely inaccessible, imposing direct barriers to understanding crop yield and fitness traits. Here, we assemble and analyse a continent-wide database of field experiments spanning ten years and hundreds of thousands of machine-phenotyped populations of ten major crop species. Training an ensemble of machine learning models, using thousands of variables capturing weather, ground-sensor, soil, chemical and fertiliser dosage, management, and satellite data, produces robust cross-continent yield models exceeding R2 = 0.8 prediction accuracy. In contrast to ‘black box’ analytics, detailed interrogation of these models reveals fundamental drivers of crop behaviour and complex interactions predicting yield and agronomic traits. These results demonstrate the capacity of machine learning models to build unified, interpretable, and explainable models of crop behaviour, and highlight the powerful role of data in the future of food.Competing Interest StatementThe authors have declared no competing interest. ER -