PT - JOURNAL ARTICLE AU - Oliver Lester Saldanha AU - Philip Quirke AU - Nicholas P. West AU - Jacqueline A. James AU - Maurice B. Loughrey AU - Heike I. Grabsch AU - Manuel Salto-Tellez AU - Elizabeth Alwers AU - Didem Cifci AU - Narmin Ghaffari Laleh AU - Tobias Seibel AU - Richard Gray AU - Gordon G. A. Hutchins AU - Hermann Brenner AU - Tanwei Yuan AU - Titus J. Brinker AU - Jenny Chang-Claude AU - Firas Khader AU - Andreas Schuppert AU - Tom Luedde AU - Sebastian Foersch AU - Hannah Sophie Muti AU - Christian Trautwein AU - Michael Hoffmeister AU - Daniel Truhn AU - Jakob Nikolas Kather TI - Swarm learning for decentralized artificial intelligence in cancer histopathology AID - 10.1101/2021.11.19.469139 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.11.19.469139 4099 - http://biorxiv.org/content/early/2021/11/20/2021.11.19.469139.short 4100 - http://biorxiv.org/content/early/2021/11/20/2021.11.19.469139.full AB - Artificial Intelligence (AI) can extract clinically actionable information from medical image data. In cancer histopathology, AI can be used to predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets whose collection faces practical, ethical and legal obstacles. These obstacles could be overcome with swarm learning (SL) where partners jointly train AI models, while avoiding data transfer and monopolistic data governance. Here, for the first time, we demonstrate the successful use of SL in large, multicentric datasets of gigapixel histopathology images comprising over 5000 patients. We show that AI models trained using Swarm Learning can predict BRAF mutational status and microsatellite instability (MSI) directly from hematoxylin and eosin (H&E)-stained pathology slides of colorectal cancer (CRC). We trained AI models on three patient cohorts from Northern Ireland, Germany and the United States of America and validated the prediction performance in two independent datasets from the United Kingdom using SL-based AI models. Our data show that SL enables us to train AI models which outperform most locally trained models and perform on par with models which are centrally trained on the merged datasets. In addition, we show that SL-based AI models are data efficient and maintain a robust performance even if only subsets of local datasets are used for training. In the future, SL can be used to train distributed AI models for any histopathology image analysis tasks, overcoming the need for data transfer and without requiring institutions to give up control of the final AI model.Competing Interest StatementJNK declares consulting services for Owkin, France and Panakeia, UK. PQ and NW declare research funding from Roche and PQ consulting and speaker services for Roche. MST has recently received honoraria for advisory work in relation to the following companies: Incyte, MindPeak, MSD, BMS and Sonrai; these are all unrelated to this work. No other potential conflicts of interest are reported by any of the authors. The authors received advice from the customer support team of Hewlett Packard Enterprise (HPE) when performing this study, but HPE did not have any role in study design, conducting the experiments, interpretation of the results or decision to submit for publication.