Privacy-preserving federated neural network learning for disease-associated cell classification

Sinem Sav; Jean-Philippe Bossuat; Juan R Troncoso-Pastoriza; Manfred Claassen; Jean-Pierre Hubaux

doi:10.1016/j.patter.2022.100487

Privacy-preserving federated neural network learning for disease-associated cell classification

Patterns (N Y). 2022 Apr 18;3(5):100487. doi: 10.1016/j.patter.2022.100487. eCollection 2022 May 13.

Authors

Sinem Sav¹, Jean-Philippe Bossuat², Juan R Troncoso-Pastoriza², Manfred Claassen^{3

4}, Jean-Pierre Hubaux^{1

2}

Affiliations

¹ Laboratory for Data Security (LDS), EPFL, Lausanne 1015, Switzerland.
² Tune Insight SA, Lausanne 1015, Switzerland.
³ Internal Medicine I, University Hospital Tübingen, Faculty of Medicine, University of Tübingen, Tübingen 72016, Germany.
⁴ Department of Computer Science, University of Tübingen, Tübingen 72076, Germany.

Abstract

Training accurate and robust machine learning models requires a large amount of data that is usually scattered across data silos. Sharing or centralizing the data of different healthcare institutions is, however, unfeasible or prohibitively difficult due to privacy regulations. In this work, we address this problem by using a privacy-preserving federated learning-based approach, PriCell, for complex models such as convolutional neural networks. PriCell relies on multiparty homomorphic encryption and enables the collaborative training of encrypted neural networks with multiple healthcare institutions. We preserve the confidentiality of each institutions' input data, of any intermediate values, and of the trained model parameters. We efficiently replicate the training of a published state-of-the-art convolutional neural network architecture in a decentralized and privacy-preserving manner. Our solution achieves an accuracy comparable with the one obtained with the centralized non-secure solution. PriCell guarantees patient privacy and ensures data utility for efficient multi-center studies involving complex healthcare data.

Keywords: federated learning; multiparty homomorphic encryption; neural networks; privacy-preserving machine learning; private training; single-cell analysis.