Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

dsMTL - a computational framework for privacy-preserving, distributed multi-task machine learning

View ORCID ProfileHan Cao, Youcheng Zhang, Jan Baumbach, Paul R Burton, Dominic Dwyer, Nikolaos Koutsouleris, Julian Matschinske, Yannick Marcon, Sivanesan Rajan, Thilo Rieg, View ORCID ProfilePatricia Ryser-Welch, Julian Späth, The COMMITMENT consortium, View ORCID ProfileCarl Herrmann, View ORCID ProfileEmanuel Schwarz
doi: https://doi.org/10.1101/2021.08.26.457778
Han Cao
1Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Han Cao
Youcheng Zhang
2Health Data Science Unit, Medical Faculty Heidelberg & BioQuant, Heidelberg, 69120, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Jan Baumbach
3Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
4Computational Biomedicine Lab, Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Paul R Burton
5Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Dominic Dwyer
6Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University, Munich 80638, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nikolaos Koutsouleris
6Department of Psychiatry and Psychotherapy, Section for Neurodiagnostic Applications, Ludwig-Maximilian University, Munich 80638, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Julian Matschinske
3Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Yannick Marcon
7Epigeny, St Ouen, France
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Sivanesan Rajan
1Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Thilo Rieg
1Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Patricia Ryser-Welch
5Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, United Kingdom
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Patricia Ryser-Welch
Julian Späth
3Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Carl Herrmann
2Health Data Science Unit, Medical Faculty Heidelberg & BioQuant, Heidelberg, 69120, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Carl Herrmann
  • For correspondence: emanuel.schwarz@zi-mannheim.de carl.herrmann@bioquant.uni-heidelberg.de
Emanuel Schwarz
1Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Emanuel Schwarz
  • For correspondence: emanuel.schwarz@zi-mannheim.de carl.herrmann@bioquant.uni-heidelberg.de
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Data/Code
  • Preview PDF
Loading

Abstract

Multitask learning allows the simultaneous learning of multiple ‘communicating’ algorithms. It is increasingly adopted for biomedical applications, such as the modeling of disease progression. As data protection regulations limit data sharing for such analyses, an implementation of multitask learning on geographically distributed data sources would be highly desirable. Here, we describe the development of dsMTL, a computational framework for privacy-preserving, distributed multi-task machine learning that includes three supervised and one unsupervised algorithms. dsMTL is implemented as a library for the R programming language and builds on the DataSHIELD platform that supports the federated analysis of sensitive individual-level data. We provide a comparative evaluation of dsMTL for the identification of biological signatures in distributed datasets using two case studies, and evaluate the computational performance of the supervised and unsupervised algorithms. dsMTL provides an easy- to-use framework for privacy-preserving, federated analysis of geographically distributed datasets, and has several application areas, including comorbidity modeling and translational research focused on the simultaneous prediction of different outcomes across datasets. dsMTL is available at https://github.com/transbioZI/dsMTLBase (server-side package) and https://github.com/transbioZI/dsMTLClient (client-side package).

Competing Interest Statement

AML has received consultant fees from: Boehringer Ingelheim, Elsevier, Brainsway, Lundbeck Int. Neuroscience Foundation, Lundbeck A/S, The Wolfson Foundation, Bloomfield Holding Ltd, Shanghai Research Center for Brain Science, Thieme Verlag, Sage Therapeutics, v Behring Roentgen Stiftung, Fondation FondaMental, Janssen-Cilag GmbH, MedinCell, Brain Mind Institute, Agence Nationale de la Recherche, CISSN (Catania Internat. Summer School of Neuroscience), Daimler und Benz Stiftung, American Association for the Advancement of Science, Servier International. Additionally he has received speaker fees from: Italian Society of Biological Psychiatry, Merz-Stiftung, Forum Werkstatt Karlsruhe, Lundbeck SAS France, BAG Psychiatrie Oberbayern, Klinik fuer Psychiatrie und Psychotherapie Ingolstadt, med Update GmbH, Society of Biological Psychiatry, Siemens Healthineers, Biotest AG. All other authors have no potential conflicts of interest.

Footnotes

  • https://github.com/transbioZI/dsMTLClient

  • https://github.com/transbioZI/dsMTLBase

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
Back to top
PreviousNext
Posted August 28, 2021.
Download PDF

Supplementary Material

Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
dsMTL - a computational framework for privacy-preserving, distributed multi-task machine learning
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
dsMTL - a computational framework for privacy-preserving, distributed multi-task machine learning
Han Cao, Youcheng Zhang, Jan Baumbach, Paul R Burton, Dominic Dwyer, Nikolaos Koutsouleris, Julian Matschinske, Yannick Marcon, Sivanesan Rajan, Thilo Rieg, Patricia Ryser-Welch, Julian Späth, The COMMITMENT consortium, Carl Herrmann, Emanuel Schwarz
bioRxiv 2021.08.26.457778; doi: https://doi.org/10.1101/2021.08.26.457778
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
dsMTL - a computational framework for privacy-preserving, distributed multi-task machine learning
Han Cao, Youcheng Zhang, Jan Baumbach, Paul R Burton, Dominic Dwyer, Nikolaos Koutsouleris, Julian Matschinske, Yannick Marcon, Sivanesan Rajan, Thilo Rieg, Patricia Ryser-Welch, Julian Späth, The COMMITMENT consortium, Carl Herrmann, Emanuel Schwarz
bioRxiv 2021.08.26.457778; doi: https://doi.org/10.1101/2021.08.26.457778

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4224)
  • Biochemistry (9101)
  • Bioengineering (6749)
  • Bioinformatics (23935)
  • Biophysics (12086)
  • Cancer Biology (9491)
  • Cell Biology (13737)
  • Clinical Trials (138)
  • Developmental Biology (7614)
  • Ecology (11656)
  • Epidemiology (2066)
  • Evolutionary Biology (15476)
  • Genetics (10615)
  • Genomics (14292)
  • Immunology (9456)
  • Microbiology (22773)
  • Molecular Biology (9069)
  • Neuroscience (48840)
  • Paleontology (354)
  • Pathology (1479)
  • Pharmacology and Toxicology (2562)
  • Physiology (3822)
  • Plant Biology (8307)
  • Scientific Communication and Education (1467)
  • Synthetic Biology (2289)
  • Systems Biology (6170)
  • Zoology (1297)