Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

A billion synthetic 3D-antibody-antigen complexes enable unconstrained machine-learning formalized investigation of antibody specificity prediction

View ORCID ProfilePhilippe A. Robert, View ORCID ProfileRahmad Akbar, View ORCID ProfileRobert Frank, View ORCID ProfileMilena Pavlović, View ORCID ProfileMichael Widrich, View ORCID ProfileIgor Snapkov, View ORCID ProfileMaria Chernigovskaya, View ORCID ProfileLonneke Scheffer, View ORCID ProfileAndrei Slabodkin, View ORCID ProfileBrij Bhushan Mehta, Mai Ha Vu, Aurél Prósz, Krzysztof Abram, Alex Olar, View ORCID ProfileEnkelejda Miho, Dag Trygve Tryslew Haug, View ORCID ProfileFridtjof Lund-Johansen, View ORCID ProfileSepp Hochreiter, Ingrid Hobæk Haff, View ORCID ProfileGünter Klambauer, View ORCID ProfileGeir K. Sandve, View ORCID ProfileVictor Greiff
doi: https://doi.org/10.1101/2021.07.06.451258
Philippe A. Robert
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Philippe A. Robert
  • For correspondence: philippe.robert@ens-lyon.org victor.greiff@medisin.uio.no
Rahmad Akbar
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Rahmad Akbar
Robert Frank
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Robert Frank
Milena Pavlović
2Department of Informatics, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Milena Pavlović
Michael Widrich
3ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael Widrich
Igor Snapkov
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Igor Snapkov
Maria Chernigovskaya
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Maria Chernigovskaya
Lonneke Scheffer
2Department of Informatics, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Lonneke Scheffer
Andrei Slabodkin
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Andrei Slabodkin
Brij Bhushan Mehta
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brij Bhushan Mehta
Mai Ha Vu
4Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Aurél Prósz
5Danish Cancer Society Research Center, Translational Cancer Genomics, Copenhagen, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Krzysztof Abram
6The Novo Nordisk Foundation Center for Biosustainability, Autoflow, DTU Biosustain and IT University of Copenhagen, Copenhagen, Denmark
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Alex Olar
7Eötvös Loránd University, Department of Complex Systems in Physics, Budapest, Hungary
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Enkelejda Miho
8Institute of Medical Engineering and Medical Informatics, School of Life Sciences, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Muttenz, Switzerland
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Enkelejda Miho
Dag Trygve Tryslew Haug
9Department of Philosophy, Classics, History of Arts and Ideas, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fridtjof Lund-Johansen
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Fridtjof Lund-Johansen
Sepp Hochreiter
3ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
10Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, Austria
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sepp Hochreiter
Ingrid Hobæk Haff
11Department of Mathematics, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Günter Klambauer
3ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Linz, Austria
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Günter Klambauer
Geir K. Sandve
2Department of Informatics, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Geir K. Sandve
Victor Greiff
1Department of Immunology, University of Oslo, Oslo, Norway
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Victor Greiff
  • For correspondence: philippe.robert@ens-lyon.org victor.greiff@medisin.uio.no
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

Machine learning (ML) is a key technology to enable accurate prediction of antibody-antigen binding, a prerequisite for in silico vaccine and antibody design. Two orthogonal problems hinder the current application of ML to antibody-specificity prediction and the benchmarking thereof: (i) The lack of a unified formalized mapping of immunological antibody specificity prediction problems into ML notation and (ii) the unavailability of large-scale training datasets. Here, we developed the Absolut! software suite that allows the parameter-based unconstrained generation of synthetic lattice-based 3D-antibody-antigen binding structures with ground-truth access to conformational paratope, epitope, and affinity. We show that Absolut!-generated datasets recapitulate critical biological sequence and structural features that render antibody-antigen binding prediction challenging. To demonstrate the immediate, high-throughput, and large-scale applicability of Absolut!, we have created an online database of 1 billion antibody-antigen structures, the extension of which is only constrained by moderate computational resources. We translated immunological antibody specificity prediction problems into ML tasks and used our database to investigate paratope-epitope binding prediction accuracy as a function of structural information encoding, dataset size, and ML method, which is unfeasible with existing experimental data. Furthermore, we found that in silico investigated conditions, predicted to increase antibody specificity prediction accuracy, align with and extend conclusions drawn from experimental antibody-antigen structural data. In summary, the Absolut! framework enables the development and benchmarking of ML strategies for biotherapeutics discovery and design.

Graphical abstract
  • Download figure
  • Open in new tab
Graphical abstract

The software framework Absolut! enables (A,B) the generation of virtually arbitrarily large numbers of in silico 3D-antibody-antigen structures, (C,D) the formalization of antibody specificity as machine learning (ML) tasks as well as the exploration of ML strategies for paratope-epitope prediction.

Highlights

  • - Software framework Absolut! to generate an arbitrarily large number of in silico 3D-antibody-antigen structures

  • - Generation of one billion in silico antigen-antibody structures

  • - Immunological antibody specificity prediction problems formalized as machine learning tasks

  • - Exploration of machine learning architectures for paratope-epitope interaction prediction accuracy as a function of neural network depth, dataset size, and sequence-structure encoding

Competing Interest Statement

E.M. declares holding shares in aiNET GmbH. V.G. declares advisory board positions in aiNET GmbH and Enpicom B.V. VG is a consultant for Roche/Genentech.

Footnotes

  • https://github.com/csi-greifflab/Absolut

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license.
Back to top
PreviousNext
Posted July 08, 2021.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
A billion synthetic 3D-antibody-antigen complexes enable unconstrained machine-learning formalized investigation of antibody specificity prediction
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
A billion synthetic 3D-antibody-antigen complexes enable unconstrained machine-learning formalized investigation of antibody specificity prediction
Philippe A. Robert, Rahmad Akbar, Robert Frank, Milena Pavlović, Michael Widrich, Igor Snapkov, Maria Chernigovskaya, Lonneke Scheffer, Andrei Slabodkin, Brij Bhushan Mehta, Mai Ha Vu, Aurél Prósz, Krzysztof Abram, Alex Olar, Enkelejda Miho, Dag Trygve Tryslew Haug, Fridtjof Lund-Johansen, Sepp Hochreiter, Ingrid Hobæk Haff, Günter Klambauer, Geir K. Sandve, Victor Greiff
bioRxiv 2021.07.06.451258; doi: https://doi.org/10.1101/2021.07.06.451258
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
A billion synthetic 3D-antibody-antigen complexes enable unconstrained machine-learning formalized investigation of antibody specificity prediction
Philippe A. Robert, Rahmad Akbar, Robert Frank, Milena Pavlović, Michael Widrich, Igor Snapkov, Maria Chernigovskaya, Lonneke Scheffer, Andrei Slabodkin, Brij Bhushan Mehta, Mai Ha Vu, Aurél Prósz, Krzysztof Abram, Alex Olar, Enkelejda Miho, Dag Trygve Tryslew Haug, Fridtjof Lund-Johansen, Sepp Hochreiter, Ingrid Hobæk Haff, Günter Klambauer, Geir K. Sandve, Victor Greiff
bioRxiv 2021.07.06.451258; doi: https://doi.org/10.1101/2021.07.06.451258

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Immunology
Subject Areas
All Articles
  • Animal Behavior and Cognition (3601)
  • Biochemistry (7567)
  • Bioengineering (5521)
  • Bioinformatics (20782)
  • Biophysics (10325)
  • Cancer Biology (7978)
  • Cell Biology (11634)
  • Clinical Trials (138)
  • Developmental Biology (6602)
  • Ecology (10200)
  • Epidemiology (2065)
  • Evolutionary Biology (13610)
  • Genetics (9539)
  • Genomics (12844)
  • Immunology (7919)
  • Microbiology (19538)
  • Molecular Biology (7657)
  • Neuroscience (42080)
  • Paleontology (308)
  • Pathology (1257)
  • Pharmacology and Toxicology (2201)
  • Physiology (3267)
  • Plant Biology (7038)
  • Scientific Communication and Education (1294)
  • Synthetic Biology (1951)
  • Systems Biology (5426)
  • Zoology (1116)