Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Robust deep learning based protein sequence design using ProteinMPNN

View ORCID ProfileJ. Dauparas, View ORCID ProfileI. Anishchenko, View ORCID ProfileN. Bennett, View ORCID ProfileH. Bai, View ORCID ProfileR. J. Ragotte, View ORCID ProfileL. F. Milles, View ORCID ProfileB. I. M. Wicky, View ORCID ProfileA. Courbet, R. J. de Haas, View ORCID ProfileN. Bethel, View ORCID ProfileP. J. Y. Leung, View ORCID ProfileT. F. Huddy, View ORCID ProfileS. Pellock, View ORCID ProfileD. Tischer, View ORCID ProfileF. Chan, View ORCID ProfileB. Koepnick, H. Nguyen, A. Kang, View ORCID ProfileB. Sankaran, View ORCID ProfileA. K. Bera, N. P. King, View ORCID ProfileD. Baker
doi: https://doi.org/10.1101/2022.06.03.494563
J. Dauparas
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for J. Dauparas
I. Anishchenko
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for I. Anishchenko
N. Bennett
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
4Molecular Engineering Graduate Program, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for N. Bennett
H. Bai
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for H. Bai
R. J. Ragotte
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for R. J. Ragotte
L. F. Milles
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for L. F. Milles
B. I. M. Wicky
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for B. I. M. Wicky
A. Courbet
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
3Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for A. Courbet
R. J. de Haas
6Department of Physical Chemistry and Soft Matter, Wageningen University and Research, Wageningen, The Netherlands
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
N. Bethel
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
3Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for N. Bethel
P. J. Y. Leung
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
4Molecular Engineering Graduate Program, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for P. J. Y. Leung
T. F. Huddy
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for T. F. Huddy
S. Pellock
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for S. Pellock
D. Tischer
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for D. Tischer
F. Chan
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for F. Chan
B. Koepnick
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for B. Koepnick
H. Nguyen
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
A. Kang
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
B. Sankaran
5Berkeley Center for Structural Biology, Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for B. Sankaran
A. K. Bera
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for A. K. Bera
N. P. King
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
D. Baker
1Department of Biochemistry, University of Washington, Seattle, WA, USA
2Institute for Protein Design, University of Washington, Seattle, WA, USA
3Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for D. Baker
  • For correspondence: dabaker@uw.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Preview PDF
Loading

Abstract

While deep learning has revolutionized protein structure prediction, almost all experimentally characterized de novo protein designs have been generated using physically based approaches such as Rosetta. Here we describe a deep learning based protein sequence design method, ProteinMPNN, with outstanding performance in both in silico and experimental tests. The amino acid sequence at different positions can be coupled between single or multiple chains, enabling application to a wide range of current protein design challenges. On native protein backbones, ProteinMPNN has a sequence recovery of 52.4%, compared to 32.9% for Rosetta. Incorporation of noise during training improves sequence recovery on protein structure models, and produces sequences which more robustly encode their structures as assessed using structure prediction algorithms. We demonstrate the broad utility and high accuracy of ProteinMPNN using X-ray crystallography, cryoEM and functional studies by rescuing previously failed designs, made using Rosetta or AlphaFold, of protein monomers, cyclic homo-oligomers, tetrahedral nanoparticles, and target binding proteins.

One-sentence summary A deep learning based protein sequence design method is described that is widely applicable to current design challenges and shows outstanding performance in both in silico and experimental tests.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted June 04, 2022.
Download PDF
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Robust deep learning based protein sequence design using ProteinMPNN
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Robust deep learning based protein sequence design using ProteinMPNN
J. Dauparas, I. Anishchenko, N. Bennett, H. Bai, R. J. Ragotte, L. F. Milles, B. I. M. Wicky, A. Courbet, R. J. de Haas, N. Bethel, P. J. Y. Leung, T. F. Huddy, S. Pellock, D. Tischer, F. Chan, B. Koepnick, H. Nguyen, A. Kang, B. Sankaran, A. K. Bera, N. P. King, D. Baker
bioRxiv 2022.06.03.494563; doi: https://doi.org/10.1101/2022.06.03.494563
Reddit logo Twitter logo Facebook logo LinkedIn logo Mendeley logo
Citation Tools
Robust deep learning based protein sequence design using ProteinMPNN
J. Dauparas, I. Anishchenko, N. Bennett, H. Bai, R. J. Ragotte, L. F. Milles, B. I. M. Wicky, A. Courbet, R. J. de Haas, N. Bethel, P. J. Y. Leung, T. F. Huddy, S. Pellock, D. Tischer, F. Chan, B. Koepnick, H. Nguyen, A. Kang, B. Sankaran, A. K. Bera, N. P. King, D. Baker
bioRxiv 2022.06.03.494563; doi: https://doi.org/10.1101/2022.06.03.494563

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Biophysics
Subject Areas
All Articles
  • Animal Behavior and Cognition (4397)
  • Biochemistry (9621)
  • Bioengineering (7118)
  • Bioinformatics (24928)
  • Biophysics (12651)
  • Cancer Biology (9984)
  • Cell Biology (14391)
  • Clinical Trials (138)
  • Developmental Biology (7982)
  • Ecology (12141)
  • Epidemiology (2067)
  • Evolutionary Biology (16019)
  • Genetics (10946)
  • Genomics (14771)
  • Immunology (9895)
  • Microbiology (23728)
  • Molecular Biology (9500)
  • Neuroscience (51034)
  • Paleontology (370)
  • Pathology (1544)
  • Pharmacology and Toxicology (2690)
  • Physiology (4035)
  • Plant Biology (8687)
  • Scientific Communication and Education (1512)
  • Synthetic Biology (2403)
  • Systems Biology (6452)
  • Zoology (1349)