Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

De novo protein design by deep network hallucination

View ORCID ProfileIvan Anishchenko, View ORCID ProfileTamuka M. Chidyausiku, View ORCID ProfileSergey Ovchinnikov, View ORCID ProfileSamuel J. Pellock, View ORCID ProfileDavid Baker
doi: https://doi.org/10.1101/2020.07.22.211482
Ivan Anishchenko
1Department of Biochemistry, University of Washington, Seattle, WA 98105
2Institute for Protein Design, University of Washington, Seattle, WA 98105
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Ivan Anishchenko
Tamuka M. Chidyausiku
1Department of Biochemistry, University of Washington, Seattle, WA 98105
2Institute for Protein Design, University of Washington, Seattle, WA 98105
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Tamuka M. Chidyausiku
Sergey Ovchinnikov
3John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Sergey Ovchinnikov
Samuel J. Pellock
1Department of Biochemistry, University of Washington, Seattle, WA 98105
2Institute for Protein Design, University of Washington, Seattle, WA 98105
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Samuel J. Pellock
David Baker
1Department of Biochemistry, University of Washington, Seattle, WA 98105
2Institute for Protein Design, University of Washington, Seattle, WA 98105
4Howard Hughes Medical Institute, University of Washington, Seattle, WA 98105
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for David Baker
  • For correspondence: dabaker@uw.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Data/Code
  • Preview PDF
Loading

Abstract

There has been considerable recent progress in protein structure prediction using deep neural networks to infer distance constraints from amino acid residue co-evolution1–3. We investigated whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occuring proteins used in training the models. We generated random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting distance maps, which as expected are quite featureless. We then carried out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (KL-divergence) between the distance distributions predicted by the network and the background distribution. Optimization from different random starting points resulted in a wide range of proteins with diverse sequences and all alpha, all beta sheet, and mixed alpha-beta structures. We obtained synthetic genes encoding 129 of these network hallucinated sequences, expressed and purified the proteins in E coli, and found that 27 folded to monomeric stable structures with circular dichroism spectra consistent with the hallucinated structures. Thus deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute, alongside traditional physically based models, to the de novo design of proteins with new functions.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

  • https://github.com/gjoni/trDesign

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license.
Back to top
PreviousNext
Posted July 23, 2020.
Download PDF
Data/Code
Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
De novo protein design by deep network hallucination
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
De novo protein design by deep network hallucination
Ivan Anishchenko, Tamuka M. Chidyausiku, Sergey Ovchinnikov, Samuel J. Pellock, David Baker
bioRxiv 2020.07.22.211482; doi: https://doi.org/10.1101/2020.07.22.211482
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
De novo protein design by deep network hallucination
Ivan Anishchenko, Tamuka M. Chidyausiku, Sergey Ovchinnikov, Samuel J. Pellock, David Baker
bioRxiv 2020.07.22.211482; doi: https://doi.org/10.1101/2020.07.22.211482

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioengineering
Subject Areas
All Articles
  • Animal Behavior and Cognition (4095)
  • Biochemistry (8787)
  • Bioengineering (6493)
  • Bioinformatics (23388)
  • Biophysics (11766)
  • Cancer Biology (9168)
  • Cell Biology (13292)
  • Clinical Trials (138)
  • Developmental Biology (7423)
  • Ecology (11386)
  • Epidemiology (2066)
  • Evolutionary Biology (15120)
  • Genetics (10414)
  • Genomics (14024)
  • Immunology (9145)
  • Microbiology (22109)
  • Molecular Biology (8793)
  • Neuroscience (47449)
  • Paleontology (350)
  • Pathology (1423)
  • Pharmacology and Toxicology (2483)
  • Physiology (3711)
  • Plant Biology (8063)
  • Scientific Communication and Education (1433)
  • Synthetic Biology (2215)
  • Systems Biology (6021)
  • Zoology (1251)