Skip to main content
bioRxiv
  • Home
  • About
  • Submit
  • ALERTS / RSS
Advanced Search
New Results

Sampling of Structure and Sequence Space of Small Protein Folds

T Linsky, K Noble, A Tobin, R Crow, Lauren Carter, J Urbauer, D Baker, EM Strauch
doi: https://doi.org/10.1101/2021.03.10.434454
T Linsky
1Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.
2Institute for Protein Design, University of Washington, Seattle, WA 98195, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
K Noble
3Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA 30602, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
A Tobin
3Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA 30602, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
R Crow
4Department of Microbiology, University of Washington, Seattle, WA 98195, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Lauren Carter
2Institute for Protein Design, University of Washington, Seattle, WA 98195, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
J Urbauer
5Department of Chemistry, University of Georgia, Athens, GA 30602, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
D Baker
1Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.
2Institute for Protein Design, University of Washington, Seattle, WA 98195, USA.
6Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
EM Strauch
3Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA 30602, USA.
7Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • For correspondence: estrauch@uga.edu
  • Abstract
  • Full Text
  • Info/History
  • Metrics
  • Supplementary material
  • Preview PDF
Loading

Abstract

Nature only samples a small fraction in sequence space, yet many more amino acid combinations can fold into stable proteins. Furthermore, small structural variations in a single fold, which may only be a few amino acids different from the next homolog, define their molecular function. Hence, to design proteins with novel molecular functionalities, such as molecular recognition, methods to control and sample shape diversity are necessary. To explore this space, we developed and experimentally validated a computational platform that can design a wide variety of small protein folds while sampling high shape diversity. We designed and evaluated about 30,000 de novo protein designs of 7 different folds. Among these designs, about 6,200 stable proteins were identified, with predicted structures having first-of-its-kind minimalized thioredoxin. Obtained data revealed more protein folding rules, such as helix connecting loops, which were in nature. Beyond providing a resource database for protein engineering, our data presents a large training data set for machine learning. We developed a high-accuracy classifier to predict the stability of our designed proteins. The methods and the wide range of new protein shapes provide a basis for the design of new protein function without compromising stability.

Competing Interest Statement

The authors have declared no competing interest.

Copyright 
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.
Back to top
PreviousNext
Posted March 11, 2021.
Download PDF

Supplementary Material

Email

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Enter multiple addresses on separate lines or separate them with commas.
Sampling of Structure and Sequence Space of Small Protein Folds
(Your Name) has forwarded a page to you from bioRxiv
(Your Name) thought you would like to see this page from the bioRxiv website.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Sampling of Structure and Sequence Space of Small Protein Folds
T Linsky, K Noble, A Tobin, R Crow, Lauren Carter, J Urbauer, D Baker, EM Strauch
bioRxiv 2021.03.10.434454; doi: https://doi.org/10.1101/2021.03.10.434454
Digg logo Reddit logo Twitter logo Facebook logo Google logo LinkedIn logo Mendeley logo
Citation Tools
Sampling of Structure and Sequence Space of Small Protein Folds
T Linsky, K Noble, A Tobin, R Crow, Lauren Carter, J Urbauer, D Baker, EM Strauch
bioRxiv 2021.03.10.434454; doi: https://doi.org/10.1101/2021.03.10.434454

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioengineering
Subject Areas
All Articles
  • Animal Behavior and Cognition (3691)
  • Biochemistry (7801)
  • Bioengineering (5679)
  • Bioinformatics (21298)
  • Biophysics (10585)
  • Cancer Biology (8183)
  • Cell Biology (11950)
  • Clinical Trials (138)
  • Developmental Biology (6764)
  • Ecology (10401)
  • Epidemiology (2065)
  • Evolutionary Biology (13880)
  • Genetics (9710)
  • Genomics (13080)
  • Immunology (8151)
  • Microbiology (20028)
  • Molecular Biology (7860)
  • Neuroscience (43079)
  • Paleontology (321)
  • Pathology (1279)
  • Pharmacology and Toxicology (2261)
  • Physiology (3354)
  • Plant Biology (7233)
  • Scientific Communication and Education (1314)
  • Synthetic Biology (2008)
  • Systems Biology (5542)
  • Zoology (1129)