ABSTRACT
The generation of de novo protein structures with predefined function and properties remains a challenging problem in protein design. Diffusion models, a novel state-of-the-art class of generative models, have recently shown astounding empirical performance in image synthesis. Here we use image-based representations of protein structure to develop ProteinSGM, a score-based diffusion model that produces realistic de novo proteins and can inpaint plausible backbones and domains into structures of predefined length. With unconditional generation, we show that ProteinSGM can generate native-like protein structures, surpassing the performance of previously reported generative models. We experimentally validate some de novo designs and observe strong structural consistency with generated backbones. Finally, we apply conditional generation to de novo protein design by formulating it as an image inpainting problem, allowing precise and modular design of protein structure.
Competing Interest Statement
P.M.K. is a co-founder and consultant to multiple companies, including Resolute Bio, Oracle Therapeutics and Navega Therapeutics and serves on the scientific advisory board of ProteinQure. J.S.L and J.S.K. declares no competing interests.
Footnotes
Added experimental validation; added new modes of conditional inpainting; Updated all figures with updated model