PT - JOURNAL ARTICLE AU - Grzegorz Chojnowski AU - Adam J. Simpkin AU - Diego A. Leonardo AU - Wolfram Seifert-Davila AU - Dan E. Vivas-Ruiz AU - Ronan M. Keegan AU - Daniel J. Rigden TI - Identification of unknown proteins in X-ray crystallography and cryo-EM AID - 10.1101/2021.04.18.440303 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.04.18.440303 4099 - http://biorxiv.org/content/early/2021/04/18/2021.04.18.440303.short 4100 - http://biorxiv.org/content/early/2021/04/18/2021.04.18.440303.full AB - Although experimental protein structure determination usually targets known proteins, chains of unknown sequence are often encountered. They can be purified from natural sources, appear as an unexpected fragment of a well characterized protein or as a contaminant. Regardless of the source of the problem, the unknown protein always requires tedious characterization. Here we present an automated pipeline for the identification of protein sequences from cryo-EM reconstructions and crystallographic data. We present the method’s application to characterize the crystal structure of an unknown protein purified from a snake venom. We also show that the approach can be successfully applied to the identification of protein sequences and validation of sequence assignments in cryo-EM protein structures.Competing Interest StatementThe authors have declared no competing interest.