RT Journal Article SR Electronic T1 Sequence-structure-function relationships in the microbial protein universe JF bioRxiv FD Cold Spring Harbor Laboratory SP 2022.03.18.484903 DO 10.1101/2022.03.18.484903 A1 Julia Koehler Leman A1 Pawel Szczerbiak A1 P. Douglas Renfrew A1 Vladimir Gligorijevic A1 Daniel Berenberg A1 Tommi Vatanen A1 Bryn C. Taylor A1 Chris Chandler A1 Stefan Janssen A1 Nick Carriero A1 Ian Fisk A1 Ramnik J. Xavier A1 Rob Knight A1 Richard Bonneau A1 Tomasz Kosciolek YR 2022 UL http://biorxiv.org/content/early/2022/03/20/2022.03.18.484903.abstract AB For the past half-century, structural biologists relied on the notion that similar protein sequences give rise to similar structures and functions. While this assumption has driven research to explore certain parts of the protein universe, it disregards spaces that don’t rely on this assumption. Here we explore areas of the protein universe where similar protein functions can be achieved by different sequences and different structures. We predict ∼200,000 structures for diverse protein sequences from 1,003 representative genomes1 across the microbial tree of life, and annotate them functionally on a per-residue basis. Structure prediction is accomplished using the World Community Grid, a large-scale citizen science initiative. The resulting database of structural models is complementary to the AlphaFold database, with regards to domains of life as well as sequence diversity and sequence length. We identify 161 novel folds and describe examples where we map specific functions to structural motifs. We also show that the structural space is continuous and largely saturated, highlighting the need for shifting the focus from obtaining structures to putting them into context, to transform all branches of biology, including a shift from sequence-based to sequence-structure-function based meta-omics analyses.Competing Interest StatementThe authors have declared no competing interest.