Abstract
We present a structure-based method for finding and evaluating structural similarities in protein regions relevant to ligand binding. PDBspheres comprises an exhaustive library of protein structure regions (“spheres”) adjacent to complexed ligands derived from the Protein Data Bank (PDB), along with methods to find and evaluate structural matches between a protein of interest and spheres in the library. Currently, PDBspheres’ library contains more than 2 million spheres, organized to facilitate searches by sequence and/or structure similarity of protein-ligand binding sites or interfaces between interacting molecules. PDBspheres uses the LGA structure alignment algorithm as the main engine for detecting structure similarities between the protein of interest and library spheres. An all-atom structure similarity metric ensures that sidechain placement is taken into account in the PDBspheres’ primary assessment of confidence in structural matches. In this paper, we (1) describe the PDBspheres method, (2) demonstrate how PDBspheres can be used to detect and characterize binding sites in protein structures, (3) compare PDBspheres use for binding site prediction with seven other binding site prediction methods using a curated dataset of 2,528 ligand-bound and ligand-free crystal structures, and (4) use PDBspheres to cluster pockets and assess structural similarities among protein binding sites of the 4,876 structures in the “refined set” of PDBbind 2019 dataset. The PDBspheres library is made publicly available for download at https://proteinmodel.org/AS2TS/PDBspheres
Competing Interest Statement
The authors have declared no competing interest.