PT - JOURNAL ARTICLE AU - Sebastian Raschka AU - Alex J. Wolf AU - Joseph Bemister-Buffington AU - Leslie A. Kuhn TI - Protein-ligand interfaces are polarized: Discovery of a strong trend for intermolecular hydrogen bonds to favor donors on the protein side with implications for predicting and designing ligand complexes AID - 10.1101/260612 DP - 2018 Jan 01 TA - bioRxiv PG - 260612 4099 - http://biorxiv.org/content/early/2018/02/05/260612.short 4100 - http://biorxiv.org/content/early/2018/02/05/260612.full AB - Understanding how proteins encode ligand specificity is fascinating and similar in importance to deciphering the genetic code. For protein-ligand recognition, the combination of an almost infinite variety of interfacial shapes and patterns of chemical groups makes the problem especially challenging. Here we analyze data across non-homologous proteins in complex with small biological ligands to address observations made in our inhibitor discovery projects: that proteins favor donating H-bonds to ligands and avoid using groups with both H-bond donor and acceptor capacity. The resulting clear and significant chemical group matching preferences elucidate the code for protein-native ligand binding, similar to the dominant patterns found in nucleic acid base-pairing. On average, 90% of the keto and carboxylate oxygens occurring in the biological ligands formed direct H-bonds to the protein. A two-fold preference was found for protein atoms to act as H-bond donors and ligand atoms to act as acceptors, and 76% of all intermolecular H-bonds involved an amine donor. Together, the tight chemical and geometric constraints associated with satisfying donor groups generate a hydrogen-bonding lock that can be matched only by ligands bearing the right acceptor-rich key. Measuring an index of H-bond preference based on the observed chemical trends proved sufficient to predict other protein-ligand complexes and can be used to guide molecular design. The resulting Hbind and Protein Recognition Index software packages are being made available for rigorously defining intermolecular H-bonds and measuring the extent to which H-bonding patterns in a given complex match the preference key.3Dthree-dimensionalCATHClass Architecture Topology Homologous superfamilyH-bondshydrogen bondsMMFF94Merck Molecular Force FieldPDBProtein Data BankPRIProtein Recognition Index