RT Journal Article SR Electronic T1 MetaScore: A novel machine-learning based approach to improve traditional scoring functions for scoring protein-protein docking conformations JF bioRxiv FD Cold Spring Harbor Laboratory SP 2021.10.06.463442 DO 10.1101/2021.10.06.463442 A1 Jung, Yong A1 Geng, Cunliang A1 Bonvin, Alexandre M. J. J. A1 Xue, Li C. A1 Honavar, Vasant G. YR 2021 UL http://biorxiv.org/content/early/2021/10/09/2021.10.06.463442.abstract AB Protein-protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes. Computational docking has become an indispensable alternative to the expensive and timeconsuming experimental approaches for determining 3D structures of protein complexes. Despite recent progress, identifying near-native models from a large set of conformations sampled by docking - the so-called scoring problem - still has considerable room for improvement.We present here MetaScore, a new machine-learning based approach to improve the scoring of docked conformations. MetaScore utilizes a random forest (RF) classifier trained to distinguish near-native from non-native conformations using a rich set of features extracted from the respective protein-protein interfaces. These include physico-chemical properties, energy terms, interaction propensity-based features, geometric properties, interface topology features, evolutionary conservation and also scores produced by traditional scoring functions (SFs). MetaScore scores docked conformations by simply averaging of the score produced by the RF classifier with that produced by any traditional SF. We demonstrate that (i) MetaScore consistently outperforms each of nine traditional SFs included in this work in terms of success rate and hit rate evaluated over the top 10 predicted conformations; (ii) An ensemble method, MetaScore-Ensemble, that combines 10 variants of MetaScore obtained by combining the RF score with each of the traditional SFs outperforms each of the MetaScore variants. We conclude that the performance of traditional SFs can be improved upon by judiciously leveraging machine-learning.Competing Interest StatementThe authors have declared no competing interest.