PT - JOURNAL ARTICLE AU - Jung, Yong AU - Geng, Cunliang AU - Bonvin, Alexandre M. J. J. AU - Xue, Li C. AU - Honavar, Vasant G. TI - MetaScore: A novel machine-learning based approach to improve traditional scoring functions for scoring protein-protein docking conformations AID - 10.1101/2021.10.06.463442 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.10.06.463442 4099 - http://biorxiv.org/content/early/2021/10/09/2021.10.06.463442.short 4100 - http://biorxiv.org/content/early/2021/10/09/2021.10.06.463442.full AB - Protein-protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes. Computational docking has become an indispensable alternative to the expensive and timeconsuming experimental approaches for determining 3D structures of protein complexes. Despite recent progress, identifying near-native models from a large set of conformations sampled by docking - the so-called scoring problem - still has considerable room for improvement.We present here MetaScore, a new machine-learning based approach to improve the scoring of docked conformations. MetaScore utilizes a random forest (RF) classifier trained to distinguish near-native from non-native conformations using a rich set of features extracted from the respective protein-protein interfaces. These include physico-chemical properties, energy terms, interaction propensity-based features, geometric properties, interface topology features, evolutionary conservation and also scores produced by traditional scoring functions (SFs). MetaScore scores docked conformations by simply averaging of the score produced by the RF classifier with that produced by any traditional SF. We demonstrate that (i) MetaScore consistently outperforms each of nine traditional SFs included in this work in terms of success rate and hit rate evaluated over the top 10 predicted conformations; (ii) An ensemble method, MetaScore-Ensemble, that combines 10 variants of MetaScore obtained by combining the RF score with each of the traditional SFs outperforms each of the MetaScore variants. We conclude that the performance of traditional SFs can be improved upon by judiciously leveraging machine-learning.Competing Interest StatementThe authors have declared no competing interest.