Abstract
Predicting protein-protein interactions (PPI) is a challenging problem of central importance in fundamental biology. With the increasing number of available PPI prediction methods and databases, an effective evaluation model would be extremely valuable. Here we introduce ZIPPI (Z-score for Information about Protein-Protein Interfaces), which evaluates structural models of a complex based on sequence co-evolution and conservation involving residues that are in contact in the interface. The interface Z-score (ZIPPI score) is calculated by comparing metrics for interface contacts to metrics obtained from randomly chosen surface residues. Since contacting residues are defined by the structural model, this obviates the need of accounting for indirect interactions with methods such as Direct Coupling Analysis. Although ZIPPI relies on species-paired multiple sequence alignments, its focus on contacting interfacial residues and the avoidance of direct coupling methods makes it computationally efficient. The performance of ZIPPI is evaluated through applications to experimentally determined complexes from the Protein Data Bank (PDB) and to decoys from the Critical Assessment of PRedicted Interactions (CAPRI) experiment. We demonstrate how ZIPPI can be implemented on a genome-wide scale by calculating scores for millions of structural models of protein-protein interactions in the E. coli interactome as predicted by PrePPI. Many PrePPI predictions filtered by ZIPPI score are novel. In all, this proteome-scale method shows promising feasibility for applications to the full human protein interactome, which is not yet accessible to deep learning methods.
Competing Interest Statement
The authors have declared no competing interest.