PT  - JOURNAL ARTICLE
AU  - Samuel Sledzieski
AU  - Rohit Singh
AU  - Lenore Cowen
AU  - Bonnie Berger
TI  - Contrasting drugs from decoys
AID  - 10.1101/2022.11.03.515086
DP  - 2022 Jan 01
TA  - bioRxiv
PG  - 2022.11.03.515086
4099  - http://biorxiv.org/content/early/2022/11/04/2022.11.03.515086.short
4100  - http://biorxiv.org/content/early/2022/11/04/2022.11.03.515086.full
AB  - Protein language models (PLMs) have recently been proposed to advance drugtarget interaction (DTI) prediction, and have shown state-of-the-art performance on several standard benchmarks. However, a remaining challenge for all DTI prediction models (including PLM-based ones) is distinguishing true drugs from highly-similar decoys. Leveraging techniques from self-supervised contrastive learning, we introduce a second-generation PLM-based DTI model trained on triplets of proteins, drugs, and decoys (small drug-like molecules that do not bind to the protein). We show that our approach, CON-Plex, improves specificity while maintaining high prediction accuracy and generalizability to new drug classes. CON-Plex maps proteins and drugs to a shared latent space which can be interpreted to identify mutually-compatible classes of proteins and drugs. Data and code are available at https://zenodo.org/record/7127229.Competing Interest StatementThe authors have declared no competing interest.