Abstract
Motivation Recent advances in human genomics have revealed that missense mutations in a single protein can lead to distinctly different phenotypes. In particular, some mutations in oncoproteins like Ras, MEK, PI3K, PTEN, and SHP2 are linked various cancers and Neurodevelopmental Disorders (NDDs). While numerous tools exist for predicting the pathogenicity of missense mutations, linking these variants to certain phenotypes remains a major challenge, particularly in the context of personalized medicine.
Results To fill this gap, we developed protPheMut (Protein Phenotypic Mutations Analyzer), leveraging multiple interpretable machine learning methods and integrate diverse biophysics and network dynamics-based signatures, for the prediction of mutations of the same protein can promote cancer, or NDDs. We illustrate the utility of protPheMut in phenotypes (cancer/NDDs) prediction by the mutation analysis of two protein cases, that are PI3Kα and PTEN. Compared to seven other predictive tools, protPheMut demonstrated exceptional accuracy in forecasting phenotypic effects, achieving an AUROC of 0.8501 for PI3Kα mutations related to cancer and Cowden syndrome. For multi-phenotypes prediction of PTEN mutations related to cancer, PHTS, and HCPS, protPheMut achieved an AUC of 0.9349 through micro-averaging. Using SHAP model explanations, we gained insights into the mechanisms driving phenotype formation. A userfriendly website deployment is also provided.
Availability Source code and data are available at https://github.com/Spencer-JRWang/protPheMut. We also provide a user-friendly website at http://netprotlab.com/protPheMut.
Supplementary information Supplementary data are available at Bioinformatics online.
Competing Interest Statement
The authors have declared no competing interest.