PT - JOURNAL ARTICLE AU - Xiaotong Yao AU - Shuvadeep Maity AU - Shashank Gandhi AU - Marcin Imielenski AU - Christine Vogel TI - iSUMO - integrative prediction of functionally relevant SUMOylation events AID - 10.1101/056564 DP - 2017 Jan 01 TA - bioRxiv PG - 056564 4099 - http://biorxiv.org/content/early/2017/08/07/056564.short 4100 - http://biorxiv.org/content/early/2017/08/07/056564.full AB - Post-translational modifications by the Small Ubiquitin-like Modifier (SUMO) are essential for diverse cellular functions. Large-scale experiment and sequence-based predictions have identified thousands of SUMOylated proteins. However, the overlap between the datasets is small, suggesting many false positives with low functional relevance. Therefore, we integrated ~800 sequence features and protein characteristics such as cellular function and protein-protein interactions in a machine learning approach to score likely functional SUMOylation events (iSUMO). iSUMO is trained on a total of 24 large-scale datasets, and it predicts 2,291 and 706 SUMO targets in human and yeast, respectively. These estimates are five times higher than what existing sequence-based tools predict at the same 5% false positive rate. Protein-protein and protein-nucleic acid interactions are highly predictive of protein SUMOylation, supporting a role of the modification in protein complex formation. We note the marked prevalence of SUMOylation amongst RNA-binding proteins. We validate iSUMO predictions by experimental or other evidence. iSUMO therefore represents a comprehensive tool to identify high-confidence, functional SUMOylation events for human and yeast.