Abstract
Protein kinases lie at the heart of cell signalling processes, constitute one of the largest human domain families and are often mutated in disease. Kinase target recognition at the active site is in part determined by a few amino acids around the phosphoacceptor residue. These preferences vary across kinases and despite the increased knowledge of target substrates little is known about how most preferences are encoded in the kinase sequence and how these preferences evolve. Here, we used alignment-based approaches to identify 30 putative specificity determinant residues (SDRs) for 16 preferences. These were studied using structural models and were validated by activity assays of mutant kinases. Mutation data from patient cancer samples revealed that kinase specificity is often targeted in cancer to a greater extent than catalytic residues. Throughout evolution we observed that kinase specificity is strongly conserved across orthologs but can diverge after gene duplication as illustrated by the evolution of the G-protein coupled receptor kinase family. The identified SDRs can be used to predict kinase specificity from sequence and aid in the interpretation of evolutionary or disease-related genomic variants.