PT - JOURNAL ARTICLE AU - Brandon Frenz AU - Steven Lewis AU - Indigo King AU - Hahnbeom Park AU - Frank DiMaio AU - Yifan Song TI - Prediction of protein mutational free energy: benchmark and sampling improvements increase classification accuracy AID - 10.1101/2020.03.18.989657 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.03.18.989657 4099 - http://biorxiv.org/content/early/2020/03/20/2020.03.18.989657.short 4100 - http://biorxiv.org/content/early/2020/03/20/2020.03.18.989657.full AB - Software to predict the change in protein stability upon point mutation is a valuable tool for a number of biotechnological and scientific problems. To facilitate the development of such software and provide easy access to the available experimental data, the ProTherm database was created. Biases in the methods and types of information collected has led to disparity in the types of mutations for which experimental data is available. For example, mutations to alanine are hugely overrepresented whereas those involving charged residues, especially from one charged residue to another, are underrepresented. ProTherm subsets created as benchmark sets that do not account for this often underrepresented certain mutational types. This issue introduces systematic biases into previously published protocols’ ability to accurately predict the change in folding energy on these classes of mutations. To resolve this issue, we have generated a new benchmark set with these problems corrected. We have then used the benchmark set to test a number of improvements to the point mutation energetics tools in the Rosetta software suite.