RT Journal Article SR Electronic T1 Predicting evolutionary site variability from structure in viral proteins: buriedness, flexibility, and design JF bioRxiv FD Cold Spring Harbor Laboratory SP 004481 DO 10.1101/004481 A1 Amir Shahmoradi A1 Dariya K. Sydykova A1 Stephanie J. Spielman A1 Eleisha L. Jackson A1 Eric T. Dawson A1 Austin G. Meyer A1 Claus O. Wilke YR 2014 UL http://biorxiv.org/content/early/2014/04/24/004481.abstract AB Several recent works have shown that protein structure can predict site-specific evolutionary sequence variation. In particular, sites that are buried and/or have many contacts with other sites in a structure have been shown to evolve more slowly, on average, than surface sites with few contacts. Here, we present a comprehensive study of the extent to which numerous structural properties can predict sequence variation. The structural properties we considered include buriedness (relative solvent accessibility and contact number), structural flexibility (B factors, root-mean-square fluctuations, and variation in dihedral angles), and variability in designed structures. We obtained structural flexibility measures both from molecular dynamics simulations performed on 9 non-homologous viral protein structures and from variation in homologous variants of those proteins, where available. We obtained measures of variability in designed structures from flexible-backbone design in the Rosetta software. We found that most of the structural properties correlate with site variation in the majority of structures, though the correlations are generally weak (correlation coefficients of 0.1 to 0.4). Moreover, we found that measures of buriedness were better predictors of evolutionary variation than were measures of structural flexibility. Finally, variability in designed structures was a weaker predictor of evolutionary variability than was buriedness, but was comparable in its predictive power to the best structural flexibility measures. We conclude that simple measures of buriedness are better predictors of evolutionary variation than are more complicated predictors obtained from dynamic simulations, ensembles of homologous structures, or computational protein design.