PT - JOURNAL ARTICLE AU - Christopher Prior AU - Owen R Davies AU - Daniel Bruce AU - Ehmke Pohl TI - Obtaining tertiary protein structures by the ab-initio interpretation of small angle X-ray scattering data AID - 10.1101/572057 DP - 2019 Jan 01 TA - bioRxiv PG - 572057 4099 - http://biorxiv.org/content/early/2019/09/09/572057.short 4100 - http://biorxiv.org/content/early/2019/09/09/572057.full AB - Small angle X-ray scattering (SAXS) has become an important tool to investigate the structure of proteins in solution. In this paper we present a novel ab-initio method to represent polypeptide chains as discrete curves that can be used to derive a meaningful three-dimensional model from only the primary sequence and experimental SAXS data. High resolution crystal structures were used to generate probability density functions for each of the common secondary structural elements found in proteins. These are used to place realistic restraints on the model curve’s geometry. To evaluate the quality of potential models and demonstrate the efficacy of this novel technique we developed a new statistic to compare the entangled geometry of two open curves, based on mathematical techniques from knot theory. The chain model is coupled with a novel explicit hydration shell model in order derive physically meaningful 3D models by optimizing configurations against experimental SAXS data using a monte-caro based algorithm. We show that the combination of our ab-initio method with spatial restraints based on contact predictions successfully derives a biologically plausible model of the coiled–coil component of the human synaptonemal complex central element protein.SIGNIFICANCE Small-angle X-ray scattering allows for structure determination of biological macromolecules and their complexes in aqueous solution. Using a discrete curve representation of the polypeptide chain and combining it with empirically determined constraints and a realistic solvent model we are now able to derive realistic ab-initio 3-dimensional models from BioSAXS data. The method only require a primary sequence and the scattering data form the user.