RT Journal Article SR Electronic T1 Direct coevolutionary couplings reflect biophysical residue interactions in proteins JF bioRxiv FD Cold Spring Harbor Laboratory SP 061390 DO 10.1101/061390 A1 Alice Coucke A1 Guido Uguzzoni A1 Francesco Oteri A1 Simona Cocco A1 Remi Monasson A1 Martin Weigt YR 2016 UL http://biorxiv.org/content/early/2016/06/29/061390.abstract AB Coevolution of residues in contact imposes strong statistical constraints on the sequence variability between homologous proteins. Direct-Coupling Analysis (DCA), a global statistical inference method, successfully models this variability across homologous protein families to infer structural information about proteins. For each residue pair, DCA infers 21×21 matrices describing the coevolutionary coupling for each pair of amino acids (or gaps). To achieve the residue-residue contact prediction, these matrices are mapped onto simple scalar parameters; the full information they contain gets lost. Here, we perform a detailed spectral analysis of the coupling matrices resulting from 70 protein families, to show that they contain quantitative information about the physico-chemical properties of amino-acid interactions. Results for protein families are corroborated by the analysis of synthetic data from lattice-protein models, which emphasizes the critical effect of sampling quality and regularization on the biochemical features of the statistical coupling matrices.