Abstract
A new method termed “Relative Principal Components analysis” (RPCA) is introduced that extracts optimal relevant principal components to describe the change between two data samples representing two macroscopic states. The method is applicable in all areas of data-driven science. Mining of the components is based on a unified physical framework which introduces the objective function, namely the Kullback-Leibler divergence, appropriate for quantifying the change of the macroscopic state as it is effected by the microscopic features. Moreover, we provide a proof of existence of a low-dimensional space for latent informative features of the change. To demonstrate the applicability of RPCA, we analyze the thermodynamically relevant conformational changes of the protein HIV-1 protease upon binding to different drug molecules. In this case, the RPCA method provides a sound thermodynamic foundation for the analysis of the binding process. The relevant collective (global) conformational changes can be reconstructed from the informative latent variables to exhibit both the enhanced and the restricted conformational fluctuations upon ligand association. Moreover, RPCA characterizes the locally relevant conformational changes which can be presented on the structure of the protein.