PT - JOURNAL ARTICLE AU - Weiguang Mao AU - Dennis Kostka AU - Maria Chikina TI - The perils of interaction prediction AID - 10.1101/435065 DP - 2018 Jan 01 TA - bioRxiv PG - 435065 4099 - http://biorxiv.org/content/early/2018/10/05/435065.short 4100 - http://biorxiv.org/content/early/2018/10/05/435065.full AB - The availability of genome-wide maps of enhancer-promoter interactions (EPIs) has made it possible to use machine learning approaches to extract and interpret features that determine these interactions in different biological contexts. Multiple methods have claimed to accomplish the task of predicting enhancer-promoter interactions based on corresponding genomic features, but this problem is actually still far from being solved. In our analysis, we show that individual enhancer and promoter regions have widely different marginal interaction probabilities, e.g. propensities, which can lead to overfitting and memorization when random cross-validation is employed. Further even when a proper cross-validation scheme is adopted, a simple propensity-based model can still achieve a competitive performance without capturing any information about the EPI mechanism.