PT - JOURNAL ARTICLE AU - Alexandre A. Lussier AU - Yiwen Zhu AU - Brooke J. Smith AU - Andrew J. Simpkin AU - Andrew D.A.C. Smith AU - Matthew J. Suderman AU - Esther Walton AU - Kerry J. Ressler AU - Erin C. Dunn TI - Updates to data versions and analytic methods influence the reproducibility of results from epigenome-wide association studies AID - 10.1101/2021.04.23.441014 DP - 2021 Jan 01 TA - bioRxiv PG - 2021.04.23.441014 4099 - http://biorxiv.org/content/early/2021/04/24/2021.04.23.441014.short 4100 - http://biorxiv.org/content/early/2021/04/24/2021.04.23.441014.full AB - Introduction Biomedical research has grown increasingly cooperative, with several large consortia compiling and sharing epigenomic data. Since data are typically preprocessed by consortia prior to distribution, the implementation of new pipelines can lead to different versions of the same dataset. Analytic frameworks also constantly evolve to incorporate cutting-edge methods and shifting best practices. However, it remains unknown how differences in data and analytic versions alter the results of epigenome-wide analyses, which has broad implications for the replicability of epigenetic associations. Thus, we assessed the impact of these changes using a subsample of the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort.Methods We analyzed two versions of DNA methylation data, processed using separate preprocessing and analytic pipelines, to examine associations between childhood adversity and prenatal smoking exposure on DNA methylation at age 7. We performed two sets of analyses: (1) epigenome-wide association studies (EWAS); (2) Structured Life Course Modeling Approach (SLCMA), a two-stage method that models time-dependent effects. We also compared results from the SLCMA using more recent methodological recommendations.Results Differences between ALSPAC data versions impacted both EWAS and SLCMA analyses, yielding different sets of associations at conventional p-value thresholds. However, the magnitude and direction of associations was generally consistent between data versions, regardless of significance thresholds. Updating the SLCMA analytic version similarly altered top associations, but time-dependent effects remained concordant.Conclusions Changes to data and analytic versions influenced the results of epigenome-wide studies, particularly when using p-value thresholds as reference points for successful replication and stability.Competing Interest StatementThe authors have declared no competing interest.