Abstract
Pancreatic cancers are typically diagnosed at late stage where disease prognosis is poor as exemplified by a 5-year survival rate of 8.2%. Earlier diagnosis would be beneficial by enabling surgical resection or earlier application of therapeutic regimens. We investigated the detection of pancreatic ductal adenocarcinoma (PDAC) in a non-invasive manner by interrogating changes in 5-hydroxymethylation cytosine status (5hmC) of circulating cell free DNA in the plasma of a PDAC cohort (n=51) in comparison with a non-cancer cohort (n=41). We found that 5hmC sites are enriched in a disease and stage specific manner in exons, 3’UTRs and transcription termination sites. Our data show that 5hmC density is reduced in promoters and histone H3K4me3-associated sites with progressive disease suggesting increased transcriptional activity. 5hmC density is differentially represented in thousands of genes, and a stringently filtered set of the most significant genes points to biology related to pancreas (GATA4, GATA6, PROX1, ONECUT1) and/or cancer development (YAP1, TEAD1, PROX1, ONECUT1, ONECUT2, IGF1 and IGF2). Regularized regression models were built using 5hmC densities in statistically filtered genes or a comprehensive set of highly variable 5hmC counts in genes and performed with an AUC = 0.94-0.96 on training data. We were able to test the ability to classify PDAC and non-cancer samples with the Elastic net and Lasso models on two external pancreatic cancer 5hmC data sets and found validation performance to be AUC = 0.74-0.97. The findings suggest that 5hmC changes enable classification of PDAC patients with high fidelity and are worthy of further investigation on larger cohorts of patient samples.