Abstract
DNA comprises molecular information stored via genetic bases (G, C, T, A) and also epigenetic bases, principally 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC). Both genetic and epigenetic information are vital to our understanding of biology and disease states. Most DNA sequencing approaches address either genetics or epigenetics and thus capture incomplete information. Methods widely used to detect epigenetic DNA bases typically fail to capture common C-to-T mutations or distinguish 5mC from 5hmC. Here, we present a single-base-resolution sequencing methodology that will simultaneously sequence complete genetics and complete epigenetics in a single workflow. The approach is non-destructive to DNA and provides a digital readout of bases, which we exemplify by simultaneous sequencing of G, C, T, A, 5mC and 5hmC; 6-Letter sequencing. We demonstrate sequencing of human genomic DNA and also cell-free DNA taken from a blood sample of a cancer patient. The approach is accurate, requires low DNA input and has a simple workflow and analysis pipeline. We envisage it will be versatile across many applications in life sciences.
Competing Interest Statement
Shankar Balasubramanian is an advisor of Cambridge Epigenetix and holds stock options. All the other authors are current or former employees and hold stock options.
Footnotes
Additional figure panel (Fig 2E) added. - previous version Figs 2E and 2F moved to Figs 2F and 2G. Supplementary methods 5 added to describe production of this Fig2E, - previous supplementary methods 5 moved to supplementary methods 6. "six" moved to "five" on page 3 as this section is describing 5-Letter seq not 6-Letter seq (where six would be accurate) Two authors added.