Abstract
Bisulfite sequencing is widely used to detect 5mC and 5hmC at single base resolution. It is the most accepted method for detecting these cytosine modifications, but it does have significant drawbacks. DNA is frequently damaged resulting in fragmentation, loss of DNA and inherent biases introduced to sequencing data. To overcome this, we developed a new method called Enzymatic Methyl-seq (EM-seq). This method relies on two sets of enzymatic reactions. In the first reaction, TET2 and T4-βGT convert 5mC and 5hmC into substrates that cannot be deaminated by APOBEC3A. In the second reaction, APOBEC3A deaminates unmodified cytosines converting them to uracils. The protection of 5mC and 5hmC permits the discrimination of cytosines from 5mC and 5hmC. Over a range of DNA inputs, the overall fraction of 5mC and 5hmC in EM-seq libraries was similar to bisulfite libraries. However, libraries made using EM-seq outperformed bisulfite converted libraries in all specific measures examined including coverage, duplication, sensitivity and nucleotide composition. EM-seq libraries displayed even GC distribution, improved correlation across input amounts, increased numbers of CpGs confidently assessed within genomic features, and improved the accuracy of cytosine methylation calls in other contexts. Bisulfite sequencing is known to severely damage DNA thus making library construction for lower DNA input very difficult. We show that EM-seq can be used to make libraries using as little as 100 pg of DNA. These libraries maintain all of the previously described advantages over bisulfite sequencing thus opening new avenues for research and clinical applications. Even with challenging input material, EM-seq provides a method to detect methylation state more reliably than WBGS.