Abstract
It has been reported recently that DNA 5-methylcytosine (5mC) in CpG contexts can be detected using PacBio circular consensus sequencing (CCS). However, the accuracy and robustness of computational methods using long CCS reads still need to be improved. In this study, we present a deep learning method, ccsmeth, to detect DNA 5mCpGs from PacBio CCS subreads. ccsmeth utilizes attention-based bidirectional Gated Recurrent Unit (GRU) networks to infer DNA methylation states. Testing ccsmeth using CCS subreads of amplified DNA and M.SssI-treated DNA, we found that ccsmeth achieved higher performances than existing methods. We also compared the results of ccsmeth on long CCS reads with bisulfite sequencing and Nanopore sequencing. The results demonstrated that ccsmeth can accurately detect 5mCpGs from CCS data sequenced using >10 kb insert library. Moreover, using PacBio CCS data, we proposed a pipeline which can detect haplotype-aware methylation in human.
Competing Interest Statement
The authors have declared no competing interest.