Abstract
Single-molecule chromatin fiber sequencing is based on the single-nucleotide resolution identification of DNA N6-methyladenine (m6A) along individual sequencing reads. We present fibertools, a semi-supervised convolutional neural network that permits the fast and accurate identification of both endogenous and exogenous m6A-marked bases using single-molecule long-read sequencing. Fibertools enables highly accurate (>90% precision and recall) m6A identification along multi-kilobase DNA molecules with a ∼1,000-fold improvement in speed and the capacity to generalize to new sequencing chemistries.
Competing Interest Statement
The authors have declared no competing interest.
Copyright
The copyright holder for this preprint is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.