RT Journal Article SR Electronic T1 Model-based differential sequencing analysis JF bioRxiv FD Cold Spring Harbor Laboratory SP 2023.03.29.534803 DO 10.1101/2023.03.29.534803 A1 Akosua Busia A1 Jennifer Listgarten YR 2023 UL http://biorxiv.org/content/early/2023/04/07/2023.03.29.534803.abstract AB Characterizing differences in biological sequences between two conditions using high-throughput sequencing data is a prevalent problem wherein we seek to (i) quantify how sequence abundances change between conditions, and (ii) build predictive models to estimate such differences for unobserved sequences. A key shortcoming of current approaches is their extremely limited ability to share information across related but non-identical reads. Consequently, they cannot make effective use of sequencing data, nor can they be directly applied in many settings of interest. We introduce model-based enrichment (MBE) to overcome this shortcoming. MBE is based on sound theoretical principles, is easy to implement, and can trivially make use of advances in modernday machine learning classification architectures or related innovations. We extensively evaluate MBE empirically, both in simulation and on real data. Overall, we find that our new approach improves accuracy compared to current ways of performing such differential analyses.Competing Interest StatementThe authors have declared no competing interest.