Abstract
Tuberculosis remains a global health problem. Making it easier and quicker to identify which antibiotics an infection is likely to be susceptible to will be a key part of the solution. Whilst whole-genome sequencing offers many advantages, the processing of the genetic reads to produce the relevant public health and clinical information is, surprisingly, often the responsibility of the end user which inhibits uptake. Here we describe our Mycobacterial genetics processing pipeline and its deployment in a cloud-based platform. For antibiotic resistance prediction we have implemented the second edition of the WHO catalogue of resistance-associated variants. We validate the resistance prediction performance by constructing and processing a diverse dataset of 2,663 publicly-available M. tuberculosis samples with published drug susceptibility testing (DST) data and find that identifying a sample as resistant if it contains a minor allele known to be associated with resistance increases sensitivity. By only considering high confidence DST results we are able to show that both isoniazid and rifampicin achieve sensitivities and specificities in excess of 95%.
Competing Interest Statement
SS and KG are employed by the Ellison Institute of Technology, Oxford. TEAP, DWC, and PWF receive consultancy fees from the Ellison Institute of Technology, Oxford.