ABSTRACT
Clostridioides difficile is a major cause of hospital-acquired diarrhea, posing significant clinical challenges due to its high morbi-mortality rates and its involvement in nosocomial outbreaks. Detecting its toxigenic ribotypes rapidly and accurately is crucial for effective outbreak control. This study aimed to create a rapid diagnostic methodology based on MALDI-TOF MS and Machine Learning algorithms to differentiate toxigenic C. difficile RTs.
MALDI-TOF spectra were acquired from 379 clinical isolates sourcing from 10 Spanish hospitals and analysed using Clover MSDAS, a specific software for MALDI-TOF spectra analysis, considered as the state-of-the-art tool for this purpose, and AutoCdiff, an ad hoc software developed in this study.
Seven biomarker peaks were found to differentiate epidemic RT027 and RT181 strains from other RTs (2463, 3353, 4933, 4993, 6187, 6651 and 6710 m/z). Two peaks (2463 and 4993 m/z) were specifically found in RT027 isolates while combinations of the other 5 peaks allowed the differentiation of RT181 from other ribotypes. Automatic classification tools developed in Clover MSDAS and AutoCdiff using the specific peaks and the entire protein spectra, respectively, showed up to 100% balanced accuracy. Both methods allowed correct ribotype assignment for isolates sourcing from real-time outbreaks.
The developed models, available from Clover MSDAS and the AutoCdiff website -https://bacteria.id-offer researchers a valuable tool for quick C. difficile ribotype determination based on MALDI-TOF spectra analysis. Although further validation of the models is still required, they represent rapid and cost-effective methods for standardized C. difficile ribotype assignment.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
↵∞ Both authors are the senior authors of this article.
The results have been reorganized in a more logical manner: several classification models were trained for differentiation of the main C. difficile ribotypes involved in nosocomial outbreaks in our setting, i. e. RT027 and RT181, and validated with an external dataset. Then, the training and validation datasets were merged to train an improved model that was validated in real time with isolates from two hospitals. The models classified correctly 100% of the cases.