A novel mosaic tetracycline resistance gene tet(S/M) detected in a multidrug-resistant pneumococcal CC230 lineage that underwent capsular switching in South Africa

Objective We reported a novel tetracycline-resistant gene in Streptococcus pneumoniae and investigated its temporal spread in relation to nationwide clinical interventions. Methods We whole genome sequenced 12,254 pneumococcal isolates from twenty-nine countries on an Illumina HiSeq Sequencer. Serotypes, sequence types and antibiotic resistance were inferred from genomes. Phylogeny was built based on single-nucleotide variants. Temporal changes of spread were reconstructed using a birth-death model. Results We identified tet(S/M) in 131 pneumococcal isolates, 97 (74%) caused invasive pneumococcal diseases among young children (59% HIV-positive, where HIV status was available) in South Africa. A majority of tet(S/M)-positive isolates (129/131) belong to clonal complex (CC)230. A global phylogeny of CC230 (n=389) revealed that tet(S/M)-positive isolates formed a sub-lineage that exhibited multidrug-resistance. Using the genomic data and a birth-death model, we detected an unrecognised outbreak of this sub-lineage in South Africa between 2000 and 2004 with an expected secondary infections (R) of ~2.5. R declined to ~1.0 in 2005 and <1.0 in 2012. The declining epidemic coincided and could be related to the nationwide implementation of anti-retroviral treatment (ART) for HIV-infected individuals in 2004 and PCVs in late 2000s. Capsular switching from vaccine serotype 14 to non-vaccine serotype 23A was observed within the sub-lineage. Conclusions The prevalence of tet(S/M) in pneumococci was low and its dissemination was due to an unrecognised outbreak of CC230 in South Africa prior to ART and PCVs. However, capsular switching in this multidrug-resistant sub-lineage highlighted its potential to continue to cause disease in the post-PCV13 era.


Introduction
pneumococcal conjugate vaccines (PCVs) targeting up to 13 serotypes were gradually 102 introduced into childhood immunisation programmes in many countries and have significantly 103 reduced pneumococcal deaths globally by 51% and 75% in HIV-uninfected and HIV-infected 104 children aged <5 years, respectively, resulting in saving an estimated 375,000 lives annually 105 when compared with the estimated mortality rate in the pre-vaccine era 1 . However, increasing 106 invasive pneumococcal disease (IPD) caused by non-vaccine serotype pneumococci has been 107 observed in numerous locations including England and Wales 2 , France 3 , Germany 4 and Israel 108 5 , a phenomenon known as serotype replacement. Serotype replacement could be mediated by 109 capsular switching, in which a cps locus encoding vaccine-type (VT) capsule is replaced by a 110 cps locus encoding non-vaccine-type (NVT) capsule through homologous recombination 6 . 111 Capsular switching within multidrug-resistant lineages, especially those recognised by the 112

Pneumococcal
Molecular Epidemiology Network (PMEN, 113 http://spneumoniae.mlst.net/pmen/pmen.asp), is of increasing concern, as these expansions can 114 reduce overall vaccine effectiveness in preventing IPD and temper the reduction in 115 antimicrobial-resistant pneumococcal infections associated with introduction of PCVs 7 . The 116 persistence of the multidrug resistant lineage ST156 (Spain 9V -3, PMEN3) in the USA 117 following the introduction of PCV13 provides a clear example of a historically successful 118 lineage that underwent a capsular switch from VT (serotype 9V, 14 and 19A) to NVT (serotype 119 35B) and continued to cause IPD in the post-vaccine era [7][8][9] . 120 Resistance to tetracycline has been frequently observed in S. pneumoniae 10 . The genetic basis 121 was shown to be the tet(M), less commonly tet(O), which encode for a ribosomal protection 122 protein that prevents tetracycline binding to the bacterial 30S ribosome subunit 10,11 . Eleven 123 other classes of ribosomal protection proteins such as tet(S) and twelve mosaic structure of tet a transposase-containing element IS1216, which potentially mediates chromosomal

Isolate collection 138
In the GPS project, each participating country randomly selected disease isolates collected via 139 laboratory-based surveillance and carriage isolates via cohort-studies using the following 140 criteria: ~50% isolates were from children ≤ 2 years, 25% from children 3-5 years, and 25% 141 from individuals >5 years. By May 2017 (last accessed to the GPS database for this study), 142 12,254 isolates, representing 29 countries, in Africa (65%), North America (14%), Asia (9%), 143 South America (8%), and Europe (4%), were sequenced, passed quality control and included 144 in this study. The collection spanned 26 years between 1991 and 2016 and included both 145 carriage (n=4,863) and disease isolates (n=7,391). We compiled the metadata including age, 146 year of collection, sample source, HIV status and phenotypic antimicrobial susceptibility 147 testing results, where available, from each participating site. In children < 18 months of age, 148 HIV status was confirmed by PCR assay. MIC results were interpreted according to Clinical 149

Genome sequencing and analyses 152
The pneumococcal isolates were whole genome sequenced on an Illumina HiSeq platform and 153 raw data were deposited in the European Nucleotide Archive (ENA) (Supplementary 154 metadata). We inferred serotype, multilocus sequence types (MLSTs) and resistance profile for 155 penicillin, chloramphenicol, cotrimoxazole, erythromycin and tetracycline from the genomic 156 data as previously described 18  whereas R<1 indicates a declining epidemic. Notably, R≥1 can be reflected in the coalescent-184 based skyline plot analysis, whereas R<1 cannot. Therefore, we expected the birth-death 185 skyline model would be a better fit for our data. Other Bayesian population size models 186 (coalescent constant, coalescent exponential and Bayesian skyline) in combination with strict 187 and lognormal-relaxed molecular clocks were also applied for comparisons using BEAST. 188

Integrative and conjugative element (ICE) 189
The ICE was extracted from the de novo assemblies of CC230 isolates and compared using 190 EasyFig version 2.2.2. The NCBI accession numbers for the representative ICE sequences in 191 HPD of R were below one, indicating a declining epidemic. The coalescent-based skyline plot 259 failed to detect the impact of the epidemic decline as described in a previous study 27 . 260

ICE carrying tet(S/M) 261
The acquisition of tetracycline and erythromycin resistance determinants by CC230 was the 262 result of the insertion of a Tn5253-type ICE, which shared a similar structure to 263 together with the established multidrug resistant genotypes, it is of concern that any further 319 capsular switching may increase the chance of this multidrug-resistant lineage surviving and 320 continuing to cause invasive disease. 321 also revealed a high degree of allelic variations that were probably due to homologous 326 recombination 36 . This finding is consistent with previous studies which suggest that the tet 327 evolved separately from Tn916 10, 36 . However, the driving force behind the evolution of tet 328 genes remains unclear , given that tetracycline is not used as a first-line antibiotic to treat 329 pneumococcal disease and was seldom used in young children 37 . The allelic diversity of tet 330 gene may be maintained by 1) frequent recombination among S. pneumoniae and with closely 331 related species such as normal nasopharyngeal resident S. mitis 38 and zoonotic pathogen S. 332 suis 39 ; 2) antibiotic-selective pressure via food chain, as tetracycline is widely used in 333 agriculture 40 and its residue is detected in milk 41 . Future studies that investigate the driving 334 force behind will improve our understanding to develop preventive measure to reduce 335 tetracycline resistance in S. pneumoniae. 336 In conclusion, we identified a novel tetracycline-resistant determinant tet(S/M) in 337 S. pneumoniae and showed that its dissemination is due to a clonal expansion of the multidrug-       Figure S2. Maximum likelihood phylogenetic tree was constructed using 2,178 SNPs extracted from a 21,296-bp alignment of serotype 23A cps locus sequences from the serotype 23A S. pneumoniae isolates (n=130) in the GPS curated dataset. This analysis used the serotype 23F cps locus reference sequence (accession number CR931685) as the outgroup on which to root the tree. The serotype 23A cps reference sequence (accession number CR931683) was included. The primary clonal complex (CC) or sequence type (ST) associated with Global Pneumococcal Sequence Cluster (GPSC) was indicated in parentheses. Figure S3. Linear regression of root-to-tip distance against time on tet(S/M)-CC230 lineage (n=129) using TempEST v1.5. TempEst detected a significant positive correlation of year of collection with its genetic distance from the root, indicating a signal of a 'molecular clock', with which isolates measurably diversifying from their last common ancestor over time.