The multi-speed genome of Fusarium oxysporum reveals association of histone modifications with sequence divergence and footprints of past horizontal chromosome transfer events

Fusarium oxysporum is an economically important pathogen causing wilting or rotting disease symptoms in a large number of crops. It is proposed to have a structured, “two-speed” genome: i.e. regions containing genes involved in pathogenicity cluster with transposons on separate accessory chromosomes. This is hypothesized to enhance evolvability. Given the continuum of adaptation of all the genes encoded in a genome, however, one would expect a more complex genome structure. By comparing the genome of reference strain Fol4287 to those of 58 other Fusarium oxysporum strains, we found that some Fol4287 accessory chromosomes are lineage-specific, while others occur in multiple lineages with very high sequence similarity - but only in strains that infect the same host as Fol4287. This indicates that horizontal chromosome transfer has been instrumental in past host-switches. Unexpectedly, we found that the sequence of the three smallest core chromosomes (Chr. 11, 12 and 13) is more divergent than that of the other core chromosomes. Moreover, these chromosomes are enriched in genes involved in metabolism and transport and genes that are differentially regulated during infection. Interestingly, these chromosomes are –like the accessory chromosomes– marked by histone H3 lysine 27 trimethylation (H3K27me3) and depleted in histone H3 lysine 4 dimethylation (H3K4me2). Detailed genomic analyses revealed a complex, “multi-speed genome” structure in Fusarium oxysporum. We found a strong association of H3K27me3 with elevated levels of sequence divergence that is independent of the presence of repetitive elements. This provides new leads into how clustering of genes evolving at similar rates could increase evolvability. Author summary Fungi that cause disease on plants are an increasingly important threat to food security. New fungal diseases emerge regularly. The agricultural industry makes large investments to breed crops that are resistant to fungal infections, yet rapid adaptation enables fungal pathogens to overcome this resistance within a few years or decades. It has been proposed that genome ‘compartmentalization’ of plant pathogenic fungi, in which infection-related genes are clustered with transposable elements (or ‘jumping genes’) into separate, fast-evolving regions, enhances their adaptivity. Here, we aimed to shed light on the possible interplay between genome organization and adaptation. We measured differences in sequence divergence and dispensability between and within individual chromosomes of the important plant pathogen Fusarium oxysporum. Based on these differences we defined four distinct chromosomal categories. We then mapped histone modifications and gene expression levels under different conditions for these four categories. We found a ‘division of labor’ between chromosomes, where some are ‘pathogenicity chromosomes’ - specialized towards infection of a specific host, while others are enriched in genes involved in more generic infection-related processes. Moreover, we confirmed that horizontal transfer of pathogenicity chromosomes likely plays an important role in gain of pathogenicity. Finally, we found that a specific histone modification is associated with increased sequence divergence.

with transposons on separate accessory chromosomes. This is hypothesized to 25 enhance evolvability. Given the continuum of adaptation of all the genes encoded in 26 a genome, however, one would expect a more complex genome structure. By 27 comparing the genome of reference strain Fol4287 to those of 58 other Fusarium 28 oxysporum strains, we found that some Fol4287 accessory chromosomes are 29 lineage-specific, while others occur in multiple lineages with very high sequence 30 similarity -but only in strains that infect the same host as Fol4287. This indicates that 31 horizontal chromosome transfer has been instrumental in past host-switches.

32
Unexpectedly, we found that the sequence of the three smallest core chromosomes 33 (Chr. 11, 12 and 13) is more divergent than that of the other core chromosomes.   Fig 2, Fig S2). For these regions synteny is also relatively 195 conserved in tomato-infecting isolates, with aligned segments spanning up to 40 kb,

196
whereas for the rest of the accessory genome this is between 5 and 10 kb ( Fig S3).
In conclusion, accessory chromosomes of Fol4287 largely correlate with   198  phylogenetic clade, but chromosome 14 and regions on chromosome 3 and 6   199 correlate with host and are more conserved in sequence than core chromosomes. we observed large deletions in core chromosomes in two melon-infecting strains: 237~0.5 Mb in a subtelomeric region of chromosome 12 in Fom010, and ~0.5 Mb in a 238 subtelomeric region of chromosome 13 in Fom013 (Fig 2).

240
In addition to differences in the propensity for loss or large deletions, there are 241 striking differences in the level of sequence conservation within and among core 242 chromosomes. The three smallest core chromosomes of Fol4287 -Chr. 11, 12 and 243 13 -are clearly more divergent than the other, larger core chromosomes (Fig 2) in 244 terms of sequence similarity as well as synteny ( Fig S2, Fig S3). Subtelomeric 245 regions of all chromosomes and a ~1 Mb central region of chromosome 4, associated 246 with a genomic rearrangement when compared to F. verticillioides [41], also show 247 elevated levels of sequence divergence and lower synteny levels ( Fig S2, Fig S3).

248
Notably, these chromosomes are not enriched in repetitive elements and have a 249 similar gene density as other core chromosomes [41] (Fig S4). From here on, we will 250 refer to chromosomes 11, 12 and 13 as 'fast-core' chromosomes.

11 252
Genes on fast-core and accessory chromosomes have lower expression levels.

272
We expected that the pathogenicity regions, required for virulence on tomato (Fig 1)

295
The proteins encoded by the 28 upregulated genes located in the pathogenicity 296 regions on chromosome 3 and 6 include ten transposase-like proteins with a domain 297 of unknown function, five enzymes and two small secreted proteins, one of which is a 298 homolog of SIX8 (Table S4)

306
Interestingly, while on the 'normal' core chromosomes only 4% of the genes is 307 differentially regulated during infection, on the fast-core chromosomes this is > 10%.

308
Hence not only chromosome 14, but also the fast-core chromosomes are significantly 309 enriched for differentially expressed genes (P-value < 1.17 x 10 -31 ), both up-(P-value 310 < 6.5 x 10 -25 ) and downregulated (P-value < 1.67 x 10 -06 ) (Fig 3). Not just fast-core 311 chromosomes 11, 12 and 13, but also the other regions that showed elevated levels 312 of sequence divergence (Fig 1, Fig 2, Fig S2) such as the subtelomeric regions of 313 core chromosomes and the central region on chromosome 4 have more differentially 314 expressed genes than core regions ( Fig S6).   (Fig 3, Fig S8). Interestingly, the same holds true for the fast-core 366 chromosomes and sub-telomeric regions, in sharp contrast to the rest of the core 367 genome that is enriched in H3K4me2 and depleted in H3K27me3 (Fig 4, Fig S8, Fig   368  S9, Table S7). H3K27me3-enriched domain ( Fig S10). H3K4me2-enriched domains on core 375 chromosomes are larger than those on fast-core and accessory chromosomes: they 376 span between 10 and 100 kb on the core genome, compared to between 2.5 and 5 377 kb on the fast-core and accessory chromosomes (Fig 4, Fig S9,

416
The differences in functional gene classes enriched in core, fast-core, accessory and 417 pathogenicity chromosomes raises the question to what extent the observed 418 hierarchy in sequence divergence levels (Fig 1, Fig 2)

453
Interestingly, both genes on fast-core chromosomes and genes on core 454 chromosomes that are located close to a telomere also have higher d S and d N values 455 ( Fig 5A, Table S8, Fig S11). When we compare substitution levels for genes that are 456 located in a region enriched in H3K27me3 on core chromosomes, the distribution of 457 d N and d S of these genes is similar to those located on the fast-core chromosomes 458 ( Fig 5A, Fig S12) Fig 5, Fig S13, Fig S14). We found that 473 on fast-core chromosomes SNP density was higher in H3K27me3-enriched regions 474 (comprising more than 90% of the fast-core chromosomes), compared to regions that 475 are not enriched in H3K27me3, but that this difference was not significant. However, on core chromosomes H3K27me3-enriched regions have a significantly higher 477 density of SNPs (P-value < 1.5 x 10 -7 ). When we included SNPs in less closely 478 related strains (i.e. all strains in the Clade III, the clade depicted in Fig 1), we found 479 that fast-core chromosomes, the central region on chromosome 4 and sub-telomeric 480 core regions have more SNPs than (other) core regions (Fig S13) We found that none of the core chromosomes of Fol4287 are rich in transposons (Fig   619  S4), but the three smallest core chromosomes, termed 'fast-core', are distinct from 620 the other core chromosomes in terms of sequence divergence, synteny, 621 dispensability, and selected histone modifications. In that respect they are similar to  (Fig 5, Fig 6), the 630 authors observed a higher mutation rate in (in their case H3K9me2-enriched) 631 heterochromatin [78]. The fact that on core and fast-core chromosomes, H3K27me3 632 strongly associates with higher synonymous substitution rates (Fig 6A, B) 6), and one in which omega follows a beta 782 distribution between 0 and 1 (M7M8 in Fig 6) Table S3. Functional analysis of differentially regulated genes on pathogenicity 1467 region on chromosome 3 and 6.
1468 Table S4. Functional analysis of differentially regulated genes on chromosome 1469 14.