The five Urochloa spp. used in development of tropical forage cultivars originate from defined subpopulations with differentiated gene pools

Background and Aims Urochloa (syn. Brachiaria, and including some Panicum and Megathyrus) is a genus of tropical and subtropical grasses widely sown as forage to feed ruminants in the tropics. A better understanding of the diversity among Urochloa spp. allow us to leverage its varying ploidy levels and genome composition to accelerate its improvement, following the example from other crop genera. Methods We explored the genetic make-up and population structure in 111 accessions, which comprise the five Urochloa species used for the development of commercial cultivars. These accessions are conserved from wild materials from collection sites at their centre of origin in Africa. We used RNA-seq, averaging 40M reads per accession, to generate 1,167,542 stringently selected SNP markers that tentatively encompassed the complete Urochloa gene pool used in breeding. Key Results We identified ten subpopulations, which had no relation with geographical origin and represented ten independent gene pools, and two groups of admixed accessions. Our results support a division in U. decumbens by ploidy, with a diploid subpopulation closely related to U. ruziziensis, and a tetraploid subpopulation closely related to U. brizantha. We observed highly differentiated gene pools in U. brizantha, which were not related with origin or ploidy. Particularly, one U. brizantha subpopulation clustered distant from the other U. brizantha and U. decumbens subpopulations, so likely containing unexplored alleles. We also identified a well-supported subpopulation containing both polyploid U. decumbens and U. brizantha accessions; this was the only group containing more than one species and tentatively constitutes an independent “mixed” gene pool for both species. We observed two gene pools in U. humidicola. One subpopulation, “humidicola-2”, was much less common but likely includes the only known sexual accession in the species. Conclusions Our results offered a definitive picture of the available diversity in Urochloa to inform breeding and resolve questions raised by previous studies. It also allowed us identifying prospective founders to enrich the breeding gene pool and to develop genotyping and genotype-phenotype association mapping experiments. HIGHLIGHT We clarified the genetic make-up and population structure of 111 Urochloa spp. forage grasses to inform cultivar development.


INTRODUCTION 52
Urochloa (syn. Brachiaria, and including some Panicum and Megathyrus) is a genus of tropical 53 and subtropical grasses widely sown as forage to feed ruminants in the American and African 54 tropics, particularly in areas with marginal soils. Urochloa grasses exhibit good resilience and 55 low nutritional needs (Miles, 2007, Gracindo et al., 2014, Maass et al., 2015. Five species, U. 56 ruziziensis, U. decumbens, U. brizantha, U. humidicola, and U. maxima are broadly used as 57 fodder plants, covering over 100M hectares in Brazil alone. Such an enormous area, about 58 half that of each of the most widely grown cereals, wheat or maize, has a huge environmental 59 impact in terms of displacement of native species, water usage, and provision of ecosystem 60 services. In addition to extensive pasture systems in Latin America, Urochloa is also planted 61 in intensive smallholder systems in Africa and Asia (Keller-Grein et al., 1996, Maass et al., 62 2015. Breeding programmes in different countries have exploited the diversity among 63 Urochloa spp. for the development of commercial forage cultivars by recurrent selection over 64 many years (Jank et al., 2014, Tsuruta et al., 2015, Worthington and Miles, 2015. 65

66
The genus Urochloa includes species previously classified under Brachiaria, Megathyrsus,67 Eriochloa and Panicum (Torres González andMorton, 2005, Kellogg, 2015). Joint missions 68 between 1984 and 1985 conducted by the CGIAR (Consultative Group on International 69 Agricultural Research) centres in several African countries collected wild materials from the 70 species in the genus, mostly as live plant cuttings or ramets (Keller-Grein et al., 1996). These in an agamic (apomictic) group or complex (Do Valle and Savidan, 1996, Renvoize et al., 77 1996, Ferreira et al., 2016, Triviño et al., 2017. Crosses between ~ 10 founders from these 78 three species were completed in the late 1980s and their progeny constitutes the basis of the 79

Population analysis 164
Population structure analysis was performed through ADMIXTURE (Alexander and Lange, 165 2011) using K = 3 to K = 10 for the 111 samples, K = 2 to K = 8 for the 67 samples and K = 2 166 to K = 8 for the 28 samples. Each value of K was run 10 times, the cross-validation error was 167 averaged over the 10 runs. The 10 output files were combined using CLUMPP within the R 168 Species identity and ploidy were previously determined using plant architecture traits and flow 178 cytometry of fluorescently stained nuclei (Tomaszewska et al., 2021a, Tomaszewska et al., 179 2021b. The country of origin of 92 accessions was known and for 75 accessions we also 180 knew the collection coordinates ( Fig. 1). Accessions were collected in a broad range of 181 latitudes (20.08S to 11.37N) but not of longitudes (26.98E to 42.05E), except for one U. 182 brizantha accession from Cameroon. Annotations were summarised in Table 1 (Worthington et al., 2021), which corresponds to a diploid U. ruziziensis sample. Two well-187 defined groups of species were observed based on aligning metrics (Fig. 2); over 70 % of the 188 reads from U. ruziziensis, U. decumbens and U. brizantha (all but one) accessions had more 189 than 70 % of reads that aligned in the reference genome once (uniquely-mapping reads). On 190 the contrary, accessions from U. maxima and U. humidicola showed a percentage of uniquely-191 mapping reads under 70 % ( Fig. 2A). The grouping was correlated to the genetic distance to 192 the reference genome (reference bias). 193

194
The percentage of reads mapping in multiple loci (multi-mapping reads) increased with ploidy 195 (Fig. 2B) for the group of the accessions belonging to the species U. ruziziensis, U. 196 decumbens and U. brizantha; diploids had a percentage of multi-mapping reads under 5 %, 197 while it was over 5 % in most polyploid accessions. However, the percentage of multi-mapping 198 reads in the other species, which are more distant species to the reference, was directly 199 proportional to the total number of mapped reads (Fig. 2B), i.e. independent of ploidy. were considered together. The "admixture model" assumes that each individual has ancestry 215 from one or more of "K" genetically distinct sources. An estimation of four subpopulations 216 (K = 4) was selected based on the CV error (Suppl. Fig. 1A) and population structure (Fig. 4). 217 A minimum threshold of 50% genetic composition was used to assign accessions to groups. 218 This allowed us to place the accessions in four groups (Fig. 3): U. humidicola (28 accessions), 9 U. maxima (13 accessions), "agamic group 1" (54 accessions from the three remaining 220 species) and a closely related "agamic group 2" (that corresponded with the "brizantha-1" 221 subpopulation). Three samples obtained from USDA and identified simply as "Urochloa sp." 222 showed an admixture of these four groups and were annotated as "admixed". Accession 223 26438 (sample 86) was received as U. humidicola. Since it clustered with the U. maxima 224 accessions, we reassigned it into that species. When we reduced the number of groups (K = 225 3), the U. humidicola and U. maxima species clustered together, but the agamic groups 1 and 226 2 were consistent (Suppl. Figure 2). When we increased the number of groups (K = 5), a new 227 group split from the "agamic group 1" (that corresponded with the "brizantha-2" shared ancestry (1-25%) with diploid U. decumbens. All seven diploid U. decumbens 241 accessions composed the group "decumbens-P2" and were pure accessions with no shared 242 ancestry with any other group. Similarly, ten tetraploid U. decumbens formed the group 243 "decumbens-P4" with pure accessions with no shared ancestry with any other group. 244 However, another six tetraploid U. decumbens composed a different group together with five 245 U. brizantha accessions, which was called "decumbens/brizantha". This group of eleven 246 accessions was the only one composed by more than one species. Despite this mix, these 247 accessions showed clear shared ancestry among them and no shared ancestry with any other 248 group (except two samples with minor components). Finally, the group "brizantha-1" and 249 "brizantha-2" were formed by eight and thirteen U. brizantha accessions, respectively. The 250 group "brizantha-2" has pure accessions with no shared ancestry with other groups (with one 251 minor exception under 5%); while most samples in "brizantha-1" have shared ancestry with 252 "decumbens-P4". The group "brizantha-1" corresponds to the previous "agamic group 2". The 253 "brizantha-2" subpopulation was only observed in Ethiopia, while "brizantha-1" was observed 254 in a broad range of latitudes. When we reduced the number of groups (K = 5), the "brizantha-255 decumbens" merged with the "decumbens-P4". When we increased the number of groups (K 256 = 7), five "brizantha-1" split into an independent subpopulation (Suppl. Fig. 3). 257

258
The admixture analysis was finally completed using only the twenty-eight U. humidicola 259 accessions (Fig. 3C). An estimation of two groups (K = 2) was selected based on the CV error 260 (Suppl. Fig. 1C) and population structure (Fig. 4). A minimum threshold of 70 % shared genetic 261 composition was used to assign accessions to a group. The twenty-eight samples into the two 262 groups: 23 accessions into "humidicola-1" and four accessions into "humidicola-2". Accession 263 16878 was an equal mixed from both U. humidicola groups and annotated as "humidicola-264 admixed". When we increased the number of groups (K = 3 and K = 4), we obtained a small 265 subpopulation with the accessions with higher admixture (16878 and 26155) and an artificial 266 split with some "humidicola-1" accessions in an additional group (Suppl. Fig. 4).  in different subpopulations. The two diploid subpopulations, "decumbens-P2" and "ruziziensis" 284 clustered together and apart from polyploid subpopulations. Subpopulation "brizantha-1" was 285 more distant to other agamic subpopulations than "brizantha-2" despite accessions in 286 "brizantha-1" showed shared admixture with tetraploid U. decumbens, while accessions in 287 "brizantha-2" did not. 288 289 Two groups of accessions contained hybrids, one containing hybrids between the distant 290 Urochloa species ("admixed" subpopulation) and another contained hybrids within the three 291 species in the agamic group ("agamic-admixed"), which readily interbreed in control 292 conditions. 293

DISCUSSION 295
We clarified the relationship between the gene pools from five Urochloa spp. that are used in 296 the development of commercial forage cultivars. By using RNA-seq, we leveraged in an 297 unprecedented number of markers, over 1.1M SNPs, that virtually encompassed the complete 298 transcriptome from the accessions based on the total genome length covered by the reads (~ 299 269 Mbp or 37 % of the genome). We obtained a median of 69 and 13 SNP sites per gene 300 and exon, respectively, which makes this dataset a valuable resource for breeders and 301 researchers (e.g. to design screening markers). The greater length compared to the annotated 302 gene models (Worthington et al., 2021) was probably because the later only used 303 transcriptomic data from U. ruziziensis. However compared with our dataset, that 304 transcriptomic data was from multiple tissues and included stress conditions. The genus 305 Urochloa includes species previously classified under other taxonomic groups. We have opted 306 to annotate all as Urochloa supported by recent work (Tomaszewska et al., 2021b). E.g., we 307 did observed the same distance between U. maxima and the agamic group than between U. 308 humidicola and the agamic group. to form an independent cluster ("agamic group 2"; Fig. 3A and 4A). This group of eleven 323 accessions later formed the subpopulation "brizantha-1". Despite of "brizantha-1" being 324 distant, we observed admixture between "brizantha-1" and "decumbens-P4", "brizantha-2" 325 and "decumbens/brizantha" subpopulations ( Fig. 3B). Among the possible evolutive scenarios 326 that would explain the multiple shared ancestry in "brizantha-1" despite being more distant, 327 previous studies have proposed a single polyploidization event taking place to establish both 328 the tetraploid U. brizantha and U. decumbens (Pessoa-Filho et al., 2017, Tomaszewska et al., 329 2021b. The "brizantha-1" subpopulation was observed in a broad range of latitudes (e.g. in 330 Ethiopia and Zimbabwe), while "brizantha-2" was only observed in Ethiopia. 331 13 332 We obtained a subpopulation, named "decumbens/brizantha", that included an almost equal 333 number of U. decumbens and U. brizantha accessions. This is the only subpopulation with 334 more than one species, and most accessions did not show shared ancestry with the other 335 subpopulations from either of these species (Fig. 3B). Furthermore, the PCA also showed 336 "decumbens/brizantha" clustered independently to other groups (in the top right corner of Fig.  337 4D). Remarkably, the "decumbens/brizantha" merged with the "decumbens-P4" when we did 338 the admixture analysis with less subpopulations (K = 5). At the same time, two accessions 339 (16173 and PI226049) shared ancestry with "brizantha-2" and were situated between the 340 subpopulations "decumbens/brizantha" and "brizantha-2" in the PCA (Fig. 4D). Consequently, 341 we concluded "decumbens/brizantha" cannot be merged with either "decumbens-P4" or 342 "brizantha-2", but on the contrary, evidence supported it is an independent subpopulation. from EMBRAPA's collection (so resulting from the same field work in 1980s than our dataset) 346 using 20 SSR markers. Eleven accessions are common between both studies, and our 347 subpopulations "brizantha-1" and "brizantha-2" corresponded with clusters II and I, 348 respectively (Vigna et al., 2011b). Notably, their cluster III appears to include additional 349 "brizantha-1" and "brizantha-2" accessions (16122, 16480), so does not correspond with our 350 "decumbens-brizantha" subpopulation. Triviño et al. (2017) did not discussed a division among 351 U. brizantha accessions, but included a tree resulting from UPGMA clustering based on 39 352 microsatellites that would also support at least two gene pools in U. brizantha. 353

This cluster of accessions included hybrid accessions resulting from interspecific species 356
within the agamic group, and should not be confused with the "admixed" accessions ( Fig. 4C), 357 which resulted from crosses between more distant Urochloa species. Our analysis supports 358 that the "agamic-admixed" are either progeny from crosses between U. decumbens and either 14 U. decumbens or U. brizantha (16505, PI210724, PI292187); or between U. ruziziensis and 360 either U. decumbens or U. brizantha (26175,16494,26110,1752). 361 362 U. maxima is also known as Panicum maximum or Megathyrsus maximus. All U. maxima 363 accessions (including accession 26438, which was incorrectly annotated as U. humidicola) 364 showed very limited diversity (Fig. 4) and were assigned to a single subpopulation ("maxima"). 365 This could reflect lower diversity in the species, or be a consequence of original collection and 366 sampling strategy, but it suggests there would be limited gains from including multiple 367 accessions from our study in breeding programmes. 368