Diversification dynamics and (non-)parallel evolution along an ecological gradient in African cichlid fishes

Understanding the drivers and dynamics of diversification is a central topic in evolutionary biology. Here, we investigated the dynamics of diversification in the cichlid fish Astatotilapia burtoni that diverged along a lake-stream environmental gradient. Whole-genome and morphometric analyses revealed that divergent selection was essential at the early stages of diversification, but that periods in allopatry were likely involved towards the completion of speciation. While morphological differentiation was continuous, genomic differentiation was not, as shown by two clearly separated categories of genomic differentiation. Reproductive isolation increased along a continuum of genomic divergence, with a “grey zone” of speciation at ∼0.1% net nucleotide divergence. The quantification of the extent of (non-)parallelism in nine lake-stream population pairs from four cichlid species by means of multivariate analyses revealed one parallel axis of genomic and morphological differentiation among seven lake-stream systems. Finally, we found that parallelism was higher when ancestral lake populations were more similar.

). 151 Next, we investigated the increase in genomic differentiation against a proxy of time since 152 population divergence, net nucleotide difference (Da). To extend the comparative framework 153 encompassing a whole continuum of genomic divergence from populations to species, we here included 154 the between-species comparisons of the three additional haplochromine species H. stappersii, C. horei 155 and P. philander. We found a rapid increase in FST at low levels of Da, which slowed down as Da 156 increased further. Given this trend, we fitted a logarithmic model to the data which was highly significant 157 (linear regression FST ~ log(Da): P < 2.2e -16 ; Pearson correlation coefficient: R 2 = 0.96). Therefore, to 158 better visualize the dynamics of diversification at early stages of genomic differentiation, Da was plotted 159 on a logarithmic scale (Fig. 2B). The one-species category encompassed Da values ranging from 5x10 -6 160 to 5x10 -4 , while the two-species category features Da values of 0.002 and above. With Da ~0.001, the 161 population pair from the Lufubu system occupied the 'grey zone' of speciation between the one-species 162 and two-species categories. 163 Reproductive isolation begins establishing at low levels of genome-wide differentiation 164 As a next step, we assessed to what extent the observed levels of genome-wide differentiation 165 scale with the degree of reproductive isolation between populations in A. burtoni. To this end, we 166 revisited available data from previous experiments that used the same A. burtoni populations as in the 167 present study (24,27,28) and conducted additional mate-choice experiments in the laboratory between 168 the genetically most distinct populations of A. burtoni from the North and the South of LT (26) ( Table 1,  that premating reproductive isolation mechanisms are at play between the genetically most distinct A. 186 burtoni clades -the northern and southern lineages (26) -at a level of genomic differentiation that is 187 similar to the one typically observed between other haplochromine species (Fig. 2B). 188 Divergent selection and drift accelerate genome-wide differentiation 189 We then investigated the relative influence of divergent selection and geography on genomic and 190 morphological diversification in A. burtoni. To do so, we compared the more closely-related southern 191 populations of A. burtoni focusing on within habitat (lake-lake) versus between habitats (lake-stream) 192 comparisons. The latter consisted of the five A. burtoni lake-stream systems from the South of LT (i.e.

193
Chitili; Kalambo 1; Lunzua; Kalambo 2; Lufubu) ( Fig. 1B), whereas the lake-lake comparisons consisted 194 of all pairwise comparisons of lake populations from that area. In both cases, we found an increase in FST 195 over geographic distance (Fig. 2C), which is compatible with an isolation-by-distance scenario for lake-196 stream (linear regression: P = 0.0013; Pearson correlation coefficient: R 2 = 0.97) and lake-lake (Mantel 197 test: P = 0.08; Pearson correlation coefficient: R 2 = 0.98) contrasts. Interestingly, genomic differentiation 198 increased much stronger with respect to geographic distance when populations were compared that 199 inhabit contrasting environments (lake-stream population comparisons; i.e. in the presence of divergent 200 selection and drift) than when they inhabit similar environments (lake-lake comparisons; i.e. primarily 201 in the presence of drift), suggesting that divergent selection outweighs isolation-by-distance in 202 diversification in A. burtoni (Fig. 2C). 203 Demographic modelling within lake-stream population pairs suggested that there is ongoing gene 204 flow between lake and stream fish in all population pairs (Fig. S3; Table S4) and that the effective 205 population size of most stream populations was smaller than the size of the respective lake population 206 (Table S4), which is compatible with the scenario that the respective rivers were colonized from lake 207 stocks (24, 25). As the impact of genetic drift is stronger in small populations, drift may also have 208 contributed to the increased genomic divergence of stream populations. Interestingly, the diversification 209 trajectories were similar between lake-stream and lake-lake comparisons at low levels of genetic, 210 morphological and geographic distances (Fig. 2D), but diverged as genomic differentiation built up much 211 more rapidly compared to geographic distance in the presence of divergent selection and increased drift 212 (lake-stream comparisons) (Fig. 2D). This corroborates the notion that the environment (via divergent 213 selection) and demographic events play a crucial role in differentiation trajectories in A. burtoni at early 214 stages of differentiation. 215 Little overlap between differentiation regions among independent lake-stream systems 216 We then turned our attention to the question of parallel genomic and morphological evolution 217 along the lake-stream environmental gradient. To do so, we compared the six A. burtoni lake-stream 218 population pairs and used three additional population pairs from different haplochromine species to 219 extend the comparative framework to between-species comparisons. Notably, H. stappersii and C. horei 220 are found in sympatry with A. burtoni in the Rusizi River and in the Lufubu River, respectively, providing 221 an unprecedented opportunity to examine two replicates of lake-stream colonization across species. The  philander populations may actually represent two distinct species, as suggested by their different sex 227 determining systems (29). Consistent with this, the P. philander populations display relatively high levels 228 of absolute divergence (dXY = 3.6x10 -3 ), which are above the levels of divergence measured between the 229 northern and southern A. burtoni lineages (dXY = 3.0-3.3x10 -3 ) ( Table 2). 230 We next investigated genomic regions of high differentiation between lake-stream population 231 pairs and evaluated to which extent such outlier regions (defined here as the intersection between the top 232 5% 10-kb windows with respect to FST, dXY, and absolute value of π difference) are shared between 233 population pairs and species. Our analyses revealed between 2 and 101 outlier regions of high 234 differentiation per lake-stream population pair and that these regions were between 10-70 kb in length 235 (red dots in Fig. 3, Table S5). It has been shown that heterogeneity in crossover rates can produce 236 contrasting patterns of genomic differentiation between diverging populations that are not due to 237 divergent selection. These are manifested, for example, in greater levels of differentiation near 238 chromosome centres (where crossover rates are low) compared to the peripheries (where crossover rates 239 are high) (30). In our case, we did not find evidence for an accumulation of differentiation regions in the 240 chromosome centres in any of the lake-stream population pairs nor when all 525 outlier regions were 241 considered jointly (Table S5), suggesting that our results reflect true signatures of divergent selection. 242 We then evaluated to what extent differentiation regions were shared among lake-stream 243 population pairs and species. We found that there was little overlap (Fig. 4A) and that no such region of 244 high differentiation was shared between more than two systems (Fig. 4A). The regions of high 245 differentiation were distributed across all linkage groups, although there seems to be an  (Table S5). The 525 differentiation regions contained a total of 637 genes. However, there was no 248 obvious overrepresentation in functional enrichment with respect to Gene Ontology categories. The only 249 exception was the Lunzua lake-stream comparison, in which outlier genes were significantly enriched 250 for the molecular function "binding" (Table S6). The 25 genes located in the 19 differentiation regions 251 shared between two systems also showed no functional enrichment (Table S7).

252
In situations of a shared evolutionary history of the populations in question, such as in our set-253 up, frequency-based outlier detection methods may not be the most appropriate way to detect regions 254 under selection. Thus, we also performed Bayesian analyses of selection at the SNP level that take into 255 account population relatedness (31). Due to their high levels of genome-wide differentiation, the 256 population pairs of C. horei and P. philander were excluded from these analyses, as signatures of 257 selection and drift cannot easily be disentangled in such cases. For the same reason, we treated the 258 northern and southern populations of A. burtoni as separate units. We identified 1,704 shared 10-kb windows of differentiation between A. burtoni from the South and H. stappersii, 1,683 shared windows 260 between A. burtoni from the North and H. stappersii, and 1,542 shared windows between A. burtoni from 261 the North and from the South (Fig. 4B). In total, 373 windows were shared among the three core sets, 262 containing 367 genes (Fig. 4B; Table S8). Interestingly, some genes involved in sensory perception 263 (sound and light) were overrepresented in the common set of outliers between H. stappersii and the 264 southern A. burtoni populations.

265
(Non-)parallel evolution in lake-stream divergence 266 We further aimed to assess the extent of phenotypic and genomic (non-)parallelism among lake-267 stream pairs of haplochromine cichlids. To examine how (non-)parallel the nine lake-stream population 268 pairs are, we performed vector analyses (19, 32) using the morphological (37 traits and landmarks 269 representing body shape) and genomic (outlier and non-outlier SNPs) data at hand (outlier SNPs:

278
Most comparisons between independent lake-stream population pairs fell into the category 'non-279 parallel', with close to orthogonal (~90°) angles of differentiation in both phenotype (θP) and genotype 280 (θG_outlier and θG) (Fig. 4F, I, Fig. S5D). However, parallelism was higher when only the within A. burtoni 281 comparisons were considered, with values of θP and θG_outlier between 70° and 80° in many cases (Fig. 282 4F, I), but not for θG (Fig. S5D). More clear signatures of parallelism were found in closely related A. 283 burtoni lake-stream systems, with the Lunzua-Kalambo1 pair being the most 'parallel' system at the 284 phenotype level (θP: 42°) and the Chitili-Kalambo1 pair the most 'parallel' at the genotype level (θG_outlier:  It has recently been proposed to examine the vectors of divergence using a multivariate approach 292 as a complementary set of analyses to investigate (non-)parallelism and convergence (34). For each 293 dataset (phenotype, genotype outlier and genotype non-outlier), we thus calculated the eigen 294 decomposition of the respective C matrix (m lake-stream systems x n traits). For the phenotype data, we 295 found that the first eigenvector (or principal component, PC) encompassed about 33% of the total 296 phenotypic variance, which was significantly higher than expected under the null Wishart distribution 297 (Fig. 4G). In other words, there was one dimension of shared evolutionary change that contained 298 significant parallelism. We next examined if the different lake-stream systems evolve in parallel or anti- anti-parallel evolution was detected in phenotypic divergence in haplochromine cichlid lake-stream 305 systems. To infer which phenotypic characteristics were evolving in parallel among the different lake-306 stream systems, we examined which landmarks were contributing the most to the first PC by examining 307 PC1 loading values. We found that the seven landmarks with the most extreme loading values (<-0.4 or 308 >0.4) were related to mouth position (landmark y1), eye size (landmarks x4, y5) and slenderness of the 309 body (body depth/standard length ratio and landmarks y7, y8, y9) (Fig. S5J, M). 310 We finally examined the genetic outlier data and found that the first PC encompassed about 30% 311 of the total genetic variance, which was significantly higher than expected under the null Wishart 312 distribution (Fig. 4J). Remarkably, the signs of PC1 loadings (positive or negative) were the same for 313 each lake-stream system as for the phenotype data, namely positive for H. stappersii and P. philander, 314 and negative for the remaining seven lake-stream systems. This further highlights parallel and anti-  (Fig. S5K,N). This indicates that half of the PCs contributing to genetic parallelism 318 are involved in lake-stream divergence. Finally, when examining the genetic non-outlier data, we found 319 that the first PC encompassed about 25% of the total genetic (non-outlier) variance, which was also 320 significantly higher than expected under the null Wishart distribution (Fig. S5E). However, the  Parallelism is higher when ancestral populations are more similar 326 We then examined whether the degree of similarity in the ancestral lake populations correlates 327 with the extent of genetic and morphological parallelisms. We found a significant correlation in both  Body shape convergence in species inhabiting the same rivers 338 We finally assessed levels of convergence (and divergence) in phenotype and genotype across all 339 lake-stream systems. We first examined how genomic differentiation scales with morphological 340 differentiation across species. We thus contrasted the levels of absolute genomic (dXY) and morphological 341 (DM) differentiation between all A. burtoni populations and those of the three other haplochromine 342 species, resulting in 136 pairwise comparisons. We used dXY rather than FST as dXY is an absolute measure 343 of genomic differentiation that is better suited for between-species comparisons. As above, we classified 344 these comparisons according to the environmental contrasts (lake-lake; lake-stream; stream-stream). The  in Driver compared to Dlake is indicative of divergent evolution. We found divergence at the genomic level 356 in both 'outlier' and 'non-outlier' datasets, since trace subtraction (tr(Driver)-tr(Dlake)) was positive in both 357 cases. In contrast, we found convergence at the phenotypic level, since the result of trace subtraction was 358 negative. In this latter case, two traits encompassed more than 99% of the total variance, namely centroid 359 size (66%) and total length (33%). This might be explained by the fact that the different traits and 360 landmarks have different units (e.g. variation in total length is in cm, while variation in landmarks is in 361 mm), and therefore account differently for the total amount of variance (this inherent issue to the method 362 has been identified by DeLisle & Bolnick (34), and references therein). Therefore, a more intuitive and 363 biologically more meaningful way to assess convergence/divergence is to compare Euclidean distances 364 for a specific trait. Thus, for each pair of lake-stream systems, we compared the Mahalanobis distance 365 (DM) between the respective lake and the stream populations, where convergence is suggested when DM 366 river -DM lake < 0, with a more negative value indicative of higher convergence (34).

367
Within A. burtoni, the majority of pairwise comparisons indicated divergent evolution in body 368 shape except for the pairs Chitili/Kalambo 1 and Chitili/Kalambo 2, with the highest level of convergence 369 observed for the latter comparison (Table S9). The between-species comparisons revealed mostly   Table S3). Thus, we 384 observed body shape convergence in species that co-occur in the same river systems and, hence, 385 diversified along the very same environmental gradient. Finally, we can rule out gene flow between 386 species as the reason for these similarities between species (Figs. S6, S7). 389 The first aim of this study was to investigate the dynamics of genomic and morphological 390 diversification along the speciation continuum in the cichlid model species Astatotilapia burtoni. We that at early stages of differentiation, lake-stream population pairs displayed higher levels of genomic 396 divergence than lake-lake population pairs for equivalent geographic distances. Furthermore, consistent with previous findings based on RAD-seq data (25), we found that stream populations have smaller 398 effective population sizes than their respective lake population, possibly reflecting a founder effect and/or 399 physical limitations of the riverine environment compared to the lake environment. Therefore, the 400 combined effect of divergent selection and genetic drift likely accelerate genome-wide differentiation at 401 early stages of divergence. 402 Furthermore, we assessed the dynamics of morphological differentiation by examining body 403 shape differences among populations, as a slender body shape has been suggested to be adaptive in the 404 stream environment (24). We found that morphology was the first axis of differentiation in both lake-405 stream and lake-lake comparisons compared to genomic differentiation and geographic distance. Then, 406 differentiation trajectories diverged as genomic differentiation built up rapidly over short geographic 407 distances in the presence of divergent selection and drift (i.e. in a lake-stream setting), while genomic 408 differentiation built up more slowly in the presence of drift only (i.e. in a lake-lake setting). The relatively 409 more rapid morphological differentiation in the early phases of diversification may be explained by the 410 fact that variation in body shape is due to genetic and environmental factors (24, 35). 411 We established that divergent natural selection is a driver of early diversification in A. burtoni showed that the "grey zone" of speciation corresponded to Da levels ranging from 0.5% to 2% (41).

487
However, we acknowledge that this study and our data are not entirely comparable due to different ways 488 of measuring reproductive isolation (i.e. gene flow modelling (41) versus experimental measures in our 489 study), and due to the fact that Da is a relative measure of genome differentiation depending on within-species genetic diversity. Therefore, the absolute measure of divergence dXY is better suited for between- The third aim of this study was to examine and quantify levels of (non-)parallelism among six A. 505 burtoni lake-stream systems and three lake-stream systems from additional haplochromine species. To 506 this end, we identified regions of differentiation that we defined as the overlap of the top 5% of FST, dXY 507 and π-difference values. We found between 2 and 101 such outlier regions ranging from 10 to 70 kb. The 508 number of outlier regions reported here is relatively small compared to other studies in cichlids (43, 44), 509 which, however, can be explained by the more stringent definition of such regions in our study (the 510 intersection between three metrics). Furthermore, we found little overlap among differentiation regions 511 from different lake-stream systems, as only 19 outlier regions were shared among two lake-stream 512 systems, and not a single such region was shared by more than two systems. Interestingly, however, with this, a previous study (43) identified common highly differentiated genomic regions between a 518 young and an old cichlid species pairs diverging along a depth gradient in Lake Victoria, and found that 519 two thirds of the differentiation regions were private to each species pair. This highlights that adaptive 520 divergence often encompasses both parallel and system-specific (non-parallel) components.

521
The overall low levels of gene and function sharing between lake-stream systems reported here 522 may be due to cryptic environmental heterogeneity in the stream environment. The streams from which 523 the populations of this study were collected are not only different in size but also encompasses diverse 524 ecological niches that may require specific mechanisms of adaptation (e.g. (33)). A non-mutually 525 exclusive explanation for the lack of sharing of regions of differentiation is that these regions may not be 526 due to divergent selection but due to genetic drift in allopatry and background selection (45), as the  Parallelism revealed by multivariate analyses 533 To better quantify the level of (non-)parallelism among the nine lake-stream systems, we 534 performed Phenotypic Change Vector Analyses (PCVA) (19, 32). Pairwise examination of θP and θG outlier 535 revealed that diversification was not particularly parallel among lake-stream systems, as the majority of 536 θPand θG outlier-values were close to orthogonal. These values are similar to what has been previously 537 reported in lake-stream population pairs of threespine sticklebacks (33). Interestingly, θG outlier-values 538 correlated with θP-values, but θG non-outlier-values did not. This strongly suggests that adaptive phenotypic 539 divergence has a genetic basis in the species under investigation, which is similar to what has been 540 reported for lake-stream population pairs of threespine stickleback fish (33). 541 It has recently been suggested that signatures of parallelism can be overlooked when relying on 542 pairwise comparisons only (34). Therefore, we conducted the recently proposed multivariate analysis of 543 the C matrix (m lake-stream systems x n traits), which is essentially a PCA of (phenotypic or genomic) 544 vectors of differentiation to uncover major axes of evolutionary change (34). We found that for both 545 phenotypic and genomic outlier datasets, there was one significant axis of parallel evolutionary change, 546 in which seven out of nine lake-stream systems evolved in parallel. Remarkably, these seven lake-stream most to parallel genotypic evolution were related to lake-stream divergence (4 PCs), species differences 558 (3 PCs), and geographic divergence within A. burtoni (1 PCs). Therefore, the genomic outliers mostly 559 encompass candidate loci for lake-stream genomic adaptive divergence. Surprisingly, the analysis of the 560 genomic non-outliers also revealed a significant axis of parallel evolution. However, the genomic traits 561 underlying parallelism were not related to lake-stream divergence, but rather summarized between 562 species (1 PC), geographic (3 PCs) or within-population (3 PCs) divergence. Thus, the biological 563 interpretation of this major axis of parallel evolution is less straightforward. To date, these multivariate 564 analyses have been performed here and on a threespine stickleback dataset of 16 lake-stream population 565 pairs (34), therefore it would be interesting to apply these analyses to a broader range of model systems 566 to uncover additional major axes of parallel evolutionary changes.

567
The distance between ancestral populations influences the level of parallelism 568 It has previously been suggested that the probability of parallelism at the molecular level 569 decreases with time since divergence (50). Furthermore, it has been suggested that the extent of 570 parallelism should be higher when ancestral populations were closely related (17). We thus examined if 571 there was a correlation between the distances between ancestral (i.e. lake) populations and the level of 572 parallelism (i.e. θ). We found that for both phenotypic and genetic outliers, there was a significant 573 positive correlation between these metrics (θP and DM; θG outlier and FST), confirming that the 574 morphological and genetic distances between ancestral populations influence the level of parallelism.

575
Standing genetic variation is an essential component of replicated adaptive evolution (51). Thus, 576 we tested if the amount of standing genetic variation between ancestral populations was correlated to 577 levels of parallelism in the respective lake-stream systems. In line with our predictions, we found that 578 the levels of standing genetic variation and parallelism were negatively correlated. In other words, lake-579 stream population pairs sharing larger amounts of standing genetic variation display more parallelism at 580 the level of both the phenotype and the genotype. Parallelism is, thus, likely constrained by the amount 581 of standing genetic variation upon which natural selection can act, as the effect of de novo beneficial 582 mutations on parallel evolution is much less likely to play an important role in adaptive divergence in 583 recently diverged population pairs. Previous theoretical work has further shown that even small 584 differences in the directionality of selection can greatly reduce genetic parallelism, especially in the case 585 of complex organisms with many traits (52). This suggests that, besides time since divergence, also 586 cryptic habitat heterogeneity (leading to small differences in the directionality of selection) can decrease 587 the likelihood of parallelism. In support of this, a comparison of regional (within Vancouver Island) 588 versus global (North America versus Europe) lake-stream population pairs of sticklebacks showed that 589 parallelism decreases at increased spatial scales (53).

590
Body shape convergence in species inhabiting the same rivers 591 Finally, we investigated levels of convergence or divergence among the nine lake-stream 592 population pairs. While we found divergence at the genomic levels (for both outlier and non-outlier 593 datasets), we found convergence at the morphological level by examining lake and river among-lineage 594 covariance matrices of trait mean values. As body shape contributed significantly to the overall variance, 595 we further focused on this trait for pairwise comparisons of morphological convergence/divergence. 596 Remarkably, we found body shape convergence in both species pairs from the same river systems,  That we did not find convergence at the genomic level despite convergence at the morphological 604 level can arise from several non-mutually exclusive factors. First, the outlier detection method based on 605 lake-stream differences might not be the most appropriate to uncover the genomic basis of morphological 606 evolution. A quantitative traits loci approach might be more appropriate, as has previously been applied 607 to uncover genomic regions underlying body shape differences along a benthic-limnetic axis of 608 differentiation in Midas cichlids (55, 56). Second, while body shape convergence may be due to adaptive 609 genomic divergence, adaptive phenotypic plasticity also plays a role in body shape evolution. Indeed, it 610 has previously been shown that body shape is partially plastic in A. burtoni (24), and that adaptive parallelism, multivariate analyses allowed to uncover major axes of shared evolutionary changes along the lake-stream ecological gradient. To conclude, our study highlights that diversification is a complex 627 product of differentiation trajectories through multivariate space and time.

630
Research objectives 631 In this study we investigated the dynamics of morphological and genomic diversification in 632 Astatotilapia burtoni populations that have diverged along a lake-stream environmental gradient and 633 across geography. We further aimed to investigate the extent and predictability of (non-)parallelism and 634 convergence in nine lake-stream population pairs from four East African cichlid species.   (Tables S1, S2). All fish were sampled with a ~1:1 sex ratio and were adult 642 specimens except for 3 P. philander juveniles from the Mbulu River. Fish were sampled in six different 643 tributaries to LT, whereby each system comprises a riverine population (N = 10-12) and a lake population 644 (N = 12-14) and was named after the river, except for the Lake Chila system that was sampled outside  Table S1). All other populations were sampled in the South of LT. A. 647 burtoni and C. horei were sampled at the Lufubu River; A. burtoni was further sampled in the Lunzua, 648 Chitili and Kalambo rivers ( Fig. 1; Table S1). As two river populations were sampled in the Kalambo 649 River, two lake-stream population comparisons were used for this river, namely Kalambo1 (comparison 650 Kalambo lake versus Kalambo1) and Kalambo2 (comparison Kalambo lake versus Kalambo2). Finally, 651 P. philander were sampled in small Lake Chila and in Mbulu creek ( Fig. 1; Table S1). All fish were 652 caught with fishing rods or minnow traps and anaesthetised using clove oil. Photographs of the leſt lateral 653 side were taken using a Nikon D5000 digital camera, under standardised lighting conditions, and with a 654 ruler for scale. To aid in digital landmark placement, three metal clips were used to spread the fins at the 655 anterior insertions of the dorsal and anal fin, and at the insertion of the pectoral fin (Fig. S2A). To increase 656 the sample size for morphological analyses, additional individuals were sampled and photographed at 657 the same locations and time points as the individuals whose genomes were sequenced (Table S2).

658
Standard length, total length, and weight were measured. A piece of fin clip was preserved in 99% ethanol 659 for DNA extraction. Whole specimens were preserved in 70% ethanol. For the genomic analyses, we planned to sample 12 individuals per population (6 males and 6 662 females) as 24 alleles per population are sufficient to obtain accurate allele frequency estimates. We 663 sampled only 10 Astatotilapia burtoni specimens from the Rusizi River because we did not succeed to 664 catch additional specimens after numerous attempts. 14 A. burtoni specimens from Rusizi lake were 665 sampled to obtain a total of 24 A. burtoni specimens from the Rusizi lake-stream system. For the 666 morphometric analyses, we intended to photograph at least 17 specimens per population, which is equal 667 to the number of landmarks used. Fewer specimens were photographed in three populations (A. burtoni Chitili lake, N=13; A. burtoni Rusizi River, N=10; P. philander Mbulu Creek, N=9) because we did not 669 succeed to catch additional specimens after numerous attempts and because 3 of the individuals caught 670 in Mbulu Creek were juveniles and thus excluded from the morphometric analyses. 672 For the genomic analyses, all individuals were included in the analyses except one hybrid 673 specimen between A. burtoni and another species that was detected only after whole genome sequencing.

674
For the morphometric analyses, juveniles were excluded on the basis that their morphology is different 675 from the adults' morphology.  (Table S1). We performed an indel realignment using RealignerTargetCreator and CatVariants (to merge all VCF files). The VCF file corresponding to the mitochondrial genome 701 (scaffold CM003499.1) was then isolated from the VCF file corresponding to the nuclear genome (that 702 is, all other scaffolds). The latter was annotated with the features ExcessHet (that is, the Phred-scaled p-703 value for an exact test of excess heterozygosity) and FisherStrand (that is, the Strand bias estimated using and 'lake-stream') and the adjusted coefficient of determination R 2 was reported in R.

813
Morphometric analyses 814 Geometric morphometrics was used to compare adult body shape between populations. Three 815 juvenile individuals of P. philander from Mbulu creek whose genomes had been sequenced were 816 excluded from the morphological analyses. In total, the photographs of 468 individuals (Table S2)  the data were corrected for allometric size effects using the residuals of the regression of shape on 823 centroid size for further analyses. Canonical variate analysis (CVA) (69) was used to assess shape 824 variation among A. burtoni populations (Fig. S2B,D) and among all populations of the four species (Fig.   825 S2C,E). The mean shape distances of CVA were obtained using permutation tests (10,000 permutations).

826
Mahalanobis distances (DM) among groups from the CVA were calculated and plotted against genetic 827 distances (FST for within A. burtoni comparisons; dXY for between species comparisons).

828
Vectors of phenotypic and genomic divergence 829 We followed the method first developed by Adams and Collyer (32) and described in detail in of genomic divergence were represented as a matrix CG of 78 t-statistics for each PC × 9 lake-stream 846 pairs. We calculated 9 LP (phenotype), 9 LG (genotype), 9 LG_outlier values, all lake- Multivariate analyses of (non-)parallelism and convergence 852 In addition to comparing all lake-stream divergence vectors in a pairwise manner, we performed

879
Predictability of (non-)parallelism 880 We then investigated the influence of similarity of ancestral (i.e. lake) populations on parallel 881 evolution. We used Mahalanobis distances and FST between lake populations as proxies of morphological 882 and genetic similarities, respectively. We performed Mantel tests between lake-lake Mahalanobis

906
In the second experiment, female spawning decisions were determined in a multi-sensory setting 907 with free contact between females and males. A single tank was subdivided into three equally sized 908 compartments by plastic grids (Fig. S4B). The middle compartment offered a resting and hiding place 909 for the females whereas the two outer compartments served as male territories. The grid size was chosen 910 to allow the smaller females to migrate between the three compartments, and to prevent direct contact 911 between the larger males to exclude male-male competition. We conducted eight trials, each time using 912 two males (one of each population) and six females from both populations (Nmales = 16; Nfemales = 46).

913
Mouthbrooding females were caught and ten larvae from each were collected for paternity analyses based 914 on five microsatellite markers. All adult males and females were also genotyped for these five markers.

915
The percentage of fertilised offspring by con-or heterospecific males in each replicate was then used to     differentiation (see Table 1 for details on the experiments and population used). (C) Isolation-by-distance in the 1162 two comparison categories 'lake-stream' (The colour coding for lake-stream systems is the same as in Fig. 1 increases sharply, likely due to the combined effect of divergent selection and drift at small geographic distance.

1167
In contrast, in the absence of strong divergent selection (that is, in the 'lake-lake' comparisons), morphological 1168 differentiation builds up first, then genomic differentiation accumulates only moderately, which is likely due to 1169 non-adaptive processes such as genetic drift.      "Lake" shape "Stream" shape     Topology weighting analysis (Twisst) reconstructed fixed-length 5 kb windows phylogenies. The species topology 1291 (topology 1) is recovered in all cases and highlights that no introgression was detected between A. burtoni and C.    matrix) and Mahalanobis (lower triangular matrix) distances from the CVA (Fig. S2). Significant body 1298 shape differences (P< 0.05) are highlighted in bold.