Phylogenetic analysis of migration, differentiation, and class switching in B cells

B cells undergo rapid mutation and selection for antibody binding affinity when producing antibodies capable of neutralizing pathogens. This evolutionary process can be intermixed with migration between tissues, differentiation between cellular subsets, and switching between functional isotypes. B cell receptor (BCR) sequence data has the potential to elucidate important information about these processes. However, there is currently no robust, generalizable framework for making such inferences from BCR sequence data. To address this, we develop three parsimony-based summary statistics to characterize migration, differentiation, and isotype switching along B cell phylogenetic trees. We use simulations to demonstrate the effectiveness of this approach. We then use this framework to infer patterns of cellular differentiation and isotype switching from high throughput BCR sequence datasets obtained from patients in a study of HIV infection and a study of food allergy. These methods are implemented in the R package dowser, available at https://dowser.readthedocs.io.

Abstract 22 23 B cells undergo rapid mutation and selection for antibody binding affinity when producing 24 antibodies capable of neutralizing pathogens. This evolutionary process can be intermixed with 25 migration between tissues, differentiation between cellular subsets, and switching between 26 functional isotypes. B cell receptor (BCR) sequence data has the potential to elucidate important 27 information about these processes. However, there is currently no robust, generalizable 28 framework for making such inferences from BCR sequence data. To address this, we develop 29 three parsimony-based summary statistics to characterize migration, differentiation, and isotype 30 switching along B cell phylogenetic trees. We use simulations to demonstrate the effectiveness 31 of this approach. We then use this framework to infer patterns of cellular differentiation and 32 isotype switching from high throughput BCR sequence datasets obtained from patients in a study 33 of HIV infection and a study of food allergy. These methods are implemented in the R package 34 dowser, available at https://bitbucket.org/kleinstein/dowser. proliferation, and selection for antigen binding [1]. This evolutionary process, called affinity 58 maturation, creates many lineages of B cells that each descend from a single naïve progenitor 59 cell. Cells within a clonal lineage differ predominately by point mutations. The genetic variation 60 within these clonal lineages has been long investigated using phylogenetic methods [2]. When 61 obtained through high throughput sequencing, BCR sequences have shown promise in 62 elucidating information about the adaptive immune response in humans, such as the sequence of 63 mutations that occur during antibody co-evolution with HIV [3], and the process of mutation and 64 selection during affinity maturation generally [4]. Other important biological processes may 65 occur as BCR sequences evolve, such as B-cell migration between tissues [5], differentiation into 66 cellular subsets [6], and antibody isotype class switching [7]. If these processes co-occur with 67 SHM, then in principle they can be investigated and inferred using phylogenetic techniques. are not able to be fit within a Markov model framework. The goal of the discrete trait analysis framework presented here is to characterize the distribution 117 of predicted trait values along B cell lineage trees. Given an alignment of sequences inferred to 118 descend from the same naïve ancestor (i.e. the same clonal family), lineage tree topologies and 119 branch lengths were estimated using maximum parsimony using dnapars v3.967 [21]. 120 Importantly, the statistics presented here are not limited to tree topologies inferred through 121 maximum parsimony. 122

123
Maximum parsimony is also used to infer the discrete character states (e.g. cell subtype, isotype, 124 tissue) of internal nodes, given a tree in which each tip is associated with a given character state. 125 Nodes with different states from their immediate ancestors are counted as state changes. More 126 specifically, internal node states were reconstructed using the Sankoff dynamic programming 127 maximum parsimony algorithm [22], which, given a weight matrix for each type of state change, 128 determines the minimum number of state changes that must be made along the tree given the 129 states at the tips. The backtrace step of this algorithm can be used to determine a set of most 130 parsimonious internal node states. Often there are multiple such maximum parsimony sets. To 131 represent state changes across ambiguous internal node sets, trajectories with equal parsimony 132 were randomly chosen in the backtrace step of the Sankoff algorithm, beginning at the root of the 133 tree and moving towards the tips. This process is performed 100 times for each tree, and the 134 mean of each type of state change was reported. 135 136 Strictly bifurcating B cell lineage trees frequently have clusters of nodes separated by zero-137 length branches (soft polytomies), which represent a high degree of uncertainty in tree topology. 138 This uncertainty in the order of bifurcating nodes can result in a potentially large number of 139 uninformative state changes along the polytomy. Multiple steps were taken to minimize the 140 effects of random polytomy resolution (Supplemental File S1). Briefly, nodes within each 141 polytomy were first re-ordered to minimize the number of state changes along the tree. To 142 represent the uncertainty in the order of state changes, nodes within each polytomy were grouped 143 together into separate subtrees according to their predicted state. These state-specific subtrees 144 were then joined together in a balanced manner, ensuring that state changes could occur in any 145 direction among the states contained within the polytomy (Supplemental File S1). The goal of our discrete trait analysis framework is to determine how the distribution of discrete 152 character states along the internal nodes of a tree differs from its expectation if traits are 153 randomly distributed among the tips. The statistics introduced herein are shown graphically in 154 (2) 160 The PS (parsimony score) statistic is the total number of state changes along a tree. The SC 164 (switch count) statistic from i to j is the number of state changes from state i to j. The SP (switch 165 proportion) statistic from i to j is the proportion of state changes from state i to j. 166

167
We calculate the significance of these three statistics using a permutation test. This is done by 168 randomizing traits at the tips of the lineage tree, re-calculating each statistic on the permuted 169 tree, and repeating for a specified number of replicates. For each replicate, we calculate d, which 170 is the difference between the statistic calculated on the observed tree and the same statistic 171 calculated on the permuted tree. If mean d > 0 (hereafter mean d is indicated by d), this indicates 172 the statistic is on average higher in observed trees than in permuted trees. For a one-tailed test, 173 we calculate the p value that d > 0 as the proportion of replicates in which d £ 0. Similarly, we 174 calculate the p value that d < 0 as the proportion of replicates in which d ³ 0. For a two-tailed 175 test, we calculate the p value that d > 0 as the proportion of replicates in which d < 0, plus half of 176 the replicates in which d = 0. We refer to the calculation of these p values for the statistics in Eq.  tends to be more immediately ancestral to state j than expected. This is because trees with 196 randomized tip states often have more state changes events in general, hence d values tend to be 197 negative even when there is no polarity in the ancestor/descendant relationship among type i and 198 j (Fig 1B). 199

200
To normalize for changes in the total number of state changes between observed and permuted 201 trees, we introduce the SP test. A significantly high SP statistic (Eq. 2) from state i to state j 202 indicates a greater proportion of switches from state i to state j than expected from a random 203 distribution of trait values at the tips. In contrast to the SC test, a significant association between 204 the tree and trait may exist, but only associations in which one trait is more often ancestral to the 205 other will give rise to a significantly high or low SP statistic (Fig 1C). Further, the denominator 206 of the SP statistic can be altered to test other hypotheses. For instance, to test whether a greater 207 proportion of state changes to state j come immediately from state i than expected by chance, one 208 can restrict the analysis to consider only state changes towards j. down each tree using a Markov model parametrized by initial frequencies π for each state, 241 relative rate parameters rij for each pair of possible states i and j, and r, the average rate of state 242 changes per mutation per site. The mean value of this rate matrix was calculated as the sum of 243 the diagonal elements weighted by their initial state frequencies. All values of the matrix were 244 then divided by this mean and multiplied by r. This calibration was performed so that r*l state 245 change events are expected to occur across a branch of length l mutations/site. For each tree, the 246 state at the germline node was randomly drawn based on each state's π value. For each node after 247 the germline, the rate matrix is multiplied by the node's ancestral branch length and 248 exponentiated to give the probability of each state at the descendant node, given the state at the 249 node's immediate ancestor. The state at the descendant node is randomly chosen based on these 250 probabilities. This process begins at the germline node and continues down the tree until each tip 251 node has a state. Because each tip corresponds to a sequence, this forms a dataset of sequences 252 paired with simulated discrete characters. Internal node states are not included in the final 253 simulated dataset used for analysis. Simulations were performed with two state models (A and B) 254 that explored a large parameter space (πa = 0.5, 1; rab = 0.1, 1, 10; r = 10, 25, 50, 100, 1000), and 255 four state models (A, B, C, D) that explored more complex patterns of state change at low overall 256 rate (r = 10). Twenty simulation repetitions were performed for each parameter combination. 257 Statistical tests were performed as described in Methods; however, to improve computational 258 efficiency simulation analyses did not use bootstrapped multiple sequence alignments, and 259 instead performed 100 permutations on a fixed maximum parsimony tree for each clone. Only 260 clones with more than one state type were analyzed. We outline three parsimony-based summary statistics to characterize the distribution of trait 287 values along B cell lineage trees (Fig 1). The significance of these statistics can be tested by 288 comparing observed values within the set of trees that comprise a repertoire to those obtained 289 from permuting trait values at the tree's tips. The first statistic, the parsimony score (PS), is the 290 total number of trait value state changes that occurred along a lineage tree. A PS test with d < 0 291 and p < 0.05 (i.e. a significantly low PS statistic) indicates the trait values cluster together in the 292 observed trees more often than expected by chance (Fig 1). We propose two other statistics 293 aimed at determining whether one state is more frequently the immediate ancestor to another 294 state than expected by chance. The switch count (SC) from state i to j is the number of state 295 changes that occurred from i to j [19], while the switch proportion (SP) from state i to j is the 296 proportion of state changes that occurred from i to j. An SC or SP test from i to j with d > 0 and p 297 < 0.05 indicates trait value i was more frequently immediately ancestral to state j than expected 298 by chance. We expect the SP test to be more sensitive to this relationship than the SC test 299 because it accounts for the increased number of state changes expected in randomized trees ( Fig  300   1B-C). Similarly, an SP test from i to j with d < 0 and p < 0.05 indicates trait value i was less 301 ancestral to state j than expected (Fig 1B). All three of these tests may be used to characterize 302 individual lineages or entire B cell repertoires; in this paper we will focus exclusively on 303 repertoires. We used simulations to test the performance of our proposed tests. We model B cell 308 migration/differentiation using a Markov model with two states, A and B, and empirically-309 derived linage tree topologies (Methods). Briefly, the pattern of state changes along a tree was 310 determined by the probability that the state at the root was A (πa = 0.5, 1; πb = 1 -πa), the 311 average rate of state change (r = 10, 25, 50, 100, 1000 changes/mutation/site), and the relative 312 We ran 20 simulation repetitions for each parameter combination, and tested the significance of 319 each of the proposed statistics to assess their statistical power. Our simulations are designed to 320 generate trees whose tip-states are more clustered together than if the tips states are randomly 321 distributed across the tree tips. Consistent with this expectation, 320/320 simulation repetitions in 322 which r < 1000 (i.e. overall rate of state change < 1000 changes/mutation/site) showed a 323 significantly low PS statistic regardless of other parameters (d < 0; one-tailed p < 0.05; 324 changes; namely, in which lineage trees were equally likely to begin at state A as B (πa = 0.5) and 346 had equal rates of state changes between A and B (rab = rba = 1). SP tests from A to B on these 347 datasets resulted in a roughly uniform distribution of p values at all tested migration rates (d > 0, 348 p < 0.05 in 5/100 ; Fig 2A). This indicates that completely unbiased state changing is consistent 349 with the null hypothesis of this test. Simulations in which lineages always had state A at the root 350 (πa = 1) and/or the relative rate of state change was higher from A to B (rab = 10) were expected 351 to give high SP statistics. At low overall rates of state change (r = 10), 55/60 of these simulations 352 had significantly high SP statistics from A to B (d > 0; p < 0.05; Fig 2B-D). At higher rates of 353 state change (r = 25, 50, 100, or 1000), this relationship diminished in these simulations as the 354 distribution of trait values became less distinguishable from random association (Fig 2B-D). 355 These results indicate that, under this two state Markov model framework, a significantly high 356 SP statistic is associated with biased origination, biased rate of state change, or both depending 357 on the overall rate. 358 359 Finally, we used these simulated datasets to test whether the SP test is affected by biased data 360 sampling, as this potential bias is important for some other phylogeographic methods of trait 361 evolution e.g.
[29]. We tested this by randomly discarding half of the sequences associated with 362 A in simulations with totally unbiased state change (πa = 0.5, rab = rba = 1). Though SP tests from 363 A to B on these datasets gave a uniform p value distribution when all sequences were included 364 (Fig 2A), SP statistics became significantly high when half of A sequences were discarded (Fig  365   2e). This indicates that severely biased sampling may give a similar signature as biased 366 origination or state change for the SP test (Figs 2B-D). Biased sampling may be caused by a 367 variety of experimental factors, and applications of these statistics to empirical datasets will need 368 to carefully consider possible effects of biased data collection for each trait type. All the tests detailed above are extendable to data with more than two states; however, due to its 373 superior performance in two state simulations, we will focus in the rest of this study on the SP 374 test. The permutation step of the SP test usually permutes trait values within each tree separately 375 (Methods). However, when more than two states are present it may be advantageous to 376 randomize trait value assignments among trees rather than just within each tree. This changes the 377 null hypothesis, which is now that the proportion of state changes observed is the same as that 378 expected if trait values are randomly distributed among all trees. Deviations from this null 379 hypothesis may be due not only to biased ancestor/descendant relationships within individual 380 trees, but also co-occurrence of trait values within different trees. To demonstrate the difference 381 between these two mechanisms, we performed simulations with four trait values: A, B, C, and D. 382 To test the difference between simple association and biased ancestry, these simulations used 383 unbiased state change between A and B, and unidirectional state change from C to D. Trees 384 began with states A, B, or C in equal probability; state changes were allowed in both directions 385 from A to B and unidirectionally from C to D. For each repetition, the rate of state change (r) = 386 10, and relative rates were equal among allowed state changes. Performing the SP test on these 387 simulations while permuting among trees showed significantly higher SP statistics in both 388 directions between A and B, and between C and D than expected (20/20 for each; Fig 3A). This 389 indicates the SP test when permuting among trees detected the association between these trait 390 values but not the directionality of C to D state changes. In contrast, the SP test when permuting 391 only within trees correctly yielded a significantly high SP statistic from C to D in 19/20 392 simulations; further, no simulation yielded a significantly high SP statistic from D to C, 393 indicating a low false positive rate. No simulation using either permutation method showed a 394 significantly high SP statistic between unassociated trait values (e.g. A and C), indicating a low 395 false positive rate. These results indicate that permuting trait values within trees is a more 396 effective means of detecting biased ancestor/descendant relationships, while permuting between 397 trees is more appropriate for detecting associations among traits (Fig 3B) power. While we previously showed that permuting among trees confuses biased association 418 with biased ancestry (Fig 3A), switching between these states can only occur in one direction. 419 Because of this, association between two states implies a direction of switching and among tree 420 permutation is justifiable. We first simulate direct switching in which trees always had state A at 421 the root and only state changes from A to the other states were allowed (Fig 3C). We expected 422 these simulations to show a significantly high SP statistic only from A to D. Confirming this 423 expectation, all 20 of these simulations had a high SP statistic from A to D (d > 0, p < 0.05; Fig  424   3C). We next simulated sequential switching, where arriving at state D requires transitioning 425 through B. All trees began in A and state changes were allowed from A to B and C, but D arose 426 only from B. We expected these simulations to show a significantly high SP statistic only from B 427 to D. All 20 of these simulations showed a significantly high SP statistic from B to D (d > 0, p < 428 0.05; Fig 3D). These results demonstrate that the SP test using constrained parsimony can 429 discriminate between simple hypotheses of isotype switch patterns, such as direct versus 430 sequential switching. 431 432 We next investigated whether the SP test can distinguish between more complex types of 433 constrained switching. We simulated irreversible isotype switching in which trees begin with 434 state A, and only state changes moving alphabetically (A to D) were allowed. Naively, we may 435 expect these simulations should show similar SP test results from A, B, and C to D. However, all 436 20 of these simulation repetitions showed a significantly high SP statistic to D from B and C, but 437 not from A (Fig 3E). As a control, we simulated unconstrained switching in which trees begin 438 randomly at any state and may change between all states. Using a constrained parsimony model, 439 these simulations showed the same significantly high SP to D from B and C, but not from A (Fig  440   3F), indicating that this pattern is possibly an artifact of the constrained parsimony model. These performed here demonstrated the SC test is highly conservative, and that permuting among trees 456 may only detect unstructured association among trait values (Fig 3) Fig 4B-D). These analyses confirm the conclusions in [6] that CD19 hi MBCs are 466 significantly closer, cladistically, to the predicted germline sequence than GCBC sequences. 467 Naively, one may interpret this as evidence that GCBC cells derive from CD19 hi MBCs. 468 However, because GCBCs are expected to have far higher mutation rates than MBCs, the 469 observed patterns are also consistent with early production of CD19 hi MBC from GCBCs, 470 followed by a near cessation of mutations in CD19 hi MBCs. This is consistent with the 471 conclusions of [6] that CD19 hi MBCs represent earlier stages in the GC reaction, rather than the 472 direction of differentiation. Overall, these SP tests confirm that the previously observed 473 relationships from CD19 hi MBCs and CD19 lo MBCs to GCBCs are driven by biased 474 ancestor/descendant relationships within trees rather than simply association in the same trees, as 475 may have been the case from the previously used SC tests with among tree permutations [6]. between IgE and IgA1 than between IgE and other isotypes in these subjects. A phylogenetic test 487 of this relationship would confirm that IgE and IgA1 sequences show a direct ancestor-488 descendant relationship within these B cell trees rather than just being part of the same clone. 489

490
We applied our discrete trait framework to determine the origins of IgE in a single subject (id = 491 2442) from [28]. This subject was selected due to reported history eczema, food allergy, and B 492 cell clones containing IgE and other isotypes [28]. Using an SP test in which only state changes 493 leading to IgE were considered and trait values were permuted among trees, we found a 494 significantly high SP statistic from IgA1 to IgE (Fig 5B). No other isotype showed a 495 significantly high SP statistic to IgE. These results favor IgE arising from sequential switching 496 through IgA1 over direct switching from IgM in this subject. Performing a similar test using only 497 state changes leading to IgG4 revealed a significantly high SP statistic from IgG1 and IgG2 to 498 IgG4 (Fig 5E). This pattern is similar to irreversible switching within the IgG family (Fig 3E). 499 As shown in simulation analyses, this test is not suited to infer relative rates of switching from 500 different isotypes if all kinds of switches are considered. However, these results are most 501 consistent with origin of IgG4 through sequential switching with other IgG isotypes rather than 502 direct switching from IgM or sequential switching from IgA1. Overall, these results are Simulations demonstrate that the SP test was uniquely able to determine the direction of biased 525 origination and state change among the approaches investigated. In simple simulations 526 containing two states (A and B) a significantly high SP statistic from A to B was associated with 527 origination in A and biased state change from A. This signal decreased as the overall rate of 528 switching increased. In more complex scenarios, the SP test was able to differentiate between 529 traits that were generated through biased state change in a particular direction versus traits that 530 were simply associated with each other. The SP test was also able to distinguish between simple 531 modes of constrained evolution such as direct and sequential switching. These results indicate 532 the SP test may have broad utility in characterizing ancestor/descendant relationships among B 533 cell discrete traits. 534

535
We next used two datasets to demonstrate that the SP test could be used to derive meaningful 536 biological conclusions. In the first, we confirm that T-Bet+ memory B cells tend to be the 537 predicted immediate ancestors of GC B cells within lineages trees obtained from three HIV+ 538 subjects. Though this relationship may primarily be due to differences in mutation rate over time 539 between memory and GC B cell subsets, this does confirm prior findings and demonstrates that 540 the T-Bet+ memory B cell subset represents an earlier state in the affinity maturation process, 541 possibly contributing to an impaired immune response to HIV [6]. We next characterized the 542 isotype switching patterns of sequences obtained from a human over the first three years of life 543 [28]. In this analysis, we found evidence of sequential switching from IgA1 to IgE, as well as 544 evidence of sequential switching from IgG subtypes to IgG4. Sequential switching from IgA1 to 545 IgE is consistent with [28] but not other analyses performed on data taken from adults, which 546 favor sequential switching from IgG [30]. This possibly reflects differences in isotype switching 547 patterns between adults and children. Overall these results demonstrate that the discrete trait 548 analysis framework developed here can be used to test important hypotheses about B cell 549 differentiation and class switching. 550

551
There are a number of limitations with these methods. First, tree topologies were estimated using 552 maximum parsimony. While maximum parsimony is not a statistically consistent estimator of 553 tree topology and is known to give inaccurate predictions over long branch lengths [32], it has 554 been shown to be an accurate estimator of tree topology in certain B cell applications [33], and is 555 widely used in B cell phylogenetic analysis [5,17]. In any case, the statistics presented here are 556 not limited to tree topologies inferred through maximum parsimony. The three statistics 557 proposed here (Eq. 1-3) are also based on maximum parsimony, and may have similar 558 inaccuracies over long branch lengths. Further, the statistical tests assume that the process of 559 state change is independent of the tree shape, when the two may be coupled e.g. [34]. This 560 assumption of independence is commonly made in discrete trait analysis e.g.
[8] to enable 561 computational tractability, and because the actual link between tree shape and state change is 562 unknown or cannot be modelled. A significantly high SP statistic could be potentially caused by 563 factors other than biased state change. For instance, because tree branch lengths represent genetic 564 distance rather than time, it is possible that cell types with low mutation rates over time will 565 spuriously appear ancestral to those with high mutation rates. This effect likely underlies our 566 analysis of B cell subtypes in HIV. Finally, it is possible that SHM is actually occurring at 567 another, un-sampled site which is seeding the sites that were sampled. Overall, it is important to 568 carefully consider alternative explanations when trying to determine the biological basis for a 569 significantly high SP statistic. 570 571 An important limitation of the SP test is that it, like many other phylogeographic approaches e.g. 572 [29] is affected by biased data sampling. This may arise due to experimental factors that are 573 difficult to control. For instance, under-sampling a trait value may cause a spurious, significantly 574 high SP from that trait value. Previous analyses of viral migration have dealt with potential 575 sampling bias by performing tests across multiple down-sampling repetitions [9]. In practice, it 576 can be difficult to know if B cells with certain trait values have been sampled proportionally to 577 their relative population sizes. However, if a type of B cell is known to be under-sampled in a 578 particular experiment, and is predicted to be the descendant of another B cell type, it can be 579 argued that this relationship is unlikely to be due to biased sampling (Fig 2E). Alternatively, if 580 multiple samples are tested it is possible that these samples will have a wide range of sequence 581 proportions belonging to different traits. If these differences in sequence proportions are 582 uncorrelated with SP test results, it could be argued that observed results are unlikely to be due to 583 consistent under-sampling of B cells with a particular trait value. 584

585
Our simulation analyses revealed that that the SP test is difficult to interpret when considering 586 complex constrained models such as irreversible isotype switching (Fig 3C-E). To recreate 587 isotype switching, we performed four state simulations in which only state changes proceeding in 588 the direction of state A, B, C, and D were allowed. Unexpectedly, these simulations tended to 589 show a significantly low SP statistic from A to D, but a significantly high SP statistic for B and C 590 to D (Fig 3E). This biased trend is likely driven by the fact, due to constraints in the direction of 591 state change, randomized trees tend to have more switches from A than expected based on the 592 relative frequency of A. This produces a significantly high SP for switches from A to D. An 593 alternative may be to use the SP statistic (Eq. 3) without comparing to a null distribution, which 594 is equivalent to comparing the relative frequency of each type of switch observed. However, the 595 observed switch frequency (SP statistic) is not proportional to the true relative rate of state 596 change in general. For instance, in the two state Markov model simulations presented here (Fig  597   2), the SP statistic alone is both positively and negatively related to the true relative rate of state 598 change, depending on other parameters (Supplemental File S5). Comparing SP statistics to 599 those obtained from randomized trees (i.e. the SP test) usefully corrects this relationship in 600 unconstrained models (Fig 2), but not always in constrained ones (Fig 3C-E). Ultimately, 601 isotype switching is a complex, constrained process, and our analyses suggest the relative rates 602 of isotype switching inferred from B cell trees should be interpreted cautiously. We suspect a 603 general method for accurately estimating these rates will require a model-based approach, such 604 as a non-reversible Markov model. 605 606 Future methods to differentiate migration, differentiation, and isotype switching patterns in B 607 cells might improve upon the approach developed here by explicitly modeling these processes 608 along a phylogeny, incorporating branch length information, and better accounting for 609 uncertainty in tree topology. The heuristic approach introduced here crucially does not use 610 branch lengths to help predict internal node states of the tree. Ignoring this source of information 611 likely lowers power, but is possibly advantageous because the relationship between mutation rate 612 and time is not currently well understood, and likely varies by cell type. While the approach 613 developed here uses phylogenetic bootstrap replicates to account for uncertainty in tree topology 614 [24], this may also be done using a posterior distribution of topologies generated by MCMC 615 sampling. This was recently done for naïve sequence inference in individual B cell lineages [35]. 616 Phylogenetic bootstrapping has less desirable statistical properties than posterior distributions, 617 but is a widely used means of assessing reproducibility of tree topology and is more 618 computationally tractable for large datasets. Overall, though there is potential for improvement, 619 the approach introduced here effectively deals with important challenges such as incorporating 620 information across trees, accounting for uncertainty in tree topology, and scaling efficiently 621 when analyzing large datasets. 622 trait values clustered in tree: Association between trait and tree structure revealed by significantly 652 low PS statistic. This tree also has a significantly low switch count statistic from A to B (SCab). 653 Further, this tree has an identical switch proportion statistic (1/2) from A to B as from B to A, which 654 is not significantly different from permuted data (SPab). (c) Biased ancestor/descendant 655 relationships among trait values: As in b, this tree also shows a significant relationship between 656 tree and trait distribution (PS). However, this tree also has a significantly high SP statistic from A 657 to B (SPab). Only the SP test captures the directionality of this relationship. with arrows. All allowed state changes occurred at the same relative rate and the total rate of state 679 changing (r) was 10 changes/mutation/site (see Fig 2). Unconstrained switching simulations also result in low p values from B and C to D, but not from 687 A. The strange results of e and f are likely artefacts of the constrained parsimony algorithm, which 688 forbids reverse alphabetical state changes (e.g. D to C), used to count state changes in simulations 689 c-d. 690 691