TopoRoot: A method for computing hierarchy and fine-grained traits of maize roots from X-ray CT images

Background 3D imaging, such as X-ray CT and MRI, has been widely deployed to study plant root structures. Many computational tools exist to extract coarse-grained features from 3D root images, such as total volume, root number and total root length. However, methods that can accurately and efficiently compute fine-grained root traits, such as root number and geometry at each hierarchy level, are still lacking. These traits would allow biologists to gain deeper insights into the root system architecture (RSA). Results We present TopoRoot, a high-throughput computational method that computes fine-grained architectural traits from 3D X-ray CT images of field-excavated maize root crowns. These traits include the number, length, thickness, angle, tortuosity, and number of children for the roots at each level of the hierarchy. TopoRoot combines state-of-the-art algorithms in computer graphics, such as topological simplification and geometric skeletonization, with customized heuristics for robustly obtaining the branching structure and hierarchical information. TopoRoot is validated on both real and simulated root images, and in both cases it was shown to improve the accuracy of traits over existing methods. We also demonstrate TopoRoot in differentiating a maize root mutant from its wild type segregant using fine-grained traits. TopoRoot runs within a few minutes on a desktop workstation for volumes at the resolution range of 400^3, without need for human intervention. Conclusions TopoRoot improves the state-of-the-art methods in obtaining more accurate and comprehensive fine-grained traits of maize roots from 3D CT images. The automation and efficiency makes TopoRoot suitable for batch processing on a large number of root images. Our method is thus useful for phenomic studies aimed at finding the genetic basis behind root system architecture and the subsequent development of more productive crops.

of the root). Second, the time series must be dense enough to observe the correct root hierarchy. 101 Neither assumption may hold for other imaging setups, such as X-ray CT or MRI scans of 102 complex roots, which may lead to numerous topological errors after image segmentation and 103 have few, and often a single, time point available. 104

105
In this work, we present TopoRoot, a method for obtaining the complete root hierarchy and 106 computing the associated traits from a single 3D X-ray CT scan of excavated maize root crowns. 107 TopoRoot adopts a multi-step approach to treat the topological errors in the input image, and it 108 computes the hierarchy and traits using a high-fidelity skeleton representation of the root system 109 architecture. Our method builds on state-of-the-art algorithms from computer graphics and 110 introduces customized heuristics tailored to the image features and the maize root structure. Shovelomics protocol [34] in September 2020 and washed to remove soil. We used an X5000 X-141 ray imaging system and efX-DR software (NSI, Rogers, MN) to collect X-ray computed 142 tomography (XRT) data (pictured in Figure 1). The X-ray source was set to a voltage of 70kV, 143 current of 1700µA, and focal spot length of 119μm. Samples were clamped and placed on a 144 turntable for imaging at a magnification of 1.17X and 10 frames per second (fps), collecting 145 1800 16-bit digital radiographs over a 3 minute scan time. efX-CT software was used to 146 reconstruct the scan into a 3D volume at 109μm voxel resolution. This volume was exported as a 147 16-bit RAW volume. Following imaging, manual counts were collected for the nodal roots. Each 148 sample was dissected starting at the highest node (stalk end) moving downward to the root tips. 149 Only attached roots were counted towards the total number of developed roots at each node. 150 Finally, 14 scans were removed from the analysis due to excessive soil present in the imaging 151 and missing whorls. The remaining 45 scans and their manual nodal root counts provide 152 validation for TopoRoot's computed values. 153 It is generally difficult to obtain manual measurement of fine-grained root traits beyond counting 154 the nodal roots. To validate other fine-grained traits produced by our method, we supplement the 155 real data set with a systematically created set of simulated root images. We adopt OpenSimRoot 156 [24], a highly customizable 3D root growth simulation software that has been widely used in 157 modeling and visualizing root growth [11,31,7]. We used OpenSimRoot to create 55 maize 158 roots ranging in days of growth from 30 to 40 days, numbers of nodal roots ranging from 31 to 159 69, number of whorls from 5 to 6, and lateral root branching frequency from 0.3 to 0.7 cm / 160 branch. The diameter of the stem was set to be 2 cm, starting diameter for nodal roots is 0.3 cm 161 (gradually decreasing to 0.1 cm after 10 days of growth), lateral roots is 0.04 cm, and fine lateral 162 roots is 0.02 cm. OpenSimRoot provides a detailed hierarchy for each of the simulated roots, 163 from which we obtain the ground-truth traits. 164 The example is at 40 days of growth. The closeups show fine lateral roots. With increased noise, 168 the roots exhibit less regular geometry and more topological errors (e.g., disconnections and 169 loops). 170 For each simulated root, we create a 512^3 image by computing the signed distance field from 171 the surface of the root using the method of [35] with the inside of the surface having positive 172 values and the outside having negative values. To simulate various levels of image noise, we 173 randomly perturb the distance value at each voxel, with the amount of perturbation ranging from 174 0 to 0.08 cm in 0.01 increments. This results in 9 images at increasing noise levels for each of the 175 55 roots, and thus 495 images in total. Figure 2 shows images of one simulated root (at day 40) 176 at different noise levels. Note that the amount of geometric irregularity and topological noise 177 (e.g., disconnected components and loops) increase with the noise level. 178 Thresholding 179 Although the X-ray images have good contrast between roots and air, due to the spatial variation 180 in the contrast, limited resolution and noise, there is typically no common threshold that can 181 accurately capture both thicker (e.g., nodal) and thinner (e.g., lateral) roots. To obtain the best 182 result, our method therefore asks the user to provide three thresholds, visualized in Figure 3. The 183 shape threshold provides the best balance between capturing thick and thin roots, but it may 184 contain many topological errors (e.g., disconnected thin roots and merged thick roots). We 185 additionally ask for a kernel threshold (the lowest value that avoids merging of thick roots) and a 186 neighborhood threshold (the highest value that avoids disconnecting thin roots), such that the 187 increasing ordering of the three thresholds are neighborhood, shape, and kernel. Our method will 188 attempt to fix topological errors at the shape threshold guided by the neighborhood and kernel 189 thresholds. For this data set, we set the shape threshold to be 0, kernel threshold to be 0.03 and 190 neighborhood threshold to be -0.15. 191 Iso-surfaces n, s, and k, as visualized for an X-ray image of a maize root. The red closeup shows 194 a fine lateral root which is connected in n, but disconnected in s and which has disappeared in k.

195
The light blue closeup shows two nodal roots which are merged together in n and s, but separated 196 in k. 197 Overview 198 Our pipeline takes as input a 3D grayscale image with three thresholds (shape, kernel and 199 neighborhood) and produces a hierarchy and associated traits in four steps ( Figure 4) The pipeline of TopoRoot for computing fine-grained traits from a 3D X-ray image. Beginning 215 from a 3D grayscale image (A), TopoRoot first simplifies the topological complexity of the iso-216 surface (B), then it creates a geometric skeleton capturing the branching structure (C), from 217 which a hierarchy is obtained (D) and the traits are subsequently computed. 218

Topological simplification 219
As shown in Figures 1 and 2, iso-surfaces of input images often contain numerous topological 220 errors, including disconnections, loops, and voids. These errors make it challenging to infer the 221 branching structure of the root system. We use a recent algorithm [38] to maximally remove 222 topological errors. Given the iso-surfaces at three thresholds (shape, kernel and neighborhood), 223 the algorithm attempts to remove all topological features on the iso-surface at the shape threshold 224 that are not present on either of the other two iso-surfaces. The algorithm allows both addition 225 and removal of contents to the shape's iso-surface, and it tries to minimize these geometric 226 changes to achieve topological simplification. This algorithm can effectively connect broken 227 branches (if they are contiguous at the neighborhood threshold) and split merged branches (if 228 they are separate at the kernel threshold), as shown in Figure 5A and 5B. 229 A common issue in XRT scans of dried roots is that the interior space of the stem and some thick 230 root branches often appears to be hollow. Since our subsequent analysis of the hierarchy relies on 231 a skeleton representation of the root architecture, we need to fill in such hollow space so that the 232 resulting skeleton captures the solid cylindrical shape of the roots (instead of only the walls). 233 However, identifying such interior space is not straight-forward. In the ideal scenario, at the 234 shape threshold, the wall of the stem or root branch completely separates the interior space from 235 the outside space, making the interior space a topological void and hence easy to detect. But 236 oftentimes the iso-surface at the shape threshold exhibits "tunnels" on the wall of the stem or 237 branch (see Figure 5A, top-left purple box), which connect the interior space with the outside 238 and making it impossible to identify as a topological feature. Applying the algorithm of [38] does 239 not fill in the hollow space. On the contrary, it may merge nearby small tunnels into larger 240 tunnels to reduce the number of topological loops (see Figure 5B, top-left). 241 We develop a simple heuristic to identify and fill the hollow interior spaces. Our assumption, 251 based on observation of our data, is that such spaces usually become topological voids at the 252 lower, neighborhood threshold (see Figure 6A). These voids, however, are usually smaller than 253 the hollow space at the shape threshold, and hence need to be expanded before filling in. To do 254 so, we first erode the set of all voxels above the neighborhood threshold, noted as ! , onto the set 255 of voxels above the shape threshold, noted as " , while preserving the topology of ! . The 256 erosion is prioritized by the intensity, such that voxels having lower intensity are eroded earlier. 257 The eroded voxel set, denoted as ! ', consists of " and a minimal set of voxels needed to achieve 258 the topology of ! , denoted as # (colored red in Figure 6B). We then fill each void of ! ', as 259 well as those voxels in # adjacent to these voids (see Figure 6C). Since filling may introduce 260 additional topological errors, such as loops or new voids, we call the heuristic prior to applying 261 the algorithm of [38]. This produces a topologically simple root shape with filled stems and 262 branches (see Figure 6C). 263 showing voxels at or above the shape threshold (white, denoted by " ) and voxels at or above the 267 neighborhood threshold (gray, denoted by ! ). Note that the hollow interior of the stem is 268 connected to the outside space in " , but it forms a void disconnected from the outside in ! . (B) 269 The result of a topology-preserving erosion of ! onto " , denoted by ! $ , which consists of " 270 (white) and additional voxels (red) needed to retain the topology of ! . (C) The voids of ! $ , as 271 well as any voxels that are in ! $ but not in " (denoted # ) are filled in. 272 Not all topological errors will be removed at the end of this step. In particular, the algorithm of 273 [38] will retain those topological features (e.g., loops) that exist on all three thresholds (shape, 274 kernel, and neighborhood). Removing these larger-scale errors often requires more careful 275 analysis of the root architecture, for example, to prevent breaking a root in the middle. We shall 276 address these remaining issues in the next step of our pipeline and with the help of a geometric 277 skeleton. 278

Skeletonization 279
To obtain the root hierarchy and associated traits, our method relies on a representation of the 280 root system as a curve skeleton -a connected set of curves lying in the center of the root 281 branches and capturing the branching structure. While there are many algorithms for computing 282 skeletons of 3D shapes, we adopt the methods of [36, 37] because of their scalability to high-283 resolution iso-surfaces, robustness to noise, and the final representation as an embedded graph 284 (with vertices and edges) which is more convenient for connectivity analysis than voxel-based 285 skeletons. These methods have been previously adopted in skeleton-based phenotyping of 286 sorghum panicles [9]. Specifically, given the topologically simplified iso-surface produced by 287 the previous step, [37] computes a 2-dimensional centered structure known as the medial axis, 288 and [36] further reduces the medial axis to 1c-dimensional curves. The resulting skeleton is also 289 associated with shape attributes, including the thickness and the width of the tubular cross-290 sections, which are useful in our subsequent analysis. An example is shown in Figure 4C (also in 291 Figure 7A). 292 skeleton junction (in the boxed region in (A)) caused by merging between two roots in the iso-297 surface, which leads to a cycle in the skeleton. (C) The cycle is removed by detaching a skeleton 298 segment from the junction. 299 As mentioned earlier, some topological errors remain after topological simplification. These 300 errors manifest as disconnected components and cycles (a path of edges which begin and end at 301 the same vertex) on the curve skeleton. To reduce the number of components to one, we simply 302 take the largest connected component on the skeleton. To remove cycles in that component, we 303 observe that cycles are usually caused by merging of distinct roots, which take place at junctions 304 (vertices with degree three or higher) on the skeletons ( Figure 7B). Our goal is to identify 305 junctions that correspond to merging between roots, as opposed to natural branching of roots, 306 and resolve the cycles by detaching skeleton segments from such junctions ( Figure 7C). This is 307 achieved by computing the minimum spanning tree (MST) on a weighted graph, as described 308 below. 309 As we are primarily concerned with the junctions of the skeleton, we construct an abstract graph 310 where each node represents either a junction vertex or a continuous skeleton segment between 311 junctions ( Figure 8A, 8B). A graph edge connects a node representing a skeleton junction with 312 another node representing a skeleton segment incident to the junction. Removing a graph edge 313 corresponds to detaching a skeleton segment from a junction ( Figure 8C, 8D). 314 The MST is the subset of edges such that all graph nodes remain connected and the sum of edge 325 weights is minimal. An MST is always free of cycles. We define the edge weights such that a 326 lower weight implies a higher likelihood that a skeleton segment should be attached to a 327 junction. Our weight definition is motivated by the observation that a merging between two 328 distinct roots often results in a "T" junction on the skeleton where, among all skeleton segments 329 incident to the junction, one segment (which should be detached) is close to being orthogonal to 330 the other segments (see Figure 8B). In contrast, a typical branching of the root usually results in 331 a "Y" junction on the skeleton where each skeleton segment at the junction forms an obtuse 332 angle with at least one other segment. However, branching close to the stem of the root could be 333 more like "T" than "Y", due to the large difference in thickness between the main and offspring 334 roots. Our weight encourages "Y" junctions while preventing detachment close to the stem. 335 Specifically, let be a graph edge, & the node of the edge that represents a skeleton junction, 336 & the other node of which represents a skeleton segment, and ( ) the set of edges incident to 337 node . We denote by the unit vector of the tangent direction of the skeleton segment 338 represented by node & towards the junction node & . We first define the angle deviation term 339 This term reaches the minimal value of 0 when there is some edge $ incident to & that has the 342 same direction as , and it attains the maximal value of 2 when all other edges incident to & 343 have the opposite direction as . Next, we define 12" ( ) as the shortest distance along the 344 skeleton between the skeleton junction represented by & and any skeleton vertex in the stem. 345 Here, we extract the stem from the skeleton using the heuristic reported in [9]. The heuristic 346 starts from a subset of the skeleton whose thickness measure is above a threshold (since the stem 347 is usually the thickest portion of the maize root system) and extracts the longest simple path from 348 the subset. Finally, the edge weight is defined as: 349 Here is a balancing parameter, which is set to 0.05 in our experiments. 351 Inferring hierarchy 352 assigns hierarchy levels to each skeleton segment based on the parent-child association. 359 A key feature of TopoRoot is computing the hierarchy of the maize root system consisting of the 360 stem, nodal roots, and lateral roots at different levels. Given the cycle-free skeleton computed 361 from the previous steps, we next label each vertex of the skeleton as either part of the stem path, 362 the stem region, a nodal root, or a lateral root at a specific level. As described in the previous 363 step, the stem path is identified using the thickness-based heuristic of [9]. The stem region is then 364 defined as the set of skeleton edges within a cylindrical region around the skeleton path, where 365 the radius of the cylinder varies along the stem path and is set to be 1.6 times the thickness 366 measure stored on the path vertices. In Figure 4D, the stem path and stem region are colored dark 367 blue. 368 To label the remaining skeleton edges by the root hierarchy (e.g., nodal roots, 1st-order lateral 369 roots, 2nd-order lateral roots, etc.), we make the following two assumptions. First, as assumed in 370 previous works including DynamicRoots, roots at lower levels of the hierarchy are generally 371 longer. For example, nodal roots are generally longer than lateral roots, and 1st-order lateral 372 roots are generally longer than 2nd-order lateral roots, and so on. Second, the maximum number 373 of hierarchy levels in a root system is generally kept low. With these assumptions, we developed 374 a labelling heuristic that minimizes the maximal level of hierarchy in the root system while 375 encouraging longer roots to be at lower levels. 376 Our heuristic proceeds in two stages, a bottom-up traversal of the skeleton and then a top-down 377 traversal. They are illustrated in Figure 9 using a cartoon example. We start with a skeleton 378 labelled only by the stem region ( Figure 9A). Since the skeleton has no cycles, it is a "tree". We 379 consider the stem region as the "root" of this tree, and this induces a partial ordering of the 380 skeleton segments such that each skeleton junction (outside the stem region) is incident to 381 exactly one parent segment and zero or more children segments. The first stage of the heuristic 382 computes, at each skeleton junction, the association between the parent skeleton segment with 383 one of the children segments as the continuation of the same root (see arrows in Figure 9B). The 384 association is computed by visiting the skeleton segments from the leaves of the skeleton 385 towards the stem and updating a depth d(b) (numbers in Figure 9B) and a distance l(b) at each 386 visited segment , as follows. First, for each segment incident to a leaf vertex of the skeleton, 387 we set d(b)=0 and l(b) as the length of the segment . We then iteratively visit parent segments 388 whose children segments have already been visited. For a parent segment b whose children are 389 b1, …., bn, we associate with the child 2 that has the maximal depth ( 2 ). If multiple children 390 have the same maximal depth, is associated with the 2 with maximal length ( 2 ). We then set 391 d(b)=d(bi)+1 and l(b) to be l(bi) plus the length of segment . In the second stage, we visit all 392 segments from the stem to the leaves and assign the hierarchy levels. We assign each segment 393 attached to the stem region a hierarchy level of 1 (i.e., nodal roots). For each parent segment 394 assigned with level , we assign level to the child segment associated with the parent 395 (computed from the first stage) and level + 1 to all other children segments. The resulting 396 hierarchy labelling is shown in Figure 9C both as numbers and the heat color (warmer colors 397 have higher levels). 398

Computing traits 399
Given the skeleton and the hierarchy labelling, TopoRoot computes a suite of coarse-grained and 400 fine-grained traits of the root system. Like existing works (e.g. DynamicRoots tends to produce significant mis-classifications of root branches in our dataset. 430 For example, the level-0 roots obtained by DynamicRoots include not only the stem but quite a 431 few nodal roots (black boxes). These errors propagate to higher-level roots, which fork from 432 lower-level roots. In addition, since the thresholded root image at the shape threshold contains 433 many disconnected components, and because DynamicRoots only considered one connected 434 component, a significant portion of the root is missing in the analysis of DynamicRoots (red 435 box). Although performing the topological simplification step in TopoRoot allows 436 DynamicRoots+ to recover a more complete root system, mis-labelling of hierarchy levels 437 remains (black boxes). In contrast, TopoRoot produces a more visually plausible hierarchy 438 separating the stem (region), nodal roots, and lateral roots. 439 TopoRoot exhibits a much lower relative error (mean=8.3%, = 5.6%) and higher correlation 456 (Pearson's coefficient=0.951) than either DynamicRoots (159.5% mean error with =190.0%, 457 Pearson's coefficient=-0.0452) or DynamicRoots+ (235.4% mean error with = 244.4%, 458 Pearson's coefficient=-0.160). The significant over-counting of DynamicRoots+ is mostly 459 caused by the mislabeling of nodal roots as level-0 roots, as explained above, which leads to 460 many lateral roots being labelled as level-1 roots. The over-counting also increases with the size 461 of the root system. Furthermore, both the nodal root counts computed by TopoRoot and the hand 462 measurements exhibited a significant difference between the mutant and wild type samples, as 463 measured by the independent two-sided Wilcoxon rank sum test (p=0.000130 for TopoRoot, 464 p=0.00349 for hand measurements). Neither DynamicRoots (p=0.126) nor DynamicRoots+ 465 (p=0.0199) showed a significant difference between the mutant and wild-type, and both had 466 negative Pearson coefficients. This shows that TopoRoot can perform better for differentiating 467 the root system architecture between these two varieties than can DynamicRoots. 468 shows distribution bars for one trait over all 27 wild type samples and 18 mutants. Traits with 500 p<0.01, as measured by the independent two-sided Wilcoxon rank sum test, are highlighted in 501 red boxes. 502

Simulated Roots 503
We start with a visual comparison of the results of TopoRoot, DynamicRoots+ and GiaRoots+ 504 on one of the simulated root systems. This root system is simulated to be 34 days old, with five 505 whorls, 34 nodal roots, and a lateral root branching frequency between 0.3 -0.7 cm / branch. 506 Figure 13 visualizes the root hierarchies produced by TopoRoots and DynamicRoots+ as well as 507 the voxelized skeletons produced by GiaRoots+ at three different noise levels (0, 0.04cm, 508 0.08cm). As the noise level increases, the thresholded image at the shape threshold becomes 509 more disconnected and contaminated by topological errors (see Figure 13). Accordingly, 510 DynamicRoots and GiaRoots miss more root parts, whereas TopoRoot as well as the extended 511 tools, DynamicRoots+ and GiaRoots+, retain much of the root shape. Observe that, similar to the 512 real roots dataset, the hierarchies produced by DynamicRoots+ incorrectly label many nodal 513 roots as level 0 (black boxes). In contrast, the hierarchies produced by TopoRoot are more 514 visually plausible.  Figure S4) as the noise level increases. Roots less than one 538 voxel long in the ground truth model were ignored in our analysis. We compare and analyze the 539 accuracy of TopoRoot across each category of traits below. In general, we observe that higher 540 image noise leads to larger mean errors and/or greater variance by TopoRoot. For most of the 541 traits, TopoRoot maintains a lower error than other tools across all noise levels. 542 As the base of the hierarchy, the stem traits are among the most accurate. As the noise increases, 543 portions of the stem region are lost, resulting in a thinner stem ( Figure S1). Increased noise also 544 causes the stem to wiggle more in the direction perpendicular to its main path, resulting in an 545 increased stem length. 546 The nodal root traits (Table 2) are the most accurate for the root count (8.3% up to e=0.04, 547 10.3% up to e=0.08) and emergence and midpoint angles (5.7/7.1% up to e=0.04, 9.1/9.5% up to 548 e=0.08). The lowest accuracy is seen for the number of children (40.0% up to e=0.04, 48.6% up 549 to e=0.08) and thickness (39.2% up to e=0.04, and 38.2% up to e=0.08). These errors are due to 550 misclassifications when the nodal root becomes entangled with higher-order lateral roots. 551 TopoRoot slightly underestimates the average and total length due to faulty cycle breaking and 552 misclassification errors in portions of nodal roots further away from the stem. TopoRoot's error 553 in nodal root tortuosity is higher than that of DynamicRoots for two reasons. First, the ground 554 truth tortuosity is close to 1 (only for the simulated data, but not for the real maize roots), and 555 DynamicRoots coincidentally produces values close to this because it mistakes many shorter 556 lateral roots as nodal roots, as evidenced by its much shorter average nodal root length and the 557 black boxes of Figure 13. Second, nodal roots sometimes are misclassified by TopoRoot closer 558 to their tips due to the large number of intersections between roots of different hierarchy levels, 559 resulting in excessive winding. TopoRoot slightly overestimates angle measurements due to 560 misclassification errors further away from the stem which bend the detected paths sideways; 561 these explain the errors in the tip angle measurements. 562 The errors for the lateral root traits (Table 3) are generally larger than nodal root traits, primarily 563 since the imaging noise has a greater impact on the thinner roots more than the thicker ones. 564 There is a greater underestimation of both the total first-order lateral roots and their total length 565 ( Figure S3), due to both the misclassification of the hierarchy levels and the loss of many thin 566 roots in the distance field. On the other hand, the misclassified first-order lateral roots are 567 counted as lateral roots of higher orders, and hence less errors lie in the total lateral root count 568 (22.2% up to e=0.04 and 37.0% up to e=0.08) and lengths (24.3% up to e=0.04 and 24.4% up to 569 e=0.08) over all orders. All methods significantly overestimate the first-order lateral root 570 thickness due to limits in the resolution, but TopoRoot produces the lowest error. The lowest 571 errors are seen in the first-order lateral emergence/midpoint/tip angles (3.4%/4.1%/5.4% up to 572 e=0.04 and 3.0%/4.0%/5.9% up to e=0.08) and tortuosity (4.6% up to e=0.04 and 4.5% up to 573 e=0.08). 574 575 Finally, combining nodal and lateral roots, TopoRoot produces on average 35.4% error (21.5% 576 up to e=0.04) in the total root count and 25.4% error (25.0% up to e=0.04) in the total root 577 length, which are much lower than DynamicRoots and GiaRoots (Table 4). Note that both 578 DynamicRoots and GiaRoots significantly underestimate the root count and total length, even 579 after topological simplification, and the amount of underestimation generally increases with the 580 level of noise ( Figure S4). The only global trait that TopoRoot does not have the lowest error is 581 the average length, due to a combination of DynamicRoots being coincidentally closer due to its 582 underestimation of both the total length and number of roots, and TopoRoot having an excessive 583 number of roots at higher noise levels. These are the same reasons why the two methods have 584 similar lateral root average length errors. 585 Discussion 607 A gap exists in the phenotypic measure of root system architecture between fine-grained 608 analyses that can be conducted on entire seedling root systems in laboratory settings, and much 609 coarser global analyses available to field researchers. Since root systems are an emergent 610 property of their many hundreds, thousands, or tens of thousands of constituent roots, this gap is 611 a major hindrance to a comprehensive understanding of root system development, environmental 612 interaction, and the genetics that influence these processes. In previous work, we showed that 613 when global 3D analysis of field excavated maize root crowns was compared to 3D seedling 614 analysis in gellan gum, genetically encoded differences were consistent despite major differences 615 in developmental stage and the growth environment [17]. Another step of our pipeline which is prone to errors is the cycle breaking portion of 637 skeletonization. Decisions on where to break cycles rely on local angle continuity information 638 near junctions, as well as the distance from the stem. However, if a root continues across many 639 junctions where cycles pass through, then the root may accidentally be cut off early at one of 640 these junctions. Our pipeline is sensitive to the amount of soil remaining in the sample, because 641 soil often has a similar or higher intensity as the roots. If a large clump of soil is stuck onto the 642 root, it may result in a thick region which is mistakenly identified as part of the stem. This 643 problem can be avoided by thorough cleaning of samples prior to scanning, and by basic image 644 thresholding prior to TopoRoot analysis. The final contrast will affect the ideal choices for k and 645 n with respect to s; the greater the contrast, the smaller the gap between k and n need to be to 646 incur the same amount of topological simplification. 647 The quality of the topological repairs may be improved by considering not only the image 648 intensity when computing additions and removals of contents, but also incorporating geometric 649 criteria such as the tubularity of a region to identify it as part of a root. This will allow for a 650 greater amount of topological simplification without sacrificing geometric optimality. The 651 suboptimal local decisions of cycle-breaking may be potentially improved by grouping cutting 652 decisions together to produce a collective score. These groups can gradually be built from the 653 bottom-up to produce a more globally optimal solution. While TopoRoot is designed for and validated on X-ray CT scans of excavated maize root 684 crowns, the tool can be potentially adapted to other types of root systems and imaging 685 modalities. For root crowns with multiple tillers (e.g., sorghum), we offer a mode of TopoRoot 686 which extends the stem-detection heuristic (during the skeletonization step) by producing a stem 687 path within each region of the skeleton above a given thickness threshold. Preliminary visual 688 experiments show that TopoRoot's multiple-tiller mode produces plausible hierarchies at a 689 quality similar to that seen in the single-tiller mode ( Figure 14). Further expanding the stem-690 detection heuristic to identify the primary root would make the pipeline applicable to taprooted 691 systems as well. TopoRoot should work equally well for fine-grained analysis of 3D root system 692 architecture reconstructions derived from in situ imaging methods such as X-ray CT, MRI, and 693 optical imaging, provided they are first segmented from background (primarily soil and pot). 694 Segmentation can be done using a variety of methods such as region-growing [ some of these methods produce a binary volume (e.g., region-growing) whereas others produce a 697 probability density field (e.g., deep learning). Since TopoRoot requires a gray-scale intensity 698 volume with three thresholds (shape, kernel and neighborhood), a binary segmentation will first 699 need to be converted into a Euclidean distance field. 700

Software availability 701
TopoRoot is available for free at: https://github.com/danzeng8/TopoRoot 702 Included in the page are instructions to run the software, and details on the formats of the input 703 files, with a .dat accompanying the .raw file to specify the dimensions. TopoRoot is currently 705 configured to build and run on Windows 10 machines, and has not been tested on other 706 platforms. We plan on releasing a Linux build of TopoRoot in the future. A graphical user 707 interface is also available on the page for visualizing the output hierarchy. 708

709
We introduced TopoRoot, a high-throughput method for computing the root hierarchy and fine-710 grained root traits from a 3D image. TopoRoot specifically addresses topological errors, which 711 are common in 3D imaging and segmentation and are barriers for obtaining accurate root 712 hierarchies. Our method combines state-of-the-art methods developed in computer graphics with 713 customized heuristics to compute a wide variety of traits at each level of the root hierarchy. 714 When tested on both 3D scans of excavated maize root crowns and simulated root systems with 715 artificially added noise, TopoRoot exhibits higher accuracy than existing tools (DynamicRoots 716 and GiaRoots) in both coarse-grained and fine-grained traits. Furthermore, the efficiency and 717 automation of TopoRoot makes it ideal for a high-throughput analysis pipeline, and the results 718

Groups Trait Description
Global traits Total length Sum of the lengths of stem, nodal roots, and lateral roots.
Number of roots Total number of roots across all hierarchy levels.
Average length Total length divided by number of roots.

Stem traits
Average stem thickness Average thickness of the vertices along the stem path.

Stem length
Length of the stem path.
Per-level traits Level n root count The number of level n roots.
Total level n root length Sum of the lengths of level n roots. The length is the skeleton distance from the beginning of the root to its tip.
Average level n root length Total level n root length divided by level n root count.
Level n root tortuosity Length of a root (skeleton distance) divided by the Euclidean distance from the beginning to the tip, averaged across all level n roots.
Level n root thickness Thickness associated with the skeleton vertices in the root, averaged across all level n roots.

Number of level n root children
Number of level 2 roots divided by number of level 1 roots.
Level n root tip angle Angle between the stem direction and the vector from the beginning to the tip of a root, averaged across all level n roots.
Level n root emergence angle Angle between the stem direction and the vector from the beginning to 30 vertices along the skeleton of a root, averaged across all level n roots.
Level n root midpoint angle Angle between the stem direction and the vector from the beginning of a root to the halfway point of the root, averaged across all level n roots.

Aggregated lateral root traits
Total lateral length Sum of the lengths of lateral roots whose hierarchy level is greater than or equal to 2.
Number of lateral roots Number of lateral roots whose hierarchy level is greater than or equal to 2.
Average lateral root length Average length of lateral roots whose hierarchy level is greater than or equal to 2. Comparing the accuracy of computing lateral root traits between TopoRoot, DynamicRoots, and 905

Supplementary
DynamicRoots+ across 55 models. The tan line represents the ground truth, and the results of all 906 methods are computed as a percentage of the ground truth. 907 908 Supplementary Fig. S4  909