## Abstract

Changes in trabecular micro-architecture are key to our understanding of osteoporosis. Previous work focusing on structure model index (SMI) measurements have concluded that disease progression entails a shift from plates to rods in trabecular bone, but SMI is heavily biased by bone volume fraction. As an alternative to SMI, Ellipsoid Factor (EF) has been proposed as a continuous measure of local trabecular shape between plate-like and rod-like extremes. We investigated the relationship between EF distributions, SMI and bone volume fraction of the trabecular geometry in a murine model of disuse osteoporosis as well as from human vertebrae of differing bone volume fraction. We observed a moderate shift in EF median (at later disease stages in mouse tibia) and EF mode (in the vertebral samples with low bone volume fraction) towards a more rod-like geometry, but not in EF maximum and minimum. These results support the notion that the plate to rod transition does not coincide with the onset of bone loss and is considerably more moderate, when it does occur, than SMI suggests. A variety of local shapes not straightforward to categorise as rod or plate exist in all our trabecular bone samples.

**Subject Areas** osteoporosis, trabecular bone, morphometry

## 1. Introduction

The metabolic bone disease osteoporosis is a major health concern associated with high mortality rates and considerable economic costs [1,2], likely to be exacerbated by the increase in the proportion of elderly people in future demographics. In this disease, imbalance between osteoblastic (bone-forming) and osteoclastic (bone-resorbing) cell activity is thought to lead to lower bone turnover and relatively higher resorption than formation, and thus to a lower amount of bone [3]. Lower bone mass causes reduced mechanical competence and increased fracture risk with age [4].

The large amount of bone surface relative to bone volume in trabecular bone (compared to cortical bone) may make it particularly sensitive to shifts in the bone remodelling balance [5]. Beyond the loss of bone volume fraction in the trabecular bone compartment, changes in tissue morphology may contribute to the deterioration of bone quality of osteoporotic patients. Because such osteoporosis-related changes to the trabecular bone micro-architecture form a link between the bone (re)modelling balance at a tissue level and the mechanical performance of the bone organ, they are key to our understanding of the disease.

Prominent amongst parameters considered when evaluating tissue-level morphological changes is *structure model index* (SMI) [6]. SMI was designed to estimate how rod-or plate-like a trabecular geometry is [7]. Evaluation of SMI across a number of data sets from human patients and animal models suggests that trabecular geometry transitions from being more plate-like to more rod-like as osteoporosis severity increases (“plate-to-rod transition”) [8–12]. However, it is well known that SMI correlates strongly with bone volume fraction, rendering the comparison of SMI values between samples of vastly different bone volume fraction (such as osteoporotic samples versus healthy control samples) dubious. Furthermore, the concept of SMI is based on relative changes in surface area in response to a small dilation, and relies on the fact that dilating a convex shape (such as a sphere (SMI=4), a cylinder (SMI=3), or a infinite plane (SMI=1) always creates a larger surface area. This is not the case in trabecular bone, because parts of the trabecular bone surface are concave and become smaller when the volume is expanded [13].

Ellipsoid Factor (EF) has been proposed as an alternative method to measure the plate-to-rod transition in trabecular bone [14]. EF has since been used within and beyond bone biology (e.g. bone surgical implant testing [15] and the characterisation of the trabecular bone phenotype of genetic dwarfism [16], of the primate mandible [17], of the human tibia [18], and of animal models of osteoarthritis [19], but also studies of fuel cell performance [20,21]). Apart from the original critique of SMI [13], as far as we know, there have been no further reports of EF in osteoporotic samples in the literature.

In this study, we expand on our two previous studies on the use of EF and the putative plate-to-rod transition in osteoporosis [13,14]. Specifically, we present new EF data on trabecular bone from an animal model of disuse osteoporosis as well as from human second lumbar (L2) vertebral bodies from women of varying age and bone volume fraction. The aim of the study is to investigate the association between variables describing the trabecular architecture (EF and SMI) and bone health. Our EF data relies on an updated and validated implementation of EF (details in Supplementary Material (a)) available freely as part of the latest BoneJ, a collection of ImageJ plug-ins intended for skeletal biology [22].

## 2. Methods

### (a) EF algorithm

The EF algorithm was first reported in a previous study [14] and is explained here again due to its fundamental relevance to the present study.

EF is a scalar value assigned to each foreground pixel in the three-dimensional binary image stack of interest. The EF of each pixel depends on the maximal ellipsoid that contains the pixel and that is contained in the image foreground. Denoting the axis lengths of the maximal ellipsoid as *a, b* and *c* (with *a* ≤ *b* ≤ *c*), EF of each pixel is calculated as a difference of sorted axis ratios

EF is confined between -1 and 1, with -1 being very plate-like, and 1 very rod-like.

Ellipsoid Factor is calculated by fitting locally maximal ellipsoids into the image foreground, then iterating over the foreground pixels to find the largest ellipsoid in which each pixel is contained. Note that the locally maximal ellipsoid is generally non-unique (Supplementary Material ii).

#### (i) Ellipsoid fitting

First, points where a small sphere can start to grow (“seed points”) are determined. Two strategies for finding seed points exist. The first is called distance-ridge based seeding. It involves subtracting the results of a morphological opening and a closing operations on the distance transform of the input image from each other. The second is a topology-preserving skeletonisation [23]. Distance-ridge based seeding is computationally more efficient than skeletonisation in practice, but it may miss thin features that skeletonisation preserves well and may overestimate the number of seed points needed to fit ellipsoids to a plate.

After being seeded, each spherical ellipsoid grows uniformly by one user-defined increment at a time until a number of surface points equal to the user-defined “contact sensitivity” parameter hit the trabecular bone boundary (a background pixel). Surface points are chosen from a random uniform distribution on the ellipsoid surface.

When the growing ellipsoid hits the trabecular bone boundary for the first time, the vector from the ellipsoid centre to the average contact point is set as the first ellipsoid axis and the ellipsoid is contracted slightly. Growth of the ellipsoid then continues in the plane orthogonal to this first axis, again until the boundary is hit. This initial ellipsoid fitting is following by a series of small random rotations, translations and dilations of the ellipsoid in an attempt to find a larger ellipsoid in the local region. These attempts end if no increase in volume of the ellipsoid is found after a user set maximum number of iterations (default 50, see (b)), or if the total number of attempts exceeds ten times the maximum iteration number. If more than half of the sampling points on the ellipsoid are outside the image boundary it is invalid, removed and ignored in further calculations.

#### (ii) Assign EF to each pixel and averaging over runs

Once maximal ellipsoids are found for each seed point, each foreground pixel is assigned the EF value of the largest ellipsoid that contains it, or NaN (not a number) if no ellipsoids contain that pixel. One iteration of fitting ellipsoids and assigning EF to each pixel is termed a *run*.

Ellipsoid factor is a stochastic process and therefore results can vary from run to run. The user has the option to average the outputs over several runs to smooth the results. From experience on various real-life examples, we recommend averaging over 6 runs (the “repetitions” input parameter) for the final result generation. This typically reduces the median and maximum EF variation per pixel per run to less than 0.15 and 0.4, respectively (See supplementary material (c)).

#### (iii) EF inputs and outputs

Some further mathematical considerations on the shape of the distributions to be expected when calculating a difference of axis ratios can be found in Supplementary Material ((d)).

In the present study, we ran Ellipsoid Factor on two data sets, with sample descriptions and statistical analysis detailed in the next two subsections. EF input parameters used for each of these studies are listed in Table 4. For both studies, we measured BV/TV and SMI, calculated descriptive statistics of the EF distribution (median, maximum, and minimum), and plotted EF histograms.

### (b) Disuse osteoporosis in mouse tibiae

X-ray microtomography (XMT) scans (5 µm nominal pixel spacing) of 12 murine tibiae were obtained from an unrelated study [25] (in preparation). The animals had undergone sciatic neurectomy to the right hindlimb, inducing one-sided disuse osteoporosis. They were divided into three groups of four mice. Group 1,2, and 3 were euthanised 5, 35, or 65 days after surgery, respectively. Trabecular bone from the proximal metaphysis was segmented by drawing around the trabecular-cortical boundary using the software CTan (Bruker, Belgium).

The segmented images were denoised using a 3D median filter and thresholded at a pixel value of 75 (Figure 2). The thresholding value was selected visually as sensible on one sample and kept consistent across samples. As the EF distributions were uni-modal and not normal in all cases, the EF median, maximum and minimum were taken as representative values for each specimen. SMI values were computed for each sample (using Hildebrand and Rüegsegger’s method [7] with volume resampling 2 and mesh smoothing 0.5), and bone volume fraction measurements were taken from the raw data of an unrelated study [25] (in press).

For each group, paired t-tests comparing EF median, SMI and bone volume fraction between control and disuse leg were performed using the R software [26]. We performed Pearson’s product-moment correlation tests for association between EF median and bone volume fraction, and between SMI and bone volume fraction for each group. The R scripts used for this purpose can be found in an online repository [27] under `/R/paired-mouse-disuse-test.R`.

### (c) Ellipsoid Factor in human vertebrae of varying trabecular bone volume fraction

To investigate the association of SMI and EF with human bone heath, we imaged sagittal sections of 22 vertebrae from women of varying age (24-88 years old) using XMT (30 µm pixel spacing). Pixels with a linear attenuation coefficient of more than 0.7 cm^{−1} were classified as bone, others as background. Cuboidal regions of interest containing trabecular bone, aligned with the image axes, were chosen manually. The vertebrae were originally collected and prepared for imaging with scanning electron microscopy in a previous study [28]. This data set was interesting to the present study for two reasons. Firstly, these are the first EF numbers obtained on healthy and osteoporotic samples from humans. Secondly, they constitute a challenge for choosing reasonable EF input parameters because they are close to the resolution limit at which we can expect EF to fit the local shape well (trabecular thickness is approximately 5-8 pixels in these images). We additionally report mean and maximum trabecular thickness (Tb.Th [mm]) in these samples.

The age distribution of our vertebral samples was non-normal, as it was skewed to the left by the prevalence of older samples (Shapiro-Wilk test *p <* 0.05). We therefore performed a non-parametric test of association of age with bone volume fraction. All other variables of interest (EF Median, EF Maximum, EF Mode, EF Minimum, SMI, SMI+, SMI-, mean Tr. Th., maximum Tr. Th) could be assumed to follow a normal distribution (Shapiro-Wilk test *p >* 0.05). As a consequence, we used Pearson’s *r* as a measure of association between these variables and bone volume fraction in our statistical tests. All statistical analysis of the vertebral samples was based on a custom script (available at [27] under `R/histo-EF-stats-vertebrae-final.R`) using the R programming language [26].

## 3. Results

All distributions of EF observed in images of bone were uni-modal, as seen in the histograms of Figures 3 and 12. As described earlier, we used the median, maximum and minimum (and the mode, for the vertebrae) of the distribution as a representative value to describe the distributions of local shape in these images for statistical analysis.

### (a) Disuse osteoporosis in mouse tibiae

Paired one-sided t-tests (n=4) showed BV/TV and SMI values were significantly different between disuse and control limbs at the 5% level between control and disuse groups at all time points (Figures 5 and 6). Minimum and maximum EF were not statistically associated with disease state (*p >* 0.05) at any time point (Figure 7). There was no link between EF median and disuse at 5 days (paired one-sided t-test, *p >* 0.05) and 35 days (difference not normally distributed (Shapiro-Wilk *p <* 0.05), paired one-sided Wilcoxon rank sum test, *p* = 0.06), but there was a statistical difference at 65 days (*p <* 0.05). Unlike SMI, these measurements suggest therefore the presence of a small shift of about EF 0.1 occurred only after a large amount of bone had already been lost. Over all time points, bone volume fraction explained considerably less of the variance in EF (Pearson’s *r*^{2} = 0.25, *p <* 0.05) than SMI (Pearson’s *r*^{2} = –0.81, *p <* 0.001, Figure 8). The R-script used to perform this analysis can be found under `/R/mouse-smi-tests.R` in [27]. EF images and histograms for our murine samples can be seen in Figures 4 and 3, respectively. EF filling percentage was higher than 90% for all our murine samples, although significantly differed between disuse and control at all time points (Paired t-test, *p <* 0.05).

### (b) Ellipsoid Factor in human vertebrae of varying trabecular bone volume fraction

Filling percentages ranged from 74% to 97% and median change in EF between the two final runs ranged from 0.1 to 0.17(Figure 9). Correlation tests showed that there was no association (*p >* 0.05) between bone volume fraction and any of the three convergence variables median change, maximum change and filling percentage, indicating that the EF algorithm did not preferentially fill the trabecular bone more completely or in a more stable way in samples with relatively low or high bone volume fraction. This was evidence for a satisfactory convergence of the EF algorithm, albeit not as complete as in the murine samples.

There was a negative association between BV/TV and age (Spearman’s *ρ* = –0.58, *p* = 0.004), but not between BV/TV and mean or maximum trabecular thickness (*p >* 0.05). SMI, SMI+ and SMI-were strongly and significantly associated with bone volume fraction: Values for Pearson’s *r* were –0.69, –0.65, and –0.73, respectively, while *p*-values were all *<* 0.005 (Figure 10). SMI ranged from 1.36 to 3.11.

Median, maximum and minimum EF were not associated with bone volume fraction (*p >* 0.05, Figure 11), and there was a mild negative association between bone volume fraction and EF modal value (*r* = –0.45,*p* = 0.03). Histograms of the EF distribution were occasionally skewed in either direction across all values for bone volume fraction (Figure 12). Sometimes similar EF values clustered in one region of the vertebra, while in other cases, a range of EF values could be found in all anatomical regions considered. Figure 13 shows EF images for 20 of the 22 vertebrae we analysed.

## 4. Discussion

We measured Ellipsoid Factor distributions in trabecular bone from healthy and unloaded mouse tibiae and from human vertebrae. Only on some occasions, EF supported the presence of a small shift towards a more rod-like geometry linked with decreases in bone volume fraction. SMI, on the other hand, suggested the presence of a drastic plate to rod transition whenever a difference in bone volume fraction was found. EF distributions in the samples from both species we investigated in the present study were consistently uni-modal.

In the murine samples, bone loss happened shortly after surgery in one condyle, but EF median changed only later during disease progression. This suggests that local shape changes in the trabecular bone may be delayed with respect to the initial loss of bone. The strong interdependence between SMI and bone volume fraction is misleading in this case, as it support an immediate change in local trabecular shape that culminates into a geometry that is more convex than a perfect rod (SMI>3) at the latest time point. Minimum and maximum EF values are not different in healthy and osteoporotic murine samples, underlining that very plate- and very rod-like structures co-exist in all samples.

Similarly, in the human vertebra samples, only the mode of the distribution correlated with bone volume fraction, highlighting that any changes in local shape linked to a decrease in bone volume fraction are subtle. Considerable variability in local shape can be seen in the EF images of the vertebral samples. Some of the samples agree with the results of a descriptive anatomical study of human 4th lumbar vertebral bodies, which characterised the trabecular geometry as central plates and braces surrounded cranially and caudally by a honeycomb of rods [29].

In this study, we further presented some recommendations for suitable default parameters for EF (Table 1), based on the convergence behaviour of EF reported in Supplementary Material (c).

### (a) What is the mechanical relevance of plates and rods in cancellous bone?

Modelling cancellous bone as a cellular solid gave rise to the idea that plates and rods contribute to mechanical performance. Theoretical, idealised models of open-cell and closed-cell porous solids predicted a dependence of the stiffness and strength on the square and the cube of the characteristic length *r*, respectively. In a seminal study for the concept of rods and plates in trabecular bone, Gibson analysed previous data from this perspective and showed that these models were consistent with a transition from open-cell to closed-cell mechanical behaviour at a bone volume fraction of 0.2 [30]. This is further evidence that attempting to measure rods and plates in trabecular bone is not independent of the amount of bone present (contrary to what was stated in the original SMI study [7]). The bone volume fraction in our samples was below 20%, where the influence of concave surface and negative SMI are less than in samples with greater BV/TV [13], so it would be interesting to compare EF in samples with bone volume fraction above and below this value in the future.

The mechanical environment has a strong effect on bone size and shape at an organ and tissue level (e.g. [31–33], for a review, see [34]), but Frost’s mechanostat may not be the main driver of trabecular adaptation within the life of an individual [5]. Across species, trabecular bone micro-structure scales as a function of animal size and is likely to behave differently in small animals compared to large animals [35]).

Changes in local shape may indicate preferential osteoclastic resorption and/or osteoblastic formation in certain areas of bone. Qualitative descriptions based on scanning electron micrographs of human lumbar vertebrae suggest defective and or slowed bone formation and mineralisation, as well as decoupling of resorption and formation as characteristic of the osteoporotic trabecular geometry at a length scale below the one investigated in the present study [36]. Resorption cavities in human fourth lumbar vertebrae may occur most often near trabecular nodes, with the next most common location plate-like trabeculae [37]. The study gives no details on how plates, rods, nodes and “fenestrations” are characterised. It would be interesting to correlate SMI and EF results with such observational studies in the future.

### (b) Measures of local shape beyond SMI and EF

Individual trabecula segmentation (ITS) has been proposed as a method to classify the local shape of trabecular bone as rods and plates [38]. ITS is based on a decomposition of the trabecular geometry into surfaces and curves [39], with subsequent assignment of all foreground pixels to one of these surfaces and curves based on a measure of vicinity and orientation [40]. ITS has been measured in biopsies of hip replacement patients with inter-trochanteric fractures [41]. Compared to cadaveric controls, these fracture patients had lower ITS plate bone volume fraction, but equal ITS rod bone volume fraction, as well as lower stiffness moduli and lower overall bone volume fraction (BV/TV). We find it interesting that ITS-measured plate volume fraction correlates with stiffness in these studies. However, we note that ITS-measured axial volume fraction is also (often more strongly) correlated to stiffness than plate volume fraction. It is clear that, at equal bone volume fraction, bone that is less aligned to the direction in which stiffness is measured will behave in a more compliant manner that bone that is more strongly aligned to this direction [42]. We therefore suggest that, in the ITS studies, the driving factor for these observations may not be a change in local plate/rod shape, but rather a change in local alignment to the axes in which stiffness is measured. It would be interesting to compare ITS and EF results in the future.

Another method that decomposes trabecular bone into rods and plates was developed but validated only on piece-wise convex objects [43]. Applying it to human vertebral samples suggested that three parameters of micro-architecture (two relating to the supposed rod elements) explained 90% of bone stiffness, the same amount of variation in sample stiffness explained by apparent bone volume fraction alone [44]. However, all three of these parameters had a significant and strong correlation with bone volume fraction, and this study therefore does not constitute evidence for geometrical changes in the trabecular compartment driving mechanical properties beyond the loss of material. Fatigue failure of trabecular bone may further be related to elements oriented transversely to the main loading direction, which have little effect on stiffness and strength [45].

### (c) Limitations and Future work

Ellipsoid Factor is a useful addition to the many geometrical and topological quantities that are routinely measured in trabecular bone, some of which depend on each other, as we have shown here. Ellipsoid Factor is at least designed to be a priori independent of bone volume fraction, the most important descriptor of trabecular bone mechanical properties [42,46]. The lengths of the ellipsoid semi-axes a,b,c as half-thickness, half-width, and half-length trabecular variables could be seen as an extension to measuring trabecular thickness alone.

The samples we consider in this paper are cross-sectional, which unfortunately precludes us from following the trabecular architecture of a single individual over time. Ellipsoid Factor, like all other measures of trabecular micro-architecture, requires a sufficient resolution of the individual geometrical features to minimise artefacts such as noise and partial volume effect. Where resolution is insufficient for EF to run on a binarised image, it might be possible to locate the trabecular boundary using fuzzy edge detection (and therefore circumventing the need for precise thresholding), as is done in the tensor scale algorithm [47–49]. The current EF software is designed in such a way as to make an approach based on fuzzy boundary detection straightforward. Very small trabeculae may be routinely missed by XMT altogether, but dealing with this limitation was outside the scope of this study.

Ellipsoid Factor is a complex algorithm, with several input parameters that need to be tailored to the application. We believe that this is also an advantage in some ways, as it will force users to better understand the methods they are using. We encourage users to ask questions can be asked on the ImageJ forum (https://forum.image.sc/). Despite its complexity, an advantage of EF is that it reduces local shape down to a single number per pixel. Important information on the subtlety of trabecular local shape is lost due to this simplification and users are encouraged to view and interpret the Flinn peak plot because it is a more complete, but more complex, representation of the local shapes present in their sample (Figure 1). The Flinn plot may require more advanced statistics, for 2-D, non-independent response variables, to rigorously compare sampled groups. It might be possible to improve the performance of EF in the future by transferring some parallel computations onto the graphics card [50].

Further avenues of future research could investigating how well EF characterises curved trabecular bone, and understanding whether characteristic combinations of axis ratios and for an individual or a group exist that are not immediately recognised by looking at the axis ratio difference.

## 5. Conclusion

Our investigations suggest that local shape in trabecular bone is not straightforward to decompose into rods and plates, and that a wealth of shapes across the plate-rod continuum exist in any sample. Our data support the presence of a slight tendency of the trabecular geometry to have higher EF in osteoporotic samples, possibly as a consequence of a cell-driven re-organisation that is delayed in respect to the initiation of bone loss. This transition, where it occurs, is considerably more subtle than SMI values suggest.

## Author contributions

MD had the idea for Ellipsoid Factor and designed the overall project. AAF and MD implemented the ImageJ code. AAF wrote the R and Python scripts, did the statistical analysis, made the figures, and wrote the manuscript draft. RS performed the surgery and dissected the mouse tibiae. BJ imaged the murine samples. SM segmented the trabecular compartment of the mouse tibiae, and measured bone volume fraction in the mouse samples under the guidance of BJ. DM and AB prepared and imaged the human vertebrae. All authors read the manuscript, provided feedback and approved the final version of the manuscript.

## Competing interests

MD was a member of the Editorial Board of Royal Society Open Science at the time of submission and was not involved in the assessment of this submission.

## Data availability

Vertebral XMT scans (`https://figshare.com/projects/Assessment_of_Bone_Quality_in_Osteoporosis_-_XMT_and_SEM/76962`) and segmented XMT of mouse trabecular bone (`https://figshare.com/projects/Segmented_trabecularbone_from_microCT_scans_of_mice_with_one-sided_neurectomy_to_hindlimb/79583`) are available on the data sharing repository figshare. Some vertebral scans can additionally be explored as 3d renderings on SketchFab: `https://sketchfab.com/alexjcb/collections/vertebrae-sections`.

The BoneJ source can be found on Github `https://github.com/bonej-org/BoneJ2`, with installation instructions at `https://imagej.net/BoneJ2#Installation`

## Ethics

The image data were obtained and re-used from unrelated experiments in which animal procedures and human samples were used with appropriate ethical approval. The use of animals in the unrelated study was carried out in accordance with the Animals (Scientific Procedures) Act 1986, an Act of Parliament of the United Kingdom, approved by the Royal Veterinary College Ethical Review Committee and the United Kingdom Government Home Office, and followed ARRIVE (Animal Research: Reporting of In Vivo Experiments) guidelines. Human second lumbar vertebral body samples were obtained via the European Union BIOMED I study “Assessment of Bone Quality in Osteoporosis".

## Supplementary material

### (a) Implementation and software design

Results presented in this study were obtained with a development version of BoneJ (commit 0ce1c5eba). We have verified that the differences obtained with the initial “styloid” release of BoneJ [22] are approximately 0.15 on average, which is what is to be expected from the stochasticity of the EF algorithm.

The latest Ellipsoid Factor implementation adheres to the principles of modern ImageJ - ImageJ2 [51], dividing the the execution of the algorithm into small, modular and re-usable part (“ops“, in our case: seed-point finding, ellipsoid fitting) combined into a high-level ImageJ plugin (“command“, in our case, the Ellipsoid Factor command, part of BoneJ2 [52]). Installation instructions for BoneJ2 can be found at https://imagej.net/BoneJ2#Installation.

The modularity of the seed point finding allows easy switching between the two existing seed point finding strategies (distance-ridge-based and topology-preserving). Similarly, the modularity of ellipsoid fitting allows future extension to e.g. fuzzy edge detection of trabecular surface points on low-resolution grey scale images (which may be relevant for *in-vivo* HR-pQCT images of human trabecular bone with (comparatively) low resolution [53]; for a related algorithm, see [47,49,54]) or a surface-based ellipsoid fitting strategy [55].

The two modules are decoupled from each other, so a change in one will not affect the usability of the other.

### (b) Sensitivity to max iteration parameter

Figure S1 shows that the difference in EF for different values of “max iterations” was small (other input parameters being equal). Numerical experiments showed that for the emu and shrew test images, an ellipsoid that had not improved its volume for approximately 40 iterations was equally likely to not be fitted well than it was to be fitted well. We therefore chose 50 as a reasonable, conservative default value for “max iterations".

### (c) Sensitivity to some input parameters

We tested the sensitivity of the EF image on variations in number of seeds (skip ratio), volume-weighted averaging, number of sampling vectors and number of runs. This was done on three test images, already used in the previous EF study [14]. The tests showed consistently that adding more seeds was important to achieve a high filling percentage, and that an increased number of runs (about 6) was necessary for good convergence (median change per run < 0.1). More seeds and more runs come at the cost of a longer run-time, however. We therefore recommend setting a low skip-ratio and averaging over 6 runs to users for EF experiments once they are satisfied with the other settings.

### (d) Some mathematical considerations on Ellipsoid Factor

#### (i) How is a difference of ratios of sorted random variable triplets distributed?

As seen in the main document text, EF is calculated as the difference of sorted axis ratios of the fitted ellipsoids. The ellipsoid fitting is stochastic, which means that the ellipsoid axis lengths will be randomly distributed, but their distribution is *a priori* unclear. It may therefore be useful to know how the difference of ratios of sorted random triplets is distributed for some known defined random distributions: the normal distribution (Figure S5) and the uniform distribution (Figure S6).

These simple examples caution against the undiscerning interpretation of EF values. However, it seems unlikely that a,b,c follow either of the distributions above. We can simulate this to an extent with a gamma distribution (Figure S7). This also shows that EF values distribute in an approximately triangular fashion.

Finally, we expect that a structure clearly divisible into rod and plate like parts would have the properties displayed in Figure S8, i.e. the middle radius clustering into two clusters near where the smallest radius and the largest radius are clustered.

This subsection shows that one difference of sorted ratio distribution does not imply a unique underlying ellipsoid distribution: we have seen two ways of getting triangular distributions. It also tells us that one way of getting a bimodal distribution is have bimodal axis ratios. The R script used to perform these numerical experiments and plot their results can be found at [27] under `/R/null-case-EF.R`.

#### (ii) Non-uniqueness of EF

Note that this locally maximal ellipsoid is not unique. For example, theoretically, one could have two axis-aligned ellipsoids with (x,y,z)-axis lengths of (1,1,9) and (3,3,1) respectively. Both would have volume 9, but very different EF. Assuming these ellipsoids are locally maximal, pixels within the first ellipsoid would have (i.e. rod-like) whereas pixels within the other would have (i.e. rather plate-like (Figure S9)). The pixels in the intersection of the two ellipsoid would have non-unique EF.

#### (iii) Sensitivity to some more input parameters

[ht] We used vertebral sample 57 to investigate how the sampling increment (“step”), the contact sensitivity and a semi-axis length filter (see next paragraph) affected the EF distribution. The results are for contact sensitivity values of 1 and 5 are shown in Figure S10 and S11, respectively.

Small ellipsoids may get caught within a thin feature when growing, especially when the step-size is close to some stair-case-like feature and the contact-sensitivity is low (Supplementary Material iii). For this reason, we built an option to remove ellipsoids whose longest semi-axis *c* is smaller than a user-defined threshold (in pixel units, but does not have to be an integer number) into our code. We refer to this as the minimum valid longest semi-axis length filter.

We observed that at contact sensitivity 5 S11, reducing the step size was enough to remove spurious small ellipsoids, and using larger filters did not really affect the EF distribution. We therefore removed the minimum valid longest semi-axis filter in the interest of reducing an already high number of input parameters, and ran the vertebral measurements with contact sensitivity 5 and the lowest step size.

## Acknowledgments

The authors thank Phil Salmon (Bruker Micro-CT), Andy Pitsillides (RVC) and the wider RVC Skeletal Biology Group for helpful discussions on trabecular bone, as well as Richard Domander (RVC) and Curtis Rueden (University of Wisconsin-Madison) for valuable help and support in working with ImageJ2. We also thank Yu-Mei Chang for advice with statistical analysis, and Eva Herbst for critically reading the manuscript. This research was supported by a BBSRC Project Grant to MD (BB/P006167/1).