2D or 3D? How in vitro cell motility is conserved across dimensions, and predicts in vivo invasion

Cell motility is a critical aspect of wound healing, the immune response, and is deregulated in cancer. Current limitations in imaging tools make it difficult to study cell migration in vivo. To overcome this, and to identify drivers from the microenvironment that regulate cell migration, bioengineers have developed 2D and 3D tissue model systems in which to study cell motility in vitro, with the aim of mimicking the environments in which cells move in vivo. However, there has been no systematic study to explicitly relate and compare cell motility measurements between these geometries/systems. Here, we provide such analysis on our own data, as well as across data in existing literature to understand whether, and which, in vitro models are predictive of in vivo cell motility. To our surprise, many metrics of cell movement on 2D surfaces significantly and positively correlate with cell migration in 3D environments, and cell invasion in 3D is negatively correlated with glioblastoma invasion in vivo. Finally, to best compare across complex model systems, in vivo data, and data from different labs, we suggest that groups report an effect size, a statistical tool that is most translatable across experiments and labs, when conducting experiments that affect cellular motility.


Introduction
Cell migration is the evolutionarily conserved ability of cells or cellular components to move varying distances depending on both intrinsic and extrinsic cues from their environment 1-3 . This ability of cells to move is vital for the development of complex, multicellular organisms during development and organogenesis [4][5][6][7] . Several crucial processes important to homeostasis, such as wound healing, inflammatory responses, and angiogenesis, are dependent on cell migration [8][9][10][11][12][13] . Just as cellular motility plays a key role in normal development and function, its dysregulation has serious implications in pathobiology. Absent motility of immune cells leads to serious autoimmune diseases, chronic inflammatory conditions, and delayed wound healing 8,[14][15][16] . Finally, cell migration is a hallmark of cancer, with increased invasion of tumor cells correlated with poor patient prognosis 17 .
In order to best understand mechanisms of cellular motility, we and others have developed sophisticated and controllable in vitro systems [18][19][20][21][22][23] . These in vitro systems, coupled with live microscopy, have allowed us to see cells move in response to extracellular signals and genetic manipulations that would be impossible in vivo. These analyses have been reviewed most recently by Decaesteker et al. with the merits of each system described 24,25 . The jump to 3D systems creates a more physiologically relevant environment that now requires cells to not only feel and move around on surfaces, but to also squeeze, modify, and manipulate the environment around them. In vivo measurements of invasion and cellular movement is difficult, though has become possible through the use of intravital imaging and fluorescently labeled cells 26,27 . However, the use of these types of 3D in vitro systems is still preferred due to the controllability, ease of implementation, and flexibility.
There are many challenges in analyzing the data collected on cellular motility and invasion with biomaterial-based systems. These include diversity of assays, metrics, and analyses that result in difficulty in correlating results across platforms, stimuli, and labs. Most of the metrics used to analyze cellular invasion and motility have been developed in 2D and translated to 3D studies. We summarized the most commonly used metrics in Table 1, which include both continual live microscopy and endpoint imaging. We found cell migration reported on a population level, such as percent of cells invaded or migrating, or at a single cell level, such as migration speed or distance traveled. In this commentary, we describe the interrelation between these different motility measurements, the important differences in assays and reporting techniques used across the literature, and describe the potential predictive nature of in vitro assays to in vivo outcomes.

Results
In order to begin to understand how cellular motility metrics may interrelate, we analyzed the correlations between outcomes for multiple glioma cell lines. We summarize them in Table 1, which include percent invading cells, percent migrating cells, chemotactic index, speed, total, and net displacement. Excluding percent invasion, which is a chamber-based endpoint assay, all other metrics mentioned are obtained from live, continuous microscopy. As a first case study, we compared live imaging and percent invasion data for several patient-derived glioma stem cell (GSC) lines, including G2, G34, G62, and G528 (Fig. 1, Supp. Fig. 1). We first compared motility metrics assessed with live imaging to endpoint percent invasion and determined that no single metric significantly correlated with this endpoint metric (Fig. 1a), though there was a medium effect for chemotactic index (negative) and speed (positive). Next, we aimed to determine if there was a correlation between the percent of migrating cells in a total population and single cell metrics of motility (Fig. 1b) and identified that both total and net displacement positively correlated with the total percent of cells that were migrating (r=0.707 and 0.711 respectively, p<0.05). Finally, we compared the single cell metrics of motility based on tracts of individual cells to identify correlations both averaged for the total population (Fig. 1c) and of the single cells (Fig. 1d, n=1182 cells tracked). We found the expected positive correlations for net displacement vs. speed (Supp. Fig. 1, r>0.98, p<0.0001) and a relationship brtween displacement and chemotactic index for both the population averaged outcomes (Fig. 1c) and the individual cell measurements (Fig. 1d). The correlations with percent invasion are particularly interesting as the invasion of cells in vitro is often assumed to be predictive of invasiveness in vivo. Overall, these correlations indicate that it may be possible to infer some cellular motility behaviors from a single assay/measurement. This may be important when making decisions regarding experimental design and analysis of data.

For glioblastoma cell lines, 2D motility correlates with 3D motility
Although cellular motility in 2D and 3D microenvironments entail many of the same underlying mechanisms of cellular motion including contractility, adhesion, and cytoskeletal rearrangement, 3D systems are thought to better mimic in vivo conditions by surrounding cells with the extracellular matrix (ECM). Given the increased use of 3D environments in which to study cells, we sought to evaluate what measurements of 2D motility still applied, or were related to, cell migration in 3D. Using glioma as a case study, we compared the 2D and 3D motility measurements ( Fig. 2) across experiments with these four glioma stem cell lines. Comparing percent migrating cells, speed, net distance, and chemotactic index in 2D vs 3D environments showed that only one metric-percent of migrating cells-correlated significantly between 2D and 3D (r=0.878, p<0.001). Generally, the total percentage of cells migrating was significantly higher in 2D than in 3D, as explained by a linear regression (2D=3.3x3D+21.2). Speed of cells migrating was also lower in 3D than in 2D, as has been commonly reported [28][29][30][31][32] . Observationally, the range of chemotactic indices was similar between 2D and 3D, alongside decreases in total and net displacement in 3D compared to 2D culture. Thus, we were surprised to see that many metrics of individual cell motility did not correlate between 2D and 3D, though the total percent of migrating cells did.

No obvious relationship between measurement time or cell density and cell migration quantification from the literature
The data in Figures 1 and 2 are a result of experiments performed in a single lab, and thus, potential confounding factors were largely controlled for. However, across the literature, cellular motility is examined not only via different metrics and assays, but also with varying experimental setup. Thus, we aimed to examine the variability in assay set up and its potential effects on outcomes through a careful literature search focused on several of the most widely examined cell lines in motility assays. We compiled data from a list of publications measuring motility in 2D and 3D platforms ( Figure 3, and Supplemental Tables 1-6) among widely used cell lines to extrapolate our findings to that beyond our own labs. We focused on studies of cell motility in 3D that reported % invasion ( Fig.   3a, b) and % migrating (Fig. 3c, d), and studies that reported % wound closure in 2D (Fig. 3e). We saw no significant correlation for the 3D motility outcomes with the two consistent experimental conditions reported (assay duration and cell density). In the case of wound healing assays, however, there was an unsurprising correlation between assay duration and percent of wound closure (r=0.87, p<0.01) (Fig. 3c).
We found that biomaterial properties like pore size and composition were similar across studies, although concentrations of basement membrane extract (i.e. Matrigel R ) used were often not reported (Supp. Tables 1-2). Cell invasion outcomes from tissue culture insert assays were reported differently across publications and included total cell number, self-defined "invasion value", fold change, percent invasion, or images without quantitative metrics (Supp. Table 3). Assay readouts varied significantly between crystal violet, H&E staining, trypsinization prior to counting, or simply imaging counting, all at different time points (Supp. . In the case of invasion, attractants used in invasion assays were unique to each study (Supp. Table 6). Thus, we could not determine a correlation between the assay experimental setup and the cell migration-related outcomes. We were also unable to quantitatively evaluate all experimental design components (such as matrix concentration) within this small sample size of publications.

In vivo invasion in glioma negatively correlates with 3D chemotactic index
The ultimate goal of in vitro assays is to predict the behavior of cells in a host organism. For glioblastoma (GBM), the deadliest form of brain cancer, invasion is a hallmark of its behavior and is responsible for recurrence after treatment. Unlike other cancers, in GBM, invasive cells remain within the primary organ, which allows for straightforward quantification of invasion at an endpoint using immunohistochemistry. We hypothesized that this invasion would positively correlate with outcomes of cellular motility in vitro. Using data from five models of GBM (our four glioma stem cell lines and the rat glioma line RT2) implanted into mouse cortex, we quantified cells that had invaded beyond the tumor border and correlated these numbers to our assays in vitro (Fig. 4a).
Results from at least five mice were averaged (data from 33 ) and plotted against averaged values from at least four in vitro experiments. For cells in 3D, we did not see a statistically significant correlation between any motility metric in vitro and our in vivo results. However, we did see large negative effects when correlating 3D chemotactic index and net displacement with in vivo invasion. Interestingly, the opposite was true with 2D chemotactic index ( Fig. S2b). In 2D, we saw large negative relationships of speed and displacement with the invasion metric in vivo ( Fig. S2). Due to our low number of cell lines to compare in vitro and in vivo, it is difficult to conclude anything concrete between invasion in vitro and in vivo, though we see interesting negative trends that are contrary to our current assumptions about translating in vitro invasion outcomes to in vivo results.

Effect size as a statistical tool to measure motility changes across dimensions
Mechanistic invasion and motility assays aim to determine the response to particular stimuli or inhibitor (and determine if that difference is statistically significant from some internal control). It is often assumed, though not directly tested, that if a stimulus increases 2D motility it will do the same in 3D. To directly test this assumption, we revisited our data and calculated effect sizes (Cohen's d) in 2D and 3D to determine if 1) dimensionality altered the effect of stimuli and 2) we can use effect size to better analyze and compare cell motility in response to stimuli across dimensions. Effect size is a statistical concept that defines the strength of a relationship between two variables or conditions on the same numeric scale 34 . Thus, one can easily compare the effect of one treatment to another regardless of laboratory, experimental setup, or outcome measure to determine how universal findings are.
Glioma motility in response to CXCL12. We examined motility of multiple patient-derived glioma stem cell lines in the presence of 100nM of CXCL12 in 2D and 3D (Fig. 5b) by reanalyzing our previously published data 33 .
CXCL12 is a pro-migratory chemokine that has been implicated in glioma motility and invasion 35 . We quantified multiple outcomes with live cell tracking and found that the effect size varied based on the dimensionality. For some cell lines (G62) the effect size was nearly equal for percent motile cells when cells were stimulated in 2D or 3D and indicated that there was low effect (<0.2) of the stimulation. For G2 and G528, the effect size varied but remained large (>0.8) for both cell lines in both dimensions. Interestingly though, for G34, the effect in 2D was medium, but large in 3D, indicating that dimensionality may affect this cell line-specific response to CXCL12.

Breast cancer motility in response to EGF and integrin inhibitors.
To broaden the utility of effect size beyond glioma to breast cancer cell behavior, Figure 5g shows SkBr3 cells that were seeded on a bone-ECM functionalized surface and stimulated with EGF or inhibitors for integrin subunits β 1 and α 2 36 . Comparison of effect sizes, as we saw for glioma, the effect size for 2D and 3D for all types of stimulation had roughly the same effect. EGF stimulation had a small effect, β 1 integrin inhibition had a medium effect, and α 2 integrin inhibition had a large effect regardless of geometry. Our analysis highlights the utility of using the statistical tool effect size to determine its importance given its ability to span dimensionality and cell sources.

Discussion
In this analysis, we found that the diversity of invasion and motility assay measurement approaches, reporting tools, and responses all vary across labs ( Fig. 3 and Suppl. Tables 1-6). Though motility metrics have been studied in multiple contexts for decades, there is still not a consensus nor clarity in terms of the importance of each and the impact of each on outcomes in vivo. In cancer, this is particularly striking as there is already a high level of heterogeneity in the disease itself, which is amplified as we move into complex in vitro models. One major impediment to the field's progress is the variability from lab to lab in the implementation and analysis of these experiments. First, we identified high variability in the assay setup. As illustrated in Supplemental Tables 1-5, concentrations of Matrigel used for invasion assays differed, and in some publications, were not reported. We know that the source and lot of basement membrane extracts (like Matrigel) can influence experiments alone, let alone the concentration 37 . Similarly, assay durations and cell densities differed across most publications using breast cancer cell lines. Unsurprisingly, the assay duration correlated positively with degree of wound closure (Fig. 3c).
When we looked through how different publications quantified their assay outcomes, we noticed variable methods to count invasive cells from the bottoms of tissue culture inserts, including selection of immunocytological stain and/or fixation vs. cellular detachment and counting. Regardless, publications generally reported some final number, though this could be a percent, fold change, or total number of cells that prevented us from directly comparing their results as were able to do for our own experiments. A standardized metric that best conveys the raw data would allow to compare outcomes in a meaningful way across labs.
We propose effect size as a useful metric to understand how and if stimuli and inhibitors affect cell motility across geometries and labs. For example, as seen on Figure 5d  The desire to understand how 2D cell migration relates to that in 3D is not unique to our study. Meyer et al.
quantified breast cancer cell line motility and showed that the degree of initial cell protrusion in 2D was predictive of 3D invasion across many different stimuli 39  length/number/time, etc. in 2D and 3D and found no correlation between any of the metrics in the two environments. 40 Next generation biomaterials are being developed that provide possible explanations of the key differences between 3D and 2D environments that drive the unique motility phenotypes, such as confinement 41,42 and porosity 28 .
Ultimately, we are attempting to predict cell invasion in vivo so that we can potentially discover druggable targets to halt malignant cells from invading and metastasizing. In our limited dataset we show that there is a negative correlation between migration in 3D collagen gels with invasion in vivo. Live imaging data may reveal more information, but with at least our endpoint assay, we do cannot predict in vivo "invasiveness" with in vitro invasion in glioma. It's possible that our in vitro systems, even when 3D, do not have hold enough complexity to capture true in vivo behavior.
Taken together, standardized metrics are needed that allow for direct comparison between 2D, 3D, and in vivo models. Effect size can allow us to better compare the effects of different stimuli on motility metrics and perhaps draw conclusions independent of dimension and environment. Given the rise of more physiological in vitro models that result in more complicated responses, this could be a first step to implement comparison of metrics across the field. Finally, standardizing motility metric outcomes could help bridge the gap between 2D, 3D in vitro systems and their translation to in vivo physiology.

Cell culture
All cell culture supplies were purchased from Thermo Fisher Scientific (Waltham, MA) unless otherwise noted.

Preparation of ECMs for SkBr3 migration experiments
Glass coverslips (15 mm and 18 mm diameter, Fisher Scientific, Agawam, MA, USA) were functionalized with 10 g/L N,N-disuccinimidyl carbonate (Sigma-Aldrich) and 5% v/v diisopropylethylamine (Sigma-Aldrich), and ECM protein cocktails were then covalently bound to the glass coverslips through reactive amines: 5 μ g/cm 2 of 99% collagen I and 1% osteopontin 36 . Coverslips were incubated with proteins at room temperature for three hours, rinsed three times with PBS, and then incubated with 10 μ g/cm 2 MA(PEG)24 (Thermo Scientific, Rockford, IL, USA) for two hours. Coverslips were rinsed three times with PBS, epoxied to the plate (Devcon 5 minute epoxy) and UV-sterilized prior to cell seeding. For invasion studies from coverslips, cells were seeded on coverslips and then overlaid with a collagen gel as previous described 36 .

3D Invasion Assays
Invasion assay data for glioma cells was acquired from our previous publications where it was conducted as described 33,44 . Briefly, cells were seeded in a 1.2 mg/ml thiolated hyaluronan (ESI)/0.8 mg/ml rat tail collagen I (Corning) matrix at a concentration of 1E6 cells/ml. 100µl of this gel was applied to a 8-µm pore tissue culture insert (Millipore). Serum free media was applied to the top and bottom of the well and the system was incubated for 18h after which gels were removed and membranes fixed and stained with DAPI. Membranes were imaged at five non-overlapping locations and %invasion was calculated as an extrapolated cell count divided by the seeded cell count x 100.

Live Imaging and Analysis
Glioma Motility: The motility metrics were determined via live imaging and single-cell tracking of glioma cells in either the hydrogel system (above) or on tissue culture plastic. The EVOS FL Auto (ThermoFisher) microscope and the EVOS Onstage Incubator (ThermoFisher) were used for imaging in 15 minute intervals for 14-18 hours. The incubator was set to the following conditions: 5% CO2, 20% air, and 80% humidity. Images were taken in 20

Tissue post-processing
Evans blue dye injections were administered intravenously to the animals ten to eleven days after tumor inoculation. Intracaridal saline perfusion was performed the following day to euthanize the animals. The brains were harvested, cryoembedded, and sectioned at 12 µm. Immunostaining for 4',6-Diamidino-2-Phenylindole, Dihydrochloride (DAPI, Sigma) and mouse anti-human nuclei (clone 235-1, Millipore) was performed on sections at differing depths of the tumor. EVOS FL Auto 2.0 was used to scan whole sections. After importing raw images into ImageJ, integrated density was used to quantify Evans blue intensity in four to five 0.49 mm 2 regions of the image. Integrated density of each region was normalized to tumor maximum.

Invasion calculations from published data
Percent of invasion, and migration data were extracted with the WebPlotDigitizer v4.1 from the published work cited in Figure 2 and Supplementary Tables 1-4. Re-plotted data was used to calculate the percent of invasion based on the initial number of seeded cells.

Effect size calculations
Effect size measures were performed between two independent groups following Cohen's d calculation.

Conclusion
Current challenges in the field of cellular motility and invasion within biomaterial-based systems, including diversity of assays, metrics, and analyses, limit the translation of results across platforms and impede correlation between 2D, 3D and in vivo. Here, we summarize the most commonly used metrics to quantify cell motility, and describe the interrelation between these different motility measurements, the important differences in assays and reporting techniques used across the literature, and describe the potential contribution of in vitro predictions to in vivo outcomes. To our surprise, we found cell invasion in 3D is negatively correlated with invasion in a glioblastoma model in vivo. Given the variability we saw in reporting in the literature, and the inability to predict 3D or in vivo invasion from simpler 2D assays, we suggest that standardized metrics are needed. We recommend the use of effect size as a possible avenue that allows direct comparison between two different groups independent on dimensionality or stimulus. Given the rise of more physiological in vitro models that result in more complicated responses, this could be a first step to implement comparison of metrics across the field. Finally, standardizing motility metric outcomes could help bridge the gap between 2D, 3D in vitro systems and their translation to in vivo physiology.