Inter-session repeatability of Theia3D markerless motion capture gait kinematics

The reliability of kinematic gait data obtained using marker-based motion capture has been questioned due to its sensitivity to marker placement and soft tissue artefact. Markerless motion capture systems are free from these errors associated with marker-based systems and may be able to measure gait kinematics more reliably. This work aimed to determine the reliability of kinematic gait data obtained using Theia3D markerless motion capture, based on an established method. Seven healthy adult participants performed three sessions of ten over-ground walking trials in their own clothing while eight synchronized and calibrated cameras recorded video. Theia3D software was used to process the videos and obtain three-dimensional pose estimates of a multibody model scaled to each of the participants. The established method removes inter-subject variability and estimates average inter-session and inter-trial errors, and the ratio between them. Kinematic patterns across all sessions within individuals were consistent, particularly in the sagittal plane. Inter-trial errors were similar to those previously reported, and the average across all joints was 2.61°. Inter-session errors were similar or smaller than the inter-operator errors reported in previous reliability studies, and the average across all joints was 2.75°. Error ratios were lower than those reported elsewhere, with the largest being 1.1 and the average being 1.06, indicating that multiple sessions increased the total variability of subjects’ joint angle patterns by 6% over their natural inter-trial variability. These results indicate that gait kinematics measured using Theia3D markerless motion capture are more reliable than those measured using marker-based methods and are largely unaffected by multi-session methodologies.


Introduction
Three-dimensional human movement analysis is a widely used tool in clinical and research biomechanics to provide comprehensive 3D representations and quantification of individuals' movement patterns, particularly gait. This tool allows comparisons to be made within and between individuals and groups on a singular or longitudinal basis, and can be used to assess and monitor the progression of disease (Astephen et al., 2008), alter gait patterns to prevent injury or improve treatment outcomes (Crowell and Davis, 2011;Haim et al., 2012;Smania et al., 2011), and inform clinical treatments (Clouthier et al., 2018;Wren et al., 2011a) and surgical decision-making (Wren et al., 2011b). The use of human movement analysis in longitudinal musculoskeletal health studies and in clinical and surgical decision-making places added importance on its reliability over repeated visits.
Currently, the standard approach to human movement analysis is marker-based motion capture which is widely accepted in the field of research biomechanics. However, it has several inherent issues that affect its ability to collect accurate and reliable measures of gait; thus, some doubt has been cast on its role in clinical decision-making (Narayanan, 2007). This technology requires markers to be placed on patients' palpable anatomical landmarks, a process that makes the resulting data susceptible to inaccurate and inconsistent marker placement (Della Croce et al., 2005). The issues associated with marker placement are especially evident in data collected by different operators, during separate collection sessions, and between laboratories. The changing collection conditions lead to inconsistent error levels across gait studies, demonstrating that clinically acceptable errors are possible but not always achieved in gait analysis (McGinley et al., 2009). Standardized protocols have been shown to reduce undue variability in data collected on the same subject at different laboratories by different operators (Gorton et al., 2009); however, some inter-operator variability persists even when an identical collection protocol and the same laboratory are used (Maynard et al., 2003). In addition to the issue of marker placement, marker-based motion capture is susceptible to errors caused by the movement of the soft tissue to which the markers are affixed relative to the underlying bones. Soft tissue artefact introduces errors that affect individuals' gait data differently, making it difficult to correct for and further limits the reliability of biomechanical measures (Dumas et al., 2014;Leardini et al., 2005). Finally, marker-based technology requires patients to visit a laboratory environment where they are required to wear minimal and tightly fitting clothing before having markers applied to their body. In combination, the physical and social discomfort, laboratory environment that is unfamiliar and perhaps strange, and requests to perform natural gait on demand reduce the fidelity of gait data collected using marker-based motion capture and negatively affect recruitment rates.
Markerless motion capture is a quickly evolving technology that uses alternative methods to measure human movement, without the use of skin-based markers. These systems often use arrays of twodimensional (2D) video cameras or depth sensors in combination with machine learning algorithms to estimate human pose during physical tasks, and have been implemented to varying levels of success (Mathis et al., 2018). Theia3D software (Theia Markerless Inc., Kingston, ON) is one example of a machine learning-based markerless motion capture system that uses 2D video data from an array of standard video cameras to perform 3D pose estimation on human subjects. Markerless motion capture systems present several benefits over marker-based systems that could allow them to collect more reliable gait data in a minimally invasive manner. Since they do not rely on skin-based markers there is no longer a need for subjects to wear the minimal and tightly fitting clothing that is characteristic of marker-based motion capture, and they could instead wear their own clothing. Thus, subjects would be significantly more comfortable and able to walk or perform physical tasks more naturally, leading to better data.
Markerless systems also would not be limited to use in laboratory spaces, allowing data to be collected in real-world environments which cannot be replicated in the laboratory. Finally, given that markerless systems do not rely on skin-based markers, they would not be affected by the issues associated with those markers, including inconsistent marker placement or soft tissue artefact. This could allow markerless motion capture systems to record more reliable data, increasing the applicability and impact of gait analyses and human movement research in general.
The objective of this work was to determine the reliability, in the form of test-retest repeatability, of over-ground gait kinematics measured using the Theia3D markerless motion capture system. The reliability of these measures was then compared to those in the literature for field-accepted marker-based motion capture systems. We hypothesized that the markerless motion capture system would have lower variability in joint kinematics between repeated visits compared to marker-based systems.

Theia3D Markerless Motion Capture
Theia3D is a deep learning algorithm-based approach to markerless motion capture that uses an array of synchronized and calibrated video cameras to perform three-dimensional (3D) pose estimation on human subjects. The cameras are arranged such that their 2D views overlap to establish an appropriate 3D capture volume and are calibrated to determine their location in global 3D space using a calibration object with known dimensions. The subject then performs a physical task in the clothing of their choice while video data is recorded. These data are processed by the Theia3D software which uses deep convolutional neural networks to perform feature recognition on the humans within the 2D camera views, identifying and locating specific anatomical features on a frame-by-frame basis. These neural networks were trained on >500,000 images sourced from Microsoft COCO (Lin et al., 2015) and a proprietary image set, which were manually labelled by highly trained annotators and controlled for quality by a minimum of one additional expert labeller. The training images consisted of humans in a wide array of settings, clothing, and performing various activities, and the training received by the networks enabled them to learn the visual features that correspond to the labelled landmarks and apply them to any new image. The extrinsic camera calibration parameters are used to determine the 3D position of each camera, allowing the global 3D position of each landmark to be calculated based on the estimates from the 2D videos. Finally, an articulated multi-body model is applied to the 3D positions of the landmarks to estimate the 3D pose of the subject throughout the physical task. This inverse-kinematic model is made up of 17 segments, and has three degrees-of-freedom (DOF) at the ankle, two DOF at the knee (flexion/extension and abduction/adduction), and three DOF at the hip, three DOF at the shoulder, two DOF at the elbow (flexion/extension and internal/external rotation), and two DOF at the wrist (flexion/extension and radial/ulnar deviation).

Participants
Seven healthy, recreationally active individuals (2 female, mean (SD) age: 25.7 (6.3), height: 172.9 (9.4) cm, weight: 66.6 (11.1) kg) were convenience-recruited to participate in this multi-session study at the Human Mobility Research Laboratory (Kingston, ON). Participants gave written informed consent and this study was approved by the institutional ethics board. Exclusion criteria included having any neuromuscular or musculoskeletal impairments that could prevent their performance of walking. Participants were given no prior instruction for what clothing to wear and participated wearing the clothing in which they arrived, along with either their personal running shoes or were provided with a pair of running shoes. Participants returned for a total of three sessions, on average separated by 8.8 (2.0) days. A composite image of the clothing worn by participants during each session is shown in Figure 1. Participants were given no specific instructions regarding the clothing they should wear during the data collections.

Experimental Setup and Data Collection Procedure
Eight Sony RX0 II cameras (Sony Corporation, Minato, Japan) were connected and synchronized using a Sony Camera Control Box and were arranged around a capture volume approximately 12 metres long by 5 metres wide within a large indoor laboratory space. Red tape lines were placed on the ground 10 metres apart and were used as walkway start/finish lines. A large calibration object with known dimensions was placed in the centre of the capture volume and a calibration trial was recorded for later use. At every session, participants performed 10 over-ground walking trials between the red start/finish lines at their comfortable walking speed, alternating direction for each trial, while synchronized 2D video data were collected at 60 Hz during each trial.

Data Analysis
Video data from all trials for all sessions and all subjects were batch processed using Theia3D software to obtain 3D pose estimates of the subjects throughout each over-ground walking trial. The 3D pose estimates for each segment of the articulated multi-body model were exported as 4x4 pose matrices for each frame of data, for further analysis in Visual3D (C-Motion Inc., Germantown, MD). A built-in Visual3D model intended for use with Theia3D rotation matrix data was applied to all trials, and virtual toe and heel markers were added in the model. These virtual markers were used with the method described by Zeni et al. (Zeni et al., 2008) to determine heel-strike and toe-off gait events throughout each trial. Lower limb kinematic joint angles were calculated and time-normalized to the gait cycle using the heel-strike and toe-off events. The time-normalized joint angles were averaged within each trial for each joint separately and were exported for further analysis in Matlab (The MathWorks Inc., Natick, MA).
Repeatability of the lower limb joint angles measured by the markerless motion capture system during over-ground walking was assessed using the method described by Schwartz et al. (Schwartz et al., 2004) by pooling trials to obtain within-and across-session averages for each subject, along each kinematic measure. The average difference between each trial from a given subject and the within-and across-session averages for that subject was calculated, to obtain the average inter-trial and inter-session deviations, respectively. The inter-trial deviations capture the stride-to-stride variability that exists within each kinematic measure due to natural, intrinsic subject variability, whereas the inter-session deviations capture any systematic variability due to the nature of the repeated methodology in addition to inter-trial deviations. Since these deviations are measured relative to subject means, inter-subject variability is removed, and the inter-trial and inter-session deviations can be pooled across subjects. The standard deviation of these inter-trial and inter-session deviations provides an estimate of the average inter-trial and inter-session error, respectively, which can additionally be expressed as an error ratio of inter-session to inter-trial error. Results were compared to similar repeatability measures for marker-based motion capture systems from the literature.

Results
Consistent joint angle patterns were found across all three sessions, within each subject, as shown for one representative subject in Figure 2. Joint flexion/extension angle patterns were particularly consistent, with very little deviation between individual trials, the session means, and the subject's overall pattern within those measures. Joint rotations about the secondary and tertiary rotation axes showed greater variability between each cycle as indicated by the larger standard deviation bounds; however, the mean patterns for these measures were still very consistent across all three sessions.  The inter-trial and inter-session error estimates were found to have very similar patterns throughout the gait cycle within each measure, with the inter-session errors being larger than the intertrial errors but with little difference between them (Figure 3). The knee ab/adduction and ankle flexion/extension measures had greater error during swing phase than stance phase, while knee flexion/extension and ankle ab/adduction had lower error levels immediately preceding and following heel-strike (0% gait cycle). Most measures, including those previously mentioned, had fluctuating error levels throughout the gait cycle. The small differences between the inter-session and inter-trial error levels are exhibited in the error ratio of inter-session to inter-trial error, which is relatively constant and has an average at or below 1.1 for all measures (Figure 3). Since the inter-trial error is an estimation of the intrinsic variability of subjects' gait patterns, the error ratio is an indicator of the proportion of the intersession error that is accounted for by this natural variability. Thus, an error ratio of 1 would indicate all inter-session variability was due to inter-trial variability, whereas an error ratio of 1.1 indicates the repeated session methodology caused an increase in the error level of 10% relative to the subjects' intrinsic inter-trial variability.
The average inter-trial error, inter-session error, and error ratio are summarized in Table 1. The inter-trial errors were greatest in ab/adduction and internal/external rotation angles at the ankle and hip, all of which were between 3.0 and 3.6 degrees. The knee ab/adduction angles had the lowest inter-trial error of 1.3 degrees. Across all measures, the average inter-trial error was 2.6 degrees. With respect to the inter-session errors, since the Theia3D markerless motion capture system automatically estimates human pose without the operator input that is required when using marker-based motion capture, the intersession error in this study is analogous to the inter-operator or inter-laboratory errors reported in other marker-based motion capture reliability studies. The inter-session errors were similar to the inter-trial errors with the greatest errors in the ab/adduction and internal/external rotation angles at the ankle and hip, all of which were between 3.0 and 3.6 degrees. The knee ab/adduction angles had the lowest intersession error of 1.4 degrees. Across all measures, the average inter-session error was 2.7 degrees. The error ratios were all equal to or less than 1.1 for all joint angles. Across all measures the average error ratio was 1.1, which indicates that performing multiple separate sessions increased the total variability of subjects' joint angle patterns by 10% on average, over their intrinsic inter-trial variability. Table 1: Average inter-trial error, inter-session error, and error ratio from this study and those by Schwartz et al. (Schwartz et al., 2004), Manca et al. (Manca et al., 2010), Caravaggi et al. (Caravaggi et al., 2011), and Kaufman et al. (Kaufman et al., 2016).

Inter-Trial Error
Inter

Discussion
Three-dimensional gait analysis has been shown to be a useful tool in biomechanics research, with many relevant scientific findings and demonstrated applications. However, its influence and impact have been severely limited by the marker-based motion capture technology that is typically used in performing human movement analysis. These systems require highly trained operators to affix markers to the skin of participants, a process which is known to reduce the accuracy and reliability of kinematic gait data across operators and laboratories (Della Croce et al., 2005;Gorton et al., 2009). In addition, soft tissue artefact causes skin-mounted markers to move relative to the underlying bones whose motion they are intended to track, introducing another significant source of error (Benoit et al., 2006;Leardini et al., 2005). Finally, data collection conditions including the foreign laboratory environment, clothing requirements, and skin-mounted markers may influence participants' natural gait. Collectively, these factors introduce undesirable variability to measured gait kinematics, reducing their accuracy and reliability. In a systematic review of kinematic gait measurements, McGinley et al. found that while clinically acceptable errors are possible to obtain in gait analysis, the variability between studies suggested that they are not always achieved (McGinley et al., 2009). This variability of human movement analysis data has caused its usefulness as a clinical tool to be questioned (Narayanan, 2007).
However, markerless motion capture technology such as Theia3D is not affected by any of the aforementioned sources of error and thus could represent an opportunity to increase the impact of gait analyses, provided its measurements are shown to be reliable. The results of this study showed that the lower limb joint angles measured by Theia3D were consistent within individuals across multiple sessions, particularly with respect to sagittal plane joint angles (Figure 2). Higher levels of variation were observed between trials for the ankle and hip ab/adduction and internal/external rotation angles; however, the session mean patterns for these measures were found to be consistent within subjects. This finding suggests that the gait patterns in these planes vary between trials, but provided that a sufficient number of trials are collected, a subject's average pattern in these measures can be reliably replicated between multiple sessions.
The inter-trial errors measured in this study were generally larger than those reported in other studies using marker-based motion capture and the methodology proposed by Schwartz et al. (Kaufman et al., 2016;Manca et al., 2010;Schwartz et al., 2004). However, larger inter-trial errors do not necessarily have a negative implication in terms of measurement accuracy and could instead represent the possibility that more true movement variability between trials was captured in this study compared to others. The inter-session errors measured here, which are analogous to inter-operator or inter-laboratory errors measured in other studies, were the smallest or within 0.1 degrees of the smallest values among previously reported values for five out of eight joint angles: ankle flexion/extension, ankle internal/external rotation, knee flexion/extension, hip flexion/extension, and hip internal/external rotation (Caravaggi et al., 2011;Kaufman et al., 2016;Manca et al., 2010;Schwartz et al., 2004). The intersession errors for the remaining measures were comparable to previously reported values. Across all joint angles, the average inter-session error was only 2.7 degrees, the lowest reported average. These intersession errors indicate that the Theia3D markerless motion capture system produces more reliable lower limb joint angle measurements on average than field-accepted marker-based motion capture systems. The error ratios measured here were lower than those in other studies for all examined joint angles, with the largest in the present study (1.1) being equal to the smallest reported value among the other studies (Manca et al., 2010). The error ratio compares the level of undesirable variability due to methodology (inter-session, inter-operator, or inter-laboratory error) to the level of natural subject variability (inter-trial error), thus indicating the proportionality between methodological and natural variability in the measurements. The error ratios obtained here indicate that lower limb joint angles measured using Theia3D markerless motion capture are largely unaffected by the multi-session methodology and are almost entirely a result of the natural inter-trial variability of the subjects' gait.
Although the findings presented here showed high levels of reliability in the gait kinematics measured by the Theia3D markerless motion capture system, there are some limitations to this work. We did not directly compare markerless and marker-based motion capture during the same trials because we wanted to perform the markerless data collections on unrestricted attire and a direct comparison study is in the midst of completion. Despite being provided no instruction regarding attire, the participants wore mostly dark clothing during the data collection sessions. While it is theorized that dark attire provides a greater challenge for the markerless motion capture system due to reduced contrast, it is possible this homogeneity in attire affected the gait kinematic measurements. We acknowledge that in order to perform lower limb kinematics, both legs must be visible and so this approach does have some limitations in terms of clothing, excluding, for example, long coats or skirts. Additionally, while the markerless motion capture system is largely unrestricted with regards to the data collection environment, the data used in this study were collected in a laboratory space. Thus, further work should be done to determine the sensitivity of gait kinematics to attire and collection environment.
The findings presented in this study demonstrated Theia3D markerless motion capture can measure gait kinematics with greater reliability than that of marker-based methods. With this increased reliability and the practical improvements offered over marker-based methods, markerless motion capture could allow gait kinematics to be obtained more quickly, simply, and in a wider variety of environments, increasing the impact of gait analyses in research and clinical biomechanics.

Conflict of Interest Statement
Scott Selbie is the President and Marcus Brown is the Director of Technology of Theia Markerless Inc. (Kingston, Ontario), the developers of Theia3D.