Analyzing Human Movements - Introducing A Framework To Extract And Evaluate Biomechanical Data

This study discusses possible sources of discrepancy between findings of previous human motion studies and presents a framework that seeks to address these issues. Motion analysis systems are widely employed to identify movement deficiencies - e.g. patterns that potentially increase the risk of injury or inhibit performance. However, findings across studies are often conflicting in respect to what a movement deficiency is or the magnitude of the relationship to a specific injury. To test the information content of movement data, a framework was build to differentiate between movements performed by a control (NORM) and abnormal (IMP-L and IMP-C) cohort using solely movement data. Movement data was recorded during jumping, hopping and change of direction exercises and was mathematically decomposed into subject scores. Subjects scores were then used to identify the most appropriate machine learning technique, which was subsequently utilized to create a prediction model that classified if a movement was performed by: IMP-L, IMP-C or NORM. The Monte Carlo method was used to obtain a measure of expected accuracy for each step within the analysis. Findings demonstrate that even the worst classification model outperformed the best guess observed and that not all members of the NORM group represent a NORM pattern as they were repeatedly classified as IMP-L or IMP-C. This highlights that some NORM limbs share movement characteristics with the abnormal group and consequently should not be considered when describing NORM.

providing evidence in support for screening for injury risk." However, while predicting 23 acute injuries from movement features is challenging (as there may be multiple 24 contributing factors to injury risk -e.g. movement deficiencies, fatigue, contact with 25 other players, genetics, environmental factors, changes in acute/chronic training load, 26 lost focus, frequency of movement assessments), studies examined within Bahr's [3] 27 review utilized features independently (e.g. peak knee valgus moment, maximum knee 28 flexion on their own), rather than the interaction of features. Further, to date, there has 29 been little progress in the way biomechanical features are extracted and analyzed and 30 this could be a reason for the disparity between studies. Traditional analyses often 31 discard a large portion of the captured data [4][5][6][7] and do not consider the 32 interrelationships within such a complex and multivariate system as the human body. 33 When describing a movement, studies often extract features based on prior 34 knowledge (previous research and / or personal and clinical experience) or post hoc 35 analysis, performing a comparison of magnitude or timing [8] assuming that these 36 features capture the underlying function of a signal. While discrete points can be 37 helpful in understanding movements, the selection of discrete points has the potential to 38 discard important information [4,5], to compare features that present unrelated 39 neuromuscular capacities [6] and to encourage fishing for significance -e.g. non-trivially 40 biased non-directed hypothesis testing [7]. Due to the apparent limitations in discrete 41 point selection, other analyses have been introduced in recent years -statistical 42 parametric mapping [7], (functional) principal component analysis [9][10][11], analysis of 43 characterizing phases [6], point by point manner testing [12] and other 44 techniques [8,13,14] to improve the analysis of movement patterns. 45 Another possible source for conflicting conclusions is the way extracted features are 46 compared across groups. Comparisons are often made using statistical significance, 47 which bases conclusions by inferring properties about a population by testing 48 hypotheses and deriving estimates by probability values (p-values). While, p values are 49 useful they have been criticized in the past and present [15][16][17][18][19]. Criticism of p values 50 include the possibility of committing type 1 and 2 errors, that any difference can be 51 statistically significant -with large enough sample size [18] -that p values do not 52 provide statistical precision and that conclusions do not account for subgroups within 53 the data. The way an athlete moves could be influenced by anthropometric measures, 54 sporting and injury history and there is growing evidence for differences within 55 movement strategies across individuals [20][21][22][23][24][25][26][27][28][29]. As such, features that relate to the risk 56 of injury could be masked during an analysis because of different movement strategies. 57 Finally, most movement analysis studies involve the recording of multiple trials but 58 commonly examine the average of all or the maximum trial. When averaging multiple 59 trials, an ?artificial? movement is created and examined. Local peaks will be altered in 60 magnitude and temporal appearance [30] and the intersegmental link (coordination) 61 between joints might be lost. As such, examining the best trial seems more valid.

62
However, such an approach selects a unique instance (based on jump height or 63 performance time) and could bias an analysis towards a non-realistic situation. No 64 September 26, 2018 2/15 athlete will perform a task over and over with a maximal effort and consequently the 65 sub-maximal efforts should not be discarded. An alternative approach is to utilize 66 repeated random sampling (Monte Carlo simulation), where the captured trials are 67 selected at random and the analysis is run multiple times [31,32]. This can overcome 68 the "maximal effort bias" and can also provide a measure of expected differences or 69 accuracy within future studies. Further, such an approach can also overcome 70 discrepancies between findings that are caused by the selection of a reference limb when 71 comparing a abnormal (injured) to a normal or uninjured group.

72
When examining a human movement, the method of data analysis chosen needs to 73 contend with a complex and multivariate system and any analysis using an inference 74 test may not help to progress the understanding of movement further as it does not 75 account for differences in movement strategy or the interrelationship of segments. More 76 suitable methods for movement analysis might be machine learning techniques, which 77 have gained popularity in other fields and have demonstrated an enhanced ability to 78 understand complex and multivariate system -see Rajpurkar et al., [33] or Barton et 79 al., [34]. Rajpurkar et al., [33] demonstrated the use of such an approach in 80 cardiovascular medicine, by developing an algorithm that outperformed board certified 81 cardiologists when detecting a range of heart arrhythmias from electrocardiograms.

82
When applying machine learning techniques to motion analysis, the computer is trained 83 to learn the underlying connection between a set of predictor features (e.g. peak knee 84 angle and moment) and a class (abnormal or normal). If the set of predictor features 85 hold sufficient information, the algorithm will be able to predict the class of previously 86 "unseen" observation correctly. If the set of predictor features does not hold sufficient 87 information, the algorithm will fail to predict the correct class of a previously "unseen" 88 observation. To build a model that can objectively judge a movement, Richter et 89 al., [29] proposed identifying individuals that present the "true" group pattern -e.g. are 90 continuously classified into the correct class. For example, if the goal is to describe a 91 normal movement pattern, an algorithm could be trained to differentiate between a 92 normal and abnormal population. Samples, which cannot be correctly classified by the 93 classification technique, might not be considered when describing the normal behavior. 94 Consequently, the probability of belonging to a class (abnormal or normal) generated 95 during a classification could be used when judging a movement [29]. A good example of 96 an abnormal cohort are athletes recovering from ACL reconstruction as the ACL 97 reconstruction will have altered the neuromuscular properties of the athlete, influencing 98 the way they move.

99
The aim of this study was to examine if biomechanical data can distinguish between 100 a normal and abnormal movement pattern and to present a framework that combines 101 an automatic feature extraction with a machine learning approach.

104
The data set used in this study holds a cohort of athletes recovering from an ACL 105 reconstruction and an uninjured control group. The ACL group (n = 156) was recruited 106 from the caseload of two orthopedic surgeons who specialize in knee surgery between multi-directional sport (i.e. Gaelic Football, Soccer, Hurling, Rugby Union) that were 114 free of injury in the 3 months prior to testing, had no previous knee surgery and were 115 between 18 and 35 years of age. The study received ethical approval from 116 and was registered on 117 clinicaltrials.gov .

118
The ACL group had an average age of 24.8 ± 4.8 years was 180 ± 8 cm tall and had 119 a body mass of 84 ± 15.2 kg. The control group had an average age of 24.8 ± 4.2 years 120 was 183 ± 6 cm tall and had a body mass of 82 ± 8.9 kg.

121
Data Capture and Preprocessing

122
The testing took place in the motion analysis laboratory using an eight-camera motion 123 analysis system (200Hz; Bonita-B10, Vicon, UK), synchronized with two force platforms 124 (1000Hz BP400600, AMTI, USA). Before data collection, all subjects undertook a 125 standardized warm-up and wore their own athletic footwear with 24 reflective markers 126 secured to the shoe or to the skin using tape, at bony landmarks according to the 127 Plug-in-Gait marker set. Three trials of each limb for the following exercises were 128 captured: DLCMJ [35], SLCMJ [35], DLDJ [35], SLDJ [35], HoHo [35], SLHop [35],

132
Data pre-processing (gap filling and waveform screening) was performed using a custom 133 developed MATLAB program (R2015a, MathWorks Inc., USA) that also computed the 134 additional kinematic measures [35]. The start and end of an exercise was defined using 135 the force trace or a combination of center of mass (CoM) power and force trace. For the 136 CMJ the start was defined as the first time the ground reaction force (GRF) was less 137 than BW-25N, while the end was defined as toe off (force less than 25). For the DJs, 138 HH and CoDP the start was defined as the first instance the vGRF is above 25N and 139 the end when the GRF is below 25N. For the SLHop the start was defined, as the first 140 instance were GRF was above 25N and the end when the center of mass (CoM) power 141 first became positive. All measures were landmark registered using a dynamic time 142 warping process [38] to align the end of the eccentric phase across the all curves 1 .

143
The Framework

144
The steps taken during data analysis can be described as follows: feature generation, 145 selection of a supervised learning technique, and generation of a classification model and 146 testing of the generated classification model. During the analysis, each limb was treated 147 as a separate entry to overcome the question of which limb to choose within the control 148 group. Consequently, the data that was used during the following steps contained 156

157
The first step was the identification of phases of variation, which were used to calculate 158 features that describe the behavior of a trial -similar to Richter et al., [39]. The process 159 is illustrated in detail in fig 2. using the idea of analysis of characterizing phases [6]. The information obtained 164 (measure [e.g. joint angle], start and end) of each phase of variation found was recorded. 165 This process was repeated 100 times, omitting the not selected trials (n = 236) during 166 each simulation to increase the generalizability of findings. All phases that occurred at 167 least 95 times, during the 100 iterations, and spanned over at least 5 % of the measure 168 were considered "robust" and used to generate a feature matrix (i features x 436 trials) 169 to describe the movement pattern within a trial. A feature i was calculated as the 170 average value over a robust phase of variation. For every double leg exercise, symmetry 171 was also calculated as limb -contralateral limb -e.g. left -right and right -left. 2 Measure included were: GRF (x, y, z), GRF impulse (x, y, z), CoM velocity (xy, xyz, z), CoM Power (x, y, z), CoM in pelvis (x, y, z), CoM in hip (x, y, z), CoM in knee (x, y, z), CoM in ankle (x, y, z), joint angle of ankle, knee, hip, pelvis, thorax and thorax on pelvis in sagittal, frontal and transversal planes, joint angular velocities of ankle, knee, hip, pelvis, thorax and thorax on pelvis in sagittal, frontal and transversal plane, joint powers, moments, work and impulse of ankle, knee, hip and pelvis in sagittal, frontal and transversal plane, time and foot angle on pelvis.
of IMP-L and IMP-C) resulting in n = 90, while the leave out dataset contained the To identify features with high importance towards classification, the data was, as 206 before, split into train, test and leave out data sets (see selection of supervised learning 207 technique). The train set was used to teach the best performing machine-learning 208 technique to forecast the three classes (IMP-L, IMP-C and NORM). After the training 209 phase had been completed, the performance of the learning technique was compared to 210 the predicted class of the test data to the actual class. This process was done for each 211 extracted feature on its own. The accuracy of each "feature model" was recorded and 212 the process was repeated 100 times using different randomly selected train, test and 213 leave out samples to obtain a repeatable measure of the expected accuracy.

214
Subsequently, the feature with the highest mean accuracy was identified, removed from 215 the feature matrix and used to build the "model base". All features that correlated with 216 the identified feature (greater than 0.7) were removed to increase interpretability based 217 on a correlation utilizing the whole feature matrix. After the model base was built, the 218 process was repeated while pairing every feature remaining in the feature matrix with accuracy, the Elbow method was used -described in Hastie and Tibshirani [40] or 225 Vapnik and Vapnik [41].The "elbow" was defined as the point n where the 226 differentiation of accuracy f improved less than 10 % of its range (Eq 1).

228
After the selection of a machine learning technique and most important features, a final 229 model was created and tested. As before, the data was split into train, test and leave  (table 1).

259
Best performing machine-learning techniques 260 The ability to classify an unseen trial into IMP-L, IMP-C and NORM using every   normalizing scores (increasing performances close to best performing technique).

304
The examined exercises can be split in three groups based on their patterns within 305 findings. The exercises DLDJ, SLCMJ, DLCMJ and SLHop represent non-complex 306 movement. The HuHo, CoDP and CoDU represent a complex movement, while the 307 SLDJ represents a non-complex movement with within "group-limb" confusions. deviate from the NORM pattern but not from each other. As such, the IMP-C may not 319 be ideal to be used as reference when judging risk of injury. Another reason could be 320 that the differences between IMP-L and IMP-C might be too small to be detected using 321 the examined features and sample size. In contrast to the SLCMJ and SLDJ, the DLDJ 322 and DLCMJ did not demonstrate an increased confusion pattern toward a specific Symmetry features were selected first, followed by a performance approximate and 326 another symmetry feature. The magnitude of symmetry or asymmetry seems to be a 327 useful feature within these exercises and would support findings of Myer et al., [42].

328
While symmetry features could have been included within the single leg models, it 329 would require the interventions of the investigators -e.g. symmetry calculation as mean 330 symmetry, symmetry between trial x and trial y and so on [43]. As every exercise 331 execution presents different external and internal conditions, symmetry cannot be 332 calculated without setting subjective rules and was hence not included here. However, 333 this also suggests that a testing battery should contain both single and double leg 334 exercises.

335
The HuHo, CoDP and CoDU demonstrated a performance about 10 % less than 336 DLDJ, SLCMJ, DLCMJ and SLHop within the optimal model. However, these 337 exercises should not be discarded because the optimal model suffered from the detected 338 number features -as they lost about 13 % of their prediction ability when the number 339 of features was reduced from 20 to about 5. This suggests that these exercises are more 340 complex and need more information to describe the underlying movement pattern. A 341 reason could be that the initial condition within these exercises is less defined than in 342 the DLDJ, SLCMJ, DLCMJ and SLHop. Jumping height, width and speed of the HuHo 343 was not accounted for within the model nor had the model information (e.g. the feature 344 jump height, width and so on) to adjust. Completion time and pre step movement 345 pattern during the CoDP and CoDU was not accounted for within the model nor had 346 the model information to adjust for these differences in execution. As such, there is a present the true pattern). These findings highlight that understanding a complex tasks 361 requires many features in combination and that it becomes more challenging to extract 362 a representative group movement with increasing complexity of a task.

363
The last remaining exercise was the SLDJ that demonstrated a rather "unusual" 364 behavior. Its performance was comparable to the HuHo, CoDP and CoDU but its 365 performance did not improve beyond the number of optimal features nor did it present 366 the previously observed confusions pattern within the IMP group. However, a large As such the SLDJ seems to be able to extract movement patterns that describe if a limb 372 had an ACL reconstruction or not.   Features selected in the SLDJ 4 can be linked to jump height (CoM vertical velocity 384 prior to take off), knee and hip kinematics during the middle and latter part of the first 385 ground contact as well as a possible cheating pattern -as evident by the selection of a 386 resultant CoM velocity at impact. This feature should not hold any meaningful 387 information towards classification as every trial was recorded from the same drop height. 388 As such, impact CoM velocity should theoretically be nearly the same across all trails. 389 However, results demonstrate that some information is held in this feature and the  Return to play after ACL surgery / prevention of subsequent re-injury are not always 410 guaranteed [44,45] and this might be in part due to absence of clear criteria identifying 411 if an athlete has returned to pre-injury levels or completed rehabilitation. Current 412 clinical testing batteries often utilize biomechanics to assess a movement quality.

413
However, there is little consensus on the appropriateness of biomechanical analysis or 414 and specific exercise tests and measures when differentiating between two specific 415 groups. This study demonstrates that biomechanical data hold enough information to 416 differentiate between IMP-L, IMP-C and NORM with classification accuracies above 75 417 %, that a large proportion of individuals included within the control group do not 418 represent a normal movement pattern and that the probability of membership to a class 419 (in this case the NORM class) might allow the generation of a "healthy" healthy score. 420 Such a score can give an objective measure of how close a trial is to a desired class and 421 might present a clear criterion if an athlete has returned to normal or has completed 422 rehabilitation. The "healthy" NORM pattern could be represented by trials those were 423 continuously classified as NORM, and might be considered "low risk" for prospective 424 injury. As the risk of a second ACL injury to the same or contralateral limb is 425 considerably higher than risk of ACL injury in previously un-injured healthy 426 subjects41?44 the here presented framework may be able to even judge risk of injury by 427 4 Selected were: CoM vertical velocity (94 to 101 %), resultant CoM velocity (1 to 6 %), CoM in anterior knee orientation (57 to 69 %), hip abduction angle (72 to 79 %) and knee abduction angle (95 to 101%). 5 Selected were: symmetry knee rotation angle (29 to 33 %), resultant CoM velocity (87 to 94 %), symmetry CoM in anterior hip orientation (57 to 69 %), hip flexion angular velocity (72 to 79 %). 6 Selected were: Ankle flexion angular velocity (7 to 11 %), knee flexion angular velocity (94 to 101 %), GRF (14 to 18 %) and CoM in anterior knee orientation (33 to 42 %).
using the strength to the NORM group from a trial. However, this assumption cannot 428 and was not tested and need confirmation on an prospective data set containing a 429 similar cohort.

430
Based on the findings, a biomechanical testing protocol should contain a single and 431 double leg-jumping task (SLCMJ, DLCMJ and DLDJ), as they were able to 432 differentiate trials with high accuracy with only a few features. More complex task, 433 HuHo and CoDP should also be included, for individuals in later stage of rehabilitation, 434 as they challenge the athlete's ability in more than one plane and were also able to 435 differentiate limbs with high accuracy -with a large number of features. The SLDJ and 436 CoDU demonstrated the lowest ability to differentiate limbs due to their complexity and 437 the range of possible execution strategies. However, all exercises demonstrated that they 438 contain valuable information. This study introduced and tested a framework that combined an automatic feature 465 extraction with machine learning and assessed its ability to differentiate a movements 466 performed by limbs with ACL reconstruction (IMP-L), limbs contralateral from IMP-L 467 (IMP-C) and limbs of a control group (NORM). Findings of this study demonstrate that 468 predictor features extracted from biomechanical data hold valuable information for 469 assessing rehabilitation progress/status, highlighting the potential of movement analysis 470 and machine learning, that a large portion of a control group might not, in identifying 471 injury risk and rehabilitation status. Overall, biomechanical data requires advanced 472 statistics to identify true representations of a group movement pattern, which suggests 473 that probabilities to previously identified patterns may be appropriate to objectively 474 judge injury risk and rehabilitation status.