Abstract
Background The motor learning literature focuses on relatively simple laboratory-tasks due to their highly controlled manner and the ease to apply different manipulations to induce learning and adaptation. In recent work we introduced a billiards paradigm and demonstrated the feasibility of real-world neuroscience using wearables for naturalistic full-body motion tracking and mobile brain imaging. Here we developed an embodied virtual reality (VR) environment to our real-world billiards paradigm, which allows us to control the visual feedback for this complex real-world task, while maintaining the sense of embodiment.
Methods The setup was validated by comparing real-world ball trajectories with the embodied VR trajectories, calculated by the physics engine. We then ran our real-world learning protocol in the embodied VR. 10 healthy human subjects played repeated trials of the same billiard shot when they held the physical cue and hit a physical ball on the table while seeing it all in VR.
Results We found comparable learning trends in the embodied VR to those we previously reported in the real-world task.
Conclusions Embodied VR can be used for learning real-world tasks in a highly controlled VR environment which enables applying visual manipulations, common in laboratory-tasks and in rehabilitation, to a real-world full-body task. Such a setup can be used for rehabilitation, where the use of VR is gaining popularity but the transfer to the real-world is currently limited, presumably, due to the lack of embodiment. The embodied VR enables to manipulate feedback and apply perturbations to isolate and assess interactions between specific motor learning components mechanisms, thus enabling addressing the current questions of motor-learning in real-world tasks.
Background
Motor skill learning is a key feature of our development and our daily lives, from a baby learning to crawl, to an adult learning crafts or sports, or undergoing rehabilitation after an injury or a stroke. It is a complex process, which involves movement in many degrees of freedom (DoF) and multiple learning mechanisms. Yet the majority of motor learning literature focuses on simple lab-based tasks with limited DoF such as force-field adaptations [e.g. 1–4], visuomotor perturbations [e.g. 5–9], and sequence-learning of finger tapping or pinching tasks [e.g. 10–13]. Real-world neuroscience approach studies neurobehavioral processes in natural behavioral settings [14–17]. We recently presented a naturalistic real-world motor learning paradigm, using wearables for full body motion tracking and EEG for mobile brain imaging, while making people perform actual real-world tasks, such as playing the competitive sport of pool-table billiards [18,19]. We showed that motor learning is a full body process that involves multiple learning mechanisms, and different subjects might prefer one over the other.
Now we want to introduce manipulations of real-world tasks to establish causality. While the study of real-world tasks takes us closer to understanding real-world motor-learning, it is lacking the key advantage of lab-based toy-tasks (which made them so popular) of highly controlled manipulations of known variables, to isolate specific movement/learning components. Virtual Reality (VR) provides a handy solution to this problem, enabling to apply controlled manipulations in a real-world task [20]. VR has clear benefits such as ease of controlling repetition, feedback, and motivation, as well as overall advantages in safety, time, space, equipment, cost efficiency, and ease of documentation [21,22]. Thus, it is commonly used in rehabilitation after stroke [23,24] or brain injury [25,26], and for Parkinson’s disease [27,28]. In simple sensorimotor lab-based motor learning paradigms, VR training showed to have equivalent results to those of real training [29–31], though adaption in VR appears to be more reliant on explicit/cognitive strategies [31].
While VR is very good for visual immersion, it is often lacking the Sense of Embodiment (SoE). SoE refers to the senses associated with being inside, having, and controlling a body [32]. SoE requires a sense of self-location, agency, and body ownership [33–35]. This study aims to set and validate an Embodied Virtual Reality (EVR) for real-world motor-learning, which would enable to apply highly controlled manipulations in a real-world task. We develop an EVR to our billiards paradigm [18] by synchronizing the positions of the real-world billiards objects (table, cue-stick, balls) into the VR environment using motion capture (MoCap). Thus, the participant can play with a physical cue and a physical ball on the physical pool-table while seeing it all in VR (https://youtu.be/m68_UYkMbSk). We ran our real-world billiards experimental protocol in this novel EVR to explore the similarities and differences in learning between the real-world paradigm and its EVR mockup.
Methods
Experimental Setup
Our EVR experimental setup (Figure 1A) was composed of a real-world environment of a physical pool table, a VR (Unity3d, HTC Vive) environment of the game (Figure 1B), and MoCap to link between the two environments (Figure 1C). The positions of the virtual billiards table and balls the subjects saw in the VR were matched with their respective real-world positions during calibration, and the cue-stick trajectory was streamed into the VR using MoCap (Optitrack, Motiv). This allowed subjects to engage in the VR task the same way they would in the real-world. Data collection for game object trajectories was done directly in Unity3d, and the full body movement is recorded with a suit of IMUs (inertial measurement units).
(A) 10 right-handed healthy subjects performed 300 repeated trials of billiards shots in Embodied Virtual Reality (EVR). Green arrows mark the MoCap markers used to track and stream the cue stick movement into the EVR environment (B) Scene view in the EVR. Subjects were instructed to hit the cue ball (white), which was a physical ball on the table (in A), in attempt to shoot the virtual target ball (red) towards the far-left corner. (C) For environments calibration, MoCap markers were attached to the HTC Vive controllers which were placed in the pool-table’s pockets with additional solo marker in the cue ball position.
Real-world objects included the same billiards table, cue ball, target ball, and cue stick, used in our real-world billiard study [18]. Subjects were unable to see anything in the real-world environment, they could only see a virtual projection of the game objects. They were however able to receive tactile feedback from the objects by interacting with them.
Four MoCap cameras (Optitrack, Motiv) with Motiv software were used to stream the position of the real-world cue stick into the VR using 4 markers on the stick (Figure 1A). The position of each marker was streamed to Unity3d using the NATNET Optitrack Unity3d Client plugin and associated Optitrack Streaming Client script edited for the application. The positions were transformed from the Optitrack environment to the Unity3d environment with a transformation matrix derived during calibration. The cue stick asset was then reconstructed in VR using known geometric quantities of the cue stick and marker locations (Figure 1B). The placement of markers on the cue stick, as well as the position and orientation of the cameras were key to provide consistent marker tracking and accurate control in VR without significantly constraining the subject movement. The rotation of the cue stick or the position of the subject can interfere with the line of sight between the markers on the cue stick and the cameras. Thus, to prevent errors in cue tracking, if markers become untracked the cue stick disappears from the visual scene until proper tracking is resumed.
The VR billiards environment was built with the Unity3d physics engine. The head-mounted display (HMD) used was the HTC Vive Pro. Frame rate for VR display was 90 Hz. The Unity3d assets (billiards table, cue stick, balls) were taken from an open source Unity3d project [36] and scaled to match the dimensions of the real-world objects. Scripts developed in C# to manage game object interactions, apply physics, and record data. Unity3d software was used to develop custom physics for game collisions. Cue stick – cue ball collision force in Unity3d is computed from the median velocity and direction of the cue stick in the 10 frames (∼0.11 seconds) before contact. Sensory and auditory feedback comes from the real-world objects for this initial collision. Cue ball – target ball collision is hard coded as a perfect inelastic collision. Billiard ball sound effect is outputted to the Vive headphones during this collision. The default Unity3d engine was used for ball dynamics, with specific mass and friction parameters tuned to match as closely as possible to real-world ball behavior. For the game physics validation, the physical cue ball on the pool table was tracked with a high-speed camera (Dalsa Genie Nano) and its trajectories were compared with those of the VR ball.
For environments calibration, the ‘y-axis’ was set directly upwards (orthogonal to the ground plane) in both the Unity3d and Optitrack environments during their respective initial calibrations. This allows us to only require a 2D (x-z) transformation between environments, using a linear ratio to scale the height. The transformation matrix was determined by matching the positions of known coordinates in both Unity3d and Optitrack environments. We attached markers to the Vive controllers and during calibration mode set them in the corner pockets of the table and placed a solo marker on the cue ball location (Figure 1C), to compute the transformation matrix as well as position and scale of the real-world table. This transformation matrix was then used to transform points from the Optitrack environment into the Unity3d space.
Experimental Design
10 right-handed healthy human volunteers with normal or corrected-to-normal visual acuity (4 women and 6 men, aged 24±2) participated in the study following the experimental protocol from Haar et al [18]. The volunteers, who had little to no previous experience with playing billiards, performed 300 repeated trials in the EVR setup where the cue ball (white) and the target ball (red) were placed in the same locations and the subject was asked to shoot the target ball towards the pocket of the far-left corner (Figure 1B). VR trials ended when ball velocities fell below threshold value, and the next trial began when the subject moved the cue stick tip to a set distance away from the cue ball start position. The trials were split into 6 sets of 50 trials with a short break in-between. For the data analysis we further split each set into two blocks of 25 trials each, resulting in 12 blocks. During the entire learning process, we recorded the subjects’ full body movements with a motion tracking ‘suit’ of 17 wireless inertial measurement units (IMUs). Movement of all game objects in Unity3d (most notably ball and cue stick trajectories relative to the table) were captured in every frame in 90Hz sampling.
Full-Body Motion Tracking
Kinematic data was recorded at 60 Hz using a wearable motion tracking ‘suit’ of 17 wireless IMUs (Xsens MVN Awinda, Xsens Technologies BV, Enschede, The Netherlands). Data acquisition was done via a graphical interface (MVN Analyze, Xsens technologies BV, Ensched, The Netherlands). The Xsens joint angles and position data were exported as XML files and analyzed using a custom software written in MATLAB (R2017a, The MathWorks, Inc., MA, USA). The Xsens full body kinematics were extracted in joint angles in 3 degrees of freedom for each joint that followed the International Society of Biomechanics (ISB) recommendations for Euler angle extractions of Z (flexion/extension), X (abduction/adduction) Y (internal/external rotation).
Movement Velocity Profile Analysis
From the joint angles we extracted the velocity profiles of all joints in all trials. We defined the peak of the trial as the peak of the average absolute velocity across the DoFs of the right shoulder and the right elbow. We aligned all trials around the peak of the trial and cropped a window of 1 sec around the peak for the analysis of joint angles and velocity profiles.
Results
Ball trajectories validation
To validate how well the billiards shot in the EVR resembles the same shot in real life, the cue ball trajectories of 100 shots in various directions (−50°< ø < 50° when 0 is straight forward) were compared between the two environments. The cue ball angles were perfectly correlated (Pearson correlation r=0.99) and the root mean squared error (RMSE) was below 3 degrees (RMSE = 2.85). Thus, the angle of the virtual ball in the EVR, which defines the performance in this billiards task, was very consistent with the angle of the real-world ball (Figure 2A). The velocities were also highly correlated (Pearson correlation r=0.83) between the environments but the ball velocities in the VR were slightly slower than on the real pool-table (Figure 2B), leading to a RMSE of 1.03 m/s.
Cue ball trajectories comparison between VR and real-world for 100 billiard shots at various directions (−50°< ø < 50° when 0 is straight forward). (A) Cue ball angles, each dot is a trial. (B) Max velocity of the cue ball during each trial. The regression line is in black with its 95% CI in doted lines. Identity line is in light gray.
Motor Learning Experiment
To compare the learning of the billiards shot in our EVR to the learning in real-life, we ran the same experimental protocol as in Haar et al [18] and compared mean subjects performance. Accordingly, the trials were divided into blocks of 25 trials each (each experimental set of 50 trials was divided into two blocks to increase resolution) to assess the performance. Over blocks, there is a gradual decay in the mean directional absolute error (Figure 3A). The error was defined as an absolute angular difference between the target ball movement vector direction and the desired direction to land the target ball in the center of the pocket. The decay of error over trials is the clearest signature of learning in the task. Accordingly, success rates are increasing over blocks (Figure 3B). We also see a decay in inter-subject variability over learning, represented by the decrease in the size of the error bar of the directional error over time (Figure 3A). These learning trends in the directional error and success rates are similar to those reported in the real world. Nevertheless, there are clear differences in the learning curve. In the EVR learning occurs slower than in the real-world task, and subjects’ performance are worse. The most striking difference between the environments is in the intertrial variability (Figure 3C). In the real-world task there was a clear decay in intertrial variability throughout the experiment, whereas in EVR we see no clear trend. Corrected intertrial variability (Figure 3D), calculated to correct for learning happening within the block [18], also showed no learning trend.
(A) The mean absolute directional error of the target-ball, (B) The success rate, (C) directional variability, and (D) directional variability corrected for learning (see text). (A-D) presented over blocks of 25 trials. Solid and dashed lines are for EVR and Real-world respectively.
The full body movements were analyzed over the velocity profiles of all joints, as those are less sensitive to potential drifts in the IMUs and more robust and reproducible in natural behavior across subjects and across trials [18,37].
The velocity profiles of the different joints in the EVR showed that the movement is in the right arm, as expected. The velocity profiles of the right arm showed the same changes following learning as in the real-world task. The shoulder velocities showed a decrease from the initial trials to the trials of the learning plateau, suggesting less shoulder movement; while the elbow rotation shows an increase in velocity over learning (Figure 4). The covariance matrix over the velocity profiles of the different joints, averaged across blocks of trials of all subjects, emphasizes this trend. Over the first block it shows that most of the variance in the movement is in the right shoulder while in the 9th block (trials 201-225, the beginning of the learning plateau) there is an overall similar structure of the covariance matrix, but with a strong decrease in the shoulder variance and a strong increase in the variance of right elbow rotation (Figure 5A). This is a similar trend to the one observed in the real-world task, and even more robust.
Velocity profiles in 3 degrees of freedom (DoF) for each joint of the right arm joints. Blue lines are the profiles during the first block (trials 1-25), and red lines are the velocity profiles after learning plateaus, during the ninth block (trials 201-225). Solid and dashed lines are for EVR and Real-world respectively.
(A) The variance covariance matrix of the right arm joints velocity profiles in EVR, averaged across subjects and trials over the first block, second block and the ninth block (after learning plateaus). (B) The trial-by-trial generalized variance (GV), with a double-exponential fit (red curve). (C) The number of principal components (PCs) that explain more than 1% of the variance in the velocity profiles of all joints in a single trial, with an exponential fit (red curve). (D) The manipulative complexity (Belić and Faisal, 2015), with an exponential fit (red curve). (B-D) Averaged across all subjects over all trials. Grey dots are the trial averages for the EVR data. Solid and dashed red lines are fits for EVR and Real-world respectively.
The generalized variance (GV; the determinant of the covariance matrix [38]) over the velocity profiles of all joints was lower in the EVR than in real-world but showed the exact same trend: increase fast over the first ∼30 trials and later decreased slowly (Figure 5B), suggesting active control of the exploration-exploitation trade-off. The covariance (Figure 5A) shows that the changes in the GV were driven by initial increase followed by a decrease in the variance of the right shoulder. Like in the real-world, in the EVR as well the internal/external rotation of the right elbow showed a continuous increase in its variance, which did not follow the trend of the GV.
Principal component analysis (PCA) across joints for the velocity profiles per trial for each subject showed that in the EVR subjects used more degrees of freedom in their movement than in the real-world task (Figure 5C&D). While in both environments, in all trials, ∼90% of the variance can be explained by the first PC, there is a slow but consistent rise in the number of PCs that explain more than 1% of the variance in the joint velocity profiles (Figure 5C). The manipulative complexity, suggested by Belić and Faisal [39] as way to quantify complexity for a given number of PCs on a fixed scale (C = 1 implies that all PCs contribute equally, and C = 0 if one PC explains all data variability), showed the same trend (Figure 5D). This suggests that, in both environments, over trials subjects use more degrees of freedom in their movement; and in EVR they used slightly more DoF than in the real-world task.
As a measure of task performance in body space we use the Velocity Profile Error (VPE), as in Haar et al [18]. VPE is defined by the mean correlation distance (one minus Pearson correlation coefficient between the velocity profile of each joint in each trial to the velocity profiles of that joint in all successful trials. Like in the real-world, in the EVR environment we also found that VPE shows a clear pattern of decay over trials in an exponential learning curve for all joints (Figure 6A). Intertrial variability in joint movement was also measured over the VPEs in each block. Unlike the real-world task, where learning was evident in the decay over learning of the VPE intertrial variability, in the EVR there was no such decay in most joints (Figure 6B). This is in line with the lack of decay in the intertrial variability of the directional error (Figure 3C&D).
Velocity Profile Error (VPE) and Intertrial variability reduction across all joints in the EVR task. (A) The trial-by-trial VPE for all 3 DoF of all joints, averaged across all subjects, with an exponential fit. The time constants of the fits are reported under the title color coded for the DoF (blue: flexion/extension; red: abduction/adduction; green: internal/external rotation). (B) VPE intertrial variability over blocks of 25 trials, averaged across all subjects.
Discussion
In this paper, we present a novel embodied VR framework capable of providing controlled manipulations in order to better study naturalistic motor learning in a complex real-world setting. By interfacing real-world objects into the VR environment, we were able to provide the SoE subjects experience in the real-world, which is missing in most VR environments. We demonstrate the similarities and differences between the learning in the EVR environment and learning in the real-world environment. This is the first study to directly compare full body motor learning in a real-world task between VR and the real-world.
There is much evidence that humans can learn motor skills in VR and transfer learning from the VR to the real-world, but also a real need to enhance it to make VR really useful for rehabilitation applications [for review see 20]. Theory suggests that transfer should be enhanced when VR simulates the real-world as closely as possible. The physical interaction with the real-world objects in our EVR is adding haptic information about interaction forces with virtual objects, which is lacking in most VR setups, and should enhance learning and transfer.
Comparison of the motor learning in the EVR to learning in the real-world task showed many similarities but also intriguing and differences. The main trends over learning that were found in the real-world task include the decrease in directional error, decrease in directional intertrial variability, decrease in shoulder movement and increase in elbow rotation, decrease in joints VPE, and decrease in joints VPE intertrial variability [18]. In the EVR environment we found the same general trends for all these metrics, except for those of the intertrial variability. We do however see a systematic difference in learning rates between VR and real-world when comparing the directional error and VPE trends. Across the board we see less learning in VR compared to the real-world.
Decay in intertrial variability over learning is considered to be a feature of motor skill learning and specifically of motor acuity [40,41]. The lack of such decay and the overall differences in the learning curve suggest potential differences in the learning mechanisms used by the subjects who learned the task in the VR. These differences may be attributed to the fact all subjects are completely naïve to the EVR environment and must learn not just the billiards task but also how to operate in the VR.
Limitations
The comparison of the ball trajectories between the EVR and the real-world environments highlight the similarity in the ball directions, which is the main parameter that determines task error and success. Nevertheless, there were significant velocity differences between the environments. These velocity differences were set to optimize subjects’ experience, accounting for deviations in ball physics due to friction, spin, and follow through which were not modeled in the VR. Due to these deviations, in VR, the cue ball reaches its max velocity almost instantaneously while in the real-world there is an acceleration phase. For the current version of the setup we neglected these differences, assuming it would not affect the SoE of very naïve pool players. Future studies, testing experts on the setup, would require more accurate game physics in the EVR. Another limitation of the current EVR setup is that subjects are unable to see their own limbs in the environment, whereas in the real-world the positions of the subject’s own limbs may influence how the task is learned. We can probably neglect this difference due to the extensive literature suggesting that learning is optimized by external focus of attention [for review see 42]. Thus, the lack of body vision should not significantly affect learning. Lastly, our setup is limited to visual perturbations and cannot be used to manipulate haptic force feedback.
Conclusions
In this study we have developed an embodied VR framework capable of applying visual feedback manipulations for a naturalistic free-moving real-world skill task. We have demonstrated the similarities in learning for a billiards shot between the EVR and the real-world and have confirmed the findings that motor learning is holistic. By manipulating the visual feedback in the EVR we can now further investigate the relationships between the distinct learning strategies employed by humans for this real-world motor skill. Such a setup can also be highly useful for rehabilitation, as it solves the problem of lack of embodiment which is limiting the transfer to the real-world in regular VR setup. Considering the potential of VR based rehabilitation and increasing popularity of VR in rehabilitation [20], the importance of EVR is clear.
Ethics statement
All experimental procedures were approved by Imperial College Research Ethics Committee and performed in accordance with the declaration of Helsinki. All subjects gave informed consent prior to participating in the study.
Availability of data and materials
The datasets used and/or analyzed in the current study are available from the corresponding author on reasonable request
Competing interests
The authors declare no competing financial interests
Funding
The study was enabled by financial support to a Royal Society-Kohn International Fellowship (NF170650; SH & AAF).
Authors’ contributions
SH and AAF conceived and designed the study; SH, GS and AAF developed the experimental setup; GS acquired the data; GS and SH analyzed the data; SH, GS and AAF interpreted the data; SH drafted the paper; SH and AAF revised the paper
Acknowledgements
We thank our participants for taking part in the study.
Abbreviations
- EVR
- Embodied virtual reality;
- DoF
- Degrees of freedom;
- MoCap
- Motion capture;
- SoE
- Sense of Embodiment;
- IMU
- inertial measurement unit;
- HMD
- head-mounted display;
- RMSE
- root mean squared error;
- GV
- generalized variance;
- PCA
- Principal component analysis;
- VPE
- Velocity profile error;