Abstract
Self-agency, the sense that one is the author or owner of one’s behaviors, is impaired in multiple psychological and neurological disorders, including functional movement disorders (FMDs), Parkinson’s Disease, alien hand syndrome, schizophrenia, and dystonia. Existing assessments of self-agency, many of which focus on agency of movement, can be prohibitively time-consuming and often yield ambiguous results. Here, we introduce a short online motion tracking task that quantifies movement agency through both first-order perceptual and second-order metacognitive judgments. The task assesses the degree to which a participant can distinguish between a motion stimulus whose trajectory is influenced by the participant’s cursor movements and a motion stimulus whose trajectory is random. We demonstrate the task’s reliability in healthy participants and discuss how its efficiency, reliability, and ease of online implementation make it a promising new tool for both diagnosing and understanding disorders of agency.
Introduction
Self-agency refers to the feeling that one is the cause of their behaviors (Tsakiris et al., 2007). Impairments in self-agency have been observed in multiple psychological and neurological disorders, such as functional movement disorders (FMDs) (Fried et al., 2017), Parkinson’s disease (Ricciardi et al., 2017), alien hand syndrome (Biran & Chatterjee, 2004), schizophrenia (Pierre, 2014), and dystonia (Delorme et al., 2016). However, existing measurements of agency are rarely applied in diagnosis due to concerns of unreliability (Armstrong & Okun, 2020; Jinnah & Factor, 2015; Patel et al., 2014). Behavioral tasks in general have been shown to have low test-retest reliability as measures of individual differences and predictors of real-life outcomes (Eisenberg et al., 2019; Enkavi et al., 2019), suggesting limited efficacy in diagnostic procedures. Moreover, prominent behavioral measurements of agency contain their own specific limitations. For example, the Libet clock paradigm, which is widely used to assess movement-based self-agency in experimental settings (Libet et al., 1983), has been criticized for confounding perceived timing of voluntary movement initiation and execution with timing for memory processing (Gomes, 2002), and for its general susceptibility to memory-based biases (Dennett & Kinsbourne, 1992; Libet, 1985; Matsuhashi & Hallett, 2008). Taken together, this suggests that a reliable and unbiased behavioral method for assessing agency could represent a major step forward in the diagnosis of agency related disorders (Hauser et al., 2011).
Among the disorders mentioned above, we focus on FMDs here, as their diagnosis is particularly challenging. FMDs are movement disorders characterized by body movements or postures that patients cannot control, and that have no known neurological basis (Kranick & Hallett, 2013). The most prominent feature of FMDs is the lack of a sense of agency over body movements or postures (Fried et al., 2017). A sense of agency is acquired if little or no discrepancy is observed between movement intention and sensory feedback (Voon et al., 2010). However, patients with FMDs report impaired movement intention, which leads to a frequent mismatch between intention and feedback. For example -- setting concerns about the reliability of the Libet clock paradigm to the side temporarily -- in contrast to healthy controls, FMD patients show no difference in reported intention and movement times on the Libet clock paradigm (Edward et al., 2011; Libet et al., 1983). Further, the right temporo-parietal junction (rTPJ) is thought to contribute to the generation of self-agency by comparing the prediction of movements with actual sensory feedback (Decety & Lamm, 2007; Farrer et al., 2008). Decreased connectivity between the rTPJ and sensorimotor regions has been observed in FMD patients, therefore suggesting that an impairment in the ability to generate a sense of movement agency may underlie symptoms (Maurer et al., 2016).
Current diagnosis of FMDs is ambiguous and time-consuming (Brown & Thompson, 2001). One challenge with FMD diagnosis is that about 25% of patients with FMDs have other neurological illnesses, including separate movement disorders with known neurological causes (Factor et al., 1995). These separate movement disorders may hinder detection of FMDs given their similarity in symptoms. It is also difficult to determine whether certain psychological factors, such as anxiety, are causes or consequences of FMDs (Brown & Thompson, 2001).
Though the Fahn and Williams clinical classification of FMDs is widely used (Brown & Thompson, 2001; Kranick et al., 2013; McAuley & Rothwell, 2004), the diagnosis requires a large time investment. According to this diagnosis, an FMD patient’s disorder should be “inconsistent over time or incongruent with a recognized [movement disorder], in association with other related features” (Fahn & Williams, 1988; Williams et al., 1995). Confirmation of this criterion requires extended observation and a clinician’s thorough understanding of all prominent types of movement disorders. Because the early diagnosis of FMDs can benefit recovery (Brown & Thompson, 2001), a more efficient tool to supplement such procedures is in need.
In the current study, we introduce an adaptive online sensorimotor task, in which participants make agency judgments based on the extent to which their cursor movements affect otherwise randomly moving dots, to quantify their sense of movement agency, and validate its implementation in healthy participants. Providing an efficient and unbiased agency measure, this task has the potential to improve diagnostic efficacy not only for FMDs, but for agency-related disorders in general.
Methods
Participants
We conducted two experiments with minor procedural differences through Amazon Mechanical Turk (mTurk). In Experiment 1, the data of 99 participants (71 males, average age = 29.66) was collected, and 53 participants were included in the data analysis (23 males, average age = 35.40; see Exclusion Criteria). In Experiment 2, the data of 94 participants (49 males, average age = 36.89) was collected, and 58 participants (31 males, average age = 37.67) were included in the data analysis.
Participants received $4 for finishing the task. People who partially completed the task were paid at a rate of $4 per hour. An online consent form was given at the beginning of the study. The study was approved by the University of California, Los Angeles Institutional Review Board, and was carried out following the Declaration of Helsinki.
Stimuli and Apparatus
On each trial, two moving dots (dot A and dot B) were presented within separate circles for 4 or 2.5 seconds, in Experiment 1 and Experiment 2, respectively (Figure 1A). Each dot had an independent, pseudorandom trajectory. While the dots were presented, the participant could move the cursor to influence the trajectory of one of the two dots, hereafter referred to as the target dot. The target dot (A or B) was determined randomly at the beginning of each trial.
Independent of cursor movements, random dot trajectories were computed as follows. The position of each dot was updated on each display frame. At onset, or frame i = 1, each dot appeared at the center of its respective circle. The initial angle of motion for each dot, θi=1 (Figure 1B), was randomly selected from a full 360-degree range. For successive display frames (i = 2 to i = n-1, where n is the total number of frames over which the dot stimuli were displayed) new angles of motion θi were randomly selected from a uniform distribution with a range of θi-1 ± 11.75 degrees. Movement distance from frames i to i+1, Xi,, (Figure 1B) was randomly selected from a uniform distribution from 0.5 to 2.5 pixels. Given a reference position, P, on display frame i, (Figure 1B, point Pi), the dot’s position on frame i+1(Figure 1B point Pi+1), was the dot’s location after it moved along angle θi at a distance of Xi.
Cursor movements affected the trajectory of the target dot as follows. Cursor movement at frame i+1 was represented by a vector, yi+1 (Figure 1B), that pointed from the cursor’s coordinates at the time point when the coordinates of point Pi were calculated (but before was Pi drawn), to the cursor’s coordinates at the onset of frame i+1. The amount of control that cursor movements had over the target dot’s trajectory varied from trial to trial. To implement this, the cursor movement vector’s length was multiplied by the participant’s level of control (between 0 and 100 %) on a given trial.
The position of the target dot on frame i+1 was computed by first moving the dot from position Pi+1 along the adjusted mouse movement vector to a new position, Pi+1’,(Figure 1B). The target dot’s final display position on frame i+1, Pi+1’’, was computed by moving the dot toward point Pi+1’ along the straight line connecting points Pi and Pi+1’ by a distance of Xi.
If the non-target dot’s computed position at frame i+1 (Pi+1, Figure 1B) was outside its circular border (Figure 1A), its coordinates were re-calculated so as to make it appear to “bounce” off of the border. The new position was re-calculated by reflecting the coordinates over the tangent line of the circular border at the point where the dot was in the previous frame (Pi). For the target dot, if its final position (Pi+1’’) is outside the border, its final coordinate will adopt the value of its old coordinates (Pi). This was intended to minimize the extent to which the direction of cursor movements opposed the motion of the target dot.
Both dots were black and had a radius of 8 pixels. The cursor was hidden during the task to prevent participants from inferring the location of the target dot from cursor movement. If the cursor reached the edge of the screen, the corresponding side of an outer rectangular border (Figure 1A) changed color from black to red to alert participants that they had reached an edge of the area in which their cursor movements could be recorded.
Both experiments were conducted through mTurk. The experimental code was written in JavaScript and JsPsych (6.0.5). Participants performed the task with their personal computers.
Procedure
On each trial in Experiment 1, a fixation cross was presented at the center of the screen for 0.5 seconds. Dot stimuli were then shown for 4 seconds. These were followed by a 0.5 s blank screen, after which participants answered two questions. They first indicated which dot, A or B, they were better able to control. Then, they reported their confidence in this agency judgment on a scale of one to four. A rating of one corresponded to a complete guess. A two meant that the judgment was better than a guess but the participant was unsure about it. A three meant that they were almost certain, and a four meant that they had no doubt in their judgment. Participants had unlimited time to respond.
Each session started with five practice trials that implemented an adaptive staircasing procedure (Kingdom & Prins, 2010). The control level of the first practice trial was 25%, as pilot data confirmed that healthy subjects have 100% accuracy at this level. If the participant’s answer on a given trial was correct, the control level was reduced by 2.5% for the next trial. If the answer was incorrect, the control level was increased by 7.5% for the next trial. Participants would repeat the whole set of practice trials if their total accuracy was below 80% correct.
Following the practice, the main task employed two randomly interleaved one-up/one-down adaptive staircases with differentially weighted step sizes (Kingdom & Prins, 2010). The ratio of down-to up-step magnitudes following correct and incorrect responses, respectively, was 0.33. This procedure was designed to estimate the control level at which participants are 75% correct in their agency judgments.
The control level on the first trial of each staircase was 2.5% and the initial down step size was 0.5%. Reversal trials were trials in which accuracy was different from that of the previous trial of the current staircase. For each staircase, the down step size was reduced to 0.25% after the second reversal, to 0.1% after the sixth reversal, and to 0.02% after the twelfth reversal. The experiment ended after both staircases accumulated 13 reversals. The maximum number of trials allowed was 100 trials in each staircase.
Experiment 2 was the same as Experiment 1 except for the following changes. To improve task efficiency, dot stimulus duration was decreased to 2.5 seconds, and participants only had 2 seconds to answer each question. Additionally, participants were given an optional break of up to 5 minutes in the middle of the task (after they had finished six reversal trials in both staircases). Finally, the last down step size was increased to 0.07% to prevent the staircasing procedure from becoming prematurely constrained to an overly narrow range of control values.
Data Analysis
Percent control thresholds on the dot trajectory task were estimated by computing the average control level across the last five reversal trials in each staircase. The final control threshold estimation for each participant was the average of the two staircases’ thresholds.
Participants’ metacognitive sensitivity was quantified using the area under the Type 2 receiver operating characteristic curve (Type 2 AUROC; Galvin et al., 2003). According to Signal Detection Theory, participants’ tasks in the current paradigm can be divided into two types: identifying the target dot is considered a Type 1 task and reporting confidence is considered a Type 2 task. The Type 2 AUROC reflects subjects’ ability to track their performance on the Type 1 task with confidence ratings (Galvin et al., 2003). A Type 2 AUROC of 0.5 indicates that the participant has no metacognitive sensitivity, while a Type 2 AUROC of 1 indicates optimal metacognitive sensitivity (i.e., all correct responses are endorsed with high confidence while all incorrect responses are rated with low confidence). Statistical analyses were conducted in MATLAB R2018b (Natick, MA) and R version 3.6.0 (Vienna, Austria).
Exclusion criteria
Catch trials in which participants had a high level of control (25%) were randomly inserted in between staircasing trials such that they accounted for approximately 15% of the total trial number. In Experiment 1, participants who missed more than 40% of catch trials were excluded (N = 39). Participants who used extreme confidence ratings (one or four) more than 95% of the time (N = 3) were also excluded because such biases would impede a meaningful analysis of metacognitive scores (Maniscalco & Lau, 2012). No participant was excluded due to a control level threshold that was more than three standard deviations away from the mean threshold level across participants. We also excluded participants whose d’ scores (Maniscalco & Lau, 2012, 2014) on the agency judgment task (N = 4), computed with hits corresponding to correctly choosing dot A and correct rejections corresponding to correctly choosing dot B, were less than or equal to zero, as such scores imply a lack of effort.
Similar exclusion criteria were applied to participants in Experiment 2. Twenty-eight participants were excluded for missing more than 40% of the catch trials. No participants were excluded because of extreme confidence rating or abnormal control level thresholds. Finally, seven participants were excluded for negative d’ scores, and one participant was excluded due to incomplete data acquisition.
Results
The distributions of percent control thresholds estimated in healthy participants in Experiments 1 (M = 1.89%, SD = 2.31%) and 2 (M = 1.48%, SD = 1.79%) are shown in Figure 2A and 2B, respectively. The reduction in stimulus duration and allotted response times from Experiment 1 to Experiment 2 did not significantly change observed control thresholds, t(97.95) = 1.06, p = 0.29.
The trial counts in these experiments (Experiment 1: M = 83.64, SD = 21.93; Experiment 2: M = 83.24, SD = 21.89) were lower than that recommended for the unbiased metacognitive sensitivity estimation procedure used to estimate meta-d’ (Maniscalco & Lau, 2012). Therefore, the method of Type 2 sensitivity estimation used here (Type 2 AUROC) is susceptible to being biased by Type 1 performance. Because the control levels at the start of each thresholding procedure were high by design, participants’ Type 1 performance before the third reversal trial was relatively inflated. Thus, we computed Type 2 AUROC based on participants’ performance after the third reversal trial in order to minimize the extent to which any initial inflation of Type 1 accuracy would bias the estimation of metacognitive sensitivity. Type 2 AUROC scores for Experiments 1 (M = 0.73, SD = 0.09) and 2 (M = 0.74, SD = 0.08) are shown in Figure 2C and 2D, respectively. No significant difference was observed between the two experiments, t(99.99) = 1.01, p = 0.32.
Test-retest reliability of control level threshold estimation was assessed by evaluating the correlation between the average of each participant’s control levels on the 8th to 10th reversal trials and the average of the subject’s control levels on the 11th to 13th reversal trials. Spearman rank correlation tests were used to minimize the influence of extreme values. Positive correlations were observed in both experiments [Experiment 1: r(51) = 0.89, p < 0.001, Figure 3A; Experiment 2: r(56) = 0.84, p < 0.001, Figure 3B]. Because control thresholds were densely clustered at low levels, as shown by Shapiro-Wilk’s tests [Experiment 1, reversals 8-10: W(52) = 0.75, p < 0.001; Experiment 1, reversals 11-13: W(52) = 0.69, p < 0.001; Experiment 2, reversals 8-10: W(57) = 0.69, p < 0.0001; Experiment 2, reversals 11-13: W(57) = 0.66, p <0.0001; Figure 3A,B], split-half relationships between log transformed control thresholds are shown (Figure 4) to visually confirm that the observed reliability was not simply driven by extreme values. To further validate the test-retest reliability of control threshold estimation, we also found that the two control thresholds estimated for each participant from the eighth through thirteenth reversals of each individual staircase were significantly correlated [Experiment 1: r(51) = 0.82, p < 0.0001; Experiment 2: r(56) = 0.82, p < 0.0001].
To again avoid the influence of early inflation of Type 1 performance, data after the third reversal trial was used to check the test-retest reliability of Type 2 AUROC estimates. A positive correlation was found between participants’ Type 2 AUROC in the first and second half of Experiment 1, but not in Experiment 2 [Experiment 1: r(51) = 0.43, p < 0.01, Figure 3C; Experiment 2: r(56) = 0.20, p = 0.13, Figure 3D].
Discussion
The dot trajectory task introduced here was designed to quickly and reliably quantify participants’ sense of movement agency. The present test-retest reliability results suggest that agency control thresholds can be reliably estimated from this relatively short (fewer than 100 trials) online thresholding procedure. Significant test-retest reliability was observed for Type 2 AUROC only in Experiment 1, showing that, across experiments, metacognitive sensitivity was a less stable measure than agency control thresholds. This relative decrease in reliability from a Type 1 to a Type 2 measure may be attributable to additional noise in the signal processing underlying Type 2 decisions (Maniscalco & Lau, 2016; Peters et al., 2017). However, the significant test-retest Spearman correlation for Type 2 AUROC observed in Experiment 1 suggests that further investigation of this measure is warranted (e.g., in case its reliability can be improved by simple procedural modifications like an increase in trial counts).
The current task has several advantages over previous paradigms for quantifying self-agency. First, it avoids biases and memory confounds that may be inherent to paradigms using subjective timing judgments like the Libet clock paradigm (Matsuhashi & Hallett, 2008) or paradigms based on the intentional binding effect (Haggard & Clark, 2003; Ricciardi et al., 2017). Further, it improves upon other paradigms that have moved away from subjective timing judgments (e.g., Daprati et al., 1997; Schimansky et al., 2010) in the use of both adaptive staircasing and metacognitive judgments. Adaptive staircasing increases the task’s flexibility and sensitivity to individual differences between participants, and also allows for between-subjects comparisons of metacognitive sensitivity (e.g., between patient and control groups) that are not confounded by Type 1 performance (Lau, 2008). Importantly, this ensures that, if a difference in Type 1 control thresholds is indeed observed between patients and healthy controls, this difference will not obscure any potential further differences in metacognitive sensitivity between patients and controls. In this regard, future results from this task could also potentially shed light on an extant debate about the extent to which the sense of agency is metacognitive in nature (Chambon et al., 2014). Further, the significant test-retest reliability for agency control thresholds observed in both experiments suggests that the task overcomes a common limitation of weak measurement stability in behavioral tasks (Enkavi et al. 2019). However, to further confirm this, the current results should be replicated in a larger sample, as larger sample sizes can reduce estimates of measurement reliability (Enkavi et al. 2019).
Going forward, the next step will be to compare these measures between patients and healthy controls. Continuing with the example of FMDs, we hypothesize that patients’ control level thresholds will be higher than that of healthy controls. The dot trajectory task requires participants to compare predicted dot trajectories with real trajectories. While the real trajectories are represented by visual input, predicted trajectories are a combination of perceived trajectories and participants’ inner prediction based on hand movements. Because FMD patients have impaired inner predictions (Voon et al., 2010), their predicted trajectories should be less accurate than those of the controls. Therefore, patients with FMDs should require greater control levels to reach 75% accuracy on the dot trajectory task than controls. Similarly, because FMD patients display impaired somatosensory metacognition (Edward et al., 2011; Kranick et al., 2013), we predict that they will have lower Type 2 AUROC compared to healthy controls.
The dot trajectory task may also have broader applicability for the diagnosis of other agency-related disorders whose primary symptoms are not movement-based. For example, some schizophrenia patients have demonstrated an excessive sense of self-agency, such that they tend to consider movements generated by others as being self-generated (Garbarini et al., 2016; Schimansky et al., 2010). Such patients may experience difficulty distinguishing dot movements generated by themselves from those created by the program. This would be expected to manifest in a response profile similar to that predicted for FMD patients: higher Type 1 error rates (and thus, higher control thresholds) and inflated confidence ratings on incorrect trials (and thus, lower type 2 AUROC) relative to healthy participants.
The present study has at least two limitations. First, participants provided Type 1 and Type 2 responses in two separate questions with a fixed order. This may allow information accumulated after the Type 1 response to be incorporated into Type 2 judgments, which could lead to inaccurate assessment of metacognitive sensitivity (Fleming & Daw, 2017). Thus, future studies may benefit from asking participants to indicate the target dot and rate confidence simultaneously (e.g., Matthews et al., 2020). Second, the motivation for healthy controls and patients may be different. Patients may participate more actively as the task is directly related to their condition, while online subjects are often motivated primarily by moderate monetary rewards. Such discrepancies could make it more difficult to observe any real differences between the two populations, thereby limiting the task’s diagnostic potential. Therefore, any procedural modifications that may increase motivation in healthy participants, such as increased monetary rewards or bonuses, could be important moving forward.
In conclusion, the dot trajectory task introduced here is able to efficiently and reliably estimate two new measures of movement self-agency in healthy participants. We hope that future implementations of this novel task in studies with patients can improve our ability to both understand and diagnose agency-related disorders.
Link to code and data: https://github.com/ShiyunCMC/FMD_public