Genetic studies of accelerometer-based sleep measures in 85,670 individuals yield new insights into human sleep behaviour

Sleep is an essential human function but its regulation is poorly understood. Identifying genetic variants associated with quality, quantity and timing of sleep will provide biological insights into the regulation of sleep and potential links with disease. Using accelerometer data from 85,670 individuals in the UK Biobank, we performed a genome-wide association study of 8 accelerometer-derived sleep traits, 5 of which are not accessible through self-report alone. We identified 47 genetic associations across the sleep traits (P<5×10-8) and replicated our findings in 5,819 individuals from 3 independent studies. These included 26 novel associations for sleep quality and 10 for nocturnal sleep duration. The majority of newly identified variants were associated with a single sleep trait, except for variants previously associated with restless legs syndrome that were associated with multiple sleep traits. Of the new associated and replicated sleep duration loci, we were able to fine-map a missense variant (p.Tyr727Cys) in PDE11A, a dual-specificity 3’,5’-cyclic nucleotide phosphodiesterase expressed in the hippocampus, as the likely causal variant. As a group, sleep quality loci were enriched for serotonin processing genes and all sleep traits were enriched for cerebellar-expressed genes. These findings provide new biological insights into sleep characteristics.


INTRODUCTION
Sleep is an essential human function, but many aspects of its regulation remain poorly understood. Adequate sleep is important for health and wellbeing, and changes in sleep quality, quantity and timing are strongly associated with several human diseases and psychiatric disorders [1][2][3][4][5] . Identifying genetic variants influencing sleep traits will provide new insights into the molecular regulation of sleep in humans and help to establish the genetic contribution to causal links between sleep and associated chronic diseases, such as diabetes and obesity 6 [19][20][21][22][23] . Polysomnography (PSG) is regarded as the "gold standard" method of quantifying nocturnal sleep traits, but it is impractical to perform in large cohorts. Additionally, PSG is relatively burdensome for the participant making it less suitable for measuring sleep over multiple nights and capturing inter-daily variability. Research-grade activity monitors (accelerometers), also known as actigraphy devices, provide cost-effective estimates of sleep using validated algorithms 24,25 . However, accelerometer-based studies have often involved much smaller sample sizes than those required for GWAS and have generally focussed on day-time activity 26,27 . The UK Biobank study is a unique resource collecting vast amounts of clinical, biomarker, and questionnaire data on ~500,000 UK residents. Of these, 103,000 participants wore activity monitors continuously for up to 7 days. This provides an unprecedented opportunity to derive accelerometerbased estimates of sleep quality, quantity and timing and to assess the genetics of sleep traits.
In this study we identify genetic variants associated with objective measures of sleep and rest-activity patterns and use them to further understand the biology of sleep.
We used accelerometer data from the UK Biobank to extract estimates of sleep characteristics using a heuristic method previously validated using independent PSG datasets 28, 29 . We analysed a total of 8 accelerometer-based measures of sleep quality (sleep efficiency and the number of nocturnal sleep episodes), timing (sleepmidpoint, timing of the least active 5 hours (L5), and timing of the most active 10 hours (M10)), and duration (diurnal inactivity and nocturnal sleep duration and variability) by performing a GWAS in 85,670 UK Biobank participants and assess replication of the findings in 3 independent studies. Our analysis primarily focuses on traits that cannot be captured, or are unavailable, from self-report sleep measures, and are likely to be underpowered for GWAS in studies with PSG data due to limited sample sizes.

Measures of sleep quality and quantity are not correlated with sleep timing
Descriptive statistics and correlations between the eight accelerometer-derived phenotypes are shown in Supplementary Tables 1 and 2. We observed little observational correlation (R) between measures of sleep timing and measures of nocturnal sleep duration and quality (-0.10 ≤ R ≤ 0.12). These negligible or limited correlations between timing and duration are consistent with data from chronotype and self-report sleep duration measures (R = -0.01). We also observed limited correlation between sleep duration and sleep quality as represented by the number of nocturnal sleep episodes (R = 0.14) but observed a stronger correlation between sleep duration and sleep efficiency (R = 0.57). The correlations between self-reported sleep duration and accelerometer-derived sleep duration was 0.19 and between selfreported chronotype ("morningness") and L5 timing was -0.29.

Accelerometer-derived estimates of sleep patterns are heritable
To estimate the proportion of variance attributable to genetic factors for a given trait, we used BOLT-REML to estimate SNP-based heritability (h 2 SNP ) (

Low genetic correlation between self-reported and accelerometer-derived sleep duration
To quantify the genetic contribution, overlap between accelerometer-derived and self-reported sleep traits, we performed genetic correlation analyses using LD-score regression as implemented in LD-Hub 30 . We observed strong genetic correlations between L5, M10 and sleep midpoint timing and chronotype (r G >0.79) and weaker genetic correlation between objective versus self-reported sleep duration (r G =0.43).
This observation suggests differences in the genetic contribution to variation in selfreported versus objective sleep duration.

Forty-seven genetic associations identified across the accelerometer-derived sleep traits
To identify genetic loci associated with accelerometer-derived sleep traits, we performed a genome-wide association analysis of 11,977,111 variants in up to 85,670 individuals for the 8 accelerometer-derived sleep traits. We identified 47 genetic associations across 7 of the phenotypes at the standard GWAS threshold (P<5x10 -8 ). Among these associations, 20 reached a more stringent threshold of P<8x10 -10 . We estimate that this threshold reflects a better type 1 error rate to account for the approximate number of independent genetic variants analysed  Table 2 and Supplementary Figs 1-2). Twenty-six associations were observed for sleep quality measures, including 21 variants associated with number of nocturnal sleep episodes and 5 associated with sleep efficiency (8 and 2 at P<8x10 -10 , respectively). An additional 8 genetic associations were identified for sleep and activity timing. These included 6 associated with L5 timing, 1 associated with M10 timing, and 1 associated with mid-point sleep. Only 3 associations with L5 timing were detected at P<8x10 -10 . Finally, for sleep duration we observed 13 associations -11 for sleep duration and 2 associated with diurnal inactivity (6 and 1 at P<8x10 -10 , respectively). Of these 47 associations reaching P<5×10 -8 and the 20 associations reaching P<8×10 -10 , 31 and 9 were not previously reported in studies based on self-report measures, respectively ( Table 2). The variance explained by all the discovered loci ranged from 0.04% for sleep midpoint timing to 0.8% for number of nocturnal sleep episodes. The lambda GC observed across these analyses ranged from 1.03 (sleep duration variability) to 1.14 (number of nocturnal sleep episodes), while LD-Score intercepts ranged from 1.03 (diurnal inactivity) to 1.07 (sleep midpoint timing). These results suggest that any inflation of test statistics observed is more likely to due to the polygenicity of the phenotype tested over and above population stratification.

Replication of 47 genetic associations in 5,819 individuals
We attempted to replicate our findings in up to 5,819 adults from the Whitehall (N=2,144), CoLaus (N=2,257), and Rotterdam Study (subsample from RS-I, RS-II and RS-III, N=1,418) who had worn similar wrist-worn tri-axial accelerometer devices for a comparable duration as the UK Biobank participants. Individual study and metaanalysis results for the three replication studies are presented in Supplementary   Table 3. Of the 20 associations reaching P<8x10 -10 , 18 were directionally consistent in the replication cohort meta-analyses (P binomial = 3x10 -4 ). Of the additional 27 signals, 18 were directionally consistent in the replication meta-analysis (P binomial = 0.03). For traits with more than one SNP associated at P<5×10 -8 in the UK Biobank, we combined the effects of each SNP (aligned to the trait increasing allele) and tested them in the replication data. In the combined effects analysis, we observed

Variants associated with sleep quality include known restless legs syndrome, sleep duration, and cognitive decline associated variants
Of the 5 variants associated with sleep efficiency, one was the strongly associated PAX8 sleep duration signal 11 and one was a restless legs syndrome/insomnia associated signal (MEIS1) 17,31 . Of the 20 loci associated with number of nocturnal sleep episodes, one is represented by the APOE variant (rs429358). This variant is a proxy for the APOE ε 4 risk allele that is strongly associated with late-onset Alzheimer's disease and cognitive decline 32 . The  Table 5). This finding is inconsistent with the observational association between cognitive decline in older age and poorer sleep quality [33][34][35][36] . We also noted that the APOE  Table 6 and Supplementary Methods).

Six association signals identified for accelerometer-derived measures of sleep timing
We identified 6 loci associated with L5 timing, of which 3 have not previously been associated with self-report chronotype but have been associated with restless legs syndrome 31  However, our lead SNP (rs12927162) was not in LD with the previously reported index variant at this locus (rs45544231, LD r 2 = 0.004). There were minimal differences in effect sizes when we performed a range of sensitivity analyses, including removing individuals on depression medication, adjustments for BMI and lifestyle factors and splitting the cohort by median age (Supplementary Table 6 and Supplementary Methods).

Ten novel sleep duration loci identified from accelerometer-derived sleep duration GWAS
We identified 11 loci associated with accelerometer-derived sleep duration, including ten not previously reported to be associated with self-report sleep duration, despite and Supplementary Table 7). This lower overlap in signals is consistent with the lower genetic correlation between self-reported and objective sleep duration than between chronotype and objective measures of sleep and activity timing. The lead variants representing the ten new sleep duration loci all had the same direction and larger effects in the accelerometer data compared to self-report data, with effect sizes ranging from 1.3 to 5.9 minutes compared to 0. To confirm that associations were not influenced by age-related differences in sleep, we confirmed that there was also no difference in effect sizes between younger and older individuals (above and below the median age of 63.7 years) (Supplementary Table 6).

Fine-mapping analysis identifies multiple likely causal variants
To identify credible SNP sets likely to contain causal variants within 500Kb of lead SNPs (log 10 Bayes Factor > 2) for each trait with a genetic association (P<5x10 -8 ) we used FINEMAP 37

Associated loci are enriched for genes expressed in the cerebellum and serotonin pathway-related genes
We used MAGMA 40 to assess tissue enrichment of genes at associated loci across the sleep traits. All traits showed an enrichment of genes in the cerebellum ( Supplementary Figures 3 and 4). Loci associated with number of nocturnal sleep episodes were enriched for genes involved in serotonin pathways (P Bonferroni =0.0003) (Supplementary Table 9).

Multiple sleep traits have genetic variants previously associated with restless legs syndrome
We observed most variants to be associated with either sleep quality, duration, or timing, but not combinations of these sleep characteristics. However, the variant rs113851554 at the MEIS1 locus was associated with sleep quality (sleep efficiency), duration, and timing (L5). In addition, the variant rs9369062 at the BTBD9 locus was associated with both sleep duration and L5 timing. Both variants have previously been reported as associated with restless legs syndrome (Figure 2 Table 13).

Estimates of the effects on accelerometer-derived and self-report-derived sleep traits are correlated
We compared effects of variants associated with self-reported sleep duration and chronotype identified in parallel GWAS analyses.   Variants with nominal evidence of association with self-reported sleep duration had weaker effects. This difference may be due to reporting biases related to the UK Biobank questionnaire (e.g. response was in hourly increments) and due to asking participants to include nap-time in their sleep duration. In contrast the accelerometer derived estimates of L5 timing, the least active 5 hours of the day, correlated well with self-report estimates. These data suggest that the answer to the very simple question "are you a morning or evening person" provides similar power as wearing accelerometers for 7 days and nights. In a parallel GWAS analysis, the PAX8 variant was also associated with self-report insomnia ( This was particularly true of the participants who took part in the activity monitor study.
In conclusion, we have performed the first large-scale GWAS of objective sleep measures. We demonstrate that self-report measures are good proxies for objective sleep measures, but use of objectively measures of sleep quality allowed us to identify additional loci not identified by previous self-report GWAS studies including potential new therapeutic targets for poor sleep.

Data availability
The full set of GWAS summary statistics for all eight accelerometer-based measures are available at http://www.t2diabetesgenes.org/data/.

UK Biobank participants
The study population was drawn from the UK Biobank study -a

Activity-monitor Devices
A triaxial accelerometer device (Axivity AX3) was worn between 2.8 and 9.7 years after study baseline by 103,711 individuals from the UK Biobank for a continuous period of up to 7 days. Details of data collection and processing have been previously described 56

Accelerometer data processing and sleep measure derivations
We derived 8 measures of sleep quality, quantity and timing. All measures were derived by processing raw accelerometer data (.cwa). We first converted the .cwa files available from the UK Biobank to .wav files using "omconvert" for signal calibration to gravitational acceleration 56,57 and interpolation 56  Inactivity bouts that are <60 minutes apart are combined to form inactivity blocks.
The start and end of the longest block defined the start and end of the SPT-window.
Sleep duration and variability. Sleep episodes within the SPT-window were defined as periods of at least 5 minutes with no change larger than 5° associated with the zaxis of the activity-monitor, as motivated and described in van Hees et al. (2015).
The summed duration of all sleep episodes was used as indicator of sleep duration within the SPT-window. The total duration over the activity-monitor wear-time was averaged. Individuals with an average sleep duration <3 hours or >12 hours were excluded from all analyses. In addition, the standard deviation of sleep duration was also calculated and put forward for statistical analysis for individuals with 7 days of accelerometer wear.
Sleep efficiency. This was calculated as sleep duration (defined above) divided by the time elapsed between the start of the first inactivity bout and the end of the last inactivity bout (which equals the SPT-window duration).

Number of nocturnal sleep episodes within the SPT-window.
This was defined as the number of sleep episodes within the SPT-window.
Individuals with an average number of nocturnal sleep episodes ≤5 or ≥30 were excluded from all analyses. Sleep-midpoint timing. Sleep midpoint was calculated for each sleep period as the midpoint between the start of the first detected sleep episode and the end of the last sleep episode used to define the overall SPT-window (above). This variable is represented as the number of hours from the previous midnight.
Diurnal inactivity. Diurnal inactivity was estimated by the total daily duration of estimated bouts of inactivity that fell outside of the SPT-window. This measure captures very inactive states such as napping and wakeful rest but not inactivity such as sitting and reading or watching television, which are associated with a low but detectable level of movement.

Comparison against self-reported sleep measures
We performed analyses of self-reported measures of sleep. Self-reported measures analysed included a) the number of hours spent sleeping over a 24-hour period (including naps); b) insomnia; c) chronotype -where "definitely a 'morning' person", "more a 'morning' than 'evening' person", "more an 'evening' than a 'morning' person", "definitely an 'evening' person" and "do not know", were coded as 2, 1, -1, -2 and 0 respectively, in our continuous variable.

Statistical Analysis
Genome-wide association analyses. We performed all association tests in the UK Biobank using BOLT-LMM v2.3 59

Fine-mapping association signals
Fine-mapping analyses were performed using FINEMAP v1.2 37  Combined variant effects on respective traits were subsequently calculated using the 'metan' function in STATA using the betas and standard errors obtained through the primary meta-analysis of the three replication studies.
Sensitivity Analysis. To assess whether stratification was responsible for any of the individual variant associations in a subset of the cohort, we performed multiple sensitivity analyses in unrelated European subsets of the UK Biobank using STATA.
The sensitivity analyses carried out were: 1) males only, 2) females only 3) individuals younger than the median age (at start of the activity monitor wear period),      Association tests were carried out for all phenotypes on both the raw scale and inverse-normalised scale. Sensitivity analysis included: 1) in males only, 2) in females only, 3) in those lower than median age at actigraphy (63.7 years), 4) in those greater than or equal to the median age, 5) in all European unrelated but adjusting for BMI in addition to standard adjustments, 6) in all European unrelated but also adjusting for BMI and lifestyle factors, and 7) excluding those reporting shift work, having self-report or hospital-recorded mental health or sleep disorders, and those taking anxiolytic, antipsychotic, antidepressant or sleep medication. Lifestyle adjustments for analysis (6) and exclusions for analysis (7) are described in greater detail in the Supplementary Methods. and sleep duration GWAS in UK Biobank are also provided.