Abstract
Risk-taking behaviour is a key component of several psychiatric disorders and could influence lifestyle choices such as smoking, alcohol use and diet. Risk-taking behaviour therefore fits within a Research Domain Criteria (RDoC) approach, whereby elucidation of the genetic determinants of this trait has the potential to improve our understanding across different psychiatric disorders. Here we report a genome wide association study in 116 255 UK Biobank participants who responded yes/no to the question “would you consider yourself a risk-taker?” Risk-takers (compared to controls) were more likely to be men, smokers and have a history of mental illness. Genetic loci associated with risk-taking behaviour were identified on chromosomes 3 (rs13084531) and 6 (rs9379971). The effects of both lead SNPs were comparable between men and women. The chromosome 3 locus highlights CADM2, previously implicated in cognitive and executive functions, but the chromosome 6 locus is challenging to interpret due to the complexity of the HLA region. Risk-taking behaviour shared significant genetic risk with schizophrenia, bipolar disorder, attention deficit hyperactivity disorder and post-traumatic stress disorder, as well as with smoking and total obesity. Despite being based on only a single question, this study furthers our understanding of the biology of risk-taking behaviour, a trait which has a major impact on a range of common physical and mental health disorders.
Introduction
Risk-taking behaviour is an important aspect of several psychiatric disorders, including attention deficit hyperactivity disorder (ADHD), bipolar disorder (BD) and schizophrenia, as well as problem behaviours such as smoking and drug and alcohol misuse 1-3. Physical health problems such as obesity might also be considered to be related to increased propensity towards risk-taking: obesity includes aspects of reward processing, response inhibition and decision-making4. The Research Domain Criteria (RDoC) approach suggests that studying dimensional psychopathological traits (rather than discrete diagnostic categories) may be a more useful strategy for identifying biology which cuts across psychiatric diagnoses 5. In this respect, risk-taking behaviour is an important phenotype for investigation. It may also be useful for investigating the overlap between psychiatric disorders and conditions such as obesity, which include components of poor impulse control and abnormal reward processing.
To date, there has been no focused genetic study of risk-taking behaviour. GWAS of related phenotypes, such as impulsivity and behavioural disinhibition, have so far been underpowered for detecting associations at a genome-wide level. Here we conduct a GWAS of self-reported risk-taking behaviour in 116 255 participants from the UK Biobank cohort. We use expression quantitative trait loci analysis to highlight plausible candidate genes and we assess the extent to which there is a genetic correlation between risk-taking and several mental and physical health disorders, including ADHD, BD, major depressive disorder (MDD), post-traumatic stress disorder (PTSD), anxiety, alcohol use disorder, smoking and obesity.
Materials and methods
Sample
UK Biobank is a large population cohort which aims to investigate a diverse range of factors influencing risk of diseases which are common in middle and older age. Between 2006 and 2010, more than 502 000 participants (age range from 40 and 69 years) were recruited from 22 centres across the UK 6. Comprehensive baseline assessments included social circumstances, cognitive abilities, lifestyle and measures of physical health status. The present study used the first release of genetic data on approximately one third of the UK Biobank cohort. In order to maximise homogeneity, we included only participants of (self-reported) white United Kingdom (UK) ancestry.
Informed consent was obtained by UK Biobank from all participants. This study was carried out under the generic approval from the NHS National Research Ethics Service (approval letter dated 13 May 2016, Ref 16/NW/0274) and under UK Biobank approval for application #6553 “Genome-wide association studies of mental health” (PI Daniel Smith).
Risk-taking phenotype
The baseline assessment of UK Biobank participants included the question “Would you describe yourself as someone who takes risks?” (data field #2040), to which participants replied yes or no. Of the sample with available genetic data, a total of 29 703 individuals responded ‘yes’ to this question (here referred to as ‘risk-takers’) and 86 552 responded ‘no’ (here referred to as controls). N=3 769 individuals preferred not to say or were missing this data and were excluded from analysis. Characteristics of the individuals included in this analysis are described in Table 1.
Genotyping, imputation and quality control
The first release of genotypic data from UK Biobank, in June 2015, included 152 729 UK Biobank participants. Samples were genotyped with either the Affymetrix UK Biobank Axiom array (Santa Clara, CA, USA; approximately 67%) or the Affymetrix UK BiLEVE Axiom array (33%), which share at least 95% of content. Autosomal data only were available.
Imputation of the data has previously been described in the UK Biobank interim release documentation 7. In brief, SNPs were excluded prior to imputation if they were multiallelic or had minor allele frequency (MAF) <1%. A modified version of SHAPEIT2 was used for phasing and IMPUTE2 (implemented on a C++ platform) was used for the imputation 8,9. A merged reference panel of 87 696 888 biallelic variants on 12 570 haplotypes constituted from the 1000 Genomes Phase 3 and UK10K haplotype panels 10 was used as the basis for the imputation. Imputed variants with MAF <0.001% were filtered out of the dataset used for subsequent analysis.
The Wellcome Trust Centre for Human Genetics applied stringent quality control, as described in UK Biobank documentation 11, before release of the genotypic data set. UK Biobank genomic analysis exclusions were applied (Biobank Data Dictionary item #22010). Participants were excluded from analyses due to relatedness (#22012: genetic relatedness factor; one member of each set of individuals with KING-estimated kinship coefficient >0.0442 was removed at random), sex mismatch (reported compared to genetic) (#22001: genetic sex), non-Caucasian ancestry (#22006: ethnic grouping; self-reported and based on principal component analysis of genetic data), and quality control failure (#22050: UK BiLEVE Affymetrix quality control for samples and #22051: UK BiLEVE genotype quality control for samples). SNPs were removed due to deviation from Hardy–Weinberg equilibrium at P<1x10-6, MAF <0.01, imputation quality score <0.4 and >10% missingness in the sample after excluding genotype calls made with <90% posterior probability.
Association analyses
A total of 116 255 individuals and 8 781 003 variants were included in the analysis. 29 703 participants were classed as risk-takers and 86 552 were controls. Association analysis was conducted in PLINK 12 using logistic regression, assuming a model of additive allelic effects and models were adjusted for sex, age, genotyping array, and the first 8 genetic PCs (Biobank Data Dictionary items #22009.01 to #22009.08) to control for hidden population stratification. The threshold for GWAS significance was set at p<5x10-8.
Data mining
SNPs associated (at genome-wide significance) with risk-taking behaviour were further investigated for influence on nearby genes (Variant Effect Predictor, VEP 13) and for reported associations with relevant traits (GWAS catalogue 14). Descriptions and known or predicted functions of implicated genes were compiled (GeneCards www.genecards.org and Entrez Gene www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene) and global patterns of tissue expression were assessed (GTEx 15). Exploratory analyses of the impact of significant loci on the expression of nearby genes were carried out using the GTEx Portal “Test your own eQTL” function 15. In the 13 brain regions available in the GTEx dataset, we tested for associations between rs13084531 and CADM2 expression, and between rs9379971 and the expression of POM121L2, PRSS16, ZNF204P and VN1R10P.
SNP heritability and genetic correlation analyses
Linkage Disequilibrium Score Regression (LDSR) 16 was applied to the GWAS summary statistics to estimate the risk-taking SNP heritability (h2SNP). LDSR was also used to assess genetic correlations between risk-taking behaviour and ADHD, schizophrenia, BD, MDD, anxiety, PTSD, smoking status (ever smoked), fluid intelligence, years of education, obesity and alcohol use disorder. Two measures of obesity were included: Body-Mass Index (BMI) as a measure of total obesity 17 and waist-to-hip ratio adjusted for BMI (WHRadjBMI), reflecting metabolically-detrimental central obesity 18. For the ADHD, schizophrenia, BD, MDD, anxiety, PTSD, and smoking status, we used GWAS summary statistics provided by the Psychiatric Genomics Consortium (http://www.med.unc.edu/pgc/) 19-25. For the two obesity phenotypes, GWAS summary statistics for BMI 17 and WHRadjBMI 18 were taken from the consortium for the Genetic Investigation of Anthropometric Traits (http://portals.broadinstitute.org/collaboration/giant). Summary statistics for years of education 26 and fluid intelligence 27 were downloaded as instructed in the respective publications. Alcohol use disorder was defined using DSM-5 criteria 28. For this phenotype, a GWAS meta-analysis on genotypes imputed to 1000 Genomes was run with five datasets: COGEND, COGEND2, COGEND-23andMe, COGA, and FSCD. In total there were N=2 983 cases with alcohol use disorder and N=1 169 controls. Descriptions of the datasets are in the Supplementary information.
Results
Demographic characteristics
Small differences were observed between controls and risk-takers with regard to age and BMI (Table 1), but striking differences were observed for sex distribution, smoking and history of mood disorders: risk-takers (compared to controls) were more often men, more likely to be current or ever-smokers and more likely to have a lifetime history of severe depression or BD. Risk-takers were also more likely to have a university/college degree.
GWAS of risk-taking behaviour
GWAS results for risk-taking are summarised in Figure 1 (Manhattan plot), Figure 1 inset (QQ plot) and Supplementary Table 1 (Genome-wide significant loci associated with risk-taking in UK Biobank: basic and conditional analyses).
The GWAS data test statistics showed modest deviation from the null (λGC =1.13). Considering the sample size, the deviation was negligible (λGC 1000=1.002). LDSR suggested that deviation from the null was due to a polygenic architecture in which h2SNP accounted for approximately 4% of the population variance in risk-taking behaviour (observed scale h2SNP (SE 0.006)), rather than inflation due to unconstrained population structure (LD regression intercept=1.003 (SE 0.008)).
Two loci were associated with risk-taking behaviour at genome-wide significance, on chromosome 3 and chromosome 6 (Figure 1 and Supplementary Table 1). The index SNP on chromosome (chr) 3, rs13084531, lies within the CADM2 gene, however linkage disequilibrium (LD) suggests that the signal also encompasses miR5688, and borders a CADM2 anti-sense transcript (CADM2-AS2, Figure 2a). The minor allele of rs13084531 was associated with increased risk-taking (G allele, Odds Ratio (OR) 1.07, Confidence interval (CI) 1.04-1.09, P 8.75x10-9). Conditional analysis of the chr3 locus (including rs13084531 as a covariate) is suggestive of a second signal (index SNP rs62250716, OR 0.96, CI 0.94-0.98, P 8.53x10-5, LD r2=0.16 with rs13084531, Figure 2b and Supplementary Table 1). The LD structure across the chr3 locus supports the possibility of two distinct signals (Supplementary Figure 1).
The chr6 locus lies within the gene-rich HLA region (Figure 2c), where index SNP rs9379971 demonstrated an association between the minor allele and decreased risk-taking (A allele, OR 0.95, CI 0.93-0.97, P 2.31x10-9). Conditional analysis (including rs9379971 as a covariate) and assessment of the LD structure across this locus indicated that the associated region probably includes only one signal (Figure 2d, Supplementary Table 1 and Supplementary Figure 2).
As there was a predominance of men in the risk-taking group, within a secondary analysis, the lead SNP from each locus was assessed in men and women separately. The effect of rs13084531 (chr3) was comparable between men (N=51 662, Beta 0.061, se 0.016, p=0.0001), women (N=48 812, Beta 0.075, se 0.018, p=4.37x10-5) and the combined dataset (Beta 0.067, se 0.012 p=8.75x10-9). For rs9379971 (chr6) the signal (Beta -0.063, se 0.011, p=2.31x10-9) appeared stronger in women (Beta -0.074, se 0.017, p=1.05x10-5) compared to men (Beta -0.044, se 0.014 p=0.0022), but sex-interaction analysis did not support a significant difference (p=0.1764).
Data mining
As with the majority of SNPs identified by GWAS, the genome-wide significant SNPs in both loci are non-coding. Current prediction models ascribe only non-coding modifier functions to the 81 genome-wide significant SNPs (VEP 13, Supplementary Table 2). Expression quantitative trait analysis directly tests association of the index SNPs with expression of nearby transcripts. The chr3 index SNP (rs13084531) lies within the CADM2 gene and adjacent to miR5688 and CADM2-AS2 (Figure 2 and Supplementary Table 3). Currently most miRs are predicted (but not reliably proven) to influence transcription of hundreds or thousands of genes. Furthermore, analysing transcription levels of miRs is challenging. Similarly, the importance of antisense transcripts such as CADM2-AS2 is unclear and difficult to assess. CADM2, which encodes cell adhesion molecule 2 (also known as synaptic cell adhesion molecule, SynCAM2), is a plausible target gene as it is predominantly expressed in the brain (Supplementary Figure 3a). The risk allele at rs13084531 was associated with increased CADM2 mRNA levels in several regions of the brain (including the caudate basal ganglia and putamen basal ganglia, hippocampus and hypothalamus, Supplemental Figure 4). CADM1, a related cell adhesion molecule, demonstrates overlapping and co-regulated (albeit inversely) expression patterns 29. It is worth noting that CADM1 shows a similar, albeit less brain-specific, expression pattern (Supplementary Figure 3b) and that genetic deletion of Cadm1 in mice results in behavioural abnormalities, including anxiety 30.
A recent GWAS of executive functioning and information processing speed in non-demented older adults from the CHARGE (Cohorts for Heart and Aging Research in Genomic Epidemiology) consortium found that genetic variation in the CADM2 gene was associated with individual differences in information processing speed 31. In the same study, pathway analysis implicated CADM2 in glutamate signalling and gamma-aminobutyric acid (GABA) transport 31. The allele of rs17518584 (LD r2=0.45 with rs13084531, LD r2=0.34 with rs62250716) associated with increased processing speed was associated with reduced (self-reported) risk-taking in the current study (Supplementary Table 4, p=1.17x10-7). Furthermore, a GWAS of educational attainment in the UK Biobank cohort demonstrated a significant signal in CADM2 32. The effect allele of rs56262138 (LD r2=0.00 with rs13084531, LD r2=0.00 with rs62250716) for increased educational attainment showed a negative effect on risk-taking behaviour (Supplementary Table 4, p=0.0210).
Day et al reported an association between the CADM2 locus and age of reproductive onset in UK Biobank 33. In a secondary analysis, they also report an association between the same locus, CADM2, and risk-taking behaviour. However, we have applied more stringent quality control procedures (exclusion of 1 of each related pair of individuals, exclusion of UK Biobank-defined non-Caucasians and exclusion of SNPs deviating from Hardy-Weinberg equilibrium) and standard analysis methods (logistic regression using PLINK and using eight principlal components to adjust for possible population structure). The lead SNP reported by Day et al failed quality control in our analysis, making direct comparison between that study and the current difficult.
The CADM2 locus has also been tentatively associated with longevity 34 (rs9841144, LD r2=0.99 with rs13084531, LD r2=0.16 with rs62250716), but associations between CADM2 SNPs and longevity, survival and attaining 100 years of age in that study were inconsistent, limiting the interpretation of these signals in the context of risk-taking behaviour.
The chr6 locus is a gene-dense region (Supplementary Table 3) with a highly complex LD structure. Global gene expression analysis (Supplementary Figure 5 A-C) highlighted POM121L2 (testis-specific expression), VN1R10P (predominantly expressed by testis) and ZNF192P2 (predominantly expressed by testis and brain) as potentially interesting genes. Although interpretation of the biological basis for association signals in this region is fraught with difficulty, the identification of associated eQTLs for genes expressed solely or predominantly in the testis is intriguing, particularly given the higher frequency of men in the group who consider themselves risk-takers.
No genotype-specific expression patterns were detected in the brain for ZNF192P2 or (unsurprisingly) POM121L2 and VN1R10P. Despite relatively low expression in the brain (Supplementary Figure 5 D), the minor allele of rs9379971 (associated with reduced risk-taking behaviour) was associated with increased levels of PRSS16 mRNA (Supplementary Figure 6 A and B). Expression of ZNF204P levels was prominent in the brain (Supplementary Figure 5 E) and reduced levels of ZNF204P mRNA were associated with the minor (reduced risk-taking) allele of rs9379971 (Supplementary Figure 6 C).
Genetic correlations
We found significant positive genetic correlations between the risk-taking phenotype and ADHD (rg=0.31, SE=0.13, p=0.01), schizophrenia (rg=0.27, SE=0.04, p=4.54x10-11), BD (rg=0.26, SE=0.07, p=1.73x10-4), PTSD (rg=0.51, SE=0.17, p=0.0018) and smoking (rg=0.17, SE=0.07, p=0.01) and a negative genetic correlation with fluid intelligence (rg=-0.15, SE=0.05, p=0.0013, Table 2). We found no significant genetic correlation between risk-taking and MDD, anxiety or years of education (Table 2). There was also a significant genetic correlation between risk-taking and BMI (rg=0.10, SE=0.03, p=0.003), but a similar correlation was not found for WHRadjBMI. A genetic correlation between alcohol use disorder and the risk-taking phenotype was apparent (rg=0.22, SE0.31, p=0.47), although this analysis was likely underpowered due to the modest size of the alcohol use disorder GWAS (n=4 171) and we draw no conclusions about this correlation.
Discussion
There is a growing emphasis on the importance of using phenotypic traits which cut across traditional diagnostic groups to investigate the biological basis of psychiatric disorders. Risk-taking behaviour is one such trans-nosological characteristic, recognised clinically as a feature of several disorders, including ADHD, schizophrenia and BD. In this study we identified two loci, on 3p12.1 and 6p22.1, that were associated with self-reported risk-taking behaviour.
The chr6 locus falls within the HLA region which encodes a large number of genes and is extremely complicated genetically. Further work is needed to be able to identify the genes responsible for the association with risk-taking behaviour identified here and to dissect the underlying behavioural mechanisms. The (non-significant) sex effect (that is, a stronger association in women) of the chr6 locus is intriguing, especially when considering that there are a number of testis-specific genes expressed in this locus. Further examination of this aspect of risk-taking behaviour would be of value when a larger dataset is available, thus offsetting the loss of power resulting from stratifying the analysis according to sex.
A key finding of our study was the positive association between chr3 SNP, rs13084531, and risk-taking behaviour as well as CADM2 expression levels. Here, the allele associated with increased self-reported risk-taking behaviour was also associated with increased CADM2 expression. It is of interest that lack of Cadm1 in mice was associated with anxiety-related behaviour 30 and that both CADM1 and CADM2 were identified as BMI-associated loci 17 suggesting that CADM2 and related family members may be involved in balancing appetitive and avoidant behaviours.
Day and colleagues recently identified 38 genome-wide significant loci for age at first sexual intercourse within the UK Biobank cohort 33 and two of these loci were within the 3p12.1 region, close to CADM2 (rs12714592 and rs57401290). The association between rs57401290 (and SNPs in LD) and age at first sexual intercourse was also observed for a number of behavioural traits, including number of sexual partners, number of children and risk-taking propensity. In addition, CADM2 also showed association with information processing speed 31 and educational attainment 32, highlighting the complexity of relationships between cognitive performance and risk-taking. Taken together, this evidence suggests that CADM2 plays a fundamental role in risk-taking behaviours, and may be a gene involved in the nexus of cognitive and reward-related processes that underlie them.
A perhaps surprising observation was the increased frequency of having a university degree in self-reported risk-takers, compared to controls, despite the negative (albeit non-significant) association between years of education and risk-taking behaviour. These observations underscore the complexity between risk-taking and educational attainment, and highlight differences between genetic and phenotypic relationships. They may also be indicative of selection bias within the UK Biobank cohort towards more highly educated individuals.
Another key finding was genetic correlation between self-reported risk-taking and obesity. Although there are likely to be a range of potential mechanisms linking risk-taking behaviour with obesity, evidence of a shared genetic component is in keeping with work that has highlighted the importance of the central nervous system in the regulation of obesity (BMI), particularly brain regions involved in cognition, learning and reward 17. In contrast, central fat accumulation (WHRadjBMI) is primarily regulated by adipose tissue 18 which fits with the lower, non-significant genetic correlation between risk-taking behaviour and this measure. Two SNPs (rs13078807 and rs13078960) in the CADM2 locus have previously been associated with BMI 17, 35, 36, but whilst these SNPs tag each other (LD r2=0.99), the LD between the risk-taking index SNP or possible secondary signal is low (LD r2=0.31 and 0.01 for rs13084531 and rs62250716 respectively), suggesting that these are distinct signals.
It is perhaps unsurprising that we identified genetic correlations between risk-taking and smoking. Similarly, risk-taking and impulsive behaviour is a core feature of ADHD and BD, suggesting substantial genetic overlap between variants predisposing to risk-taking behaviour and these disorders. The genetic correlation between risk-taking and schizophrenia is of interest because schizophrenia is commonly comorbid with substance abuse disorders 37. The correlation between risk-taking and PTSD is perhaps plausible if we accept that risk-takers may be more likely to find themselves in high-risk situations with the potential to cause psychological trauma. Overall, these correlations suggest that studying dimensional traits such as risk-taking has the potential to inform not only the biology of complex psychiatric disorders but might also represent a way to sub-classify or stratify individuals in terms of diagnosis and treatment.
Strengths and limitations
We acknowledge that Day et al have previously reported an association for risk-taking within the CADM2 locus. Strengths of our study include the use of a more conservative and standardised methodology and reporting of results across the entire genome. A risk-taking locus was identified in the HLA region as well as the CADM2 locus and we have shown that CADM2 may contain a second signal. Furthermore, we have investigated the possibility of a sex-specific effect of these loci, provided evidence highlighting possible candidate genes at both loci and confirmed the importance of this phenotype in relation to psychiatric illness. In short, our report provides a fuller understanding of the genetic basis of risk-taking behaviour. Despite this, we highlight some limitations. The risk-taking phenotype used was a self-reported measure, based on response to a single question, and is therefore open to responder bias. It is also plausible that there are distinct subtypes of risk-taking behaviour (for example disinhibition, sensation-seeking and calculated risks). Whether the single question used in our analyses captures all, or only some, of these is not clear. It would be of interest to investigate whether the loci identified here are also associated with more quantitative and objective measures of risk taking; however, such measures were not available in the UK Biobank dataset.
Conclusion
In summary, we have identified a polygenic basis for self-reported risk-taking behaviour and two associated loci containing variants likely to play a role in predisposition to this complex but important phenotype. The identification of significant genetic correlations between risk-taking and several psychiatric disorders, as well as with smoking and obesity, suggest that future work on this trait may clarify mechanisms underlying several common psychopathological and physical health conditions, which are important for public health and wellbeing.
Conflict of interest
JPP is a member of UK Biobank advisory committee; this had no bearing on the study. No other conflicts of interest.
Acknowledgements
This research was conducted using the UK Biobank resource. UK Biobank was established by the Wellcome Trust, Medical Research Council, Department of Health, Scottish Government and Northwest Regional Development Agency. UK Biobank has also had funding from the Welsh Assembly Government and the British Heart Foundation. Data collection was funded by UK Biobank. DJS is acknowledges the support of the Brain and Behaviour Research Foundation (Independent Investigator Award 1930) and a Lister Prize Fellowship (173096). EMT is supported by a University Research Fellowship (UF140705) from the Royal Society. JC acknowledges the support of The Sackler Trust and is part of the Wellcome Trust funded Neuroimmunology of Mood and Alzheimer’s consortium that includes collaboration with GSK, Lundbeck, Pfizer and Janssen & Janssen. The funders had no role in the design or analysis of this study, decision to publish, or preparation of the manuscript.