The role of rare copy number variants in depression

The role of large, rare copy number variants (CNVs) in neurodevelopmental disorders is well-established,1–5 but their contribution to common psychiatric disorders, such as depression, remains unclear. We have previously shown that a substantial proportion of CNV enrichment in schizophrenia is explained by CNVs associated with neurodevelopmental disorders.6, 7 Depression shares genetic risk with schizophrenia8, 9 and is frequently comorbid with neurodevelopmental disorders10, 11, suggesting to us the hypothesis that if CNVs play a role in depression, neurodevelopmental CNVs are those most likely to be associated. We confirmed this in UK Biobank by showing that neurodevelopmental CNVs were associated with depression (24,575 cases, 5.87%; OR=1.36, 95% CI 1.22-1.51, p=1.61×10-8), whilst finding no evidence implicating other CNVs. Four individual neurodevelopmental CNVs increased risk of depression (1q21.1 duplication, PWS duplication, 16p13.11 deletion, 16p11.2 duplication). The association between neurodevelopmental CNVs and depression was partially explained by social deprivation but not by education attainment or physical illness.

3 Studies of CNVs in depression have been based on relatively small samples 12-18 , have generated inconsistent results, and no findings have met criteria for genome-wide significance. The UK Biobank offers an opportunity to investigate the relationship between CNVs and depression on a larger-scale than has hitherto been possible. We performed a CNV analysis of depression in UK Biobank data of 455,913 individuals who reported white British or Irish ethnicity (Methods). Our primary definition of depression was participant selfreport of ever having received a medical diagnosis of depression ('self-reported depression'). To ensure our findings were not restricted to this definition, we also tested two more conservative phenotypes -(i) self-reported depression with an additional requirement for antidepressant prescription, and (ii) a hospital discharge diagnosis of depression. We sought association with depression for (i) a group of 53 CNVs known to be associated with neurodevelopmental disorders ('neurodevelopmental CNVs') 6,7 and (ii) after excluding CNVs relevant to the primary hypothesis, we tested for a residual unexplained burden among CNVs ³ 100KB, ³ 500KB and ³ 1MB.  The addition of genotyping platform as a covariate in regression analyses did not change the results as presented.
We used data from 157,397 individuals who completed an online follow-up mental health questionnaire to examine association between CNVs and markers of depression severity (age at onset, number of episodes of depression, duration of worst depressive episode). We restricted these analyses to the 73,234 individuals who reported experience of prolonged feelings of sadness or 5 depression (61,467 unaffected), a phenotype which itself was associated with neurodevelopmental CNV carrier status (OR=1.19, 95% 1.06-1.33, p=0.004). We did not find any association between neurodevelopmental CNVs and markers of depression severity that survived correction for multiple testing (Supplementary Table 2).
In order to better understand the relationship between neurodevelopmental CNVs and depression, we investigated whether the association was explained by measures of (i) educational attainment (qualifications), (ii) physical health There remained a strong independent association between these CNVs and depression in analyses incorporating these other measures.
Given recent evidence for an increased rate of large CNVs in female children with anxiety or depression 21 , we undertook exploratory analyses which provided weak evidence for a higher rate of depression in female neurodevelopmental CNV carriers than males. This increased rate was over and above the baseline increased rate of self-reported depression in females (interaction term OR=0.78, 95% CI 0.62 -0.98, uncorrected p=0.03, Supplementary Table 4) although this 6 effect was weaker still and non-significant for our secondary depression definitions.
Here we report the largest study of CNVs in depression to date. We tested two main hypotheses -(i) that risk CNVs for neurodevelopmental disorders would be associated with depression and, (ii) that, after excluding CNVs relevant to the primary hypothesis, there would be a residual unexplained burden among CNVs ³ 100KB, ³ 500KB and ³ 1MB. Our results support our first hypothesis -CNVs previously associated with neurodevelopmental disorders were associated with increased risk of lifetime depression, whether defined on the basis of selfreported diagnosis, this combined with antidepressant treatment or on hospital discharge diagnosis. Four neurodevelopmental CNVs (1q21.1 duplication, PWS duplication, 16p13.11 deletion, 16p11.2 duplication) were individually associated with depression at levels of significance that survived Bonferroni correction for the 53 CNVs tested. We note that none of these CNV loci overlap with risk loci recently identified in a large depression genome-wide association study 9 . The risk of depression in CNV carriers in this study (whether CNVs are considered individually or collectively) was lower than that identified in previous studies of schizophrenia but qualitatively, the results followed a similar pattern -the highest risk for both disorders was conferred by 3q29del (depression OR=10.72, schizophrenia OR =57.65) and the lowest risk for both disorders was conferred by 16p12.1del (depression OR=1.59, schizophrenia OR=3.3). 5,6 After excluding neurodevelopmental CNVs we found no evidence of residual burden of risk for depression among CNVs ³ 100KB, ³ 500KB and ³ 1MB.
7 Further investigation of the association between neurodevelopmental CNVs and depression indicated that this relationship is partially explained by social deprivation. To our knowledge this is the first report showing these CNVs are associated with neighbourhood measures of social deprivation, and thus implicates an important mechanism by which CNV carrier status could increase risk of depression, although longitudinal data on these measures is needed to establish the causal directionality between depression and social deprivation.
We acknowledge some limitations of this study. Our primary depression definition relied on self-report, a method known to be subject to information bias. 22 However, this is unlikely to have markedly influenced our findings given the almost identical, or even stronger, results from using the clinicians' hospital discharge diagnosis of depression phenotype. Another limitation is the relatively low rate of depression compared to population estimates 23 . While this likely reflects the better than average health and functioning of the UK Biobank sample, and imprecision in the definition of disorder, these factors are not likely to have generated spurious CNV associations.
Neurodevelopmental CNVs have incomplete penetrance for major developmental disorders 24 , yet beyond mild cognitive impairment 25,26 , little is known about phenotypic associations with CNV carrier status. Our study is the first to robustly demonstrate association between these CNVs and depression and thus extends the spectrum of clinical phenotypes that are associated with CNV carrier status. Duration of worst depression (UKBB field 20438) -this was coded in ranges of months e.g. less than a month, between one month and three months. Previous data has shown that the median duration of a depressive episode is 3 months. 29 Therefore, this variable was dichotomised into 0-3 months and more than 3 months. Lifetime number of depressed episodes (UKBB field 20442) -this variable was dichotomised using a median split approach (median = 1).

Genotyping and CNV Calling
DNA was extracted from whole blood 30 and then genotyped at the Affymetrix Research Services Laboratory, Santa Clara, CA on the UK Biobank Axiom and UK BiLEVE arrays. Genotypes were released to Cardiff University after application to UK Biobank (project number 14421). CNV calling was carried out using PennCNV-Affy 31 using biallelic markers common to both genotyping platforms and is described in detail elsewhere. 25 CNV burden analysis was carried out on the CNV calls generated by Penn-CNV-Affy using PLINK 1.07. 32 Individual samples were excluded if they had ³ 30 CNVs, a waviness factor of >0.03 or <-0.03, a SNP call rate of <96% or LRR SD of >0. 35. Individual CNVs were excluded if they were covered by <20 probes, had a density coverage of <1 probe per 20,000 base pairs or a confidence score of <10.

Defining Copy Number Variant Sets and Analysis
Following the approach of our recent study using UK Biobank data 25 we defined a group of 'neurodevelopmental CNVs' as those 54 CNVs for which there is at least nominally significant evidence for association with neurodevelopmental disorders(p < 0.05). 3 We excluded the high frequency 15q11. CNV burden analyses were carried out using PLINK on regions of variable copy number at three size thresholds: (i) ³ 100KB, (ii) ³ 500KB and (iii) ³ 1MB. CNVs were filtered for frequency at <1% using the --cnv-freq-exclude-above command and overlapping lower copy repeat regions were filtered out using the --cnvexclude command. PLINK outputs were converted into CNV carrier status, which was used as the outcome in regression analyses. For all burden analyses, the group of 53 CNVs associated with neurodevelopmental disorders were excluded.
Association analyses were carried out in R using logistic or linear regression as appropriate with age and sex as covariates. Analyses were restricted to those who reported white British or Irish ethnicity and we excluded individuals with CNV-associated neurodevelopmental/neuropsychiatric disorders (autism spectrum disorder (ASD), intellectual disability (ID), attention deficit hyperactivity disorder (ADHD), schizophrenia or bipolar affective disorder (BPAD)) from being cases and controls.

Association
In order to better understand the association between neurodevelopmental CNVs and depression we investigated whether the association was explained by three variables known to be associated with depression 19,20 and postulated to be associated with CNVs -(i) educational attainment (qualifications), (ii) physical health (number of non-psychiatric hospital admission diagnoses) and, (iii) social deprivation (Townsend Deprivation Index). Educational attainment: Prior to analysis, data from the academic qualifications field were dichotomised and 12 recoded into college/university degree or all other qualifications, an approach previously used for this data field (UKBB field 6138). 33 Physical health: We used number of hospital admission physical health codes as a proxy measure of the extent of physical health problems (excluding psychiatric codes; UKBB fields 41202 and 41204). Social deprivation: This was measured using Townsend Deprivation Index codes (UKBB field 189). Analyses were carried out using structural equation modelling in the lavaan package in R. 34 The proportion explained was estimated by indirect effect/total effect. Age and sex were used as covariates throughout the analyses.

Sex-Specific Analyses
Previous studies have reported a small but significant excess of large (³ 500KB) rare (< 1%) CNVs in females. 35 Recent evidence has also suggested that female children with anxiety or depression are more likely to carry large CNVs than males. 21 This led us to examine rates of depression in female and male CNV carriers in our sample. Following finding that there was an excess of female CNV carriers with depression, we added an interaction term consisting of the product of neurodevelopmental CNVs and sex, to our main regression model.