ABSTRACT
We aim to investigate the application of polygenic risks scoring within a family context and thereby unravel role that common variation confers on disease risk within a pedigree that would have traditionally been viewed through the prism of monogenic inheritance only. We illustrate our discussion by analyzing polygenic risk scores for schizophrenia, major depressive disorder and bipolar disorder in a large pedigree (n~300) in which 30% of family members have major depressive disorder or bipolar disorder. The polygenic risk scores illuminate patterns of assortative mating and anticipation, whereby increased polygenic risk for psychiatric disorders is contributed by affected individuals who married into the family, thereby increasing genetic risk over generations in the family. This may explain the observation of anticipation in mood disorder, whereby onset is earlier and the severity of a disease increases over the generations of a family. Joint analyses of both rare and common variation may be the most powerful way to understand the familial genetics of mood and psychiatric disorders.
INTRODUCTION
The recent development of polygenic risk scoring (PRS) has greatly advanced the field of psychiatry genetics. This approach allows for even sub-genome-wide significant threshold results from large genome wide meta analyses to be leveraged to explore genetic risk in smaller studies (Shaun M Purcell et al. 2009). The effect sizes at many individual single nucleotide polymorphisms (SNPs), estimated by a large GWAS on the disorder of interest, are used to calculate a individual level genome-wide PRS in individuals from an independent genetic dataset. For example, PRS based on the summary statistics of the SCZ meta analysis by the Psychiatric Genomics Consortium (Ripke et al. 2014; Sullivan 2010) have proven to be most powerful in predicting not only SCZ (Ahn et al. 2016; S M Purcell et al. 2009) but also other psychiatric disorders (Lichtenstein et al. 2009; Lee et al. 2013; Cross-Disorder Group of the Psychiatric Genomics Consortium 2013). In addition, updated, more powerful, summary statistics from the Psychiatric Genomics Consortium from the latest meta analyses for Bipolar Disorder (BPD) and Major Depressive Disorder (MDD) are available via the PGC Data Access Portal (https://www.med.unc.edu/pgc/shared-methods).
Aside from increasing power in traditional case-control designs, PRS algorithms also open up new avenues for studying common variation. In this study we consider the application of PRS within a family context. While pedigree studies have been traditionally used to explore rare genetic variation through linkage analyses, studying patterns of PRS throughout a pedigree would allow for assessment of phenomena like assortative mating and anticipation. Assortative (non-random) mating is a common phenomenon where mated pairs are more phenotypically similar for a given characteristic than would be expected by chance (Merikangas and Spiker 1982). Results from a recent study by Nordsletten et al. (Nordsletten et al. 2016) show extensive assortative mating within and across psychiatric, but not physical disorders. This could explain some of the features of the genetic architecture of this category of disorders (Nordsletten et al. 2016; Plomin, Krapohl, and O’Reilly 2016; Robinson et al. 2017). This includes anticipation, a phenomenon where later generations exhibit more severe symptoms at an earlier age, robustly reported (although not explained) in bipolar disorder (O’Donovan, Jones, and Craddock 2003) and recently highlighted in genetic studies of MDD (Power et al. 2016; Power et al. 2012).
In the current study we aim to discuss the application of polygenic risk scoring for SCZ, MDD and BPD to explore patterns of common risk variation within a family context. We illustrate our discussion by investigating the relationship between PRS and apparent assortative mating and anticipation within a complex multigenerational pedigree affected with mood disorders.
MATERIALS & METHODS
Subject description
The Brazilian Bipolar Family (BBF) was ascertained via a 45-year old female proband who presented with severe Bipolar Type 1 (BPI) disorder and stated there were dozens of cases of mood disorders in the family, most of whom lived in a small village in a rural area of a large state north of São Paulo. Cooperation from the family and a 2003 self-published book about their history was invaluable for our ascertainment. The entire Brazilian Bipolar Family (BBF) consists of 960 members. Family members >16 years of age underwent semi-structured interviews, using the Portuguese version of the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-I)(Del-Ben, Rodrigues, and Zuardi 1996). Members aged 6-16 were assessed using the Portuguese version of Kiddie-SADS-Present and Lifetime Version (K-SADS-PL)(Brasil and Bordin 2010). In total 308 interviews were completed, and 5 eligible members declined an interview. In the rare event of discrepancies, two independent psychiatrists reviewed them and a final consensus diagnosis was assigned. All affected and unaffected adult family members that have been included in the genetic study have given informed consent. Minors have given assent, followed by consulted consent by their parents in accordance with accepted practice in both the U.K. and Brazil. The project was approved by the Brazilian National Ethics Committee (CONEP). Table 1 contains the demographics of the subjects used in the current analysis (n=243 passed genotype quality control procedures described below). The population control dataset (BRA controls) was collected in Sao Paulo, Brazil, as a control dataset in a genetic study of first-episode psychosis (Noto et al. 2015). They were volunteers who had no abnormal psychiatric diagnoses (SCID) or family history of psychotic illness. The Research Ethics Committee of Federal University of Sao Paulo (UNIFESP) approved the research protocol, and all participants gave informed consent (CEP No. 0603/10). Demographics for n=57 BRA controls can be found in Table 1.
Genotype data
Following diagnostic interview, interviewers obtained whole blood in EDTA containing monovettes for adults and lesser amounts or saliva given personal preference or age (DNA Genotek Inc., Ontario, Canada). Genomic DNA was isolated from whole blood and saliva at UNIFESP using standard procedures. Whole genome genotype data was generated using the Illumina Infinium PsychArray-24 (http://www.illumina.com/products/psycharray.html) for both the BBF and the BRA control dataset at the in-house BRC BioResource Illumina core lab according to manufacturers protocol. Samples were excluded when average call rate was <98%, missingness >1% with additional check for excess heterozygosity, sex, family relationships and concordance rates with previous genotyping assays. SNPs were excluded when missingness >1%, MAF <0.01 or HWE <0.00001 and if showing Mendelian errors for the BBF dataset in Plink v1.07(S. Purcell et al. 2007) and v1.9 (Chang et al. 2015) or Merlin v1.1.2 (Abecasis et al. 2002). The BBF and BRA control datasets were QC’d separately and then merged, applying the same SNP QC thresholds to the merged dataset as well. This quality control procedure resulted in a dataset of 225,235 SNPs for 243 BBF individuals (197 family members and 46 married-in individuals) and 57 BRA controls for analysis. Eigensoft v4.2 (Patterson, Price, and Reich 2006) was used to check for population differences between the BBF family members, married-in individuals and BRA control sets. The BBF members self-reported mixed Southern European ancestry, confirmed by genome-wide principal components analysis showing that family members clustered closely with the Northern and Western European and Tuscan Italian populations. The principal components appear to represent within-family structure, with PC1 (variance explained 10.31%) and PC2 (variance explained 5.98%) separating generations, and the subsequent PCs separating subfamilies (Supplementary Information S2). PC1 and PC2 are significantly correlated to the SCZ:PRS (PC1 r=-0.21, p=1.7 x 10-4; PC2 r=-0.22, p=1.4 x 10-4), MDD:PRS (PC1 r=0.12, p=0.04; PC2 r=-0.24, p=2.9 x 10-5) and BPD:PRS (PC1 r=0.19, p=8.8 x 10-4; PC2 r=-0.12, p=0.03). The principal components were not used in subsequent analyses.
Polygenic risk scores
Polygenic risk scores for each family member (n=243) and population control (n=57) were generated in the same run using the PRSice v1.25 software (Euesden, Lewis, and O’Reilly 2014) with the publically available PGC2 schizophrenia meta analysis (Ripke et al. 2014) as a base dataset (36,989 SCZ cases, 113,075 controls), in addition to MDD (51,865 MDD cases, 112,200 controls) and BPD (20,352 BPD cases, 31,358 controls) summary statistics from the latest PGC meta-analyses (unpublished data (Lewis 2015; Sklar 2015)). We performed P-value-informed clumping on the genotype data with a cut-off of r2 = 0.25 within a 200-kb window, excluding the MHC region on chromosome 6 because of its complex linkage disequilibrium structure. Through linear regression (no covariates) in PRSice we selected the PRS thresholds most predictive in discriminating affected from unaffected family members for SCZ:PRS (P<0.00055, 1,218 SNPs), MDD:PRS (P<0.0107, 2,788 SNPs) and BPD:PRS (P<0.00005, 143 SNPs). PRS showed only modest correlations (no covariates) amongst each other in our data (SCZ:PRS vs. MDD:PRS r=0.26, p=5.1 x 10-6, SCZ:PRS vs. BPD:PRS r=0.13, p=0.03, MDD:PRS vs. BPD:PRS r=0.14, p=0.04).
Statistical testing
All PRS were standardized mean=0 and SD=1. Linear mixed model analyses were selected to be able to model covariates and relatedness within this complicated dataset. The analyses were performed using the Wald conditional F-test (Kenward and Roger 1997) in ASReml-R software (Butler et al. 2002) with one of the categories of mood disorders or family status as dependent variable and PRS as the independent variable. The model was adjusted for age (except for the generation analysis) and sex; and For 7 individuals in the BBF age at collection was missing and imputed to be the mean age of the relevant generation. To account for relatedness in within family comparisons, an additive genetic relationship matrix was fitted as a random effect. The relationship matrix was constructed using LDAK software (Speed et al. 2012) with weighted predictors and LD correction parameters suited for pedigree data, resulting in pairwise relatedness estimates and inbreeding coefficients on the diagonal. The variance explained by each PRS was calculated using: (var(x × β))/var(y), where x was the standardized PRS, β was the corresponding regression coefficient and y was the phenotype (Nakagawa and Schielzeth 2013). For the analysis of offspring of different spouse pair categories (“both unaffected”, “married-in parent affected”, “family parent affected”, “both affected”) we had to account for the number of children contributed to the same category by each spouse pair. While most spouse pairs contribute 1 or 2 children to the same offspring category (Supplementary Information S3); two “both affected” spouse pairs contribute 7 and 8 children respectively. In the event of more than one child per couple, we calculated the mean PRS per spouse pair and entered this in the model as being one representative child for that couple. All p-values reported are uncorrected for multiple testing, since all tests concern overlapping individuals and thus have a complex dependence structure. However, we have performed 33 tests as listed in Supplementary Information S4, and so a conservative Bonferroni threshold for P < 0.05 is 0.002.
RESULTS
Affection status
All three polygenic risk scores show a higher PRS in affected (n=78) versus unaffected (n=147) family members with SCZ:PRS (Beta=0.07, SE=0.03, Z-ratio=6.58, p=0.03, R2=0.03), MDD:PRS (Beta=0.07, SE=0.03, Z-ratio=6.58, p=0.03, R2=0.02) and BPD:PRS (Beta=0.09, SE=0.02, Z-ratio=6.81, p=0.002, R2=0.04). None of the PRS significantly discriminated between individuals having experienced a psychotic episode at some point in their lives (n=25) versus the unaffected group (n=147). Find visualization of PRS in different diagnostic categories in Supplementary Information S5.
Assortative mating
Married-in individuals were defined as individuals married to a BBF member, but having no parents in the family themselves. Of the 70 married-in individuals ascertained (irrespective of having genotype data) 19 (27%) were affected with a psychiatric disorder. This is significantly higher than the 12% population prevalence of MDD (Fisher’s exact p=0.02). The unaffected married-in group does not differ from the general healthy population as evidenced by no significant differences in PRS as compared to the BRA control group. The above led us to investigate whether we can observe assortative mating on a genetic level, using PRS. In spouse pairs, we were unable to predict the PRS of the husband, using that of his wife, even when selecting concordant (both affected or both unaffected) pairs only. We considered the possibility that the married-in individuals might confer a different genetic predisposition to mood disorders to their offspring than the original family members. Demographics of the offspring in the different offspring categories (no affected parents (n=54); one affected family member parent (n=69); one affected married-in parent (n=15) and two affected parents (n=38)) are given in Supplementary Information S3 and S6. Indeed, we find that offspring of an affected married-in parent show increased SCZ:PRS (Beta=0.21, SE=0.03, Z-ratio=4.42, p=0.002, R2=0.19, Figure 1) and BPD:PRS (Beta=0.17, SE=0.03, Z-ratio=4.42, p=0.01, R2=0.13, Figure 1) as compared to having no affected parents.
Anticipation
The BBF shows patterns of anticipation, with individuals having an earlier age at onset (AAO) in later generations. For 104 individuals (irrespective of having genotype data) the average age at onset significantly decreases over generations with G2 (n=1, AAO=8), G3 (n=23, AAO=30.2yrs±21.1), G4 (n=53, AAO=31.2yrs±12.3), G5 (n=23, AAO=19.7yrs±9.5) and G6 (n=4, AAO=13yrs±3.6)(Supplementary Information S7) with older participants recalling their AAO directly and younger participants confirmed using clinical records or parental recall (Beta=- 4.55, SE=1.79, Z-ratio=-2.54, p=0.01, R2=0.06). We hypothesized that this decrease in AAO would be reflected in a negative correlation with PRS and taking together the findings regarding assortative mating; thereby a pattern of increased PRS over generations. Because of a limited sample size of affected individuals per generation, a direct correlation of AAO and PRS does not reach significance, although the youngest generation (G5) does show trends towards negative correlations for SCZ:PRS and MDD:PRS (Supplementary Information S8). The SCZ:PRS does show a significant increase over generations (Figure 2) where n=197 family members were included (46 married-in individuals were excluded from the analysis to capture inheritance patterns of SCZ:PRS) in a linear regression with generation as independent variable (Beta=0.13, SE=0.02, Z-ratio=0.85, p=0.008, R2=0.03). The presence of such an effect when comparing generations suggests ascertainment effects such as relying on the recall of older family member with very long duration of illness in previous generations may be masking an overall effect across the entire family.
DISCUSSION
To our knowledge, the current study is the first to probe patterns of common genetic variation within a traditional pedigree design. We illustrate the possibilities of this approach by investigating apparent assortative mating and anticipation in a large multigenerational pedigree affected with mood disorders through polygenic risk scores for SCZ (Ripke et al. 2014), MDD and BPD (Lewis 2015; Sklar 2015) and thereby improve mechanistic understanding of common genetic risk for psychiatric disorders.
Recognizing the limitations in sample size of our pedigree and therefore the power to discriminate between specific diagnostic categories, we do find all three PRS to show a higher score in the affected family members, with the BPD score being most powerful. This is particularly relevant because the family not only features 36 individuals with a type of bipolar disorder, another 38 individuals are affected with a form of depression, which may be part of the prodromal stage of bipolar disorder (Berk et al. 2007). Family members currently diagnosed with MDD could still experience a manic episode and thereby convert to a BPD diagnosis.
Highlighting the possibilities of PRS applications within a family context, we find patterns of common variation to reflect phenomena within the family that are out of reach from traditional case/control studies. Assortative mating is one of the features in this family, where many married-in individuals are more affected with a mood disorder than the general population. As opposed to the family members, the married-in individuals are more likely to be affected with (r)MDD instead of BP. Non-random mating patterns have been reported in the population regarding body type, socio-economic factors and psychiatric traits (Plomin, Krapohl, and O’Reilly 2016; Nordsletten et al. 2016). The BBF provides a unique opportunity to look at the genetic correlation between spouse pairs and the contribution the offspring of married-in individuals to overall psychiatric morbidity. A recent study has found genetic evidence for assortative mating when studying BMI and height in spouse pairs (Robinson et al. 2017). The BBF does not show the same pattern; the affected married-in individuals have a higher, though non-significant, polygenic score than affected or unaffected family members. However, it appears that we observe significant consequences of this in their offspring with offspring with an affected married-in parent showing significantly increased SCZ:PRS and BPD:PRS.
A contribution of the married-in parents to a genetic driven anticipation in age of onset is supported by the increase in PRS over generations in both affected and unaffected individuals although our cross sectional study dataset was less well powered to find an association with age at onset within affected family members. We did observe a trend for association between age at onset and PRS in the youngest generation in this study but not when combining sample across generations. Age at onset can be considered a proxy for severity (Schulze et al. 2002; Schürhoff et al. 2000) and has been previously associated with genetic risk in MDD (Power et al. 2012; Power et al. 2016). However, this variable needs to be interpreted with caution, especially when analyzing patterns over time since it is dependent on context and memory (Alda et al. 2000). Ascertainment bias can be a confounding factor in studies of psychiatric traits, with older generations having less access to psychiatric care and possibly misremembering the onset or nature of their first episode.
In conclusion, our study is a proof of concept exploration of PRS as a tool for investigating patterns of common genetic risk in a traditional pedigree context. The SCZ and BPD scores appear best suited in our data for teasing apart patterns of assortative mating and anticipation, whereby increased polygenic risk for psychiatric disorders is contributed by affected individuals who married into the family, thereby possibly tipping the balance of already present rare risk variation passed on by the early generations (Diniz et al., submitted).
ACKNOWLEDGEMENTS
The authors would like to thank the family members for their enthusiastic participation. We thank our ethics consultant Prof. Barbara Prainsack for insightful discussions. This paper represents independent research part-funded by FAPESP (2014/50830-2; 2010/08968-6), the Marie Curie International Research Staff Exchange (FP7-PEOPLE-2011-IRSES/295192), and the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London. SDJ is funded by the European Union’s Horizon 2020 research and innovation programme under Marie Skłodowska-Curie grant IF 658195. The views expressed are those of the authors and not necessarily those of the EU, the NHS, the NIHR or the Department of Health. GB has been a consultant in preclinical genomics and has received grant funding from Eli Lilly ltd within the last 3 years. AG has participated in advisory boards for Janssen-Cilag and Daiichi-Sankyo. All other authors report no conflict of interest.