Effects of pathogenic CNVs on biochemical markers: a study on the UK Biobank

Background Pathogenic copy number variants (CNVs) increase risk for medical disorders, even among carriers free from neurodevelopmental disorders. The UK Biobank recruited half a million adults who provided samples for biochemical and haematology tests which have recently been released. We wanted to assess how the presence of pathogenic CNVs affects these biochemical test results. Methods We called all CNVs from the Affymetrix microarrays and selected a set of 54 CNVs implicated as pathogenic (including their reciprocal deletions/duplications) and present in five or more persons. We used linear regression analysis to establish their association with 28 biochemical and 23 haematology tests. Results We analysed 421k participants who passed our CNV quality control filters and self-reported as white British or Irish descent. There were 268 associations between CNVs and biomarkers that were significant at a false discovery rate of 0.05. Deletions at 16p11.2 had the highest number of significant associations, but several rare CNVs had higher effect sizes indicating that the lack of significance was likely due to the reduced statistical power for rarer events. The distribution of values can be visualised on our interactive website: http://kirov.psycm.cf.ac.uk/. Conclusions Carriers of many pathogenic CNVs have changes in biochemical and haematology tests, and many of those are associated with adverse health consequences. These changes did not always correlate with increases in diagnosed medical disorders in this population. Carriers should have regular blood tests in order to identify and treat adverse medical consequences early. Levels of cholesterol and related lipids were unexpectedly lower in carriers of CNVs associated with increased weight gain, most likely due to the use of statins by such people.


INTRODUCTION
Large copy number variants (CNVs) have been shown to have significant effects on common medical disorders even among the relatively healthy individuals, with no early-onset neurodevelopmental disorders, taking part in the UK Biobank (Crawford et al, 2018). Such carriers also tended to have changes in physical traits, such as weight, height, body fat content, pulse rate and blood pressure (Owen et al, 2018).
Participants in the UK Biobank agreed to donate blood, saliva and urine samples for biochemical tests at the point of their recruitment. Biochemical and haematology tests were performed on the full set of Biobank participants and the data was recently released. We expected that there will be significant changes in the levels of many of these tests among carriers of pathogenic CNVs, in line with the increased morbidity and mortality and changes in the physical traits in this population.

METHODS
Approval for the study was obtained from the UK Biobank under project 14421: "Identifying the spectrum of biomedical traits in adults with pathogenic copy number variants (CNVs)".

Participants:
The UK Biobank recruited just over half a million people from the UK general population. We restricted analysis to individuals who self-reported as White British or Irish descent and whose samples passed our standard CNV QC filters (genotyping call rate < 0.96, > 30 CNVs per person, a waviness factor of < -0.03 & > 0.03 & LRR standard deviation of > 0.35).

Choice of CNVs:
We tested 54 CNVs that have been proposed to be pathogenic and were carried by at least five participants (the choice of the CNVs, their chromosomal positions and details of analysis are given in our previous publications (Crawford et al, 2018, Owen et al, 2018. Briefly, the CNVs follow the lists of Coe et al (2013) and Dittwald et al (2014).

Tests:
We analysed these CNVs for association with a set of 28 biomarkers and 23 haematology tests that were available for the majority of participants ( and 41k individuals, and were therefore excluded as they provided inadequate samples size for analysis of most rare CNVs.

Statistical analysis:
The tested variables followed normal distributions and did not require transformation prior to analysis. We show the results in non-normalised (original) units, to allow clinicians to relate them more easily to known tests (e.g. mmol/L, U/L). Linear regression analysis was performed with sex and age as co-variates. Other potential confounders were not included, as it is not known whether confounding factors lead to the abnormalities, or are a direct consequence of the presence of a CNV. For example, we previously showed that carrying a CNV from this list has strong adverse effects on measures of social deprivation, occupation and education , which are also factors that could be perceived as contributing to some biochemical tests results abnormalities. In previous work (Crawford et al, 2018) we also observed practically no effect from controlling for principal components, as should be expected for genetic variants whose frequencies are determined by selection pressure and mutation rates, rather than genetic drift (Rees et al, 2011). We used the Benjamini-Hochberg false-discovery rate (FDR) method to estimate the project-wide statistical significance (Benjamini and Hochberg 1995). A conservative false discovery rate (FDR) of 0.05 was accepted as our significance threshold (Supplementary   Tables 1 and 2).

RESULTS
The comparisons of 54 CNVs and 51 tests produced 2751 associations, of which 268 were significant at FDR=0.05 (marked in bold in Supplementary Tables 1 and 2). All results are also displayed on our institutional website (http://kirov.psycm.cf.ac.uk/) where they will be updated when required. The website allows an interactive examination of the distribution of results, including those for physical traits (e.g. weight and height). Table 2   Some of the well-established medical consequences of CNVs were associated with corresponding abnormalities in biochemical tests. For example 17q12 deletions (Renal Cysts and Diabetes syndrome) were associated with increased levels of creatinine, urea, Cystatin-C and glucose; 22q11.2 deletion carriers exhibited reduced calcium levels; 16p11.2 deletion carriers had the expected multiple abnormalities associated with cardiometabolic disorders shown for this condition; and 16p12.1 deletion carriers, who have high rates of renal failure and heart disease, have abnormalities in creatinine and Cystatin-C ( Table 2). Other abnormalities were more subtle and would have escaped detection, if they had not been analysed in such a large dataset. Most notable are the many changes detected among carriers of 15q11.2 deletions, a CNV with neurodevelopmental phenotypes, but no confirmed medical co-morbidities. The changes were subtle (around 0.1 SD for the significant results and 0.05 SD overall) and therefore unlikely to result in significant increases of overt medical disorders.
As an example, 23.7% of carriers had increased levels of Cystatin-C, compared to 18% of non-carriers. However, the abnormalities might manifest with problems in later life and reduce life expectancy if left untreated.
Most but not all significant changes suggest deterioration of function, as follows: Liver: Nearly all significant changes affecting liver function test (except bilirubin levels) were in the direction of higher levels, indicating liver abnormalities. Bone: All significant results for the three tests connected with bone function were adversely changed -reduced Vitamin D and Calcium, raised ALP. Diabetes: All significant changes were in the direction of raised glucose and HbA1c. Renal: All significant Cystatin-C results (in 14 CNVs) were increased, as were nearly all significant urea and uric acid levels, suggesting reduced renal function in several CNVs. There were exceptions: 17p12 duplications (causing Charcot-Marie-Tooth disease type 1A) and 16p11.2 deletion carriers had reduced creatinine levels. Cancer markers: There were very few significant results, with mixed effects. It is difficult to interpret these as a group, as they are relevant for specific cancers. Homozygous deletions at 2q13del locus affect the gene NPHP1 and are known to cause the kidney disorder juvenile nephronophthisis. Three homozygous individuals were excluded from analysis, as we have reported previously that all three had renal failure (Crawford et al, 2018). Heterozygous individuals have subtle phenotypes, with changes in cholesterol and total protein (reductions) and increases in direct bilirubin. Creatinine was also increased but the evidence for association was weak after correction for multiple testing, with a change of less than 1 mcmol/L.
Mirror-image phenotypes, as reported for physical traits in our previous work (Owen et al, 2018) were rare, almost entirely confined to 16p11.2 (creatinine, platelet count and SHBG) and for 15q13.3 (MCH, MCV and MRV).

DISCUSSION
Most pathogenic CNVs tested in this study are known to increase risk for neurodevelopmental disorders and common medical phenotypes. As such it is not surprising that they are also associated with abnormal biochemical tests. About 10% of all possible CNV/test associations were significant at a conservative FDR = 0.05, suggesting multiple effects. While most are expected, e.g. high creatinine and Cystatin-C for CNVs associated with renal failure, others suggest more subtle effects that could affect medical outcomes and all-cause mortality, some of which have so far not been demonstrated. Indeed, we previously showed that carriers of 11 of these CNVs have statistically significant increase in mortality, which was not fully explained by the statistically significant associated medical conditions (Crawford et al, 2018).
One set of associations runs against the expected directions and deserves a further discussion: Cholesterol and associated biochemical tests (LDL, HDL, ApoA1 and ApoB) were reduced (rather than increased) among carriers of CNVs that are associated with obesity or ischaemic heart disease, such as 16p12.1 deletions, 16p11.2 deletions and 16p11.2 distal deletions (Bochukova et al, 2010, Crawford et al, 2018. We suspect that this is due to a large proportion of such patients taking statins, precisely because they are overweight or have cardiovascular problems. This possibility is confirmed by an extreme example: people diagnosed with "high cholesterol" in the sample as a whole, also showed significantly reduced levels of these tests, a seemingly counter-intuitive finding. Levels of cholesterol were, on average, 0.66 mmol/L lower among the 70,052 people who had this diagnosis, a situation only likely if statins were taken after the diagnosis was made ( Table 3). This class of medicines is highly effective in lowering cholesterol and lipid concentrations (Cholesterol treatment trialists, 2019). At the point of recruitment the participants listed the medications they were taking, so we tested whether reductions in cholesterol were present only among those on statins. This could not fully explain the observations. Even CNV carriers who had not declared statins intake had lower cholesterol levels than non-carriers who were not taking statins. In a linear regression analysis, being on statins was associated with a 1.61 mmol/L lower cholesterol for the population as a whole, while a diagnosis of "high cholesterol" was associated with only 0.4 mmol/L higher cholesterol. Similar analysis on 16p11.2 deletion carriers showed that being on statins was associated with a 1.34 mmol/L lower cholesterol, while carrying the deletion was associated with 0.19 mmol/L lower cholesterol. We suggest that the only way to reconcile these findings is that a proportion of people (CNV carriers and non-carriers) had either taken statins at an earlier time, or had not declared them. CNV carriers who are obese or have cardiovascular incidents could be more likely to have been prescribed statins. A separate analysis of all biochemical results with statins intake as a covariate are presented in Supplementary Table 2. Researchers should be aware of potential bias in the UK Biobank when interpreting findings on cholesterol and other lipids, a problem which could be reduced once the primary care (GP) data become available.

Our findings of changes in biochemical tests extend our previous reports on this population,
showing adverse changes in basic physical characteristics in CNV carriers (Owen et al, 2018) and higher rates of common medical disorders (Crawford et al, 2018). This strengthens the case for general monitoring of such carriers and treatment and prevention of biochemical abnormalities, such as with statins or antidiabetic drugs. As an example, 43% of carriers of 16p11.2 deletions have levels of HbA1c above the normal range of 40 mmol/mol, compared to 12.7% of CNV non-carriers (Figure 1). The findings of this study cannot yet be used for analysis of cholesterol and related lipids, due to the high rates of statins intake in this population. Clinicians involved in the care of CNV carriers will be able to examine the results in more detail, using our interactive website which shows the distribution of values between carriers and non-carriers, separately for men and women: http://kirov.psycm.cf.ac.uk/.
Acknowledgements. This research has been conducted using the UK Biobank Resource under Application Number 14421.

Conflict of interest statement. The work at Cardiff University was funded by the Medical
Research Council (MRC) Centre Grant (MR/L010305/1) and Program Grant (G0800509). :   Table 1. List of biochemical and haematological tests, their ranges and the number of people with valid results. The "Function" that the tests measure is listed according to the UK Biobank website. Table 2. CNVs analysed and the associated biochemical tests and medical conditions (as presented in Crawford et al, 2018). An * denotes cholesterol and related lipid levels that are unexpectedly reduced in carriers of CNVs associated with increased BMI. Table 3. Cholesterol levels in cases and controls, according to statins intake, diagnosis of "high cholesterol" and presence of a 16p11.2 deletion, an example of a CNV highly associated with obesity and cardiometabolic disorders.