Whole genome analysis of an extended pedigree with Prader–Willi Syndrome, hereditary hemochromatosis, and dysautonomia-like symptoms

This report includes the discovery and analysis of a pedigree with Prader–Willi Syndrome (PWS), hereditary hemochromatosis (HH), and dysautonomia-like symptoms. Nine members of the family participated in whole genome sequencing (WGS), which enabled a wide scope of variant calling from single-nucleotide polymorphisms to copy number variations. First, a 5.5 Mb de novo deletion is identified in the chromosome region 15q11.2 to 15q13.1 in the boy with PWS. Second, a female invididual with HH is homozygous for the p.C282Y variant in HFE, a mutation known to be associated with HH. Her brother is homozygous for the same variant, although he has yet to be clinically diagnosed with HH. Third, none of the people with dysautonomia-like symptoms carry any reported or novel rare variants in IKBKAP that are implicated in familial dysautonomia (FD - HSAN III). Although two people with dysautonomia-like symptoms carry two heterozygous variants in NTRK1, a gene that has been shown to contribute to HSAN IV (congenital insensitivity to pain with anhidrosis, a disease that closely resembles FD), this variant is not present in the third proband. Fourth, WGS revealed pharmacogenetic variants influencing the metabolism of warfarin and simvastatin, which are being routinely prescribed to the proband. Finally, reports of the phenotypes were standardized with the Human Phenotype Ontology annotation, which may facilitate the search for other families with similar phenotypes. Due to the extreme heterogeneity and insufficient knowledge of human diseases, it is of crucial importance that both phenotypic data and genomic data are standardized and shared.

Physicians have also been routinely prescribing prenatal genetic tests and newborn screenings in clinics (Thompson et al. 2001;Morton and Nance 2006;Palomaki et al. 2011). However, there is a degree of uncertainty inherent in most genetic testing regarding the development, age of onset, and severity of disease (Evans et al. 2001). In addition, current genetic testing has not yet established predictive or even diagnostic value for common complex diseases (Smith et al. 2005). Some groups have begun to leverage the power of next-generation sequencing (NGS) to help diagnose rare diseases The clinical utility of genomic medicine is also uncertain, although some have suggested 4 the need for better standards and benchmarking (Lyon 2012a;Dewey et al. 2014).
Furthermore, research effort to date has been mostly driven by practicality and certain assumptions, such as focusing on coding regions, searching only for single-nucleotide polymorphisms (SNPs), or looking at a small set of known disease-relevant genes (Lyon and Wang 2012). However, the genetic architecture behind human disease is heterogeneous, and there are many reports of regulatory variants in the non-coding genome and splicing variants in the intronic regions that have a large-effect size on particular phenotypes (Slaugenhaupt et al. 2001;Faustino and Cooper 2003;Pagani et al. 2003;Venables 2004;Wang and Cooper 2007;Esteller 2011). In hypothesis-driven research studies, one might gain higher statistical power with a larger sample size by using cheaper NGS assays like WES or gene panels. But whole genome sequencing (WGS) has a unique strength in its ability to cover a broader spectrum of variants, small insertions and deletions (INDELs), structual variants (SVs), and copy number variants (CNVs) in studies where phenotype relevant variants might not be necessarily SNPs (Wang et al. 2013;Weischenfeldt et al. 2013;G. Day-Williams et al. 2015). In particular, WGS results in a more uniform coverage and better detection of INDELs, and is free of exome capture deficiency issues (Fang et al. 2014). When multiple human diseases segregate in the same family with distinct patterns, a more comprehensive genetic testing assay would be ideal, relative to targeted sequencing. Of course, cost and technical considerations have prohibited the wide adoption of highly accurate WGS for humans thus far, but this would indeed be the best assay to address the extreme heterogeneity of different genetic architectures for different diseases.

5
In line with other WGS efforts by the human genetics community, a report is given here of the discovery and comprehensive WGS analysis of an extended pedigree with Prader-Willi Syndrome (PWS), Hereditary Hemochromatosis (HH), dysautonomialike symptoms, Tourette Syndrome (TS) and other illnesses.

Clinical presentation (with HPO annotation) and family history
Here, we present the phenotypic characterization of a Utah pedigree K10031, consisting of 14 individuals from three generations (Figure 1) with various medical conditions as mentioned above. The two probands we discuss in detail below come from two nuclear families in this extended pedigree.  Her tilt table test yielded a positive result. The ophthalmic exam revealed unusual changes to her optic disks but without an elevated intraocular pressure, suggesting that her large optic nerves might represent physiologic cupping rather than glaucoma. Her brain MRI showed nonspecific findings, including a subtle focus of T2 signal abnormality involving the subcortical white matter of the right parietal lobe without associated enhancement. Other negative diagnostic test results included kidney ultrasound, chest X-ray, thyroid profile, urine vanillylmandelic acid (VMA) level, catecholamines panel (urine-free) and basic metabolic panel (BMP), and epinephrine and nor-epinephrine levels.
Her other remarkable medical history included a right hemisphere ischemic stroke at the age of 22. Causes for the stroke include the added risk of oral contraceptives (OCP) use for her irregular periods and a pre-existing patent foramen ovale (PFO). The stroke has led to residual left-side numbness, weakness and balance issues, as well as apraxia and dysarthria. Her other diagnoses include asthma, joint stiffness, hyperlipidemia, sleep walking, and dyspnea. See Table 2     These deletions were not detected in either the proband's father or brother using the WGS data ( Figure S7). The orthogonal Illumina microarray data further confirmed this discovery; his father (K10031-10231) and his brothers (K10031-10233 and K10031-1 8 microarray because the density of SNP varies between genomic regions ). This highlights the higher resolution and completeness of WGS over microarray for precise molecular diagnosis of such diseases. Using other methods would make it necessary to go through a more complicated workflow with standard genetic testings for PWS (Dittrich et al. 1992;Cassidy and Driscoll 2008;Cassidy et al. 2012     which has been proven to contribute to HSAN IV (congenital insensitivity to pain with anhidrosis (CIPA). CIPA is a disease closely resembling FD (HSAN III), and is characterized by a lack of pain sensation, anhidrosis, unexplained fever since childhood, self-mutilating behavior, and intellectual disability of varying degree (Swanson 1963;Indo et al. 1996). However, since both variants have been reported before in CIPA individuals diagnosed with cancer as well as healthy individuals, they are considered to be polymorphisms in the population (Cargill et al. 1999;Gimm et al. 1999;Shatzky et al. 2000;Indo 2001;Greenman et al. 2007). These two variants seem to be linked since they always seem to occurr together (Gimm et al. 1999). Both variants are located within the intracellular tyrosine kinase domain (amino acids 510-781) of the encoded protein.
However, both sites are not conserved and biochemistry studies further confirmed that neither of these two variants has any effect on protein expression and phosphorylation compared to the wild type (Mardy et al. 2001). The fact that the mother's brother (K10031-10231) also carries these two variants further proved that they are likely to be polymorphisms, and neither of these variants is present in the proband (K10031-10133).
Hence, we investigated whether the dysautonomia-like symptoms presenting within the family are possibly arising in conjunction with other mutations. To investigate, we used pVAAST, CADD, and other prioritizing tools to help leverage the power of the large pedigree and WGS. Table S4 summarizes the lists of variants that meet the following criteria: 1) found in the probands and not found in unaffected people in the family, 2) called by at least one pipeline and supported by a second pipeline, 3) located within 2 6 coding regions, 4) have high rankings by pVAAST and with either a Combined Annotation Dependent Depletion (CADD) score greater than 15 or at least with medium effect predicted by GEMINI, and 5) have Alternate Allele Frequency (AAF) < 1% in either ExAC and 1000G databases.

Pharmacogenomic analyses for individual K10031-10133
Pharmacogenomic analyses were performed on individual K10031-10133 using For the study on the family members with dysautonomia-like symptoms, the unambiguous description of their physical symptoms and the incompleteness of their clinical investigation and records hinder both the analysis of their phenotypic data and the 3 0 interpreation of their genotypic data. As a matter of fact, an inaccurate phenotypic data can be quite misleading in finding the disease-related variants, since often times construction of a disease inheritance model to partition the variants is necessary, and is largely based on the segragation of the phenotype in the pedigree of study. In this particular family, the inheritance model seems to be a dominant pattern since multiple family members in different generations share similar symptoms such as dizziness and syncope. However, such assumptions might not necessarily to be true due to the unclear segregation of more specific and objective clinic findings like absence of gungiform papillae of the tongue and decreased deep tendon reflexes for FD and anhidrosis for CIPA. To further facilitate and pin down the disease-relevant variants in this family, emphasis needs to be placed on performing more clinical investigations including diagnostic tests and collecting the complete medical records from as many family members as possible.
The consequence of general and incomplete medical notes is not limited to discovering the variants; it also inhibits some features from being selected for and included in the HPO analysis. For instance, K10031-10133 is reported to have "irregular menses", but this term does not appear when put into the "features" search tab or "disease" search tab directly. Instead, the feature was pursued under the "ontology" tab, which is a top-down approach to selecting features. The available features relating to an irregular menses in the tool include amenorrhea, delayed menarche, menometrorrhagia, menorrhagia, etc., but the proband's medical documents lacked the details necessary for selecting the best option. However, this kind of information is necessary for more accurate, optimal Phenomizer results, and emphasizes the necessity of data availability as 4 0 T h e L a n c e t 3 5 9

REFERENCES
:  : :   :  o  s  t  i  c  s  i  n  h  u  m  a  n  g  e  n  e  t  i  c  s  w  i  t  h  s  e  m  a  n  t  i  c  s  i  m  i  l  a  r  i  t  y  s  e  a  r  c  h  e  s  i  n  o  n  t  o  l  o  g  i  e  s  .   A  m  J  H  u  m  G  e  n  e  t   8  5 : : T h e L a n c e t 3 3 7 :  s  o  u  r  a  s  A  ,  W  a  s  z  a  k  S  M  ,  A  l  b  a  r  c  a  -A  g  u  i  l  e  r  a  M  ,  H  e  n  s  K  ,  H  o  l  c  o  m  b  e  W  ,  A  y  r  o  l  e  s  J  F  ,  D  e  r  m  i  t  z  a  k  i  s  E  T  ,  S  t  o  n  e  E  A  ,  J  e  n  s  e  n  J  D  ,  M  a  c  k  a  y  T  F  C  e  t  a  l  .  2  0  1  2  .  G  e  n  o  m  i  c  V  a  r  i  a  t  i  o  n  a  n  d  I  t  s  I  m  p  a  c  t  o  n  G  e  n  e  E  x  p  r  e  s  s  i  o T h e L a n c e t 3 5 5 p e a n a n d a f r i c a n a n c e s t r y .   : : : :

FIGURE LEGENDS
Figure legends are now beneath each figure and will be moved here later.
1 Supplemental data 1 Figure S1       We used PennCNV to inspect this region from Illumina 2.5m microarray data.   We used PennCNV to call this deletion from the microarray data, which is also only detected from the proband, but not from the father and the two unaffected brothers. The dash lines in the figure of proband indicate the interval of the ERDS copy number variant call.