There has been a recent surge in interest in the use of endophenotypes in psychiatric research, although the concept was introduced to psychiatry by Gottesman and Shields as long as 35 years ago.1 This has been driven by concerns about the limited success and poor reproducibility of existing approaches as well as the fact that current diagnostic systems in psychiatry by and large lack aetiological justification. In addition, the identification of disease endophenotypes offers the prospect of creating animal models relevant to human psychopathology, which will be suitable for experimental approaches and greatly facilitate the development and screening of novel therapeutics.

Gottesman and Shields,1 who adapted the term from insect biology, described endophenotypes as internal phenotypes that lie on the pathway between genes and disease. Fundamental to the concept is the assumption that variation in an endophenotype will depend upon variation in fewer genes than the more complex disease phenotype and therefore be more tractable to genetic analysis.2 There has been a number of attempts to devise criteria to define the optimal characteristics of an endophenotype.2, 3, 4, 5, 6, 7, 8 There is a general agreement that an endophenotype should occur at a higher frequency in individuals with the disease than the general population; moreover, this association should derive from shared genes. It therefore follows that an endophenotype should be heritable, tend to co-segregate with the illness in multiply affected families, be found in unaffected relatives of cases at a higher rate than in the general population and ideally show evidence for common genetic risk factors from twin studies. In addition, if an endophenotype truly lies on the causal pathway between genes and disease, it should be state-independent, in that it manifests in an individual whether or not illness is active (although it may require challenge or provocation) and to an extent that is not critically dependent upon the degree of activity of the illness. However, it is important to note that even if a putative endophenotype satisfies all these criteria, this does not exclude the possibility that it is epiphenomenal with respect to the disease. In other words, it might occur as a pleiotropic consequence of the risk gene or genes and not lie on the disease pathway (see Figure 1 and Table 1). A further widely supported criterion is that an endophenotype should have good psychometric properties, especially reliability and validity, and be sufficiently sensitive to detect individual differences.

Figure 1
figure 1

This shows in simplified, schematic terms some of the possible relationships between putative endophenotypes, gene and disease. In reality, different combinations of these simplified scenarios are likely. G, genes; Enviro, environmental factors; Endo, putative endophenotype; P, disease phenotype.

Table 1 The relationship between genes and a putative endophenotype impacts on the various criteria proposed for endophenotypes

Whilst there is widespread acceptance of the importance of these criteria, it is disconcerting how few endophenotypes used in psychiatric research actually fulfil them. For example, there is limited evidence for co-segregation with illness, let alone heritability for many neurophysiological and neurocognitive measures that have been suggested as schizophrenia endophenotypes.9, 10 The extent to which this matters depends upon the use to which an endophenotype is being put and it is helpful to consider the possibilities in some detail.

Broadly speaking, endophenotypes are used under two circumstances. First, as originally proposed,1, 2 they are used to aid in the discovery of novel genes. The critical assumption here as we have seen is that the genetic architecture of the endophenotype is simpler than that of the disease phenotype and this, together with the opportunity to study clinically unaffected relatives, who display the endophenotype by virtue of their increased risk of disorder, should increase power. However there are a number of concerns that need to be addressed before the wholesale adoption of endophenotype approaches in gene-finding studies. First, it is often not clear how state-independent many of the measures proposed really are, with the potential for contamination not just by fluctuations in course of illness and drug treatment, but also by factors such as smoking and menstrual cycle phase.9, 11 Second, there are uncertainties about reliability and particularly inter-laboratory variation for many of the methods espoused. For example, there are no generally agreed protocols for reliably eliciting electrophysiological deficits in pre-pulse inhibition (PPI) and P50 sensory gating, which have been proposed as endophenotypes in schizophrenia, as well as concerns regarding test–retest reliability.9 The neuroimaging community is still struggling with issues of reliability as well as in developing methods to allow data obtained from different scanners to be meaningfully combined.12 The neurocognition literature continues to be overwhelmed by the use of multiple variations of tests for the same cognitive domains and poor examination of reliability issues, although consensus approaches and computer-delivered batteries should help.10, 13 Fortunately, many of these issues are being addressed by the development of multisite initiatives such as the Consortium on the Genetics of Endophenotypes in Schizophrenia.14, 15 Such projects represent an advance in this field in aiming to address reliability concerns by standardizing methodologies for electrophysiological and neurocognitive measures with regular monitoring of procedures, training and reliability.

The main justification for the use of endophenotypes in gene-finding studies is their assumed genetic simplicity in comparison to complex disease phenotypes. Originally, it was hoped that this would allow disorders such as schizophrenia to be decomposed into a set of single-gene deficits. Most recent commentators accept that this is unlikely although they still believe that the genetic architecture of endophenotypes will be significantly simpler than that of disease. Recently, even this assumption has been challenged on both empirical and theoretical grounds.16 The authors acknowledge that their work has some limitations, but it seems likely that endophenotypes, like complex diseases, will reflect the operation of many genes of small effect. This work cautions against unfettered enthusiasm, but it would seem to be premature to exclude the possibility that endophenotypes will be useful in defining more aetiologically homogeneous groups. Moreover, the use of carefully designed measures offers the possibility of phenotypes that are more reliable and objective than those based on patient report as well as the potential to harness the increased power that accrues from the use of quantitative phenotypes. However, it is also important to note that the cost of measuring some endophenotypes, particularly those based on neuroimaging, currently prohibits their application to the large samples required for gene-finding studies.

It is generally assumed that to be useful as an endophenotype, a trait should lie on the causal pathway between genes and the disorder.2, 3, 4, 5, 6, 7 Indeed, this would seem to be fundamental to the concept. However, this is one of the most difficult criterion to demonstrate. It is often indirectly inferred if a trait that is associated with a disorder is found in unaffected relatives. However, as we have seen, this does not exclude the possibility that the trait bears an epiphenomenal relationship with respect to genes and the disorder (Figure 1 and Table 1). To prove causality, longitudinal studies are required, preferably in a genetically informative design,8 although even here intervention studies are really required to prove that the putative endophenotype lies on the causal path to disorder. It is worth pointing out the proof that a trait lies on the disease pathway is not strictly speaking required for a trait to be a useful aid to finding disease genes; a trait that is epiphenomenal will do just as well here as long as it simplifies the genetic architecture by defining a more genetically homogeneous disease subgroup or identifies carriers of the risk genotype among unaffected relatives. However, some might argue that the more general term ‘biomarker’ is more appropriate in this instance.

Given these difficulties, it is perhaps not surprising that to date there are few examples of the use of endophenotypes leading to the identification of novel risk genes or even robust linkages for psychiatric phenotypes. An elegant and instructive exception is a series of studies by the Collaborative Study of the Genetics of Alcoholism (COGA) in which electrophysiological endophenotypes were studied in addition to clinical diagnoses.17 Here the use of endophenotypes substantially improved both the strength and localization of linkage findings and allowed the identification of GABRA2 and CHRM2 as genes associated with predisposition to alcohol dependence.17, 18 In this case, the utility of analysing electrophysiological data seems to have been that it allowed broad linkage signals reflecting linkage to more than one locus to be decomposed into constituent signals reflecting variation in individual genes; gains in power resulted from a combination of greater genetic homogeneity and the use of quantitative phenotypes. Further studies have used individual and combined endophenotypes in genetic linkage studies of schizophrenia, although as yet these studies have not resulted in the unambiguous identification of susceptibility genes.19, 20, 21

The second use of endophenotypes is to study the functional consequences of risk alleles rather than as a means of identifying novel risk genes.22 This approach is becoming increasingly popular as evidence for susceptibility genes accumulates since it is perceived as holding much promise for establishing disease mechanisms, for example, by seeking associations of risk alleles with structural or neurocognitive variables.22 However, there are dangers arising from the fact that, as we have seen, it is difficult to establish that traits are actually on the disease pathway even if all the generally agreed criteria for an endophenotype have been met (Figure 1 and Table 1). Most would agree that if a trait is associated with a disease, satisfies the other criteria for an endophenotype and is associated with the presence of a robustly associated risk allele, then this is prima facie, if not conclusive, evidence that it lies on the disease pathway. Indeed once these criteria are met, there are statistical approaches available to determine the extent to which a putative endophenotype might mediate a gene–disease association.4

However, a number of dangers present themselves in the absence of robust evidence for disease association with a specific allele or alleles, as is currently the case for most, if not all, associations in psychiatric genetics. In particular, there is the potential for ill-substantiated disease associations to gain spurious support from associations with endophenotypes; the combination of uncertain genetic associations leading to multiple testing of alleles or haplotypes together with the analysis of multiple endophenotypes leads to the potential generation of false positives that gain spurious credibility from appeals to the endophenotype concept. The solutions to this problem are clear, if not easily attained.23 Such studies should be based upon robust evidence for association with the primary disease phenotype, which will usually require large samples, low P-values and evidence of replication. Fine mapping studies are then required to allow exclusion of many polymorphisms in the region as causal variants. This will then leave a subset of putative candidate variants that will usually show linkage disequilibrium with each other. These will then be suitable for further genetic studies of endophenotypes, where the per-gene effect size might be larger than those involving the clinical phenotype, although it is here that researchers will need to remain vigilant that the endophenotypes studied are true mediating variables.

It has not been our intention to try and dissuade researchers from using the endophenotype approach. Indeed, we see great potential for its use in psychiatric genetics. However, we do believe that the use of endophenotypes needs to be more carefully considered and more attention is given to the choice of possible measures with specific use and setting in mind. In our view, the use of endophenotypes in gene finding, by both linkage and association approaches, is most likely to add value when multiple measures are employed in combination with clinical data.17, 20, 24 Putative endophenotypes should be chosen on the basis of robust evidence that they are not only associated with the disease but also that association reflects shared genes (Table 1). Fortunately, increasing attention is now being given to this latter issue.9, 10, 25, 26, 27, 28 Whether or not the genetic architecture of endophenotypes will be substantially simpler overall than that of the diseases to which they are related remains to be seen,16 but there are encouraging signs that, from the work from COGA cited above and work on non-psychiatric disorders,29, 30, 31 their use might make complex traits more tractable to genetic analysis.

There also seems to be great potential to use objectively measurable endophenotypes to illuminate the brain mechanisms linking specific gene actions and products to the subjective experience of psychopathological symptoms. However, this work must be based upon robust genetic associations that have been simplified by fine mapping, and researchers should remain vigilant for evidence of pleiotropy. Given the complexities of psychiatric phenotypes and our lack of understanding of disease pathogenesis, the penalties for not adhering to this could be severe.

Finally, we believe that the above considerations are relevant to the question of nomenclature. There are some who prefer the term ‘intermediate phenotype’ to ‘endophenotype’ because it implies a position on the pathway between genes and disease. It is for precisely this reason that we prefer the more mechanistically neutral term ‘endophenotype’ at least until convincing evidence for mediation has been obtained.