Abstract
Purpose Panel germline testing allows for the efficient detection of pathogenic variants for multiple conditions. However, because the benefits and harms of identifying these variants are not always well-understood, it may not be beneficial to make panels arbitrarily large.
Methods We present a multi-gene, multi-disease aggregate utility formula that allows the user to consider the addition or removal of each gene based on its own merits. This formula takes as inputs variant frequency, penetrance estimates, and subjective disutilities for false positives (testing positive but not developing the disease) and false negatives (testing negative but developing the disease). We provide credible intervals for utility that reflect uncertainty in penetrance estimates.
Results Rare, highly penetrant pathogenic variants tend to contribute positive net utilities for a wide variety of user-specified utility costs and even when accounting for uncertainty in parameter estimation. On the other hand, for pathogenic variants of moderate, uncertain penetrance, the clinical utility is more dependent on assumed disutilities.
Conclusion The decision to include a gene on a panel depends on variant frequency, penetrance, and subjective utilities and should account for uncertainties around these factors. Our framework and accompanying webtool help quantify the utility of testing particular genes.
Introduction
Genetic screening for pathogenic variants can be a valuable component of risk management for hereditary diseases1,2. Testing results can prompt heightened surveillance, prophylactic surgery, and other measures to enhance prevention or treatment. Technological advances such as next-generation sequencing have made simultaneous testing of multiple genes cheaper and more accurate than ever before3,4. Panel studies have led to many clinically significant findings that would have been missed by single-gene or single-syndrome testing5–8. However, the clinical utility of such comprehensive panel germline testing may not be universally appropriate for all contexts.
For some genes and diseases, published guidelines provide best practices about actions to take when pathogenic variants are identified in the context of diagnostic testing. For other genes and settings (e.g. secondary findings or population screening), there is a lack of consensus on whether interventions should be recommended, often because the disease penetrance (the probability that carriers of pathogenic variants will develop disease) is low or unknown. For example, where penetrance has been estimated through families with strong family histories, the penetrance estimates in population screening may still be uncertain9. If the benefits, risks, and guidelines are unclear for these genes, it could be harmful rather than beneficial to include them in a testing panel. Instead of mitigating risk and improving outcomes, testing may lead to unnecessary surveillance and overtreatment.
We consider a scenario in which the goal is to determine which genes should be included in a panel as part of non-diagnostic screening for a fixed group of diseases. While our approach is readily generalizable to other settings, such as indication-based screening and the incorporation of VUSs, we will focus on population-based screening for pathogenic variants in asymptomatic individuals as our motivating context. We propose an aggregate utility function that incorporates quantitative measures of genetic and disease characteristics (carrier prevalences and disease penetrances) and utility benefits and costs (sometimes termed “disutility”) for multiple diseases and germline tests. Positive utilities could include identifying individuals at high risk for disease who would benefit from intervention, who would remain unrecognized in the absence of testing. Disutilities could include anxiety or false reassurance in response to test results and overtreatment. Utilities and disutilities can be individualized for specific diseases and tests, as well as patient and clinician concerns. This approach generates a single net utility across all genes proposed for inclusion in a panel, but our construction also allows for the evaluation of each disease and gene combination on its own merits.
Additionally, we incorporate credible intervals for disease penetrances that reflect our confidence in available penetrance estimates and propagate this uncertainty into the net utility calculation. For sufficiently large penetrances, the net utility may provide evidence in favor of keeping the test even when the penetrance estimate is unreliable. For low or moderate penetrances, the net utility may point toward removing the gene from the panel or needing to improve the reliability of the penetrance estimate. This utility approach can be used to help formalize the decision-making process when designing a gene panel. Some approaches that take uncertainty in risk estimates into account for other domains include Berry & Parmigiani10 and Ding et al.11
In the Materials and Methods section, we formulate general expressions for our proposed multi-gene, multi-disease aggregate utility. An illustration of our approach to germline testing for pathogenic variants on five genes (ATM, BRCA1, BRCA2, CHEK2, and PALB2) associated with increased risk of developing breast cancer follows in the Results section, along with a broad exploration of the effects of varying parameter inputs (true penetrance, uncertainty in penetrance estimates, variant frequencies, relative utilities). We conclude with a Discussion. Our method is implemented in an R Shiny app, freely accessible at https://janewliang.shinyapps.io/agg_utility, where users can enter parameter estimates and uncertainties for calculating their own net utilities.
Materials and Methods
Suppose that we are interested in risk assessment for some predetermined set of diseases, indexed i = 1, …, I, and are considering the genes j = 1, …, J to be included in a panel for germline testing. We define an aggregate utility expression in terms of the following notation:
Di = {0, 1} is the indicator for developing disease i.
Gj = {0, 1} is the indicator for testing positive for carrying a pathogenic variant on gene j.
Cij11 is the utility associated with a true positive, i.e. the individual tests positive for carrying a pathogenic variant on gene j and does develop disease i.
Cij00 is the utility associated with a true negative, i.e. the individual tests negative for carrying a pathogenic variant for gene j and does not develop disease i.
Cij01 is the utility associated with a false positive, defined here as instances where the individual tests positive for carrying a pathogenic variant on gene j but does not develop disease i. Assume that Cij01 < Cij00.
C ij10 is the utility associated with a false negative, defined here as instances where the individual tests negative for carrying a pathogenic variant for gene j but develops disease i. Assume that Cij10 < Cij11.
We emphasize that the terms “false positive” and “false negative” are used to refer to incomplete penetrance, as opposed to genotyping errors or misclassifying pathogenic variants.
Let δij01 = Cij00 − Cij01 be the utility cost associated with a false positive test or alternatively the utility benefit of a true negative test, for gene j in relation to disease i, e.g. unnecessary surveillance and over-treatment and possible anxiety due to a positive test. Similarly, define δij10 = Cij11 − Cij10 as the utility cost associated with a false negative test, or alternatively the utility benefit of a true positive test, for gene j in relation to disease i, e.g. missed opportunity for surveillance and prevention along with false reassurance. We assume both δij01 and δij10 are greater than 0. (We do not consider situations where either δij10 or δij01 is < 0, although we note that these may exist; for example, where the utility of a false negative is larger than the utility of a true positive [where “the cure is worse than the disease”].) Finally, let Kij be the net utility cost (potentially including psychological or physical harms) associated with conducting the test for gene j in relation to disease i, independent of test results. Then the net utility for disease i in the setting where we test for gene j is
Assuming that the utility associated with developing disease i in the absence of testing information for gene j is equal to Cij10, and assuming that the utility for not developing disease i in the absence of testing is equal to Cij00, then the net utility for disease i in the scenario where we do not test for gene j is then Of interest is the difference in utility for disease i when testing vs. not testing for gene j, which we define as the difference of Eq. (1) and Eq. (2):
This difference in utility can be re-expressed in terms of Pr(Gj = 1), which is the prevalence for pathogenic variants of gene j, and Pr(Di = 1 | Gj = 1), which is the cumulative lifetime risk or penetrance of developing disease i given that one is carries a pathogenic variant in gene j (i.e., the penetrance):
It is beneficial to test for gene j when Δij > 0, which occurs when the utility for testing is greater than the utility for not testing. For simplicity, we will generally treat testing for a given gene as testing for particular pathogenic variant in the gene, but the framework readily extends to handle variant-specific tests, prevalences, and penetrances.
For multiple diseases (indexed by i) and tests (indexed by j), the aggregate utility Δ sums over all combinations of i and j. Doing so requires carrier prevalence and disease penetrance estimates for each gene and disease, as well as the specification of utility costs for false positives, false negatives, and testing. Δ provides a simple summary value while still allowing genes to be evaluated individually:
An additional Kij can be included for each (i, j) combination (with perhaps less weight given for each additional test), or a single overall utility cost for all testing can be used. Since Δ is the sum of the net utilities of particular disease-gene pairs (i, j), the decision whether or not to include a given test on a multi-gene, multi-disease panel depends only on the net utility of that test.
The number of utility cost parameters δij01, δij10, and Kij grows as the number of diseases and tests increases, but one can consider simplifications such as assuming the same costs across diseases/tests or subgroups of diseases/tests. For example, it may be reasonable to assume that the utility cost of each test for an additional gene j is negligible. Specification of these utility costs is largely subjective and should depend on clinical setting and patient concerns.
Utility threshold
For an individual test for disease i and gene j, a utility threshold can be defined as the value bij = δij10 /δij01 for which Δij = 0: which implies so
Note that when Kij = 0, bij = 1/Pr(Di = 1 | Gj − 1) − 1 and depends only on the penetrance Pr(Di = 1 | Gj = 1). If the ratio of the cost of a false positive to the cost of a false negative is greater than bij, then including gene j to test for disease i has positive net utility. If the ratio needed to achieve a non-negative utility is unreasonable—e.g. in many settings ascribing a higher cost to false positives than false negatives would be inappropriate—then the test should not be kept as part of the panel. Basing analysis around a threshold ratio allows for an alternative interpretation that does not require upfront specification of the utility costs δij01, δij10, and Kij.
If one assumes that δij01 = δ01 and δij10 = δ10 for all values of i and j, then and the threshold b = δ10 /δ01 when Δ = 0 can be expressed as where the last line holds when Kij = 0 for all i, j.
Uncertainty distribution for disease penetrance
Of additional interest is the incorporation of uncertainty in the penetrance estimates Pr(Di = 1 | Gj = 1). Denoting pij = Pr(Di = 1 | Gj = 1) for a given disease i and gene j, we model the uncertainty in the penetrance pij as a beta distribution Beta(αij, βij). One can motivate the choice of the parameters αij and βij by conceiving the penetrance’s uncertainty distribution as the posterior distribution from a trial of nij carriers of pathogenic variant j. Then, set αij = nijpij to represent the expected number of cases of disease i in the trial and set βij = nij(1 − pij) to represent the expected number of individuals who do not develop the disease. Through specification of the precision nij, we can express our confidence level in the estimation of pij, with larger values of nij corresponding to a greater degree of certainty about the estimate and smaller ones indicating less confidence.
The uncertainty from pij can then be propagated into a distribution and credible interval for the corresponding Δij and the aggregate Δ (assuming independence of pij across all i, j), as well as additional summary values. We will assume that we are not concerned about incorporating uncertainty from estimating Pr(Gj = 1). The probability that the individual net utility Δ is positive (i.e. adding the test for gene j makes an improvement) can be written as where the last line holds if K = 0. Prij(Δ > 0) does not generally have a closed form, but can be calculated empirically from the sampling distributions of the pij s. One can also derive a lower bound on the estimated Δij s that accounts for uncertainty by plugging in the fifth percentiles of the pij s in the uncertainty distributions in place of Pr(Di = 1 | Gj = 1) in Equation (6). This fifth percentile represents a “near-worst case scenario” for the net utility in which the true disease penetrance is at the low end of its credible range.
Results
Female breast cancer application
We first consider a specific application for the aggregate utility approach that incorporates panel germline testing for ATM, BRCA1, BRCA2, CHEK2, and PALB2 (J = 5) as part of risk assessment for female breast cancer (I = 1), for hypothetical screening of a woman without a previous breast cancer diagnosis or breast cancer family history. These five genes are commonly included in risk panels for hereditary breast cancer. We chose this example for its familiarity and relevance, as well as the availability of empirical estimates of the lifetime risk of breast cancer in women for carriers of pathogenic variants in these genes and their relative precisions. We stress that these results are largely presented for illustrative purposes. In particular, although our understanding of the absolute and relative uncertainty in the penetrance estimates for these genes is changing as more data become available in more diverse populations, the lifetime risk for carriers of pathogenic variants in these genes is relatively well known—the uncertainty in penetrances estimates for other diseases and other genes is often much greater12,13. Users are free to input their own prevalence and penetrance estimates, as well as the uncertainty in the penetrance estimates, into our R Shiny app to calculate their impact on likely net utility.
Pathogenic variants in these genes have all been linked to breast cancer, but some are better studied than others. BRCA1 and BRCA2 pathogenic variants are highly penetrant with widely adopted guidelines for enhanced screening and other clinical interventions4,14–16. While pathogenic variants of ATM, CHEK2, and PALB2 have been linked to breast cancer, the additional risk conferred is not as well-understood17–20, especially among individuals with non-European ancestries. In this section, quantities (prevalences and penetrances) involving ATM, BRCA1, BRCA2, and PALB2 are estimated for any pathogenic variant in the given gene; for CHEK2, quantities are for the 1100delC variant only.
The carrier prevalences for pathogenic variants on BRCA1 (0.00058) and BRCA2 (0.00068) are calculated based on allele frequency estimates reported in Antoniou et al.21 (see also Dullens et al.22; Krassuski et al.23). Those for ATM (0.0019), CHEK2 (0.0026), and PALB2 (0.00057) are calculated based on allele frequencies reported in Lee et al.24 Cumulative lifetime penetrance estimates for female breast cancer are taken from the literature review performed by the All Syndromes Known to Man Evaluator25–29: 0.35 (ATM), 0.73 (BRCA1), 0.72 (BRCA2), 0.19 (CHEK2), and 0.38 (PALB2). This is the genotype-specific probability of developing breast cancer among females, prior to dying.
To reflect our greater confidence in the penetrance estimates for BRCA1 and BRCA2, we use nij = 10000, i.e. a trial size/precision of 10,000, to specify the parameters in their beta uncertainty distributions. For ATM, CHEK2, and PALB2, we specify nij = 100, i.e. a smaller trial size of 100. (We chose these values to illustrate the impact of uncertainty on net utility calculations. They should not be taken as indicative of the absolute or relative strength of the available data on the penetrance of pathogenic variants in these genes.) Figure S1 plots the beta uncertainty distributions of the five lifetime penetrance estimates. The wider spread for ATM, CHEK2, and PALB2 reflects greater uncertainty. Table S1 summarizes the quantiles for these uncertainty distributions at 2.5%, 5%, 10%, 50%, 90%, 95%, and 97.5%.
Suppose in the case of the aggregate Δ that we take the false positive, false negative, and test utility costs to be the same across all five tests, denoted as δ01, δ10, and K, respectively.
This would be a suitable assumption in scenarios where the clinical intervention for those carrying any one of the pathogenic variants is the same, regardless of which gene the variant is found in. For simplicity, we will drop the ij subscript where it is clear that we refer to gene- and disease-specific utilities (e.g. Figures 1, S2-S4; Table 1). We fix K = 0 and δ01 = 1, allowing the ratio b = δ10 /δ01 to vary from 0.1 to 10 in increments of log10 (0. 1). Figure 1 plots the net utilities against the ratio of false positive to false negative utility costs for each of the individual genes, as well as the aggregate Δ for all five genes. (Figure S2 depicts the same curves on a log-transformed x-axis, to help illustrate the behavior for small values.) Table 1 gives these threshold values with a 95% credible interval.
As expected, the credible intervals for the BRCA1 and BRCA2 net utilities are very narrow and the credible intervals for ATM, CHEK2, and PALB2 are wider, reflecting the widths of the credible intervals in their uncertainty penetrance distributions. The aggregate utility has the widest credible intervals of all, because it incorporates uncertainty from all five penetrance estimates.
The utility thresholds, defined as the ratios of the cost of a false positive to the cost of a false negative such that the net utility is 0, are quite low for BRCA1 (0.37) and BRCA2 (0.40). These thresholds are well below 1, so even in a scenario where one is highly concerned with avoiding false positive tests, there is a wide range of possible utility costs that can be specified to result in a positive net utility. The net utility for testing these two genes is positive, except for some extreme cases when δ10 is very low compared to δ01. The curves for the probability of the net utility being positive resemble step functions, with the jump from being 0% positive to 100% positive occurring at a sharp, early point.
In contrast, the less-penetrant genes have utility thresholds above one: ATM (1.9), CHEK2 (4.1), and PALB2 (1.6). In these cases, the false negative utility cost needs to outweigh the false positive utility cost in order for it to be beneficial to keep the gene in the panel, sometimes by a considerable amount. Because of the greater uncertainty in the penetrance estimates, the lower bound of the utility threshold credible interval is noticeably even less favorable. The probability curve for observing a positive net utility also bends toward 100% at a much more gradual incline. The aggregate utility threshold is somewhere intermediate (1.8), balancing between the larger and smaller effect sizes, as is the shape of the probability curve.
Heatmaps of the individual and aggregate net utilities while holding K = 0 and varying δ01 and δ10 from 1 to 100 in increments of 5 are depicted in Figure 2. Similar heatmaps for the probability of a positive net utility and the fifth percentile net utility are shown in Figures S3 and S4. Utility thresholds based on the original penetrance estimates are drawn as solid black lines on all three sets of heatmaps. The dashed black reference lines have intercept 0 and slope 1, and correspond to cases where false positives and false negatives have equal costs. In Figure S4, the additional solid blue lines indicate the utility thresholds based on the fifth percentiles of the penetrances’ uncertainty distributions.
These heatmaps offer an alternative visualization as well as some additional insight on the behavior of the utilities under different false positive and false negative utility cost conditions. In Figure 2, the net utilities for BRCA1 and BRCA2 are positive (blue) for a much broader range of δ01 and δ10 values compared to the net utilities for the other genes, in concordance with the utility threshold discussion for Figure 1.
The utility threshold reference lines in the heatmaps for the probability of a positive utility (Figure S3) track with the regions where the probability of positive utility transitions from 0 (white) to 1 (dark blue). The sharp transitions for BRCA1 and BRCA2 reflect their tight credible intervals, and the more gradual transitions for ATM, CHEK2, and PALB2 reflect their wider credible intervals. The heatmaps for the fifth percentiles of the net utilities (Figure S4) closely resemble the heatmaps for the net utilities based on the original penetrance estimates. The utility thresholds for BRCA1 and BRCA2 are quite similar; there is more variability in the utility thresholds for ATM, CHEK2, and PALB2. Again, this reflects the wider credible intervals and uncertainty distributions for these genes. Under a “near-worst case scenario” interpretation (i.e. basing decisions about potential utility on the lowest 5th percentile of the utility distribution), ATM, CHEK2, and PALB2 require the specification of an even larger false negative costs relative to their false positive costs in order to result in positive net utilities for testing.
Net utility behavior as parameters vary
In order to explore the properties of our proposed utility expression across a wider range of scenarios, we varied the parameters influencing Δij as follows:
False negative utility cost δij10: Ranging from 0.1 to 10 in increments of log10(0. 1)
Cumulative lifetime disease penetrance Pr(Di = 1 | Gj = 1): {0. 2, 0. 4, 0. 6, 0. 8, 0. 99}
Carrier prevalence Prj(G = 1): {0. 001, 0. 002, 0. 003, 0. 004}
Precision nij used to specify parameters in the uncertainty distribution:
{10, 100, 1000, 10000}
The chosen penetrance and carrier prevalence values reflect those seen in clinical practice. Lifetime penetrances vary between 0.195 and 0.732 and carrier prevalences vary between 0.00114 and 0.00519 in the female breast cancer application. The ClinGen actionability reports30 frequently list disease risks with broad ranges of possible values. For example, carriers of STK11 pathogenic variants have a 38-66% estimated risk of developing gastrointestinal cancer by age 60-70 and a 13-18% risk for gynecological cancer31. The penetrance of developing dopa-responsive dystonia among GCH1 carriers is 87-100% for females and 35-55% for males32. Carriers of MLH1, MSH2, MSH6 or PMS2 have a 25-70% cumulative risk for colorectal cancer by age 70 and 30-70% for endometrial cancer. Penetrances by age 70 for other cancers are generally lower in effect size and narrower in range of estimated values, including 1-9% for gastric, 2-16% for bladder, 6-14% for ovarian, 9-30% for prostate, and 5-14% for breast33.
As in the previous subsection, we set the test utility cost Kij to 0 and the false positive utility cost δij01 to 1, thereby normalizing the ratio to be bij = δij10 /δij01 = δij10 and allowing it to range from 0.1 to 10 in increments of log10 (0. 1). Figure 3 plots Δij against bij for each combination of parameters, with prevalences varying in the rows and penetrances varying in the columns. (Figure S5 depicts the same curves on a log-transformed x-axis, to help illustrate the behavior for small values.) The colored shading represents 95% credible intervals for different values of nij. Dashed reference lines are drawn to indicate the utility threshold in each scenario. Table 2 gives these threshold values with a 95% credible interval.
Overall, higher carrier prevalences and lower penetrances tend to correspond to wider credible intervals for all precision levels. However, if the carrier prevalence is very low, the credible intervals remain consistently narrow even when the penetrance is also very low, and similarly when the disease penetrance is very high in the presence of high prevalence. So the net utilities for rare, highly penetrant genes are more likely to have narrow credible intervals, independent of the amount of confidence we have about the penetrance estimates.
Interestingly, the credible intervals for net utilities at given prevalence and penetrance values grow wider as δij10 (utility cost of a false negative) increases relative to δij01 (utility cost of a false positive). So, even supposing that one undervalues the relative cost of a false positive, the uncertainty from the distribution of the penetrance allows the decision of which genes to keep in the panel to be less dependent on the exact choice of δij01. Higher penetrances correspond to lower utility thresholds (since Kij = 0, the utility threshold does not depend on prevalence), which makes intuitive sense: high penetrance implies the proportion of carriers who are false positives is small, so interventions with larger false positive:false negative ratios can still have positive net utility.
Discussion
We have derived net utility expressions for usage in determining which genes add or detract utility from a genomic testing panel. These expressions are functions of carrier prevalence and disease penetrance estimates, as well as user-specified utility costs for false positives, false negatives, and testing. Our approach is flexible, and allows users to estimate the impact in a variety of clinical contexts, from population-level applications to screening of high-risk populations. We present utility thresholds, the probability of a positive utility, and lower bounds on the net utilities as summary values that can provide additional insight. The framework presented can be used to assist decision-making for a broad range of clinical uses for panel testing in asymptomatic patients. We evaluated the net utility of population screening for pathogenic variants in five breast-cancer predisposition genes, as well as a hypothetical range of disease penetrance, carrier prevalence, and precision (used for specifying the penetrance’s uncertainty distribution).
Our work provides a needed approach for estimating the incremental utility or disutility of genetic screening for pathogenic variants in numerous genes and conditions simultaneously. Published estimates about the clinical benefits of genetic screening to date have focused on conditions with reasonably developed evidence bases34–37. Yet, the ability of genomic sequencing to identify genetic variants associated with rare disorders, for which epidemiological evidence is typically limited, has been one of the most promising successes from advances in genetic testing capabilities38–41. Moreover, the American College of Medical Genetics and Genomics is integrating an increasing number of conditions into its recommendations for a minimal list for secondary findings disclosure, even when data about the penetrance of pathogenic variants in associated genes is limited42–44. The approach we have developed allows for better estimation of the benefits and harms of such recommendations, estimates that have been omitted from research to date37. Our tool provides a flexible approach that can accommodate varying measures of utility and disutility, including quality-ajusted life years, life years gained or lost, and death rates. Moreover, the tool can be easily tailored to accommodate utility and disutility for a variety of perspectives, from patient outcomes to societal impact45.
Our approach does not directly account for potential challenges in curation accuracy for pathogenic variants, and we generally do not distinguish between different pathogenic variants of the same gene (although our framework is easily modified to have several individual variants or classes of variants). We assume that modern germline testing technology detects pathogenic variants with near-perfect sensitivity and specificity. We further assume that the carrier prevalences can be estimated with a high degree of accuracy, such that they do not contribute a significant amount of additional uncertainty to the net utilities. When formulating the net utility expression, we treat untested individuals as being equivalent to those who test negative for the pathogenic variant(s) in question.
Further work can explore challenges when building a utility that incorporates many more genes or genes with unknown parameter estimates, as well as accounting for the age of the person being tested or measured polygenic risk scores44,46. We can also conduct a more rigorous exploration of simplifying assumptions that reduce the number of utility cost parameters that need to be specified. Nevertheless, our work provides a feasible approach to estimating the clinical benefits or harms of genetic screening. Tools such as ours are critically needed by policymakers and payers as they make decisions about how to regulate and reimburse the current generation of genomic tests.
Data availability
Data sharing is not applicable to this article as no new data were created or analyzed in this study. Our net utility calculations are available in an R Shiny app, freely accessible at https://janewliang.shinyapps.io/agg_utility. Code is available at https://github.com/janewliang/agg_utility.
Author contributions
Conceptualization: J.W.L., K.D.C., R.C.G., P.K.; formal analysis: J.W.L.; Methodology: J.W.L., P.K.; software: J.W.L.; visualization: J.W.L.; writing-original draft: J.W.L.; writing-review & editing: J.W.L., K.D.C., R.C.G., P.K.
Ethics declaration
None.
Conflict of interest
R.C.G. has received compensation for advising the following companies: AIA, Allelica, Fabric, Genome Web, Genomic Life, Grail, OptumLabs, Verily, VinBigData; and is co-founder of Genome Medical and Nurture Genomics.
Acknowledgements
J.W.L. was supported by the National Cancer Institute at the National Institutes of Health (5T32CA009337). K.D.C. was supported by the National Human Genome Research Institute (K01HG009173).
Footnotes
↵* pkraft{at}hsph.harvard.edu