## Abstract

Detecting somatic mutations withins tumors is key to understanding treatment resistance, patient prognosis, and tumor evolution. Mutations at low allelic frequency, those present in only a small portion of tumor cells, are particularly difficult to detect. Many algorithms have been developed to detect such mutations, but none models a key aspect of tumor biology. Namely, every tumor has its own profile of mutation types that it tends to generate. We present BATCAVE (Bayesian Analysis Tools for Context-Aware Variant Evaluation), an algorithm that first learns the individual tumor mutational profile and mutation rate then uses them in a prior for evaluating potential mutations. We also present an R implementation of the algorithm, built on the popular caller MuTect. Using simulations, we show that adding the BATCAVE algorithm to MuTect improves variant detection. It also improves the calibration of posterior probabilities, enabling more principled tradeoff between precision and recall. We also show that BATCAVE performs well on real data. Our implementation is computationally inexpensive and straightforward to incorporate into existing MuTect pipelines. More broadly, the algorithm can be added to other variant callers, and it can be extended to include additional biological features that affect mutation generation.

## Introduction

Cancer develops through the accumulation of somatic mutations and clonal selection of cells with mutations that confer an advantage. Understanding the evolutionary history of a tumor, including the mutations that drive its growth, the genetic diversity within it, and the accumulation of new mutations, requires accurate variant identification, particularly at low variant allele frequency [1, 2, 3, 4]. Accurate variant calling is also critical for optimizing the treatment of individual patients’ disease [5, 6, 7, 8, 9]. Low frequency mutations challenge current variant calling methods, because their signature in the data is difficult to distinguish from the noise introduced by Next Generation Sequencing (NGS), and this challenge increases with sequencing depth.

Many methods have been developed for calling somatic mutations from NGS data. The earliest widely used somatic variant callers developed specifically for tumors, MuTect1 [10] and Varscan2 [11], used a combination of heuristic filtering and a model of sequencing errors to identify and score potential variants and set a threshold score designed to balance sensitivity and specificity. Subsequent research gave rise to a number of alternate strategies, including haplotype-based calling [12], joint genotype analysis (SomaticSniper [13], JointSNVMix2 [14], Seurat [15], CaVEMan [16], and MuClone [17]), allele frequency-based analysis (Strelka [18], LoFreq [19], EBCall [20], deepSNV [21], LoLoPicker [22], and MuSE [23]), and ensemble and deep learning methods (MutationSeq [5], BAYSIC [24], SomaticSeq [25], and SNooPer [26]). These methods vary in their complexity and specific focus. But they all implicitly or explicitly assume that the rate of mutation is uniform across the genome.

The mutational processes that generate single nucleotide variants in tumors do not act uniformly across the genome. If fact, even the processes of spontaneous mutation that are active in all somatic tissues depend sensitively on local nucleotide context [27, 28, 29]. Additional mutational processes are active in tumors, due to mutagen exposure or defects in DNA maintenance and repair, and these processes are also sensitive to local nucleotide context [30, 31, 32, 33, 34]. The specific mutational processes active in a particular tumor generate its unique mutation profile, and differences within and between tumor types are pronounced [35, 36, 37, 38, 39]. For example, the mutation profiles differ substantially among the three breast tumors illustrated in Figure 1B-D.

Here we present an enhanced variant-calling algorithm that uses the biology of each individual tumor’s mutation profile to improve identification of low allelic frequency mutations. Our BATCAVE algorithm first estimates the tumor’s mutation profile and mutation rate using high-confidence variants and then uses them as a prior when calling other variants. Our R implementation of the algorithm, batcaver, takes output from the MuTect variant caller as input and returns the posterior probability that a site is variant for every site observed by MuTect. Using both simulated and real data, we show that the addition of a mutation profile prior to MuTect produces a superior variant caller. Our algorithm is simple and computationally inexpensive, and it can be integrated into numerous other variant callers. Broad adoption of our approach will enable more confident study of low allelic frequency mutations in tumors in both research and clinical settings.

## MATERIALS AND METHODS

### Somatic variant calling probability model

At every site in the genome with non-zero coverage, Next Generation Sequencing produces a vector **x** = ({*b*_{i}}, {*q*_{i}}), *i* = 1 …d of base calls *b* and their associated quality scores *q*, where d is local read depth. Variant callers use the data **x** to choose between competing hypotheses:

Here *m* is any of the 3 possible alternate non-reference bases and *ν* is the variant allele frequency. The maximum likelihood estimate of *ν* is simply , the number of variant reads divided by the local read depth. The posterior probability of a given hypothesis, P(*m, ν*), is the product of the likelihood of the data given that hypothesis and the prior probability of that hypothesis. Assuming that reads are independent, this is
where f_{m,ν} (*x*_{i}) is the probability model for reads, and *p*(*m, ν*) is the prior.

Assuming that the identity of the alternate allele and its allele frequency are independent and that *ν* is uniformly distributed, Eq. 3 becomes

The focus of BATCAVE is to provide a tumor- and site-specific estimate of the prior probability of mutation p(*m*).

### Site-specific prior probability of mutation

The probability that we have denoted p(*m*) in Eq. 4 is more precisely the joint probability that a mutation has occurred *M* and that it was to allele *m*, which we denote p(*m, M*). But p(*m, M*) is not uniform across the genome. Rather it depends on the local genomic context *C*, so its full form is p(*m, M*|*C*) [42]. Assuming that *m* and *M* are independent conditional on the genomic context, p(*m, M | C*) = p(*m | C*)p(*M | C*), which we can use Bayes’ theorem to further decompose as

We next show how to estimate the quantities in Eq. 5.

### Estimation of the mutation profile

Many aspects of genomic architecture can affect the somatic mutation rate at multiple scales [42]. Here we focus on a small-scale feature, the trinucleotide context, which is known to strongly affect the prior probability of single-nucleotide mutation [27, 28, 29]. The trinucleotide context of a genomic site consists of the identity of the reference base and the 3’ and 5’ flanking bases. Folding the central base to the pyrimidines, there are two possible bases at the focal site, and there are four possible bases 3’ and 5’ of the focal site, yielding 2 *⋅* 4 *⋅* 4 possible tri-nucleotide contexts *C*. At the focal site, a mutation *m* can be to any of three alternate alleles. Indexing by the *c* = {1 … 32} contexts and by the *m* = {1 … 3} alternate bases, we have 96 possible substitution types *S*_{m,c}. Eq. 5 is then

The first two terms on the right-hand side can be estimated from the observed mutation profile (Fig. 1).

We model the observed mutation profile S as multinomial with parameter * π* =

*π*

_{m,c}. Each element of

*represents the expected proportion of mutations that are to allele*

**π***m*and in context

*c*. In a tumor with many high-confidence observed mutations,

*could be estimated directly from the observed mutation profile*

**π***S*. But in practice many entries in

*would then have zero weight. We thus model the distribution of S as Dirichlet-multinomial with pseudo-count hyper-parameter*

**π***,*

**α**In BATCAVE we use the symmetric non-informative hyper-parameter * α* =

**1**, so a priori mutation is equally likely to any allele and in any context.

To estimate * π*, we identify a subset of high confidence variants, based on an initial calculation of their likelihood given the data. These are variants for which the evidence in the read data overwhelms any reasonable value of the site-specific prior probability of mutation. Let D be the set of high confidence variant calls, which we define as those having posterior odds greater than 10 to 1 without the site-specific prior, and s

*∈*D be the substitution type of each mutation in D. The posterior distribution of

*is then p(*

**π***| D)*

**π***∼*Dirichlet(

*) where and*

**α′***I*is the indicator function. Returning to Eq. 6, given that a mutation has occurred, the posterior probability it occurred in context

*c*is

The posterior probability of mutation to allele *m* given that a mutation has occurred in context *C* = *c* is then

The prior probability of each particular trinucleotide context *p*(*C* = *c*) is computed simply as the pro-portion of sequenced trinucleotide contexts that have context *c*. The R implementation of BATCAVE ships with pre-computed tables for both human whole exomes and whole genomes.

### Estimation of the mutation rate

The final piece of Eq. 6 is *p*(*M*), the prior probability of mutation, which we specify as the per-base per-division mutation rate *µ*. In an exponentially growing and neutrally evolving tumor, branching process calculations [3] show that the expected total number of mutations M_{tot} between two allele frequencies (*f*_{min},*f*_{max}) is

The number of bases *N* is 3 ⋅ 10^{9} for a whole genome and 3 ⋅ 10^{7} for a whole exome. The quantity *µ/β* is the effective mutation rate, where *β* is the fraction of cell divisions that lead to two surviving lineages. We make the simplifying assumption that there is no cell death (*β* = 1), so we somewhat over-estimate *µ*. We then estimate *µ* by counting observed high-confidence mutations between allele frequencies *f*_{min} and *f*_{max}. We set *f*_{max} to be the largest allele frequency in D, but we must choose *f*_{min} conservatively, depending on sequencing depth. In the R implementation of BATCAVE, *f*_{min} is a free parameter. For this paper, we set *f*_{min} = 0.05, because we are working at high depth.

### Likelihood function

The current implementation of BATCAVE builds on MuTect, because MuTect reports the log ratio of the likelihood functions for the null and alternative hypotheses (Eq. 1) as TLOD (MuTect1) or t lod fstar (MuTect2). We used MuTect 1.1.7 for all analyses in this paper, so we have

The log posterior odds is the log likelihood ratio (TLOD) plus the log prior odds, so the posterior odds in favor of the alternate hypothesis for a given substitution type is

Here p(S_{m,c}) is the prior probability of a substitution of type S_{m,c}, as described in Eq. 6 and specified in Eq. 9-11. When comparing our posterior odds to those of MuTect, we assume a uniform per-base probability of mutation of 3 *⋅* 10^{−6} [10], so

### Implementation

We have implemented the BATCAVE algorithm as an R package `batcaver`. The package leverages the Bioconductor packages BSgenome [43], GenomicAlignments [44], VariantAnnotation [45], and SomaticSig-natures [46] for fast and memory-efficient variant annotation and genomic context identification. Reference sequences are specified as BSgenome objects, allowing efficient access to genomic context information.

### Tumor simulations

We used a neutral branching process with no death and *µ* = 3 ⋅ 10^{−6} to simulate realistic distributions of mutation frequencies. Tumors were simulated with three different mutation profiles composed of COSMIC mutation signatures (version 2) [47]. Each simulated profile includes COSMIC signature 1, which is found in nearly all tumors and is associated with spontaneous cytosine deamination. The “Concentrated” profile (Fig. 2A) is an equal combination of COSMIC signatures 1, 7, and 11, which has a large percentage of C *>* T substitutions such as are often seen in cancers caused by UV exposure [48]. The “Intermediate” profile (Fig. 2B) is an equal combination of COSMIC signatures 1, 4, and 5, which has been associated with tobacco carcinogens and is representative of some lung cancers [48]. The “Diffuse” profile (Fig. 2C) is an equal combination of COSMIC signatures 1, 3, and 5, which has been associated with inactivating germline mutations in the BRCA1/2 genes leading to a deficiency in DNA double strand break repair [32]. Simulated variants were sampled from a combination of the Cancer Genome Atlas (TCGA) and Pan-Cancer Analysis of Whole Genomes (PCAWG) databases, which include mutations found in all types of cancer. Whole genome (100X depth) and whole exome (500X depth) reads were simulated from the GRCh38 reference genome using VarSim [49] and aligned with BWA [50], both with default parameters. Variants were inserted to create tumors with BAMSurgeon with default parameters [51] and called with MuTect 1.1.7 [10] with the following parameters:

Variants identified by MuTect are labelled as to whether they pass all filters, fail to pass only the the evidence threshold tlod f star filter, or fail to pass any other filter. Variants that passed all filters or failed only tlod f star were then passed to BATCAVE for prior estimation and rescoring.

### Calibration metric

To quantify the difference in calibration between MuTect and BATCAVE, we used the Integrated Calibration Index [52]. Briefly, a loess-smoothed regression was fit by regressing the binary (True=1, False=0) true variant classification against the reported posterior probability for both MuTect and BATCAVE. For a perfectly calibrated caller, the regression fit would be the diagonal line *y* = *x*. The Integrated Calibration Index is a weighted average of the absolute distance between the calibration curve and the diagonal line of perfect calibration.

### Real data

We analyzed two real data sets, one from an acute myeloid leukemia (AML) [40] and one from a multi-region sequencing experiment in breast cancer [4]. We downloaded the normal and primary whole-genome AML tumor bam files from dbGaP accession number phs000159.v8.p4. Griffith et al. generated a platinum set of variant calls for this tumor [40], which we used for our true positive dataset. We downloaded the normal and tumor whole-exome breast cancer bam files from NCBI Sequence Read Archive accession SRP070662. Shi et al. generated a gold set of variant calls for each tumor region sequenced [4], which we used for our true positive dataset. For these multi-region data, we ran BATCAVE separately on each sequenced region and combined results to generate precision-recall curves. We called variants using Mutect 1.1.7 as in our simulations, except that both these data sets were originally aligned to GRCr37, so we used that reference.

## RESULTS

We implemented BATCAVE as a post-call variant evaluation algorithm to be used with MuTect (Versions 1.1.7 or *>*2.0) [10]. BATCAVE extracts the log-likelihood ratio for each potential variant site from the MuTect output, and then it uses that ratio to separate the potential sites into high and low confidence groups. The mutation profile and mutation rate are estimated from the high confidence sites, and the posterior probability of mutation is then recomputed for all sites. The BATCAVE algorithm is inexpensive, processing 22,000 variants per second on a typical desktop computer, which corresponds to roughly 100 seconds to process a 500X exome and 2,000 seconds for a 100X whole genome.

To test the performance of BATCAVE, we generated six different tumor/normal pairs, corresponding to 100X whole genomes and 500X whole exomes for three different mutation profiles. The three mutation profiles were chosen to resemble a melanoma (concentrated), a lung cancer (intermediate), and a BRCA-driven breast cancer (diffuse) (Fig. 2). We also tested BATCAVE using two real cancer data sets, a whole-genome Acute Myeloid Leukemia (AML) [40] and a whole-exome multi-region breast cancer [4]. In both, deep sequencing and variant validation were performed with the specific purpose of evaluating tumor variant calling pipelines. Because our focus is on evaluating the statistical calling model, we computed all test metrics using only those potential variants that passed MuTect’s heuristic filters and entered the statistical model.

### Tests using simulated data

To improve variant identification, the context-dependent prior probability of mutation must converge to an accurate representation of the data generating distribution within the set of high-confidence mutations. When applied to simulated data, the prior converged within a few hundred mutations (Fig. 3). For comparison, in our simulated data sets the number of high-confidence mutations ranged between 1,500 and 5,000, and in the real AML we test on it is over 17,000 [40].

We assessed classification performance using the areas under both the receiver operating characteristic and the precision-recall curves, because the classes are unbalanced (approximately 5 to 1 ratio of false to true variants in our simulated data). By both metrics BATCAVE outperforms MuTect (Fig. 4A&B, Fig. S1A&B, and Table 1). The extent of the performance difference is dependent on both the sequencing depth and the concentration of the mutation profile. Deeper sequencing and more concentrated mutation profiles increase the performance advantage of BATCAVE.

For all simulated tumors, the estimated mutation rate was approximately 3 · 10^{−7} (Table 1), which is lower than the simulated rate of 3 · 10^{−6}. This is likely due to restrictions within BAMSurgeon, such as sequencing depth and quality, that prevent 100% of simulated variants from being inserted into the reads.

We also assessed calibration, the likelihood that a potential variant with a given posterior probability is actually a true variant. We measured overall calibration performance using the Integrated Calibration Index (ICI) [52], which integrates the difference between predicted and observed probabilities, weighted by the density of the predicted probabilities. This metric is particularly useful in our case, because the density of posterior probabilities is bi-modal (Fig. 4C&D and S1C&D). A large fraction of true negative variants have posterior probabilities less than 10^{−4}, far below any meaningful threshold, so we evaluated calibration only on potential variants with posterior probability greater than 0.01. For these potential variants, BATCAVE tends to increase posterior probabilities of low probability but true variants (density curves in Fig. 4C&D and S1C&D) while decreasing probabilities of low probability but false variants. For 500X exomes, the calibration of BATCAVE is better than MuTect across the full spectrum of posterior probabilities (Fig. 4 and Table 1). For 100X whole genomes, the calibration of BATCAVE is slightly worse (Fig. S1 and Table 1), likely because there are few low probability true positive variants in tumors sequenced to 100X depth. As with the other metrics, the advantage of BATCAVE increases with the concentration of the mutation profile and the sequencing depth.

In practice, variant callers are typically used with a threshold score above which a variant is called. The user’s choice of threshold ideally meets their need to balance precision and recall; accurate posterior probability estimates enable an informed choice. For posterior probability thresholds between 60 and 90%, the precision of BATCAVE calls is similar to the chosen threshold (Fig. 5&S2). For this range of thresholds, however, the posterior probabilities from MuTect poorly predict precision (Fig. 5&S2). For any posterior probability threshold above 70%, MuTect has a false positive rate of roughly 8%, whereas BATCAVE has a false positive rate that decreases as the threshold increases. The cost of MuTect’s compressed range of posterior probabilities is recall; at any posterior probability threshold BATCAVE has recall better than MuTect. Consequently, BATCAVE posterior probabilities are more informative than MuTect’s with regard to choosing a calling threshold.

### Tests using real tumor data

We tested BATCAVE using two data sets for which deep sequencing and variant validation were performed with the express purpose of evaluating tumor variant calling pipelines, yielding high quality true and false positive data [40, 4]. However, only variants called by at least one variant caller were validated. As a result, there are no validated true or false negative calls, so we considered only precision-recall comparisons for these data.

Griffith et al. sequenced the whole genome of an acute myeloid leukemia (AML) primary tumor to a depth of *>*360X and used targeted sequencing to validate nearly 200,000 mutations [40]. We estimated a per-base mutation rate for this tumor of 4 · 10^{−8}, which is consistent with previous estimates of AML mutation rates [40, 3]. For both MuTect and BATCAVE, the precision-recall curve is almost perfect for the validated variants (.995 &. 996 area under the curve) (Fig. 4E and Table 1).

Shi et al. performed multi-region whole exome sequencing on six individual breast tumors to a mean target sequencing depth of 160X and validated all variants identified by three different variant calling pipelines [4]. We estimated an average per-base mutation rate for these tumor regions of 4 · 10^{−8}, which is consistent with observed mutation rates for breast cancers [41] and with the low number of validated somatic mutations. For the validated variants, MuTect and BATCAVE yielded almost identical precision-recall curves (Fig. 4F and Table 1)

## DISCUSSION

BATCAVE is an algorithm that leverages the biology of individual tumor mutation profiles to improve identification of low allelic frequency somatic variants. Our implementation is built on MuTect, one of the most widely used somatic variant callers. BATCAVE improves on the classification accuracy of MuTect in synthetic data (Fig. 4A-D, S1, and Table 1) across the entire range of recall and specificity. Moreover, BATCAVE is better calibrated than MuTect at relevant posterior probability thresholds (Fig. 5 and S2), allowing researchers and clinicians to make informed choices about the trade-off between precision and recall. For real data, testing on validated calls shows that BATCAVE does not degrade performance for variants that are relatively easy to identify (Fig. 4E&F and Table 1). The BATCAVE algorithm can thus be included in a wide variety of sequencing pipelines.

We evaluated BATCAVE with simulated tumors with three different mutation profiles and two real tumors. The simulated diffuse and intermediate profiles (Fig. 2A&B) represent baseline profiles of lung and breast tumors, respectively. And the concentrated profile (Fig. 2C) represents a tumor driven by a particular mutational process, such as UV exposure. But mutational profiles are highly heterogeneous, so concentrated profiles can be found in any tumor type (e.g., Fig. 1C). The two real data sets we considered are among the few for which extensive validation of variant calls has been performed [40, 4]. They happen, however, to have diffuse mutation profiles (Fig. 1A&B), which reduces the expected advantage of BATCAVE over MuTect (Table 1). A more fundamental challenge of using these real data for testing callers is that only a subset of potential variants are validated. This subset tends to be relatively easy to call, so both MuTect and BATCAVE have almost perfect precision and recall for variants that pass heuristic filters (Fig. 4 and Table 1). Moreover, few true negative sites are validated, so specificity and calibration are impossible to calculate. Deep sequencing experiments that validate random samples of uncalled potential variants would give much-needed insight into the differences among statistical models in variant calling.

The improved calibration of BATCAVE posterior probabilities compared to MuTect provides several advantages. In practice, called variants are often manually reviewed to further reduce false positives [53]. Improved calibration enables users to focus review on the most questionable variants. In the clinic, identified variants act as biomarkers for susceptibility to targeted drugs [54]. Well-calibrated posterior probabilities facilitate the use of probabilistic risk models in the choice of treatment [55], rather than an all or nothing approach. For research purposes, the International Cancer Genome Consortium recommends that catalogs of somatic mutations target a precision of 95% and a recall of 80% [56]. Achieving this goal while minimizing cost demands well-calibrated posterior probabilities.

Our current implementation of BATCAVE is as a post-calling algorithm for MuTect, but the algorithm is broadly applicable. We chose to build BATCAVE off MuTect because MuTect is widely used, has state-of-the-art sensitivity and specificity, and includes numerous heuristic filters and alignment adjustments that reduce the prevalence of sequencing errors in results [10, 40]. But the mutational prior can be incorporated into almost any caller with an underlying probabilistic model. For example, Strelka2 computes a joint posterior probability over tumor and normal genotypes, assuming a constant somatic mutation probability at each genomic site [57]. Replacing that constant probability with a mutational prior would require a more complicated manipulation of the quality scores output by Strelka than for MuTect, but it is conceptually straightforward.

The BATCAVE algorithm is computationally inexpensive; our current implementation adds 1 second per 22,000 variants evaluated to a standard GATK best-practices variant calling pipeline. The majority of the computational cost is associated with extracting the trinucleotide context for each potential variant site from the reference genome. Since most callers are already walking the reference genome during the calling process, extracting the trinucleotide context simultaneously would virtually eliminate the computational cost of implementing a mutational prior.

The BATCAVE algorithm incorporates genomic context into the probabilistic model for variant calling. Our current implementation focuses on trinucleotide context, which is known to have a large effect on local mutation rates [58, 59]. There are, however, many other aspects of genomic context that can affect local mutation rates [42], including replication timing [60], expression level [61], and chromatin organization [62]. Some of these, such as replication timing and chromatin organization, could be incorporated into the BATCAVE mutational prior using the empirical distribution of mutations in the human germline [63]. Others, such as expression level, could be tumor-specific, but would require information not available in the variant calls to compute. In the long run, we believe that incorporating more tumor biology into variant calling models will continue to improve performance.

BATCAVE divides the data into two classes: high- and low-confidence variants. The high-confidence variants are used to estimate the mutational prior and mutation rate, which are then used to improve the calling of low-confidence variants. Statistically, this is an empirical Bayesian approach [64], in which the high and low-confidence variants are treated as parallel experiments [65, 66]. In general, high-confidence variants tend to have relatively high allelic frequencies, and consequently tend to have arisen early in tumor development. An implicit assumption of our approach is that the mutational process does not change between high- and low-confidence variants, implying that the mutational profile of the tumor is temporally constant. Recent studies have found differences in mutational profiles among variants of different allelic frequencies [67], although those differences are relatively small. A potential extension of the BATCAVE algorithm is to process potential variants in order of descending allelic frequency and to update the estimated mutational prior as the algorithm proceeds. This approach might increase sensitivity to low-frequency variants generated by recently-arisen mutational processes, at the cost of potentially increasing sensitivity to patterns of sequencing error.

Our results show that adding a mutational prior substantially improves probabilistic variant calling, particularly for tumors with concentrated profiles. Improved variant calling increases the benefit-to-cost ratio of deep sequencing in both research and clinical applications. Moreover, BATCAVE proves to be a better calibrated caller than vanilla MuTect (Fig. 5). Different users will prefer different tradeoffs in terms of precision and recall, which can be more accurately made with BATCAVE. Our R implementation, batcaver, can be easily incorporated into any MuTect-based pipeline, and the mutational profile algorithm can be incorporated into many other callers.

## Software Availability

The `batcaver` R package can be downloaded or installed from http://github.com/bmannakee/batcaver The version of `batcaver` used to generate results and all analysis code have been preserved on Zenodo https://doi.org/10.5281/zenodo.3471715 Python code used to generate simulated tumors has been preserved on Zenodo https://doi.org/10.5281/zenodo.3471741

## ACKNOWLEDGEMENTS

This work was supported by the National Science Foundation via Graduate Research Fellowship award number DGE-1143953 to BKM and by the National Institute of General Medical Sciences of the National Institutes of Health under award number R01GM127348 to RNG. We thank Prof. Edward J. Bedrick for fruitful discussions about the statistical model. This material is based upon High Performance Computing (HPC) resources supported by the University of Arizona TRIF, UITS, and RDI and maintained by the UA Research Technologies department.

## Footnotes

## References

- [1].↵
- [2].↵
- [3].↵
- [4].↵
- [5].↵
- [6].↵
- [7].↵
- [8].↵
- [9].↵
- [10].↵
- [11].↵
- [12].↵
- [13].↵
- [14].↵
- [15].↵
- [16].↵
- [17].↵
- [18].↵
- [19].↵
- [20].↵
- [21].↵
- [22].↵
- [23].↵
- [24].↵
- [25].↵
- [26].↵
- [27].↵
- [28].↵
- [29].↵
- [30].↵
- [31].↵
- [32].↵
- [33].↵
- [34].↵
- [35].↵
- [36].↵
- [37].↵
- [38].↵
- [39].↵
- [40].↵
- [41].↵
- [42].↵
- [43].↵
- [44].↵
- [45].↵
- [46].↵
- [47].↵
- [48].↵
- [49].↵
- [50].↵
- [51].↵
- [52].↵
- [53].↵
- [54].↵
- [55].↵
- [56].↵
- [57].↵
- [58].↵
- [59].↵
- [60].↵
- [61].↵
- [62].↵
- [63].↵
- [64].↵
- [65].↵
- [66].↵
- [67].↵