## ABSTRACT

Test-negative designs have become commonplace in assessments of seasonal influenza vaccine effectiveness. Vaccine effectiveness is measured from the exposure odds ratio (OR) of vaccination among individuals seeking treatment for acute respiratory illness and receiving a laboratory test for influenza infection. This approach is widely believed to correct for differential healthcare-seeking behavior among vaccinated and unvaccinated persons. However, the relation of the measured OR to true vaccine effectiveness is poorly understood. We derive the OR under circumstances of real-world test-negative studies. The OR recovers the true vaccine direct effect when two conditions are met: (1) that individuals’ vaccination decisions are uncorrelated with exposure or susceptibility to infection, and (2) that vaccination confers “all-or-nothing” protection (whereby certain individuals have no protection while others are perfectly protected). Biased effect size estimates arise if either condition is unmet. Such bias may suggest misleading associations of the OR with time since vaccination or the force of infection of influenza. The test-negative design may also fail to correct for differential healthcare-seeking behavior among vaccinated and unvaccinated persons without stringent criteria for enrollment and testing. Our findings demonstrate a need to reassess how data from test-negative studies are interpreted for policy decisions conventionally based on causal inferences.

Prospective studies randomizing participants to vaccination or placebo are the gold standard for measuring the protective efficacy of vaccines against disease and infection. However, in certain situations it is not possible to conduct such studies to measure protection. Influenza vaccines must be re-formulated each year to allow for antigenic drift in prevalent strains; it is therefore not feasible to conduct large randomized placebo-controlled trials with clinical endpoints on an annual basis (1), nor would it be ethical to use placebos in locations such as the United States where annual influenza vaccination is recommended for all individuals over 6 months old. These circumstances necessitate the use of observational study designs to measure vaccine effectiveness (VE) annually (2,3).

The “test-negative” design—implemented as a modification of the traditional case-control design—has become a popular approach for measuring clinical effectiveness of seasonal influenza vaccines (2). It resembles earlier designs such as the indirect cohort method (4) and the selection of “imitation disease” controls in case-control studies (5). Individuals who experience acute respiratory illness (ARI) and present for care receive a laboratory test for influenza virus infection, and their vaccination history is ascertained. The exposure-odds ratio of vaccination among test-positive and test-negative subjects (OR), in some instances adjusted for potential confounding using stratification or regression, has frequently been used to measure vaccine effectiveness (VE) (6), where (7). Causal interpretations of resulting estimates have become the basis for major policy decisions, such as the US Advisory Committee on Immunization Practices recommendation that quadrivalent live attenuated influenza vaccine (LAIV) should not be used in the US during the 2016-17 season (1,8,9) (a decision reversed in 2018).

Unlike VE estimates from traditional case-control studies, the test-negative measure is expected to correct for differential treatment-seeking behaviors among vaccinated and unvaccinated persons because only individuals who seek care are included (10). However, recent attention to confounding, misclassification, and selection biases under the test-negative design (3,11–13) has ignited debate about the suitability of test-negative studies as a basis for policymaking. Whereas directed acyclic graphs have been useful in revealing such biases (9,14,15), quantitative implications of these biases for VE estimates remain uncertain (16). The validity of VE estimates from test-negative studies is thus poorly understood.

To resolve this uncertainty, we derived the relation of the test-negative OR to true VE, defined as the vaccine-conferred reduction in susceptibility to influenza infection and/or influenza-caused ARI. We used this mathematical relationship to assess the quantitative impact of potential biases in test-negative studies. We consider a test-negative study of VE against seasonal influenza as a guiding example throughout the text, noting that our findings also have implications for test-negative studies of vaccines against rotavirus (17,18), cholera (19,20), meningococcus (21), pneumococcus (4), and other infections.

## NOTATION

For consistency, we use notation from a previous study (10) where possible; we list all parameters in **Table 1**. Assume that ARI may result from influenza infection (*I*) or other causes (*N*). Susceptible individuals acquire infection at time-constant rates *λ*_{I} and *λ*_{N}; we show later that results hold for seasonal or otherwise time-varying acquisition rates *λ*(*t*). Infections cause ARI with probability *π*_{I} and *π*_{N}, respectively. Out of the entire population *P*, a proportion of individuals (*v*) received vaccine prior to the influenza season. Because individuals who opted for vaccination may differ from others in their likelihood for seeking treatment for ARI, define the probability of seeking treatment for an ARI episode as *μ*_{V} among the vaccinated and *μ*_{U} among the unvaccinated.

Because a single type or subtype of influenza is typically dominant during a season, assume that naturally-acquired immunity protects against re-acquisition of influenza within a single season. The proportion of individuals remaining susceptible to infection at any time *t* is thus . Assume further that the various non-influenza causes of ARI (*N*) are unlikely to provide immunity against one another, so that the full population remains at risk of *N* throughout; we show later that this assumption does not impact estimates.

Consider two mechanisms by which vaccination protects against infection. Define *φ* as the proportion of individuals responding to vaccine, so that a proportion 1 − *φ* remain unaffected by vaccination. Among the responders, define *θ* as the hazard ratio for infection (measured relative to the hazard rate of infection among non-responders and unvaccinated persons) resulting from vaccine-derived protection (22,23). The special case where *θ=*0 and 0<*φ*<1 corresponds to a situation of “all-or-nothing” protection for responders and non-responders, respectively, while “leaky” protection for all recipients arises under *φ*=1 and 0<*θ*<1 (22–24). The more general circumstances of 0<*φ*<1 and 0<*θ*<1 correspond to an intermediate scenario of “leaky-or-nothing” protection. Perfect protection attains for *θ=*0 and *φ*=1, and no protection attains when *φ*=0 (no individuals respond to vaccination) or *θ=*1 (responders receive no protection). The true vaccine direct effect on susceptibility to infection can be measured from the rate ratio of infection given vaccination:

. To highlight design-level features most pertinent to the interpretation of test-negative studies, and in line with typical reporting of vaccine effectiveness estimates, our analysis does not address heterogeneity in vaccine response beyond the consideration of “all-or-nothing” and “leaky-or-nothing” protection, nor do we address impacts of vaccination on infectiousness, as estimates from the test-negative study design do not capture indirect effects. We refer readers to previous studies addressing such issues in the contexts of differing study designs (24–27). Where applicable, we address sources of confounding in test-negative studies that may lead to incorrect inferences of heterogeneity in vaccine effects among individuals or over time.

## PERFORMANCE OF THE ODDS RATIO UNDER VACCINATION UNCONFOUNDED BY EXPOSURE OR SUSCEPTIBILITY TO THE INFECTIONS

Here we consider the case where individuals’ decision-making about whether to receive influenza vaccine is uncorrelated with their *a priori* risk of acquiring influenza and test-negative conditions, and with the probability that these conditions would cause ARI (*π*_{I} and *π*_{N}). To examine the potential for the test-negative design to correct for treatment-seeking biases, we allow for the possibility that vaccine recipients and non-recipients have different probabilities of seeking treatment for ARI (*μ*_{V} and *μ*_{U}), assuming for now that this probability is equal, given vaccination status, regardless of the cause of the ARI. We relax these assumptions in a later section.

To understand what is measured by the OR in test-negative studies, we derive the rate at which individuals enter into the study as test-positive or test-negative subjects given their vaccination status. The rate of ascertaining test-positive, vaccinated persons is
, where the force of infection (*λ*_{I}) is applied upon as-yet uninfected members of the vaccinated population; we further account for the proportion (*π*_{I}*μ*_{V}) of individuals expected to show symptoms and seek treatment. Similarly, the rate of ascertaining test-positive, unvaccinated subjects is

. Test-negative vaccinated and unvaccinated persons are ascertained at the rates and , respectively.

Test-negative studies typically measure the OR of vaccination among the test-positive and test-negative subjects, similar to the exposure OR in case-control studies, using cumulative cases (*C*). For the test-positive outcome,

. Under the assumption that test-negative infections are not immunizing, cumulative cases are proportional to the incidence rate and study duration:

. We consider the case of immunizing test-negative outcomes below. Using the vaccine-exposure OR measured from cumulative cases,

. Under the special case of “all-or-nothing protection” (*θ*=0),
, equal to the vaccine direct effect against infection. In contrast, under the special case of “leaky” protection for all recipients (*φ*=1),
, resulting in a bias toward the null value of 0. This bias is nonexistent near , but grows as *t* increases .

Despite the lack of data in test-negative studies on the population (or cumulative person-time) at risk for infection, this result (eq. 1a) is equal to measures of vaccine effectiveness obtained in randomized controlled trials. While methods have previously been proposed to recover the vaccine effect on susceptibility through uses of population-at-risk or person-time-at-risk data (22,28), we note that the absence of such measures presents a unique obstacle to bias correction in the context of test-negative studies.

Test-negative studies may also measure time-specific ORs, for instance by stratifying analyses into sub-seasonal intervals (29–32) or by interacting vaccination and time in logistic regression models fitted to individual-level data (33,34). As the time increment approaches zero, the terms included in the OR approach the ascertainment rates of test-positive and test-negative subjects. We therefore define this measurement as , again reducing to under “all-or-nothing” protection but allowing bias to persist under “leaky” protection for all recipients

. Here bias is again nonexistent at *t*=0 and worsens as *t* → ∞. Intuitively, the bias described here arises due to differential depletion of vaccinated and unvaccinated susceptible individuals, consistent with other study designs (22,35,36). Presuming the vaccine is efficacious, more unvaccinated than vaccinated individuals will have been depleted later in the epidemic, confounding instantaneous comparisons of rates in the measurement of “leaky” vaccine effectiveness. We illustrate functional forms of 1 − *OR*^{C} and 1 − *OR*^{Λ} under scenarios of “leaky” and “leaky-or-nothing” protection in **Figure 1** and **Figure 2**, respectively.

To aid interpretation in the context of previous studies (3,11), we also illustrate the modeled causal process using a directed acyclic graph (**Figure 3**), revealing that the special case of “all-or-nothing” protection precludes bias from vaccine-derived protection against influenza infections occurring before the ARI episode for which an individual seeks care.

Conditions for sign bias

In some applications, testing for a protective or harmful effect of the vaccine may take priority over obtaining precise measurements of the effect size. The conclusions of such hypothesis tests rest on an assumption that the OR is not subject to sign bias, reflecting the circumstance *OR* > 1 for an effective vaccine (as defined by the condition *θ* < 1), or *OR* < 1 for an ineffective vaccine (for which *θ* > 1). Here we identify circumstances under which estimates may be subject to sign bias.

### Measuring the odds ratio from cumulative cases

Consider the condition
, which implies
and thus *θ* > 1. Sign bias is present if *θ* < 1, so that the necessary conditions for sign bias cannot be met. In the converse situation, *OR*^{C} < 1 implies *θ* < 1, which again cannot be true when *θ* > 1. Thus, sign bias does not occur in measurements based on cumulative cases, provided vaccination is uncorrelated with exposure or susceptibility to the infections. Because such confounding is likely in real-world studies, we assess resulting biases in a later section.

### Measuring the odds ratio from ascertainment rates

**Figure 1** and **Figure 2** illustrate that sign bias may affect measurements based on the ascertainment rates of test-positive and test-negative subjects. For *θ* < 1, the condition

Implies , so that sign bias arises when

. Conversely, for *θ* > 1, sign bias (indicated by *OR*^{Λ} < 1) arises under

. These circumstances demonstrate the need for caution in interpreting time-specific (continuous or sub-seasonal) VE measurements (29–32).

Extensions maintaining the assumption of vaccination unconfounded by exposure or susceptibility to the infections

### Time-varying infection rates

Consider that transmission intensity varies over time, so that acquisition rates are *λ*_{I}(*τ*_{j}) at time *τ*_{j}. Indexing by day, the probability of evading infection to time *t* is which can be substituted for in eqs. 1a and 1b so that
and
, resembling the expression for time-invariant *λ*_{I} and retaining the relevant biases.

### Immunizing test-negative conditions

Under an assumption of immunity to the test-negative condition(s), and

. The terms describing the cumulative proportions of vaccinated and unvaccinated persons infected by the test-negative condition (exp[–*λ*_{N}*t*]), and the proportions of vaccinated and unvaccinated persons remaining susceptible (1– exp[–*λ*_{N}*t*]), cancel in the expressions for *OR*^{D} and *OR*^{M}, respectively, which invoke (1– exp[–*λ*_{N}*t*])/(1– exp[–*λ*_{N}*t*]) and exp[–*λ*_{N}*t*]/exp[–*λ*_{N}*t*], respectively. Thus, our original VE derivations apply to the scenario of immunizing test-negative conditions.

## PERFORMANCE OF THE ODDS RATIO UNDER DIFFERENTIAL EXPOSURE OR SUSCEPTIBILITY OF VACCINATED AND UNVACCINATED PERSONS TO THE INFECTIONS

The test-negative design is typically employed in observational studies where individuals have received vaccination voluntarily. In contrast to assumptions in the above section that vaccination is uncorrelated with exposure or susceptibility to infection, variation in vaccine uptake across risk groups is well-recognized (3). For instance, preferential vaccine receipt has been reported among relatively healthy older adults (37,38) and among persons prioritized for vaccination such as healthcare workers (who may have elevated risk of encountering infected persons) and individuals with underlying health conditions (who may be at risk for severe outcomes if infected) (39,40). This circumstance corresponds to the presence of a confounder (“G” in **Figure 3**) related to disease risk as well as vaccination.

In the absence of vaccine-derived protection, define and as the relative rates at which individuals who seek vaccination would be expected to acquire influenza and test-negative conditions, respectively, measured against the rates at which individuals who do not seek vaccination would be expected to acquire these conditions. These relative rates do not consider the biological effect of the vaccine, but only the counterfactual associated with “vaccine-seeking” status.

Accounting further for vaccine-induced protection, the ascertainment rates of test-positive and test-negative subjects are , resulting in cumulative case measures .

Estimating VE from cumulative cases, , whereas the estimate based on ascertainment rates is

. These estimates reduce to under “all-or-nothing” protection, and under “leaky” protection.

Consider alternatively that and are the relative risks of ARI given influenza and test-negative infections, respectively, for individuals who seek vaccination, measured against the risk among individuals who do not seek vaccination; we again distinguish that these differences owe to factors other than vaccine-derived protection (24), and consider vaccine protection against disease progression in a subsequent section. Incorporating *π*_{l} and *π*_{N} into the ORs formulated above to allow such heterogeneity,
and

. Under “all-or-nothing” protection,
, which reduces to *φ* if differences between vaccinated and unvaccinated persons equally affect progression of influenza and test-negative conditions to symptoms, i.e. . For a vaccine conferring “leaky” protection to all recipients,
, reducing when to the bias present when vaccine-seeking is uncorrelated with exposure or susceptibility to infection (eqs. 1a and 1b).

Incorporating heterogeneity in both acquisition and progression, and

. These circumstances underscore that differential vaccine uptake among persons at high and low risk for infection or for symptoms given infection—a well-known phenomenon in observational studies of vaccines and other health interventions—may undermine causal interpretations of the OR in test-negative studies.

## BIAS ASSOCIATED WITH DIFFERENTIAL TREATMENT SEEKING AMONG THE VACCINATED AND UNVACCINATED

To this point we have considered ARI as a singular clinical entity and assumed all individuals seeking care for ARI are tested for influenza. However, different infections may cause clinically-distinct presentations that influence the likelihood that individuals seek treatment, or the likelihood that clinicians test for influenza (41). Indeed, previous studies have reported variation across settings and over time in the proportion of ARI or influenza-like-illness patients from whom specimens are collected, and the proportion of these specimens that are tested (42). Here we address the possibility for such a scenario to lead to selection bias from conditioning on the collider T (testing), the pathway in **Figure 3**.

Consider that the spectrum of clinical presentations can be discretized into “moderate” (*M*) and “severe” (*S*) classes, occurring with probabilities , and Define , and as the associated probabilities of seeking care given symptoms and vaccination status, and let *ξ*^{M} and *ξ*^{S} indicate the probabilities of receiving a test given symptoms. Ascertainment rates of subjects are
, so that

. The test-negative VE measures reduce to and

. In both situations, bias associated with differential treatment-seeking persists unless the relative risk of testing given infection (which includes experiencing symptoms, seeking treatment, and being tested) does not differ for influenza and other conditions:

. Expressed more generally, this bias arises unless when accommodating all possible factors that influence whether individuals are tested.

Correction of bias through the use of clinical criteria for enrollment and testing

A possible correction exists when enrollment and testing are tied to stringently-defined clinical criteria, i.e. criteria for which eq. 6 holds. For example, if tests are performed conditioning on cases resembling a well-defined and monotypic “Severe” entity (substituting *ξ*^{M}=0 in eqs. 5a and 5b), the OR retains bias only from differential infection rates and symptom risk between the vaccinated and unvaccinated:
when measured from cumulative incidence, or
when measured from the ascertainment rate (resembling eqs. 4a and 4b). Absent any association of the decision to receive the vaccine with individuals’ exposure or susceptibility to infection and ARI, eqs. 7a and 7b reduce to eqs. 1a and 1b. Thus, bias associated with differential care-seeking or different clinical presentations between influenza and test-negative infections can be eliminated if testing is tied to strict clinical criteria, although previously-identified sources of bias may persist.

## MEASURING VACCINE EFFECTIVENESS AGAINST PROGRESSION

In addition to protection against infection, reductions in symptom risk given infection are of interest in VE measures (24). Define *ρ* as the relative risk for vaccine-protected individuals to experience symptoms given infection owing to vaccine-derived immunity. When decisions to vaccinate are not correlated with exposure or susceptibility to the infections, other than through vaccine-derived immunity,
and
, so that

. Under the special case that a vaccine reduces risk of symptoms without protecting against infection (*θ*=1)—as might apply to oral cholera vaccines (43–45)—these measures reduce to
, an unbiased estimate of VE against progression. Under confounding between vaccination and exposure or susceptibility to the infections,
, reducing to
for a vaccine protecting against symptoms only (*θ*=1).

## IMPLICATIONS

Recent years have seen growing enthusiasm about the integration of data from observational epidemiologic studies in decisions surrounding influenza vaccine policy (46), in part based on belief that vaccine direct effects—which have traditionally been measured in prospective, randomized controlled trials—can be recovered under the test-negative design (6,10,16,23). However, uptake of the test-negative designs by researchers and policymakers has preceded thorough examination of its theoretical justification (14). Our analysis highlights limitations to interpreting VE estimates based on the exposure OR from test-negative studies.

Our most troubling finding is that the OR measured by test-negative studies is unsuited to estimating the vaccine direct effect on susceptibility to infection even under circumstances consistent with randomized vaccine allocation, unless protection is known to follow an “all-or-nothing” mechanism of action. These results echo longstanding concerns about measurement of the effectiveness of “leaky” vaccines in case-control studies (22,47–49) as well as clinical trials (35,36). Researchers rarely know *a priori* to what extent a vaccine confers “leaky” or “all-or-nothing” protection, making it difficult to know under what circumstances studies may be subject to the resulting bias.

We also show that certain traditionally-recognized sources of confounding in observational studies—arising due to differential exposure or susceptibility to infection and symptoms among vaccinated and unvaccinated persons—persist under the test-negative design. Because resulting biases may lead to time-varying estimates of vaccine effectiveness, inferences of waning vaccine protection in test-negative studies may be suspect when predicated on an assumption of no change in the OR over time (30–34). Last, whereas the test-negative design has traditionally been viewed as a strategy to eliminate treatment-seeking bias, we find that bias may persist under differential symptom severity for influenza and test-negative infections. These concerns have not been addressed at length in previous studies supporting use of the test-negative design (6,10,16).

Several assessments of test-negative studies based on DAGs (3,11) have pointed to similar sources of confounding, and the practical importance of these findings has been debated amid uncertainty about the magnitude of associated bias in estimates (16). Although DAGs can identify design factors and statistical adjustments for bias, they do not convey the magnitude or direction of such biases in practice. The framework we have taken provides a basis for quantifying bias directly. We identify not only that the OR of test-negative studies can supply VE estimates that are not equal to the causal vaccine effect on susceptibility, but also that sign bias may arise such that the OR leads to incorrect inferences about whether a vaccine is effective or not. This is contrary to the frequent assumption that the OR provides, at minimum, a valid and direction-unbiased test of the null hypothesis of no causal effect (11).

Other approaches have been taken to assess bias in test-negative studies. In informal comparisons, vaccine effectiveness estimates from test-negative studies of live oral rotavirus vaccines and oral cholera vaccines have appeared similar to vaccine efficacy estimates from randomized controlled trials in the same settings (50,51). While these findings may suggest the quantitative sources of bias we identify are not always large in practice, our study and others (35,36) have pointed to potential sources of bias that may also affect estimates of the vaccine direct effect in randomized controlled trials. Moreover, seasonal influenza vaccine trials are not conducted on a year-to-year basis amid alterations to the strain composition of vaccines and changes to the immune profile of hosts. This has led to difficulty accounting for instances where conclusions of randomized controlled trials and test-negative studies have appeared to be in conflict. For instance, LAIV effectiveness has appeared poor in test-negative studies undertaken since the emergence in 2009 of a novel H1N1 influenza A virus (52,53), despite superior efficacy of LAIV over inactivated influenza vaccine among children in earlier randomized controlled trials (54–56). While this led to recommendations against LAIV use during the 2016–2017 season in the United States, debate continues to surround the contribution of strain mismatch, confounding host factors, and study design to the apparent underperformance of LAIV in recent test-negative studies (15,57,58).

Our analysis identifies limitations to the validity of VE estimates based on the vaccine-exposure OR under the test-negative design, formalizing certain concerns about the suitability of test-negative designs as a basis for decision-making (13,14,16,46). Importantly, our analysis suggests that specific improvements can be made to strengthen the evidence base contributed by test-negative studies. We have shown that the use of strict clinical criteria or case definitions for enrollment and testing can ensure bias does not arise due to differential healthcare-seeking behavior among vaccinated and unvaccinated persons. Whereas test-negative studies typically stratify estimates according to influenza type/subtype or even the genetic clade, our findings suggest bias may persist if there are meaningful epidemiologic differences in risk factors for infection and disease among vaccinated and unvaccinated persons. This bias can be reduced by stratifying estimates to minimize within-stratum differences in exposure or susceptibility to infection among vaccinated and unvaccinated persons. While we point out the inability of test negative studies to measure “leaky” or “leaky-or-nothing” protection accurately, the persistence of such bias in randomized controlled trials echoes a broader need to consider epidemiological approaches for the measurement of imperfect forms of immunity (22,47–49). Evidence from test-negative studies should be interpreted with the limitations we report here in mind, in particular for vaccination policymaking traditionally premised on causal inferences.

## FOOTNOTES PAGE

## Footnotes

**Source of funding:**This work was supported by the National Institutes of Health/National Institute of General Medical Sciences (grant U54GM088558 to ML).**Conflicts of interest:**The authors declare no competing interests.

## Abbreviations

- ARI
- Acute respiratory illness
- DAG
- Directed acyclic graph
- LAIV
- Live attenuated influenza vaccine
- OR
- Odds ratio
- VE
- Vaccine effectiveness (used interchangeably with vaccine efficacy in this context as the estimand of both observational and randomized studies, and defined as the causal effect of the vaccine on susceptibility of individuals to infection and/or disease)