## Abstract

The test-negative design has become the standard approach for assessing real-world performance of vaccines against influenza, with increasing applications in studies of other infectious disease interventions. Vaccine effectiveness is measured from the exposure odds ratio (OR) of vaccination among individuals seeking treatment for acute respiratory illness who receive a laboratory test for influenza infection. This approach is argued to provide a natural correction for differential healthcare-seeking behavior among vaccinated and unvaccinated persons. However, the relation of the measured OR to true vaccine effectiveness is not well established. We derive the OR under circumstances consistent with real-world test-negative studies. The OR recovers the true vaccine direct effect when two circumstances are met: (1) that vaccination is uncorrelated with exposure or susceptibility to infection, and (2) that vaccination confers “all-or-nothing” protection (whereby certain individuals have no protection while others are perfectly protected). Biased effect size estimates, potentially including sign bias, arise if either condition is unmet. Such bias may generate time-varying OR estimates suggestive of vaccine waning in the absence of true time-varying protection. Troublingly, the test-negative design may fail to correct for differential healthcare-seeking behavior among vaccinated and unvaccinated persons unless stringent clinical criteria are upheld for enrollment and testing. Our findings demonstrate a need to reassess how data from test-negative studies are interpreted for policy decisions conventionally based on causal inferences.

## BACKGROUND

Prospective studies randomizing participants to vaccination or placebo are the gold standard for measuring the protective efficacy of vaccines against disease and infection. However, in certain situations it is not feasible to conduct such studies to measure protection. For instance, influenza vaccines must be re-formulated each year to allow for antigenic drift in prevalent strains; it is therefore not feasible to conduct large randomized placebo-controlled trials with clinical endpoints on an annual basis^{1}, nor would it be ethical to use placebos in locations such as the United States where annual influenza vaccination is recommended for all individuals over 6 months old. This places a premium on annual estimates of vaccine effectiveness (VE) obtained from observational study designs.^{2,3}

The “test-negative” design—implemented as a modification of the traditional case-control design—has become a popular approach for measuring clinical effectiveness of seasonal influenza vaccines.^{2} Individuals who experience acute respiratory illness (ARI) and present for care receive a laboratory test for influenza virus infection, and their vaccination history is ascertained. The exposure-odds ratio of vaccination among test-positive and test-negative subjects (OR), in some instances adjusted for potential confounding using stratification or regression, has frequently been used to measure vaccine effectiveness (VE),^{4} where .^{5} Causal interpretations of resulting estimates have become the basis for major policy decisions, such as the recent US Advisory Committee on Immunization Practices recommendation that quadrivalent live attenuated influenza vaccine (LAIV) should not be used in the US during the 2016-17 season.^{1,6,7}

Unlike VE estimates from traditional case-control studies, the test-negative measure is expected to correct for differential treatment-seeking behaviors among vaccinated and unvaccinated persons because only individuals who seek care are included.^{8} However, recent attention to confounding, misclassification, and selection biases under the test-negative design^{3,9} has ignited debate about the suitability of test-negative studies as a basis for policymaking. Whereas causal inference based on directed acyclic graphs been useful in revealing such biases,^{7,10–12} quantitative implications of these biases for VE estimates remain uncertain. The validity of VE estimates from test-negative studies is thus poorly understood.

To resolve this uncertainty, we aimed to derive the relation of the test-negative OR to true VE. We used this mathematical relationship to assess the quantitative impact of potential biases in test-negative studies. We consider a test-negative study of VE against seasonal influenza as a guiding example throughout the text, noting that our findings also have implications for test-negative studies of vaccines against rotavirus,^{13,14} cholera,^{15,16} meningococcus, pneumococcus, and other infections.

## NOTATION

For consistency, we use notation from a previous study^{8} where possible. Assume that ARI may result from influenza infection (*I*) or other causes *(N)*. Susceptible individuals acquire infection at time-constant rates *λ _{l}* and

*λ*; we show later that results hold for seasonal or otherwise time-varying acquisition rates

_{N}*λ*(

*t*). Infections cause ARI with probability

*π*and

_{l}*π*, respectively. Out of the entire population

_{N}*P*, a proportion of individuals (

*v*) received vaccine prior to the influenza season. Because individuals who opted for vaccination may differ from others in their likelihood for seeking treatment for ARI, define the probability of seeking treatment for an ARI episode as

*μ*among the vaccinated and

_{V}*μ*among the unvaccinated.

_{U}Because a single type or subtype of influenza is typically dominant during a season, assume that naturally-acquired immunity protects against re-acquisition of influenza within a single season. The proportion of individuals remaining susceptible to infection at any time *t* is thus *e*^{−λIt}. Assume further that the many various causes of ARI (*N*) are unlikely to provide immunity against one another, so that the full population remains at risk of *N* throughout; we show later that this assumption does not impact estimates.

Consider two mechanisms by which vaccination protects against infection. Define *φ* as the proportion of individuals responding to vaccine, so that a proportion 1 – *φ* remain unaffected by vaccination. Among the responders, define *θ* as the hazard ratio for infection (measured relative to the rate of infection among non-responders and unvaccinated persons) resulting from vaccine-derived protection.^{19,20} The special case where *θ*=0 and 0<*φ*<1 corresponds to a situation of “all-or-nothing” protection for responders and non-responders, respectively, while “leaky” protection for all recipients arises under *φ*=1 and 0<*θ*<1.^{19–21} The more general circumstances of 0<*φ*<1 and 0*θ*<1 correspond to an intermediate scenario of “leaky-or-nothing” protection. Perfect protection attains for *θ*=0 and *φ*=1, and no protection attains under any situation where *φ*=0 (no responders) or *θ*=1 (no protection among responders). True VE can be measured from the rate ratio of infection:

## PERFORMANCE OF THE ODDS RATIO UNDER VACCINATION UNCONFOUNDED BY EXPOSURE OR SUSCEPTIBILITY TO THE INFECTIONS

Here we consider the case where vaccination is uncorrelated with incidence of influenza and test-negative conditions, and with the probability that these cause ARI (*π _{l}* and

*π*). To examine the potential for the test-negative design to correct for treatment-seeking biases, we allow for the possibility that vaccine recipients and non-recipients have different probabilities of seeking treatment for ARI (

_{N}*μ*and

_{V}*μ*), assuming for now that this probability is equal, given vaccination status, regardless of the cause of the ARI. We relax these assumptions in a later section.

_{U}The rate of ascertaining test-positive, vaccinated persons is
, which equals
under the special case of “all-or-nothing” protection (θ=0 among the responders), and
under the special case of “leaky” protection for all recipients (*φ*=1). The rate of ascertaining test-positive, unvaccinated subjects is

. Test-negative vaccinated and unvaccinated persons are ascertained at the rates and , respectively.

Test-negative studies typically measure the OR of vaccination among the test-positive and test-negative subjects, similar to the exposure OR in case-control studies, using cumulative cases (*C*). For the test-positive outcome,

. Under the assumption that test-negative infections are not immunizing, cumulative cases are proportional to the incidence rate and study duration:

. We consider the case of immunizing test-negative outcomes below. Using the vaccine-exposure OR measured from cumulative cases,

. Under the special case of “all-or-nothing protection” (*θ*=0),
, equal to the vaccine direct effect against infection. In contrast, under the special case of “leaky” protection for all recipients (*φ*=1),
, resulting in a bias toward the null value of 0. This bias is nonexistent near time 0 and grows as study duration increases:

. Test-negative studies may also measure time-specific ORs, for instance by stratifying analyses into sub-seasonal intervals^{22–25} or by interacting vaccination and time in logistic regression models fitted to individual-level data.^{26,27} As the time increment approaches zero, the terms included in calculation of the OR approach the ascertainment rates of test-positive and test-negative subjects. We therefore define this measurement as
, again reducing to
under “all-or-nothing” protection but allowing bias to persist under “leaky” protection for all recipients
, where bias follows the same pattern of being nonexistent at t=0 and complete as t → ∞. Intuitively, the bias described here arises due to differential depletion of vaccinated and unvaccinated susceptible individuals, consistent with case-control studies more generally.^{19} Presuming the vaccine is partially efficacious, more unvaccinated than vaccinated individuals will have been depleted later in the epidemic, confounding instantaneous comparisons of rates in the measurement of leaky vaccine effectiveness. We illustrate functional forms of 1 – *OR ^{C}* and 1 –

*OR*under scenarios of “leaky” and “leaky-or-nothing” protection in

^{Λ}**Figure 1**and

**Figure 2**, respectively.

To aid interpretation in the context of previous studies,^{3,9} we also illustrate the modeled causal process using a directed acyclic graph (**Figure 3**) revealing that the special case of all-or-nothing protection precludes bias from vaccine-derived protection against influenza infections occurring before the ARI episode for which an individual seeks care.

### Conditions for sign bias

In some applications, testing for a protective or harmful effect of the vaccine may take priority over obtaining precise measurements of the effect size. The conclusions of such hypothesis tests rest on an assumption that the OR is not subject to sign bias, reflecting the circumstance *OR* > 1 for an effective vaccine (as defined by the condition [(1 – *φ*) + *φθ*] < 1), or *OR* < 1 for an ineffective vaccine (for which [(1 – *φ*) + *φθ*] > 1). Here we identify circumstances under which estimates may be subject to sign bias.

#### Measuring the odds ratio from cumulative cases

Consider the condition
, which implies
and thus *θ* > 1. Sign bias is present if [(1 – *φ*) + *φθ*] < 1, which requires *θ* < 1. Thus, the necessary conditions for sign bias cannot be met. In the converse situation, *OR ^{C}* < 1 and [(1 –

*φ*) +

*φθ*] > 1 imply

*θ*< 1 and

*θ*> 1, respectively, which again cannot be mutually fulfilled. Thus, sign bias does not occur in measurements based on cumulative cases, provided vaccination is uncorrelated with exposure or susceptibility to the infections.

#### Measuring the odds ratio from ascertainment rates

**Figure 1** and **Figure 2** illustrate that sign bias may affect measurements based on the ascertainment rates of test-positive and test-negative subjects. For *θ* < 1, the condition
implies
, so that sign bias arises when

. Conversely, for *θ* > 1, sign bias (indicated by *OR ^{Λ}* < 1) arises under

. These circumstances demonstrate the need for caution in interpreting time-specific (continuous or sub-seasonal) VE measurements.^{22–25}

### Extensions maintaining the assumption of vaccination unconfounded by exposure or susceptibility to the infections

#### Time-varying infection rates

Consider that transmission intensity varies over time, so that acquisition rates are *λ _{l}*(

*τ*) at time

_{j}*τ*. Indexing by day, the probability of evading infection to time

_{j}*t*is , which can be substituted for

*e*

^{−λJt}in the above expressions so that and , resembling the expression for time-invariant

*λ*and retaining the relevant biases.

_{l}#### Immunizing test-negative conditions

Under an assumption of immunity to the test-negative condition(s), and

. The terms describing the cumulative proportions of vaccinated and unvaccinated persons infected by the test-negative condition (exp[–*λ _{N}t*]), and the proportions of vaccinated and unvaccinated persons remaining susceptible (1–exp[–

*λ*]), cancel in the expressions indicated above for

_{N}t*OR*and

^{C}*OR*, respectively, which invoke both exp[–

^{Λ}*λ*]/exp[–

_{N}t*λ*] and (1–exp[–

_{N}t*λ*])/(1–exp[–

_{N}t*λ*]). Thus, our original VE derivations apply to the scenario of immunizing test-negative conditions.

_{N}t## PERFORMANCE OF THE ODDS RATIO UNDER DIFFERENTIAL EXPOSURE OR SUSCEPTIBILITY OF VACCINATED AND UNVACCINATED PERSONS TO THE INFECTIONS

The test-negative design is typically employed in observational studies where individuals have received vaccination voluntarily. In contrast to assumptions in the above section that vaccination is uncorrelated with exposure or susceptibility to infection, variation in vaccine uptake across risk groups is well-recognized.^{3} For instance, preferential vaccine receipt has been reported among relatively healthy older adults^{28,29} and among persons prioritized for vaccination such as healthcare workers (who may have elevated risk of encountering infected persons) and individuals with underlying health conditions (who may be at risk for severe outcomes if infected).^{30,31} This circumstance corresponds to the presence of a confounder (“G” in **Figure 3**) related to disease risk as well as vaccination.

Define and as the relative rates of acquiring influenza and test-negative conditions, respectively, between the vaccinated and unvaccinated, resulting from differential exposure to infectious persons or differential susceptibility to infection given exposure. Ascertainment rates of test-positive and test-negative subjects are , resulting in cumulative case measures

. Test-positive ascertainment rates and cumulative cases are under the scenario of “all-or-nothing” protection, and under “leaky” protection among all vaccinated individuals.

Estimating VE from cumulative cases, , whereas the estimate based on ascertainment rates is

. These estimates reduce to under “all-or-nothing” protection and under “leaky” protection.

Consider alternatively that and are the relative risks of ARI given infection in vaccinated and unvaccinated persons resulting from differences in risk factors influencing disease progression; we distinguish that these differences owe to factors other than vaccine-derived protection,^{21} and consider vaccine protection against disease progression in a subsequent section. Incorporating *π _{l}* and

*π*into the ORs formulated above to allow such heterogeneity, and

_{N}. Under “all-or-nothing” protection, , which reduces to the direct VE if differences between vaccinated and unvaccinated persons equally affect progression of influenza and test-negative conditions to symptoms, i.e. . For a vaccine conferring “leaky” protection to all recipients, , reducing when to the bias present under randomization. Incorporating heterogeneity in both acquisition and progression, and

. These circumstances underscore that differential vaccine uptake among persons at high and low risk for infection or for symptoms given infection—a well-known phenomenon in observational studies of vaccines and other health interventions—undermines causal interpretations of the OR in test-negative studies.

## BIAS ASSOCIATED WITH DIFFERENTIAL TREATMENT SEEKING AMONG THE VACCINATED AND UNVACCINATED

To this point we have considered ARI as a singular clinical entity and assumed all individuals seeking care for ARI are tested for influenza. However, different infections may cause clinically-distinct presentations that influence the likelihood that individuals seek treatment, or the likelihood that clinicians test for influenza.^{32} Indeed, previous studies have reported variation across settings and over time in the proportion of ARI or influenza-like-illness patients from whom specimens are collected, and the proportion of these specimens that are tested.^{33} Here we address the possibility for such a scenario to lead to selection bias, highlighted by the pathway V←H→T in **Figure 3**.

Consider that the spectrum of clinical presentations can be discretized into “moderate” (*M*) and “severe” (*S*) classes, occurring with probabilities , and . Define and as the associated probabilities of seeking care given symptoms and vaccination status, and let *ξ ^{M}* and

*ξ*indicate the probabilities of receiving a test given symptoms. Ascertainment rates of subjects are , so that

^{S}. The test-negative VE measures reduce to and

. In both situations, bias associated with differential treatment-seeking persists unless the relative risk of testing given infection does not differ for influenza and other conditions:

. For the scenario with severity classes indexed by *K*, bias persists unless
, or, more generally,
when accommodating all possible factors that influence whether individuals are tested.

### Correction of bias through the use of clinical criteria for enrollment and testing

A possible correction exists when enrollment and testing are tied to stringently-defined clinical criteria. If tests are performed based on “Severe” infection in the above example (substituting *ξ ^{M}*=0), the OR retains bias only from differential infection rates and symptom risk between the vaccinated and unvaccinated:
when measured from cumulative incidence, or
when measured from the ascertainment rate. Thus, bias associated with differential care-seeking or different clinical presentations between influenza and test-negative infections can be eliminated if testing is tied to strict clinical criteria, although previously-identified sources of bias will persist.

## MEASURING VACCINE EFFECTIVENESS AGAINST PROGRESSION

In addition to protection against infection, reductions in symptom risk given infection are of interest in VE measures.^{21} Define *ρ* as the relative risk for vaccine-protected individuals to experience symptoms given infection owing to vaccine-derived immunity. When vaccination is not correlated with exposure or susceptibility to the infections,
and
, so that

. However, under the special case that a vaccine reduces risk of symptoms without protecting against infection (*θ*=1)—as might apply to oral cholera vaccines^{34–36}—these measures reduce to
, an unbiased estimate of VE against progression. Under confounding between vaccination and exposure or susceptibility to the infections,
, reducing to
for a vaccine protecting against symptoms only (*θ*=1).

## IMPLICATIONS

Recent years have seen growing enthusiasm about the integration of data from observational epidemiologic studies in decisions surrounding influenza vaccine policy,^{37} in part based on belief that vaccine direct effects—which have traditionally been measured in prospective, randomized controlled trials—can be recovered under the test-negative design.^{4,8,10,20} However, the uptake of test-negative designs by researchers and policymakers has preceded thorough examination of its theoretical justification.^{11} Our analysis highlights limitations to causal interpretations of VE estimates based on the exposure OR from test-negative studies.

Our most troubling finding is that the OR measured by test-negative studies is unsuited to estimating VE even under circumstances consistent with randomized vaccine allocation, unless protection is known to follow an “all-or-nothing” mechanism of action. These results echo longstanding concerns about measurement of the effectiveness of “leaky” vaccines in case-control studies.^{19,38–40} Researchers rarely know *a priori* whether a vaccine confers “leaky” or “all-or-nothing” protection, or the intermediate general case of “leaky-or-nothing” protection considered here. This makes it difficult to know under what circumstances test-negative studies may be subject to the resulting bias.

We also show that certain traditionally-recognized sources of confounding in observational studies— arising due to differential exposure or susceptibility to infection and symptoms among vaccinated and unvaccinated persons—persist under the test-negative design. Because resulting biases may lead to time-varying estimates of vaccine effectiveness, inferences of waning vaccine protection in test-negative studies may be suspect when predicated on an assumption of no change in the OR over time. This null hypothesis is commonly invoked in test-negative studies reporting waning protection.^{23–27} Last, whereas the test-negative design has traditionally been viewed as a strategy to eliminate treatment-seeking bias, we find that such bias may persist under differential symptom severity for influenza and test-negative infections. These concerns have received little attention in studies supporting the theoretical basis of the test-negative design and use of the OR to measure VE.^{4,8,10}

Several assessments of test-negative studies based on DAGs^{3,9} have pointed to such sources of confounding, and the practical importance of these findings has been debated amid uncertainty about the magnitude of associated bias in estimates.^{10} Although DAGs can identify design factors and statistical adjustments for bias, they do not convey the magnitude or direction of such biases in practice. The framework we have taken provides a basis for quantifying bias directly. We identify not only that the OR of test-negative studies can supply VE estimates that are not equal to the causal risk ratio, but also that sign bias may arise such that the OR leads to incorrect inferences about whether a vaccine is effective or not. This is contrary to the frequent assumption that the OR provides, at minimum, a valid test of the null hypothesis of no causal effect.^{9}

Several other approaches have been taken to assess bias in test-negative studies. In informal comparisons, vaccine effectiveness estimates from test-negative studies of live oral rotavirus vaccines and oral cholera vaccines have appeared similar to vaccine efficacy estimates from randomized controlled trials in the same settings.^{41,42} These findings may suggest the quantitative sources of bias we identify are not always large in practice. However, such comparisons have been difficult to make for test-negative estimates of seasonal influenza vaccine effectiveness, as randomized studies of efficacy are not conducted on a year-to-year basis amid alterations to the strain composition of vaccines and changes to the immune profile of hosts. This has led to difficulty accounting for instances where conclusions of randomized controlled trials and test-negative studies have appeared to be in conflict. For instance, LAIV effectiveness has appeared poor in test-negative studies undertaken since the emergence in 2009 of a novel H1N1 influenza A virus,^{43,44} despite superior efficacy of LAIV over inactivated influenza vaccine among children in earlier randomized controlled trials.^{45–47} While this led to recommendations against LAIV use during the 2016–2017 season in the United States, debate continues to surround the contribution of strain mismatch, confounding host factors, and study design to the apparent underperformance of LAIV in recent test-negative studies.^{12,48,49}

Our analysis identifies limitations to the validity of VE estimates based on the vaccine-exposure OR under the test-negative design. Certain improvements can be made to strengthen the evidence base contributed by such studies. We have shown that the use of strict clinical criteria or case definitions for enrollment and testing can ensure bias does not arise due to differential healthcare-seeking behavior among vaccinated and unvaccinated persons. Whereas test-negative studies typically stratify estimates according to influenza type/subtype or even the genetic clade, our findings suggest bias may persist if there are meaningful epidemiologic differences in risk factors for infection and disease among vaccinated and unvaccinated persons. This bias can be reduced by stratifying estimates to minimize differences in exposure or susceptibility to infection among vaccinated and unvaccinated persons. However, the inability of test negative studies to measure “leaky” or “leaky-or-nothing” protection accurately suggests a need for prospective study designs to monitor effectiveness of seasonal influenza vaccines.^{19,38–40} Evidence from test-negative studies should be interpreted with the limitations we report here in mind, in particular for vaccination policymaking traditionally premised on causal inferences from randomized, prospective studies.

## Footnotes

Source of funding: This work was supported by the National Institutes of Health/National Institute of General Medical Sciences (grant U54GM088558 to ML).

Conflicts of interest: The authors declare no competing interests.