Comparative Analysis of Genetically-Modified Crops: Conditional Equivalence Criteria

The comparative assessment of genetically-modified (GM) crops relies on the principle of substantial equivalence, which states that such products should be compared to conventional counterparts that have an established history of safe use. In an effort to operationalize this principle, the GMO Panel of the European Food Safety Authority proposed an equivalence test that directly compares a GM test variety with a set of unrelated, conventionally-bred reference varieties with part of the difference as the known background of the test (the same as the given control). The criterion of the EFSA test, however, is defined solely by genotypic differences between the non-traited control and reference varieties (i.e. the background effect) while assuming the so-called GM trait effect as zero. As the outcome of an EFSA equivalence test is determined primarily by the similarity, or lack thereof, of the control and references, a conditional equivalence criterion is proposed in this investigation that focuses on “unintended” effects of a GM trait which is irrespective of the (random) genotypic value of a given control. The new criterion also includes a mean-scaled standard similar to the 80-125% rule for bioequivalence assessment practiced in the pharmaceutical industry as an alternative when the reference variation is zero or close to zero. In addition, optional criteria are proposed with a step-wise procedure to control the rate of false negatives (non-equivalence by chance) providing a comprehensive assessment under multiple comparisons. An application to maize grain composition data demonstrates that the conditional equivalence criterion provides effect-specific and more robust assessment of equivalence than the EFSA criterion did, especially for GM traits showing negligible or no unintended effects which are likely true for most traits in the current market.


Introduction
genetic variation among references, and (∆ , ∆ , ∆ ) the mean differences among three 105 groups. For parameters of interest, ∆ represents a GM trait effect with a given control, while 106 ∆ is an effect of the genotypic background shared by the test and the control from the 107 traditional plant breeding. A simple difference of a GM test from the reference mean ∆ ( = ∆ 108 + ∆ ) as expected consists of both GM trait effect ∆ and background effect ∆ . 109 The following are the underlying assumptions of the principle of substantial equivalence and 113 "It focuses on assessing the safety of any identified differences so that the safety of the 114 new product can be considered relative to its conventional counterpart" [2]. Though the 115 regulatory evaluation of a GM crop is on a given control background, upon approval, the GM 116 trait could be integrated into any conventional reference (in the current market or from a 117 breeding program) during the commercial application, and the background effect ∆ is 118 expected to vary from endpoint to endpoint for a given control or for the same endpoint 119 across different controls [17,18,19]. An equivalence of a GM crop should focus solely on a 120 GM trait effect ∆ ( = ∆ -∆ ) regardless of the genotypic background effect ∆ of a 121 given control (Fig 1). 122 (b) Substantial equivalence of a GM crop in statistics is a similarity measure to a distribution of 123 conventional references with a history-of-safe-use; 124 "Any observed differences should be assessed in the context of the range of natural 125 variations" [2] demonstrated by conventional references with no requirement of a trait effect ∆ = 0. In spite of a given control was applied in a TCR trial, the equivalence of a GM crop 127 in statistics is a similarity or distance measurement of the mean difference ∆ between two 128 probability distributions, one for GM crop with various genotypic backgrounds and one for 129 conventional references, in the scale of the reference variation . 130 (c) Equivalence conclusion of a GM crop (or a GM trait) relies on the totality of evidence across 131 key components when compared a given control background. 132 Codex guidelines state [2] that "A variety of data and information are necessary to assess 133 unintended effects because no individual test can detect all possible unintended effects or 134 identify, with certainty, those relevant to human health. These data and information, when 135 considered in total, provide assurance that the food is unlikely to have an adverse effect on 136 human health". In practice a comprehensive assessment has been performed over a wide 137 range of endpoints from various studies e.g. often > 50 analytes in a composition study alone, 138 and any experimental deviation from equivalence has been evaluated in terms of the "natural First, by criteria of EFSA and Vahl and Kang, equivalence of a trait effect ∆ would be 203 largely determined by the background effect ∆ , not only the magnitude but also its direction.

204
Opposite signs of ∆ and ∆ would be much more likely to be concluded as equivalent than 205 those with the same sign do. In addition, the probability thresholds for and assume ∆ those cases with zero estimate of as "Equivalence Not Concluded", an arbitrary conclusion is 227 expected as becomes less than certain threshold (relative to the residual variation) due to a 228 large proportion of close to zero criterion. A second problem is that, with a criterion defined by a 229 95% confidence limit, false negative (i.e. non-equivalence by chance) is expected to be at least 230 5% for each endpoint and would be much higher due to the proof-of-equivalence. While a 231 comprehensive assessment requires a totality of evidence, optional criteria become necessary.

Alternative criteria in a comprehensive assessment 234
An empirical mean-scaled criterion when is low 235 Alternative criteria are discussed in this section when the reference variation is too low for a 236 variation-scaled criterion. When 2 is low, references in a TCR trial become similar to each 237 other including the control. Consequently, the equivalence of a GM crop may become the same 238 as the bioequivalence of a generic drug to a brand-named reference in pharmaceutical industry in 239 terms of the comparison between the test and the control and the absence of references.

240
The 80-125% rule has long been adopted in pharmaceutical industry for two drugs being 241 "similar to such a degree that their effects, with respect to both efficacy and safety, will Two standards appear to be independent in theory, but in practice they are highly correlated as will be shown in the following maize grain composition example. Let whole range of reference variation, and with a 95% confidence limit a minimum 5% false 265 negative is expected even with no trait effect and could be much higher due to the proof-of-266 equivalence (as shown in the following example). The same is true for a conditional equivalence.

267
Therefore, an optional criterion = 2 0.995 -1 ≈ 2.38 corresponding to a 99% confidence 268 limit is recommended to control the number of false negative.

269
For a use of whole range of "natural variation" in a proof-of-equivalence, OECD provided   The last three columns of Table 1  proof-of-equivalence, an estimated difference exceeding would automatically lead to a non-381 equivalence conclusion. | | > were observed for both criteria of EFSA and VK. Three 382 cases of | | > for EFSA criterion represents almost exactly a 5% of non-equivalence by 383 chance, 2.6 (i.e. 5% of 51) in expectation which simply suggests no evidence of GM trait effects.

384
These results are the same as EFSA original analysis with the transformation, where three 385 analytes were classified as "Non-Equivalence More Likely Than Not" or "Non-Equivalence".
However, for the conditional criteria no | | =0 | > was observed. Note that although no 389 trait effect exceeds ,0.95 in Table 1, ,0.99 might still be necessary in a formal testing due to 390 the proof-of-equivalence.

391
In summary, despite of a formal statistical testing yet to be developed, simple comparisons in should be considered as a natural alternative to a variation-scaled criterion, which is strongly 426 supported by the close correlation between the mean and variation in the example (Fig 3) and

442
In their guideline, EFSA also proposed a simulation approach for evaluating equivalence by 443 an empirical distribution of the number of significant outcomes in the difference testing between 444 two independent references. Regardless of the residual variation, the absolute mean difference 445 between two references is 2 and a 95% confidence limit would be approximately 2.