Abstract
The Rotating Snakes illusion is a motion illusion based on repeating, asymmetric luminance patterns. Recently, we found certain grey-value conditions where a weak, illusory motion occurs in the opposite direction. Of the four models for explaining the illusion, one (Backus and Oruç, 2005) also explains the unexpected perceived opposite direction. We here present a simple new model, without free parameters, based on an array of standard correlation-type motion detectors with a subsequent non-linearity (e.g., saturation) before summing the detector outputs. The model predicts (1) the pattern-appearance motion illusion for steady fixation, (2) an illusion under the real-world situation of saccades across or near the pattern (pattern shift), (3) a relative maximum of illusory motion for the same grey values where it is found psychophysically, and (4) the inverse illusion for certain luminance values. We submit that the model’s sparseness of assumptions justifies adding a fifth model to explain this illusion.
Introduction
Certain spatial patterns can evoke illusory movement, especially under dynamic viewing. An early example was the Peripheral Drift Illusion (Fraser & Wilcox, 1979). A. Kitaoka optimized its patterns and colors, leading to the strong, beautiful and widely known “Rotating Snakes Illusion” (Kitaoka (2003; Kitaoka & Ashida, 2003, Fig. 5). While usually rendered in color, it is nearly just as strong in grey (Conway et al., 2005). The Rotating Snakes Illusion with its many variations continues to fascinate beyond the vision research community. It basically needs four luminance levels in an asymmetric spatial arrangement, e.g. black/light-grey/dark-grey/white (Murakami et al., 2006). When we assessed the optimal luminance conditions for the intermediate grey levels, we found an as yet unknown “parameter island of weak opposite rotation” when mapped into the plane of light-grey vs. dark grey values for the middle two patches (Atala-Gérard & Bach, 2017).
To date, there are four models explaining the illusory motion in the Rotating Snakes Illusion (Backus & Oruç, 2005; Conway et al., 2005; Fermüller et al., 2010; Murakami et al., 2006). As we have discussed previously, only the Backus and Oruç (2005) model is able to predict the “island of opposite rotation” (Atala-Gérard & Bach, 2017). However, for this to work it requires a specific contrast transfer function (Atala-Gérard, 2018) which differs from the one used by Backus and Oruç (2005). Furthermore, in the natural viewing situation, the Snake pattern does not suddenly appear from neutral background, as assumed in this model, and for seeing the illusion, saccades are necessary (small or large (Otero-Millan et al., 2012)).
We here present a fifth, simpler and parameter-free model, based on nothing but an array of standard Reichard correlation detectors [which are equivalent to the motion energy model (Adelson & Bergen, 1985), but see (Borst, 2007)] with a subsequent non-linear (e.g. saturating) transfer function before summing their outputs. This, together with saccades while viewing, or the appearance of the pattern out of a grey background, predicts the standard Rotating Snakes Illusion, including a parameter region leading to the opposite direction of rotation.
These findings suggest that the Rotating Snakes Illusion can be regarded as a necessary side effect when arrays of motion detectors are combined in a non-linear fashion.
A new, simple computational model
We will present the model in two steps, beginning with a simpler situation, namely that of pattern-appearance of the stimulus from a grey background. We will then move on to treat natural viewing conditions, namely that the observer performs saccades across the stimulus picture. The model only assumes the presence of arrays (Snippe & Koenderink, 1994) of standard Reichardt-Hassenstein correlation detectors (Borst & Egelhaaf, 1989; Hassenstein & Reichardt, 1956, Fig. 4e; Reichardt, 1986). As these are mathematically equivalent with the motion energy model (Adelson & Bergen, 1985), all our findings will hold for the energy model as well. In the model presented here, it proved necessary to add a sign-conserving non-linearity (of nearly any shape, see below) at the output of individual motion detectors, before averaging across the detector array. Saturating non-linearities are frequently observed in neural systems (Peirce, 2007) and, specifically, in the motion system (Derrington & Goddard, 1989) and were suggested by (Adelson & Bergen, 1985): “A compressive nonlinearity (such as a square root) may follow…”. We tested various sigmoid functions that all share the property of being rotationally symmetric around zero, including the arc tangent, hyperbolic tangent, logistic function.
First approximation: pattern appearance
This first approximation to a working model is based on the observation that the appearance of a Snake Pattern from a grey background (pattern appearance) evokes a strong apparent motion even with steady fixation. This is demonstrated on the website https://michaelbach.de/ot/mot-snakesLum/ (Bach, 2020a) if “Modulate contrast” is selected there. To make the geometry more tractable, we uncoiled the original Rotating Snake Illusion with its several “Snake wheels”, and investigated a pattern consisting of repeated “Snake cycles”, each cycle containing four grey values (Fig. 1B).
Basic motion detectors (Fig. 1A, based on Reichardt (1987, his Fig. 4c) include a delay τ, which we account for by simply comparing the time points before and after appearance (t0, t1, respectively) thus leaving the exact delay time undetermined (which may be 50–100 ms). The spatial span of each detector R equals the width of one stripe of the Snake pattern, and there is no space between adjacent detectors. The sensitivity of the detectors is spatially constant and normalized to 1 so that the output of the detectors is simply ∫ L (x)dx, with L(x) being the luminance and the limits of the integral the spatial span of the respective detector. We use the sign convention that a dark structure on light background yields a positive output when moving to the right. Given
g1: grey value at t1 at the right input,
g1p: previous (t0) grey value at the right input,
g2: grey value at t1 at the left input,
g2p: previous (t0) grey value at the left input,
and assuming some function f (which could be the identity),
the output d of the Reichard detector can be calculated by
Main assumption
The sum (or the average, these differ here only by a scaling factor) of an array of such motion detectors, stimulated by the appearance of a Snake cycle, subserves the apparent motion (Fig. 1B). Thus, given
gp: the previous (at t0) background grey value at all inputs,
gb and gw for the black and white value, respectively,
the sum d∑ of four detectors, stimulated by a full Snake cycle, will be given by Note that d∑ will collapse to zero if f is the identity function. For our purposes we will only require f to be a mapping from [-1, 1] to [-1, 1], point-symmetrical around zero, with f(0) = 0, f(1) = 1 and f(−1) = −1. In this generality, we could not solve the problem analytically, so implemented it as a computational model in the R language (R Core Team, 2020), a free open-source programming and statistical environment, and graphs were produced using the package ggplot2. [Full source code in the repository (Bach, 2020b)].
Fig. 2A shows the motion detector array along a Snake cycle with two examples of grey-value pairs (bottom left), and corresponding summed outputs ∑. To give but two examples, for the combination (g1, g2) = (0.25, 0.75), the net output was zero; for the combination (g1, g2) = (0.05, 0.5), a non-zero output resulted. Non-zero net outputs only occurred with the insertion of the aforementioned non-linearity.
Fig. 2B shows the net motion for the full parameter space of the possible grey values (g1, g2). We observed a maximum velocity around the region where that was expected (g1≈5%, g2≈50%) and, indeed, found opposite motion direction where the psychophysical data predict opposite direction of illusory motion.
The various shapes of the non-linearity affected the magnitude of the net motion output, but the distribution in (g1, g2) space remained the same (for all the tested non-linearities). Non-linearities within the receptors themselves were tested as well, but had no qualitative effect in this model and were thus omitted from further analysis, although physiologically they are likely to occur.
Closer approximation, full model: “Pattern shift”, appearance at random positions along the Snake cycle
We model the effects of saccades across the image by stimulating the motion detector array with Snake cycles changing their positions randomly (Fig. 3). Due to Kitaoka’s (2003; Kitaoka & Ashida, 2003, Fig. 5) arrangement in Snake wheels, saccades can affect the position of a Snake cycle as a random shift (Fig. 3) or at any other angle. Assuming that the latter average out, we will only compute the effects of lateral shifts here.
In Fig. 3, we sketch the model structure: As an initial condition, a Snakes sequence could have any possible shift relative to the final position which is equal to the position of the detectors Ri. Since the problem is circular, we averaged over 40 small shifts of the Snakes sequence until it was identical again. The results are shown in Fig. 4, using three different non-linearities. Unsurprisingly, no motion illusion appears for a linear transfer function (A). For any (of the tested) saturating transfer functions a non-zero (illusory) motion, including the “opposite island”, occurs with the pattern shift model just as with the pattern appearance model. Fig. 4C adds an accelerating non-linearity: illusory motion appears again, but with opposite sign; the same sign reversal occurs also in the pattern-appearance model (not depicted).
For all tested non-linearities, we found areas in the (g1, g2) plane which give non-zero results (and thus illusory motion), but the relative areas of positive and negative motion varied slightly.
Discussion
We present a computational model which demonstrates that arrays of standard motion detectors (of the correlation- or motion-energy type), followed by a non-linear transfer function, exhibit the motion illusion known as the Rotating Snakes illusion. The non-linearity can have nearly any shape, as long as it is monotonous. Some properties of psychophysical findings are predicted by the model: (1) prediction of the pattern-appearance motion illusion for steady fixation, (2) an illusion under the natural viewing situation of performing saccades across the pattern (pattern shift), (3) the presence of a relative maximum of illusory motion right at the location in the (g1, g2) parameter space where it is found psychophysically, and (4) the (recently discovered) inverse illusion in a certain parameter region. There are a number of shortcomings and assumptions associated with our simple model which will be discussed next.
While several aspects of the illusion are captured by the model, there is no quantitative prediction, and it predicts equal strength for the “island of opposite rotation” which perceptually is markedly weaker.
Initially we had considered that the real-world saccadic condition can also be traced to pattern-appearance, invoking saccadic suppression to transform pattern-shift into pattern appearance. This may well be the case, but it seems that saccadic suppression is not needed to explain the illusion.
The most famous version of the “Rotating Snakes” (Kitaoka, 2003) is in color, which we have simplified here to a luminance-only version. This is, of course, computationally much more tractable and the simplification seems justified, as Conway et al. (2005) found similar illusion strengths for the luminance and color variants.
While the present model needs no free parameters to be fitted, there are a number of inherent simplifying assumptions: A major one is that the spatial tuning of the detector inputs being matched to the Snake pattern sequence (Fig. 1B). This may indeed be related to the finding that the illusion is typically strongest when performing saccades in the neighborhood of the picture. Consequently, somewhat larger receptive fields (Strasburger et al., 2011) may become involved, that are spatially better tuned to the pattern. Furthermore, optimal matches between the stimulus and the motion detector arrangement will be only fleeting, which matches the perception of this illusion. The model rests on summation of motion receptors, thus loosing spatial information in the model. But some aggregation of motion detectors is required anyway to account e.g. for higher-order motion (Lu & Sperling, 1996) or independence of form (Glünder, 1990).
A very specific shortcoming of the model is the missing prediction of the factual illusion weakness in what we call the “island of opposite rotation” as opposed to the standard grey-value region (it rather predicts the same strength). A further assumption is connected to our simplified saccade model. In the model, we consider the saccade’s motion trajectory as an instantaneous step function. However, in reality, saccades will occur more or less randomly when viewing the Snake patterns. Thus, our “pattern shift” situation is only one of many possibilities that can occur. Our assumption that all other angles and positions will “average out” may well deserve more detailed scrutiny.
While the above is a long, and possibly incomplete, list of assumptions, they all appear physiologically plausible. We were surprised that this very simple model predicts more properties of the Rotating Snakes illusion than any of the previous models, and that it yielded similar results for any of the tested, monotonous non-linearities.
Conclusion
We demonstrate that an array of standard motion detectors with a non-linear transfer function for each detector before summing the individual receptor outputs gives rise to a motion signal, which qualitatively shows all the known properties of the “Rotating Snakes” illusion. We submit that more complicated models are not required to explain this illusion, since it appears to be a straightforward consequence of the non-linearity (which is widely found in the nervous system) when confronted with the repeated, spatially asymmetric grey-value sequence of the Rotating Snakes illusion. Taken together, this underlines the notion that understanding the mechanisms of illusion can be an automatic byproduct of understanding mechanisms of general visual perception.
Acknowledgements
We thank Hans Strasburger for very careful correction of an earlier version of the manuscript.
The article processing charge was funded by the Baden-Wuerttemberg Ministry of Science, Research and Art and the University of Freiburg in the funding program “Open Access Publishing”.
Footnotes
Conflict of Interest: None.
Improvement of corrupted figures; minor text revision; no substantial change.