The threshold of binocularity: natural image statistics explain the reduction of visual acuity in peripheral vision

Visual acuity is greatest in the centre of the visual field, peaking in the fovea and degrading significantly towards the periphery. The rate of decay of visual performance with eccentricity depends strongly on the stimuli and task used in measurement. While detailed measures of this decay have been made across a broad range of tasks, a comprehensive theoretical account of this phenomenon is lacking. We demonstrate that the decay in visual performance can be attributed to the efficient encoding of binocular information in natural scenes. The efficient coding hypothesis holds that the early stages of visual processing attempt to form an efficient coding of ecologically valid stimuli. Using Independent Component Analysis to learn an efficient coding of stereoscopic images, we show that the ratio of binocular to monocular components varied with eccentricity at the same rate as human stereo acuity and Vernier acuity. Our results demonstrate that the organisation of the visual cortex is dependent on the underlying statistics of binocular scenes and, strikingly, that monocular acuity depends on the mechanisms by which the visual cortex processes binocular information. This result has important theoretical implications for understanding the encoding of visual information in the brain.


Introduction
It has long been known that visual acuity is greatest in the centre of the visual field, 18 peaking in the fovea and degrading significantly towards the periphery [1], [14]. For very 19 simple tasks, such as detection of low contrast targets, the rate of change of the cortical 20 magnification factor is correlated with the rate of change of task performance [26].
inputs from photographs produced filters with a similar structure to the receptive fields 48 of simple cells in V1 [22]. Similar results were found for binocular images by Hoyer and 49 Hyvärinen [11]. As with simple-cell receptive fields, ICA components can be 50 characterised in terms of their position, frequency, phase, orientation and binocular 51 properties. The distributions of ICA component characteristics share substantial 52 similarities with the distributions of V1 receptive fields, in particular the distributions 53 of horizontal and vertical disparities closely resemble the distributions of horizontal and 54 vertical disparities in V1 [12]. 55 We propose that the requirement for binocular vision imposes a constraint on the 56 finest spatial scale at which visual processing can reliably occur. This threshold of 57 binocularity constraint can then account for both the reduction of binocular acuity with 58 eccentricity and stimulus wavelength, and also the reduction in acuity for purely 59 monocular tasks. As such, the need for binocular vision can be seen as imposing a 60 fundamental bottleneck on visual encoding which extends beyond the requirements of 61 binocular depth perception, and provides a direct explanation for the overall fall-off in 62 visual acuity with eccentricity. 63 Threshold of Binocularity 64 Central to our theory is the idea that some visual features are optimally encoded 65 binocularly, resulting in binocularly tuned cells. Conversely, other features are optimally 66 encoded monocularly leading to cells the respond maximally to features in one eye only. 67 Typically, ICA will learn both monocular and binocular components [11,12]. In both 68 cases, this reflects the redundancy present in natural binocular images. As eccentricity 69 increases, the distribution of binocular disparities increases [10,18,23,28] thus reducing 70 the similarity expected in a corresponding region of an image pair across the two eyes. 71 For a given image eccentricity, the degree to which binocular processing is possible 72 will depend on the availability of a sufficient number of binocular components. Since 73 this will depend on the spatial scale of analysis, we can define a Threshold of 74 Binocularity: the smallest spatial scale at which sufficient binocular components exist to 75 support binocular processing. Given the dependence of binocular components on both 76 eccentricity and scale, the Threshold of Binocularity will increase with eccentricity. It 77 follows from this that binocular processing will then only be possible at increasingly 78 coarse spatial scales as eccentricity increases. Fine scale features in eccentric regions 79 that cannot be binocularly matched would need to be processed monocularly and would 80 not (at least directly) form part of a cyclopean percept. The Threshold Angle of 81 Binocularity thus poses a constraint on the highest spatial scale of processing, at each 82 image eccentricity, that is consistent with binocular integration. Critically, if we wish to 83 ensure binocular vision across the visual field, then it follows that both monocular and 84 2/13 binocular processing will be constrained to only occur at progressively coarser scales 85 with increasing eccentricity.

87
We generated an efficient coding of stereoscopic images using Independent Component 88 Analysis. The proportion of binocular components varied substantially depending on 89 both the angle of eccentricity and the wavelength of the feature, as shown in Figure 1.

90
The proportion of binocular components is close to zero at short wavelengths and large 91 eccentricities, and close to 1 at long wavelengths and small eccentricities. This arises 92 because long wavelength features are more likely to overlap in each view and thus merge 93 into a single binocular feature. Similarly, since the range of disparities is greater at 94 larger eccentricities than in the fovea [10,28], we would expect a lower proportion of 95 binocular components in these areas. Between these two points a clear and rapid 2D histogram of the proportion of binocular components across wavelength and eccentricity. analysis, E 2 was used to describe the effect of eccentricity on the wavelength threshold: 118 the spatial scale up to which binocular processing can occur, given the components 119 learned through ICA. An alternative approach is to analyse the impact of both 120 eccentricity and scale on the ratio of binocular to monocular components. In this 121 approach E 2 is calculated for each wavelength by measuring the slope of the 122 binocular/monocular ratio, i.e. the rate of performance decline for each wavelength. A 123 2D sigmoid function was fitted to the data as a smoothing function and E 2 was 124 calculated as the eccentricity at which the ratio of binocular/monocular components 125 was half of that of the fovea. The fitted sigmoid function is plotted in Figure 1. The 126 iso-contour showing the binocular threshold hold (at 50% binocular) is shown in black, 127 this is the omnibus E 2 from the previous experiment. E 2 for each wavelength is 128 calculated on vertical slices of the 2D sigmoid take at each wavelength.

129
Values of E 2 for each wavelength are shown in Table 1. Results reported for human 130 stereo-acuity by Siderov and Harwerth 9 are shown where there is appropriate data. It 131 is worth noting that our estimate of E 2 matches that of Siderov and Harwerth at 7.5 132 Figure 2. Proportion of binocular components across wavelength and eccentricity.. The heatmap shows the iso-contours of a sigmoid fitted to the datapoints (blue marks). The iso-contour showing 50% binocular is highlighted in black.
c/arcmin however only one wavelength from their data lies within the reliable range of 133 our data. The second comparison at 30 arcmin is within the standard error of the result 134 quoted by Siderov and Harwerth however our value is outside the range of our results 135 and has been imputed. measurements of E 2 in humans obtained psychophysically. Figure 3 shows a log-log plot 140 of E 2 as estimated from the fitted 2D psychometric function together with data from 141 Siderov and Harwerth [27]  and E 2 , the gradient of our ICA data was -1.265, this is slightly surprising given the 154 linear relationship between wavelength and eccentricity in our psychometric function, 155 this effect is due to the proportion of binocular components being less than 100% at 156 zero eccentricity. We do not measure the binocular/monocular ratio at zero eccentricity 157 but rather from the area between zero and 150 arcmin.
158 Figure 3. There is a clear effect of wavelength on the proportion of binocular components that 166 can be seen in figures 1& 2. The proportions of binocular components and by extension 167 the proportion of monocular components can be seen to approximate a sigmoid function 168 from which we were able to calculate an approximation to the threshold between 169 predominately binocular and predominately monocular components. A high prevalence 170 of monocular components at a particular wavelength or eccentricity indicates that most 171 low-level features vary independently from each other at that particular wavelength and 172 location. It is not unreasonable to believe that humans would perform poorly in Harwerth [27] show that this is indeed the case, with both lower binocular acuity and a 176 steeper reduction in visual acuity with eccentricity for high frequency stimuli compared 177 with low frequency stimuli Siderov and Harwerth [27]. 178 We found that the binocular threshold wavelength varies in an approximately linear 179 fashion with eccentricity, with an E 2 of 0.74. In terms of frequency the binocular 180 threshold function varies in the inverse exponential decay expected of an eccentricity 181 function. This result places the rate of change of the binocular threshold squarely with 182 the range of results of both binocular depth acuity [9,27]. We have also shown that 183 both the predicted values and general trend of our predications lie within one standard 184 deviation of psychophysical measurements of E 2 in humans [27]. This indicates that 185 human binocular visual performance in eccentric regions is optimised to the statistics of 186 binocular natural images.

187
The substantial reduction in monocular visual acuity with visual eccentricity has not 188 yet been fully explained. Weymouth proposed retinal ganglion cells as the bottleneck on 189 eccentric visual performance [31]. However, the greater loss of visual acuity in positional 190 (e.g. Vernier) [16] and recognition [1] tasks verses simple motion detection [17] indicates 191 that loss of fidelity occurs after retinal-ganglion cell computation.  Formation of ocular dominance columns is widely viewed as being, at least in part, a 207 response to the properties of visual stimuli [6]. Previously Chklovskii used wire-length 208 minimisation to suggest a link between binocular disparity and ocular dominance 209 columns [5]. The physiological evidence together with the psychophysical evidence 210 6/13 indicate that both Vernier acuity and binocular acuity are limited by the same processes 211 in V1. 212 We have found that the proportion of monocular to binocular components increases 213 with eccentricity and this increase occurs at the same rate as both binocular depth 214 acuity and monocular position acuity as measured in humans psychophysically. This 215 indicates that binocular visual acuity is tuned to the statistics of natural images. 216 Moreover, monocular visual acuity is linked to binocular acuity and therefore to the 217 statistics of binocular images.

218
Further thoughts 219 Binocular vision produces substantial advantages through enhanced signal to noise ratio, 220 greater visual acuity and a direct perception of stereoscopic depth. Most of these 221 benefits are found in the area of the fovea and toward the peripheries these benefits are 222 greatly reduced. Adopting a binocular configuration is a trade-off between the added 223 benefits in the fovea against reduced binocular fidelity in the peripheries.

225
The dataset 226 We used the binocular photographic image set of [10] as a source of natural images. The 227 image-set consists of scenes covering a wide range of depths and disparities, from 228 interior still life scenes on a light-table to outdoor scenes of woodland and beaches, 229 taken around the town of St Andrews in Scotland, UK. The images were taken using a 230 DSLR camera seated on a horizontal slide mount. Two images were taken of each scene, 231 separated using the slide mount by 65mm, close to the average inter-pupillary 232 separation for human adults [8]. In all scenes the cameras were independently verged  The raw patch images undergo a pre-processing stage in order to perform gain 241 control and in order to meet the assumptions of the FastICA algorithm.
This function approximates gain control systems in early binocular features [7], reducing 247 luminance differences between the two eyes and enhancing the phase differences that we 248 are interested in. The FastICA algorithm requires normalised vectors, location of the feature in one eye relative to another. The FastICA algorithm [13] 255 attempts to learn a sparse linear decomposition of the dataset by maximising the 256 Gaussianity of the loading (weights) matrix. If the underlying structure of the input 257 dataset is linear-sparse, ICA will learn a sparse linear decomposition. The set of 258 normalised image patches x i form the matrix X. ICA decomposes X into two matrices, 259 a factor matrix W and score matrix A as, The columns of W are the independent components used in this analysis. The FastICA 261 algorithm attempts to find A such that its elements are maximally non-Gaussian.

262
FastICA is a two stage process; a whitening pre-processing step using Principal 263 Component Analysis and the ICA algorithm itself. Whitening using PCA is a necessary 264 step to the FastICA algorithm [13] that acts as a low filter on the patch samples 265 removing high frequency information. As high frequency information has a lower signal 266 to noise ratio than low frequency information a low pass filter acts to reduce noise an 267 increase the signal to noise ratio [2]. Together with the limits set by the patch size this 268 stage produces a bandpass filtered representation of the binocular image data.   Figure 5.
Examples of binocular linear components learned using Independent Component Analysis.
The entire process of image rescaling, patch sampling and ICA was repeated 10 times 301 to produce 2000 components per region and scale combination. The linear components 302 learned in this fashion resemble the receptive fields of simple-cells in V1 [11,12,22,24]), 303 allowing for comparison between the distribution ICA components learned from natural 304 images and the distribution of known simple-cells in V1 [12,24,30]. The distribution of 305 frequencies across all learned components is shown in figure 6. Components varied in 306 the energy ratio between left and right eye parts, components with at least two thirds of 307 energy in one eye were classified as monocular, components will a more equal energy 308 distribution, at least one third of energy in each eye, were classified as binocular. which it is retinatopically mapped [26]. This factor varies across the visual field such 323 that the inverse of M is roughly proportional to the angle of eccentricity. M −1 can 324 therefore be described as: Where E is the eccentricity in degrees of visual arc, M −1 0 is the inverse cortical 326 magnification factor ( • /mm) at the fovea (E = 0 • ). E 2 is the eccentricity at which the 327 inverse cortical magnification factor is doubled relative to the fovea (or equivalently the 328 cortical magnification factor is halved relative to the fovea). In a psychophysical We define the Threshold of Binocularity as the iso-contour where the ratio of binocular 337 to monocular components generated by our learned model is 0.5 (see figure 1).

338
The rate of change of the Threshold of Binocularity can be measured in a similar The values of S 0 and E 2 were determined from the data by using standard 344 regression techniques, assuming that measurement errors were distributed normally. We 345 found that S 0 = −0.005 • and E 2 = −0.7421. As S 0 indicates a negative wavelength at 346 the fovea which is physically impossible, this results from interpolation from data-points 347 in the middle of the heatmaps bins. As S 0 is negative E 2 is negative also, as this is a 348 linear we can simply take the absolute value of E 2 . For the Threshold of Binocularity, 349 E 2 represents the rate of decrease, with eccentricity, of the maximum spatial wavelength 350 at which binocular processing can occur.

351
A wavelength-eccentricity surface on binocular thresholds.

352
The ToB as calculated from our ICA resembles a sigmoid function. In order to extend 353 our analysis to consider both eccentricity and wavelength we need to define a sigmoid 354 function over two-dimensions. We define a probability p(β|µ, λ) of a sample component 355 taken from eccentricity µ and wavelength λ that the component is binocular. The   356 probability that the component is monocular is then 1 − P (β|µ, λ). The standard 357 sigmoid can be extended to a two-dimensional form as: 10/13 Where f is a simple linear function 359 f (µ, λ) = a + bλ + cµ (7) The free parameters a,b,& c were determined by fitting p(β|µ, λ) to our data using 360 least squares minimisation tools from Matlab's curve fitting toolbox.

363
The 2D sigmoid forms 1D sigmoids in both the wavelength and eccentricity from a sigmoid. However, an alternative definition of E 2 defines E 2 as the 'eccentricity 368 at which [the stimulus size] is twice the foveal value' [29]. In our case the sigmoid is at 369 its maximum at the fovea rather than its minimum, therefore we invert the concept and 370 find the eccentricity at which the threshold of binocularity is half that at the fovea.

371
This concept is shown in diagrammatic form in figure 7.
We can calculate E 2 by solving equation 8 for λ. Here the left hand side returns the 387 value of the sigmoid at 0 eccentricity (µ = 0), and the right hand side is twice the value 388 of the unmodified sigmoid. Equation 8 can be solved for µ as: Clearly when the binocular/monocular ratio at the fovea is one, E 2 is equal to the 390 threshold of binocularity and can be found trivially using equation 9. As the value at 391 the fovea tends to 1, the value of equation 9 will tend to equation 7. The author(s) declare no competing financial interests.