Learning Complex Representations from Spatial Phase Statistics of Natural Scenes

Natural scenes contain higher-order statistical structures that can be encoded in their spatial phase information. Nevertheless, little progress has been made in modeling phase information of images in order to understand efficient representation of image phases in the brain. Based on recent findings of spatial phase structure in natural scenes, we introduce a generative model of the phase information in the visual systems according to the efficient coding hypothesis. In this model, we assume independent priors for the amplitude and phase of the coefficients, and model the phase using a non-uniform distribution, which extends existing models of independent component analysis for complex-valued signals. The parameters of the proposed model are then estimated under the maximum-likelihood principle. Using simulated data, we show that the proposed model outperforms conventional models with a uniform phase prior in blind source separation of complex-valued signals. We then apply the proposed model to natural scenes in the Fourier domain. The learning yields nonlinear features specified by a pair of similar Gabor-like filters in quadratic phase structure. These features predict properties of phase sensitive complex cells in the visual cortex, and indicate that the phase sensitive complex cells are essential for removing redundancy in natural scenes.

more perceptual information than the amplitude of an image [24]. Perceptually salient features 26 such as edges and bars are encoded in the ventral visual cortex based on their phase congruency 27 [2, 5, 7, 15]. It has also been shown that both simple and complex cells in macaque V1 are sensitive 28 to the phase images [9, 16,20,28]. Nevertheless, constructing models of a visual system that utilizes 29 the characteristic phase information in the natural scenes remains to be a challenging problem. 30 The classical linear generative models represent the natural images by linearly combining features 31 (i.e., receptive fields), and by weighting them using different coefficients [22,4]. The coefficients 32 of the features (i.e., responses of the receptive fields) are learned from natural images so that they 33 become as independent as possible, according to the efficient coding hypothesis. Nevertheless it 34 is known that their dependency cannot completely removed. It was pointed out that the residual 35 dependency in the responses of the receptive fields is conveniently described by using scalar and 36 circular components [30,21,19]. This suggests to use complex representation (a pair of real and  In this study, we present a linear generative model of complex representation for natural images using 42 a superposition of complex features (a pair of features). While we consider independent priors for the 43 amplitude and phase of the coefficients, our attention is particularly paied to the phase distribution.  Let X obs = (X 1 , X 2 , . . . , X T ) be a collection of a complex-valued matrix that is a Fourier transform 52 of image patches with size N pixels. These T patches were selected randomly from natural scenes. If 53 we whiten the complex-valued data, we can assume that the samples, X t (t = 1, . . . , T ), are mutually 54 uncorrelated with zero mean. 55 We consider the following complex-valued generative model for these observations. In this model, a 56 complex domain of natural patches, X, is generated from a superposition of unknown N complex Here s i is a complex coefficients given as The complex independent component analysis (cICA) aims to infer the transform matrix A (or W ) 69 and source signals S under the assumption of their independence.

70
In this study, we propose to model each complex coefficient by polar coordinates, and impose (2) Throughout this paper, we assume that the amplitude distribution p ri (r i ) follows the gamma distribu-75 tion with the shape parameter being 2, where β i > 0 is a scale parameter. This distribution resembles the amplitude distribution obtained 77 from responses of complex Gabor filters to natural scenes [19], and imposes sparseness on the 78 complex coefficients. We let the shape parameter be 2 because we found that the optimization gives the log-likelihood function of the model parameters: where W i is the i-th row of W . By considering the prior knowledge of amplitude, Eq. 2, and a 92 uniform phase distribution, we have where the superscripts * and H denote the conjugate of complex coefficient, s m , and Hermitian 99 transpose of de-mixing matrix W , respectively. Note that r i and s m , are calculated from W and X. More specifically, we model the phase distribution by a mixture of uniform and von-Mises distribu-106 tions: where I 0 (.) is the Bessel function of order 0. This model is a modification of the previous circular 115 cICA. We call this new model the modified circular cICA (mc-cICA).

116
The log-likelihood function of the mc-cICA model is ) is a vector of the model parameters.

118
The MLEs of the parameters in the proposed model are obtained by gradient descent algorithms using 119 Eq. 6 and the following gradients: where I 1 (.) is the Bessel function of order 1. For both c-cICA and mc-cICA, we use the conjugate 121 gradient method. 122 3 Results

123
Performance comparison using simulated data In this section, we evaluate efficiency of source 124 separation by variants of the cICAs, using simulated data. For this goal, we generated a data set X by 125 mixing 10 independent complex-valued source signals S, using a random invertible matrix A. We 126 considered two different types of the complex-valued source signals using the polar coordinates: In 127 one data set, phases were sampled from a uniform distribution; in the other data set, phases were 128 sampled from a bimodal distribution (Eq. 8). Amplitudes were sampled from Eq. 3 in both cases. In 129 total, we generated 20 data sets for each case. The mixing matrix and parameters of the probability 130 density function of sources were chosen randomly for each data set. 131 We then applied the various cICA models to estimate the de-mixing matrix W . We evaluated their 132 performance as follows. If the source signals are perfectly separated by these algorithms, the product where P m,n is a (m,n)-element of P .

141
Using the Amari index, we compared the mc-cICA with c-cICA as well as the previously suggested  with the sample sizes. Figure 2A shows the performance of models when they are applied to mixed  complex features were tuned to the same specific orientation bandwidth (Fig. 5B). Finally, the real 186 and imaginary components of the complex features were orthogonal to each other (Fig. 5C).

187
In summary, the real and imaginary components of the complex features learned from natural