Research Article: Ideal Binocular Disparity Detectors Learned Using Independent Subspace Analysis on Binocular Natural Image Pairs

Date Published: March 16, 2016

Publisher: Public Library of Science

Author(s): David W. Hunter, Paul B. Hibbard, Maurice Ptito.


An influential theory of mammalian vision, known as the efficient coding hypothesis, holds that early stages in the visual cortex attempts to form an efficient coding of ecologically valid stimuli. Although numerous authors have successfully modelled some aspects of early vision mathematically, closer inspection has found substantial discrepancies between the predictions of some of these models and observations of neurons in the visual cortex. In particular analysis of linear-non-linear models of simple-cells using Independent Component Analysis has found a strong bias towards features on the horoptor. In order to investigate the link between the information content of binocular images, mathematical models of complex cells and physiological recordings, we applied Independent Subspace Analysis to binocular image patches in order to learn a set of complex-cell-like models. We found that these complex-cell-like models exhibited a wide range of binocular disparity-discriminability, although only a minority exhibited high binocular discrimination scores. However, in common with the linear-non-linear model case we found that feature detection was limited to the horoptor suggesting that current mathematical models are limited in their ability to explain the functionality of the visual cortex.

Partial Text

We performed Independent Subspace Analysis [15] on rectangular image patches cut from binocular image pairs [37]. This analysis produced a set of non-linear models whose responses to binocular image stimuli was explored using responses to sine-grating and bar stimuli. Finally we used the Disparity Discrimination Index [9] to determine how much of the variance in complex-cell model responses could be explained by the disparity of the stimuli.

Theoretical complex cell models such as the binocular energy model [3], and derivatives of this model, calculate disparity by combining the squared responses of simple cell models with Gabor-shaped receptive fields in quadrature phase. Quadrature pairs of Gabor functions are by definition orthogonal (decorrelated), so the orthogonality constraint applied by ISA to the sub-spaces makes sense in a binocular context. Assuming the signal (or part thereof) has the same frequency and orientation as the carrier wave of the Gabor function, the well-known Fourier shift theory holds that the result of convolving the signal with the Gabor function will be the cosine of the phase difference between the signal and Gabor function. As a consequence, two Gabor functions in quadrature phase are sufficient to calculate the local phase of a signal [10], and two pairs of Gabor functions in quadrature phase in each eye are sufficient to calculate the disparity between two signals of equal frequency and orientation in each eye [4]. This combination of components implies that the joint distribution of responses is polar and equal in all directions. Such a distribution can be considered to lie on a space described by an L2 norm.

The model of simple linear-non-linear models used here has a substantial limitation in that it does not allow for half-wave rectification. Half-wave rectification of sub-unit responses has been found to play a substantial role in binocular visual processing. For example, Read and Cumming [65] found evidence for both inhibitory and excitatory combinations of binocular stimuli post-filtering and rectification.

Previous work has applied ICA to natural binocular images [20, 21]. This has resulted in components bearing a number of similarities to binocular simple cells in the visual cortex. Binocular independent components tend to be Gabor-like, with very similar orientation and frequency tuning for the two eyes, and a range of position and phase disparity tunings. However, there were also a number of differences. In particular, the distribution of phase-disparity tunings did not match that found in the visual cortex. While phase disparities are strongly peaked around zero in cortical cells, independent components also showed a marked peak for anti-phase components [21]. This distribution was also found with ISA.

Our work has shown that an ISA model applied to binocular natural images produces complex cell models, a subset of which fit the criteria of an ideal disparity detector [4]. In particular, a subset of the models generated by ISA displayed high levels of local position and phase invariance, resulting in a response that was constant for all stimulus positions within the receptive field. Responses of the model to anti-correlated stimuli are also zero or near to zero in complex cell models with high DDI, as required by an ideal disparity detector.