Visual Inference Amid Fixational Eye Movements Yoram Burak Center for Brain Science, Harvard University

Oct 18, 2010, 11:45 pm1:00 pm
Joseph Henry Room


Event Description
Our visual system is capable of inferring the structure of 2-d images at a resolution comparable (or, in some tasks, greatly exceeding) the receptive field size of individual retinal ganglion cells (RGCs). Our capability to do so becomes all the more surprising once we consider that, while performing such tasks, the image projected on the retina is in constant jitter due to eye and head motion. For example, the motion between two subsequent discharges of a foveal RGC typically exceeds the receptive field size, so the two subsequent spikes report on different regions of the visual scene. This suggests that, to achieve high-acuity perception, the brain must take the image jitter into account. I will discuss two theoretical investigations of this theme. I will first ask how the visual system might infer the structure of images drawn from a large, relatively unconstrained ensemble. Due to the combinatorially large number of possible images, it is impossible for the brain to act as an ideal observer that performs optimal Bayesian inference based on the retinal spikes. However, I will propose an approximate scheme derived from such an approach, which is based on a factorial representation of the multi-dimensional probability distribution, similar to a mean-field approximation. The decoding scheme that emerges from this approximation suggests a neural implementation that involves two neural populations, one that represents an estimate for the position of the eye, and another that represents an estimate of the stabilized image. I will discuss the performance of this decoding strategy under simplified assumptions on retinal coding. I will also compare it to other schemes, and discuss possible implications for neural visual processing in the foveal region. In the second part of the talk I will focus on the Vernier task, in which human subjects achieve hyper-acuity, greatly exceeding the receptive field size of a single RGC. The optimal decoder for this task can be formalized and analyzed mathematically in detail. I will show that a linear, perceptron-type decoder cannot achieve hyper-acuity. On the other hand a quadratic decoder, which is sensitive to coincident spiking in pairs of neurons, constitutes an effective and structurally simple solution to the problem. Furthermore, the performance achieved by such a decoder is close to the limit imposed by the ideal Bayesian decoder. Therefore, spike coincidence detectors in the early visual system may facilitate hyper-acuity vision in the presence of fixational eye-motion.