Neural dynamics of invariant object recognition: relative disparity, binocular fusion, and predictive eye movements
MetadataShow full item record
How does the visual cortex learn invariant object categories as an observer scans a depthful scene? Two neural processes that contribute to this ability are modeled in this thesis. The first model clarifies how an object is represented in depth. Cortical area V1 computes absolute disparity, which is the horizontal difference in retinal location of an image in the left and right foveas. Many cells in cortical area V2 compute relative disparity, which is the difference in absolute disparity of two visible features. Relative, but not absolute, disparity is unaffected by the distance of visual stimuli from an observer, and by vergence eye movements. A laminar cortical model of V2 that includes shunting lateral inhibition of disparity-sensitive layer 4 cells causes a peak shift in cell responses that transforms absolute disparity from V1 into relative disparity in V2. The second model simulates how the brain maintains stable percepts of a 3D scene during binocular movements. The visual cortex initiates the formation of a 3D boundary and surface representation by binocularly fusing corresponding features from the left and right retinotopic images. However, after each saccadic eye movement, every scenic feature projects to a different combination of retinal positions than before the saccade. Yet the 3D representation, resulting from the prior fusion, is stable through the post-saccadic re-fusion. One key to stability is predictive remapping: the system anticipates the new retinal positions of features entailed by eye movements by using gain fields that are updated by eye movement commands. The 3D ARTSCAN model developed here simulates how perceptual, attentional, and cognitive interactions across different brain regions within the What and Where visual processing streams interact to coordinate predictive remapping, stable 3D boundary and surface perception, spatial attention, and the learning of object categories that are invariant to changes in an object's retinal projections. Such invariant learning helps the system to avoid treating each new view of the same object as a distinct object to be learned. The thesis hereby shows how a process that enables invariant object category learning can be extended to also enable stable 3D scene perception.