Fast Learning VIEWNET Architectures for Recognizing 3-D Objects from Multiple 2-D Views

OpenBU

Show simple item record

dc.contributor.author Bradski, Gary en_US
dc.contributor.author Grossberg, Stephen en_US
dc.date.accessioned 2011-11-14T18:19:25Z
dc.date.available 2011-11-14T18:19:25Z
dc.date.issued 1995-08-01 en_US
dc.identifier.uri http://hdl.handle.net/2144/2030
dc.description.abstract The recognition of 3-D objects from sequences of their 2-D views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation of an image, a supervised incremental learning system that classifies the preprocessed representations into 2-D view categories whose outputs arc combined into 3-D invariant object categories, and a working memory that makes a 3-D object prediction by accumulating evidence from 3-D object category nodes as multiple 2-D views are experienced. The simplest VIEWNET achieves high recognition scores without the need to explicitly code the temporal order of 2-D views in working memory. Working memories are also discussed that save memory resources by implicitly coding temporal order in terms of the relative activity of 2-D view category nodes, rather than as explicit 2-D view transitions. Variants of the VIEWNET architecture may also be used for scene understanding by using a preprocessor and classifier that can determine both What objects are in a scene and Where they are located. The present VIEWNET preprocessor includes the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and suppresses image noise. This boundary segmentation is rendered invariant under 2-D translation, rotation, and dilation by use of a log-polar transform. The invariant spectra undergo Gaussian coarse coding to further reduce noise and 3-D foreshortening effects, and to increase generalization. These compressed codes are input into the classifier, a supervised learning system based on the fuzzy ARTMAP algorithm. Fuzzy ARTMAP learns 2-D view categories that are invariant under 2-D image translation, rotation, and dilation as well as 3-D image transformations that do not cause a predictive error. Evidence from sequence of 2-D view categories converges at 3-D object nodes that generate a response invariant under changes of 2-D view. These 3-D object nodes input to a working memory that accumulates evidence over time to improve object recognition. ln the simplest working memory, each occurrence (nonoccurrence) of a 2-D view category increases (decreases) the corresponding node's activity in working memory. The maximally active node is used to predict the 3-D object. Recognition is studied with noisy and clean image using slow and fast learning. Slow learning at the fuzzy ARTMAP map field is adapted to learn the conditional probability of the 3-D object given the selected 2-D view category. VIEWNET is demonstrated on an MIT Lincoln Laboratory database of l28x128 2-D views of aircraft with and without additive noise. A recognition rate of up to 90% is achieved with one 2-D view and of up to 98.5% correct with three 2-D views. The properties of 2-D view and 3-D object category nodes are compared with those of cells in monkey inferotemporal cortex. en_US
dc.description.sponsorship National Science Foundation (IRI-90-24877); Office of Naval Research (N00014-92-J-1309); Air Force Office of Scientific Research (F49620-92-J-0499); Advanced Research Projects Agency (AFOSR 90-0083, ONR N00014-92-J-4015) en_US
dc.publisher Boston University Center for Adaptive Systems and Department of Cognitive and Neural Systems en_US
dc.relation.ispartofseries BU CAS/CNS Technical Reports;CAS/CNS-TR-1993-053 en_US
dc.rights Copyright 1995 Boston University. Permission to copy without fee all or part of this material is granted provided that: 1. The copies are not made or distributed for direct commercial advantage; 2. the report title, author, document number, and release date appear, and notice is given that copying is by permission of BOSTON UNIVERSITY TRUSTEES. To copy otherwise, or to republish, requires a fee and / or special permission. en_US
dc.subject Pattern recognition en_US
dc.subject Neural networks en_US
dc.subject ART en_US
dc.subject ARTMAP en_US
dc.subject 3-D object recogntion en_US
dc.subject Learning en_US
dc.subject Probability learning en_US
dc.subject Fuzzy logic en_US
dc.subject Working memory en_US
dc.title Fast Learning VIEWNET Architectures for Recognizing 3-D Objects from Multiple 2-D Views en_US
dc.type Technical Report en_US
dc.rights.holder Boston University Trustees en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search OpenBU


Advanced Search

Browse

Deposit Materials