<html>
<body>
<div align="center"><font size=4><b>The UCSD Department of Cognitive
Science is pleased to announce a talk by<br><br>
</font><font size=6>Tim Marks Ph.D.<br><br>
</font><font size=4>Department of Computer Science and Engineering<br>
University of California, San Diego<br><br>
</font><font face="arial" size=4>Monday, February 25, 2008 at 12pm<br>
Cognitive Science Building, room 003<br><br>
<br>
</b></font><font size=6>"Probabilistic Models of Visual
Processing"<br><br>
</font></div>
<b>If the brain has developed optimal solutions to the perceptual
problems it faces, then it may be implementing elegant probabilistic
solutions to the problems of dealing with an uncertain world. Studying
human perception can inform new systems for machine perception, and
developing new systems for machine perception improves our understanding
of human and animal perception. My research takes advantage of this
synergy between human and machine perception by developing probabilistic
models and deriving optimal inference algorithms on those models to
enable machines to perform a variety of visual processing tasks.<br><br>
I will first discuss a generative model and probabilistic inference
algorithm, called G-flow, for tracking a human face (or other deformable
object) in 3D from single-camera video. Two standard computer vision
algorithms, optical flow and template matching, emerge as special
limiting cases of optimal inference under G-flow. This elucidates the
conditions under which each of these existing methods is optimal and
suggests a broader family of tracking algorithms that includes an entire
continuum between these two extremes. Then I will discuss diffusion
networks, a stochastic version of continuous time, continuous state
recurrent neural networks. I will present the surprising result that a
particular class of linear diffusion networks is equivalent to factor
analysis, and demonstrate that this neurally plausible architecture can
be trained, using contrastive divergence (Hinton 2002), to learn the type
of 3D deformable models used by G-flow for face tracking. The same
mathematical technique used in G-flow can be applied to simultaneous
localization and mapping (SLAM) in mobile robots. I will present a new
algorithm, called Gamma-SLAM, that uses stereo vision for SLAM in
off-road outdoor environments by representing the world using variance
grid maps. Next, I will present two probabilistic models of visual
processing and compare them directly with human performance: NIMBLE, a
Bayesian model of saccade-based visual memory; and SUN, a Bayesian
framework for saliency using natural statistics. NIMBLE achieves human
levels of performance in a face recognition task using a fixation-based
memory model, and SUN derives a state-of-the-art saliency algorithm from
the simple assumption that a goal of the visual system is to find
potential targets that are important to survival. I will conclude with a
discussion of my plans for future research in each of these areas.<br>
</b></body>
</html>