Douglas Reece

Thesis Title: Selective Perception for Robot Driving
Degree Type: Ph.D. in Computer Science
Advisor(s): Steven Shafer
Graduated: May 1992

Abstract:

Robots performing complex tasks in rich environments need very good perception
modules in order to understand their situation and choose the best action. Robot planning
systems have typically assumed that perception was so good that it could refresh the entire
world model whenever the planning system needed it, or whenever anything in the world
changed. Unfortunately, this assumption is completely unrealistic in many real-world
domains because perception is far too difficult. Robots in these domains cannot use the
traditional planner paradigm, but instead need a new system design that integrates
reasoning with perception. In this thesis I describe how reasoning can be integrated with
perception, how task knowledge can be used to select perceptual targets, and how this
selection dramatically reduces the computational cost of perception.


The domain addressed in this thesis is driving in traffic. I have developed a
microscopic traffic simulator called PHAROS that defines the street environment for this
research. PHAROS contains detailed representations of streets, markings, signs, signals,
and cars. It can simulate perception and implement commands for a vehicle controlled by a
separate program. I have also developed a computational model of driving called Ulysses
that defines the driving task. The model describes how various traffic objects in the world
determine what actions that a robot must take. These tools allowed me to implement robot
driving programs that request sensing actions in PHAROS, reason about right-f-way and
other traffic laws, and then command acceleration and lane changing actions to control a
simulated vehicle.


In the thesis I develop three selective perception techniques and implement them in
three robot driving programs of increasing sophistication. The first, Ulysses-i, uses
perceptual routines to control visual search in the scene. These task-specific routines use
known objects to guide the search for others--e.g a routine scans along the right side of the
road ahead for a sign. The second program, Ulysses-2, decides which objects are the most
critical in the current situation and looks for them. It ignores objects that cannot affect the
robot's actions. Ulysses-2 creates an inference tree to determine the effect of uncertain input
data on action choices, and searches this tree to decide which data to sense. Finally,
Ulyuses-3 uses domain knowledge to reason about how dynamic objects will move or change
over time. Objects that do not move enough to affect the robot can be ignored by perception.
The program uses the inference tree from Ulysses-2 and a time-stamped, persistent world
model to decide what to look for. When run in the PHAROS world, the techniques included
in Ulysses-3 reduced the computational cost for perception by 9 to 12 orders of magnitude
when compared to an uncontrolled, general perception system.

Keywords:
Vision and scene understanding, active vision, robotics, knowledge representation, reasoning with uncertainty, driving, traffic simulation, tree search strategies, graphics applications

CMU-CS-92-139.pdf (11 MB)
Copyright Notice