lates relative measurements to them. Landmarks are
then integrated in the map with an Extended Kalman
Filter associated to it. However, this approach does
not manage correctly the uncertainty associated with
robot motion, and only one hypothesis over the pose
of the robot is maintained. Consequently it may fail
in the presence of large odometric errors (e.g. while
closing a loop). In (Mir
´
o et al., 2005) a Kalman fil-
ter is used to estimate an augmented state constituted
by the robot pose and N landmark positions (Dis-
sanayake et al., 2001). SIFT features are used too
to manage the data association among visual land-
marks. However, since only one hypothesis is main-
tained over the robot pose, the method would fail in
the presence of incorrect data associations. In addi-
tion, in the presence of a significant number of land-
marks the method would be computationally expen-
sive.
The work presented in (Sim et al., 2005) uses SIFT
features as significant points in space and tracks them
over time. It uses a Rao-Blackwellized particle filter
to estimate both the map and the path of the robot.
The robot movement is here estimated from stereo
ego-motion (Little et al., 2001), providing a corrected
odometry that simplifies the SLAM problem, since no
large odometric errors are introduced.
The most relevant contribution of this paper is
twofold. First, we present a new mechanism to deal
with the data association problem in the case of dif-
ferent landmarks with similar appearance. This fact
may occur in most environments. Second, our ap-
proach actively tracks landmarks prior to its integra-
tion in the map. As a result only those landmarks
that are more stable are incorporated in the map. By
using this approach, our map typically consists of a
reduced number of landmarks compared to those of
(Little et al., 2002) and (Sim et al., 2005), for compa-
rable map sizes. In addition, we have applied effec-
tive resampling techniques, as exposed in (Stachniss
et al., 2005). This fact reduces the number of particles
needed to construct the map, thus reducing computa-
tional time.
The remainder of the paper is structured as follows.
Section 2 deals with visual landmarks and their utility
in SLAM. Section 3 explains the basics of the Rao-
Blackwellized particle filter. Next, section 4 presents
our solution to the data association problem in the
context of visual landmarks. In section 5 we present
our experimental results. Finally, section 6 sums up
the most important conclusions and proposes future
extensions.
2 FEATURE-BASED METHODS:
VISUAL FEATURES
Previous works in map building has revolved around
two topics: occupancy or certainty grids, and feature-
based methods. Feature based methods work by lo-
cating features in the environment, estimating their
position, and then using them as known landmarks.
Different kinds of features have been used to create a
map of the environment. In our work, we use visual
landmarks as features to build the map. In particular,
we use SIFT features (Scale Invariant Feature Trans-
form) which were developed for image feature gen-
eration, and used initially in object recognition ap-
plications (see (Lowe, 2004) and (Lowe, 1999) for
some examples). Key locations are selected at max-
ima and minima of a difference of Gaussian function
applied in scale space. The features are invariant to
image translation, scaling, rotation, and partially in-
variant to illumination changes and affine or 3D pro-
jection. They are computed by building an image
pyramid with resampling between each level. SIFT
locations extracted by this procedure may be under-
stood as significant points in space that are highly dis-
tinctive, thus can be found from a set of robot poses.
In addition, each SIFT location is given a descriptor
that describes this landmark. Thus, this process en-
ables the same points in the space to be recognized
from different viewpoints, which may occur while
the robot moves around its workplace, thus providing
information for the localization process. SIFT fea-
tures have been used in robotic applications, showing
its suitability for localization and SLAM tasks (Little
et al., 2001), (Little et al., 2002), (Sim et al., 2005).
Figure 1 shows a visual features extracted from an im-
age. These SIFT locations are used as landmarks in
the map. The size of the arrow on each feature is pro-
portional to SIFT’s scale.
3 RAO-BLACKWELLIZED SLAM
We estimate the map and the path of the robot using
a Rao-Blackwellized particle filter. Using the nomen-
clature in this filter, we denote as s
t
the robot pose at
time t. On the other hand, the robot path until time
t will be denoted s
t
= {s
1
, s
2
, . . . , s
t
}, the set of
observations made by the robot until time t will be
denoted z
t
= {z
1
, z
2
, . . . , z
t
} and the set of actions
u
t
= {u
1
, u
2
, . . . , u
t
}. Therefore, the SLAM prob-
lem can be formulated as that of determining the lo-
cation of all landmarks in the map Θ and robot poses
s
t
from a set of measurements z
t
and robot actions
u
t
. The map is composed as a set of differente land-
marks Θ = {θ
1
, θ
2
, . . . θ
i
, . . . , θ
N
}. We call c
t
to
the correspondence of the landmarks extracted from
SIMULTANEOUS LOCALIZATION AND MAPPING IN UNMODIFIED ENVIRONMENTS USING STEREO VISION
303