The remainder of this paper is organized as follows.
Section 2 presents a discussion on related literature.
Section 3 outlines the methodology used to solve this
problem. Section 4 presents experimental results. Fi-
nally, section 5 presents conclusions and future work.
2 PREVIOUS WORK
One application of gesture recognition that has been
active for many years is sign language recognition.
Starner et al. shows a system that tracks sign lan-
guage in real time for continuous sentence recognition
(Starner, 1998). Their system takes measurements of
the hand (shape, orientation, and trajectory) and com-
bines these with hidden Markov models to produce
a powerful system capable of a recognition rate of
between 97.8 - 99.3 percent while using a wearable
computer.
Hand gestures have also been explored as a means
of human computer interaction. Oka et al. demon-
strates a system of hand gesture recognition and fin-
ger tracking for use in an application called the En-
hanced Desktop (Oka, 2002). An image of the ap-
plication (such as a drawing application) is projected
onto a desk and users can then manipulated the image
using hand gestures. It is necessary to both directly
manipulate objects (by doing tasks such as grabbing
an object and moving it) and communicate with the
computer using symbolic gestures.
Hidden Markov Models (HMM) are a popular
method to use in the recognition of gestures (Starner,
1998). HMMs were originally employed in the field
of automatic speech recognition (ASR). The dynamic
natures of both gesture and speech suggested that
a similar approach might be successful in gesture
recognition. Oka uses a HMM to recognize 12 differ-
ent gestures, based on the direction of motion of the
detected fingertips (Oka, 2002). The authors boast
an accuracy rate of 99.2% of single finger gestures
and an accuracy rate of 97.5% of double-finger ges-
tures. Starner achieved similar recognition rates in
ASL recognition (Starner, 1998).
The major difficulty in using HMMs for gesture
recognition is related to the quality of the sensor.
When the sensor quality is poor, tracking becomes
much more difficult. In such situations, measure-
ments become less reliable and HMMs yield poor re-
sults.
3 METHODOLOGY
A convex hull is a geometric shape such that no
two points in the shape are connected by a line seg-
ment that contains points outside of the shape (Oviatt,
2002). Fig. 1 shows two examples of convex hulls of
hand gestures. The lines show the convex hull while
hull points are marked with an x. The gesture on
the left is the gesture created by extending the index
finger and curling the remaining fingers towards the
palm. The gesture on the right is the gesture created
by extending and splaying all of the fingers.
Figure 1: Gesture Silhouette and Convex Hull.
We compute the convex hull using the graham scan
algorithm, which was selected for its speed and sim-
plicity. The graham scan algorithm builds the convex
hull by systematically examining all the shape points
to determine if they are inside of or outside of the cur-
rent shape. Points inside of the shape are discarded,
while points outside of the shape are added to the hull.
The algorithm runs in time linear to the number of
points in the object (O’Rourke, 1998). Next, we ex-
tract the deficits of convexity. Starting at the top-left
hull point, we trace the contour of the hand until an-
other hull point is found. The contour of the hand can
be efficiently traced by examining points neighbor-
ing the current point (clockwise at the previous loca-
tion) until the next hand point has been found (Chang,
1989). The extracted deficit of convexity is normal-
ized by rotating the edge shared with the convex hull
to align it with the X axis, shown in eq. (1). In this
equation, the start point indicates the starting convex
hull point and finish denotes the finished convex hull
point. The rotated deficit is aligned by translating the
first moment of the deficit to the midpoint of the im-
age.
θ = − arctan(
y
finish
− y
start
x
finish
− x
start
) (1)
The process of computing the deficits can occasion-
ally result in a deficit that is extremely small. As we
see in fig. 1, this is particularly true around the tips
of the fingers and the side of the hand. Therefore,
we establish three thresholds for evaluating whether
a deficit is accepted or rejected. The width thresh-
old T
w
rejects deficits that are too narrow. The height
threshold T
h
rejects deficits that are too short. The
area threshold T
a
rejects deficits that meet the height
and width requirements, but are still too small. Ex-
amples of normalized deficits of convexity that meet
these thresholds are shown in fig. 2.
During training, a number of examples of each ges-
ture are captured. The deficits of convexity from these
VISAPP 2006 - IMAGE UNDERSTANDING
124