which uses a personal calibration process for each
user and does not allow large head motions. Limit-
ing the head motion is typical for systems that utilize
only a single camera. (T. Ohno and Yoshikawa, 2002)
uses a (motorized) auto-focus lens to estimate the dis-
tance of the face from the camera. In (J.G. Wang and
Venkateswarku, 2003), the eye gaze is computed by
using the fact that the iris contour, while being a cir-
cle in 3D is perspectively an ellipse in the image. The
drawback in this approach is that a high resolution im-
age of the iris area is necessary. This severely limits
the possible motions of the user, unless an additional
wide-angle camera is used.
In this paper we introduce a new approach with
several advantages. The system is monocular, hence
the difficulties associated with multiple cameras are
avoided. The camera parameters are maintained con-
stant in time. The system requires no personal cali-
bration and the head is allowed to move freely. This is
achieved by using a model of the face, deduced from
anthropometric features. This kind of method has al-
ready received some attention in past (T. Horprasert
and Davis, 1997; Gee and Cipolla, 1994a; Gee and
Cipolla, 1994b). However, our approach is simpler,
requires less points to be tracked and is eventually
more robust and practical.
In (Gee and Cipolla, 1994a; Gee and Cipolla,
1994b), the head orientation is estimated under the as-
sumption of the weak perspective image model. This
algorithm works using four points: the mouth cor-
ners and the external corners of the eyes. Once those
points are precisely detected, the head orientation is
computed by using the ratio of the lengths L
e
and L
f
,
where L
e
is the distance between the external eyes
corners and L
f
between the mouth and the eyes. In
(T. Horprasert and Davis, 1997), a five points algo-
rithm is proposed to recover the 3D orientation of
the head, under full perspective projection. The in-
ternal and external eyes corners provide four points,
while the fifth point is the botton of the nose. The
first four points approximately lie on a line. There-
fore the authors use the cross-ratio of these points as
a an algebraic constraint on the 3D orientation of the
head. It is worth noting that the cross-ratio is known
to be very sensitive to noise. Consider for exam-
ple, four points A, B, C, D lying on the x-axis which
x-coordinates are respectively 5, 10, 15, 20. These
points can be typically the eye corners. Then the
cross-ratio [A, B, C, D]=(5− 15)/(5− 20) × (10 −
20)/(10 − 15) = 10/3=3.333.NowifA is de-
tected at 4 and B at 11, then the cross-ratio becomes
[A, B, C, D]=(4− 15)/(5 − 20) × (11 − 20)/(11 −
15) = 99/64 = 1.54. This simple computation
shows that using the cross-ratio as a constraint on the
3D structure, requires detection with a precision gen-
erally beyond the capability of a vision system.
In contrast, our approach is based on three points
only and works with a full perspective model. The
three points are the eye centers and the middle point
between the nostrils. Using these three points, we can
compute several algebraic constraints on the 3D head
orientation, based on a anthropomorphic model of the
human face. These constraints are explicitly formu-
lated in section 2.
Once the head orientation is recovered, further
computations are possible. In this paper, we show an
application to gaze detection. This approach, of a me-
chanically simple, automatic and non-intrusive sys-
tem, allows eye-gazing to be used in a variety of ap-
plications where eye-gaze detection was not an option
before. For example, such a system may be installed
in mass produced cars. With the growing concern of
car accidents, customers and regulators are demand-
ing safer cars. Active sensors that may prevent acci-
dents are actively perused. A non-intrusive, cheaply
produced, one-size-fits-all eye-gazing system could
monitor driver vigilance at all times. Drowsiness and
inattention can immediately generate alarms. In con-
junction with other active sensors, such as radar, ob-
stacle detection, etc. the driver may be warned of an
unnoticed hazard outside the car.
Psychophysical and psychological tests and exper-
iments with uncooperative subjects such as children
and/or primates, may also benefit from such a static
(no moving parts) system, which allows the subject
to focus solely on the task at hand while remaining
oblivious to the eye-gaze system.
In conjunction with additional higher-level sys-
tems, a covert eye-gazing system may be useful in se-
curity applications. For example, monitoring the eye-
gaze of ATM clients. In automated airport checkin
counters, such a system may alert of suspiciously be-
having individuals.
The paper is organized as follows. In section 2,
we present the core of the paper, the face model that
we use and how this model leads to the computation
of the Euclidean face 3D orientation and position. We
present simulations, that show the results are robust to
error in both the model and the measurements. Sec-
tion 3 gives an overview of the system, and some ex-
periments are presented.
2 FACE MODEL AND
GEOMETRIC ANALYSIS
2.1 Face Model
Following the statistical data taken from (Farkas,
1994), we assume the following model of a generic
human face. Let A and B be the centers of the eyes,
and let C be the middle point between the nostrils.
VISAPP 2006 - IMAGE UNDERSTANDING
86