We have demonstrated the feasibility of a novel
single- and multi-camera pose estimation technique
which relies exclusively on the computed 3D head
pose of a human in the scene. A broad range of ex-
periments were carried out on simulated and real im-
ages of vehicle cockpit scenes with varying camera
configurations. Our tests on real multi-camera data
have shown an average translational and rotational er-
ror of about 17 cm and less than 5 degrees, respec-
tively. The proposed method can be applied to use
cases where a certain decrease in accuracy compared
to traditional checkerboard calibration is outweighed
by the natural, easy and flexible handling of the head
pose based calibration. Such use cases include camera
setups within the cockpit of a vehicle, train or plane,
where one or more cameras focus on the occupants,
for example, for the purpose of attention monitoring
or early sensor fusion in a multi-camera environment.
Other potential applications include robot attention
tracking or monitoring costumer interest in automated
In future work, the 2D facial landmarks employed
in our approach and symmetries typically present in
human faces could potentially be used to extend our
approach to estimate the camera intrinsics as well.
This would allow for the extraction of a full camera
calibration from human faces as a calibration object.
Currently, our approach relies on detecting 2D facial
landmarks for head pose calculation. Further research
could try to alleviate the requirements of facial land-
marks detection in order to generalize the head pose
estimation algorithm to viewing conditions where the
human face is not visible to all cameras.
This work was partly supported by the Synthetic-
Cabin project (no. 884336), which is funded through
the Austrian Research Promotion Agency (FFG) on
behalf of the Austrian Ministry of Climate Action
(BMK) via its Mobility of the Future funding pro-
Camera Pose Estimation using Human Head Pose Estimation