Section 3, we describe the geometry of refraction dis-
tortion due to surface waves. Section 4 describes the
details of our method. In Section 5, we demonstrate
our algorithm on video sequences acquired in the lab-
oratory and outdoors. Finally, in Section 6, we present
our conclusions and directions for future work.
2 RELATED WORK
Several authors have attempted to approach the gen-
eral problem of analyzing images distorted by wa-
ter waves. The dynamic nature of the problem re-
quires the use of video sequences of the target scene
as a primary source of information. In the discussion
and analysis that follow in this paper, we assume that
frames from an acquired video are available.
The literature has contributions by researchers in
various fields including computer graphics, computer
vision, and ocean engineering. Computer graphics
researchers have primarily focused on the problem
of rendering and reconstructing the surface of the
water (Gamito and Musgrave, 2002; Premoze and
Ashikhmin, 2001). Ocean engineering researchers
have studied sea surface statistics and light refrac-
tion (Walker, 1994) as well as numerical modeling of
surface waves (Young, 1999). The vision community
has attempted to study light refraction between water
and materials (Mall and da Vitoria Lobo, 1995), re-
cover water surface geometry (Murase, 1992), as well
as reconstruct images of submerged objects (Efros
et al., 2004; Shefer et al., 2001). In this paper, we
focus on the problem of recovering images with min-
imum distortion.
A simple approach to the reconstruction of images
of submerged objects is to perform a temporal av-
erage of a large number of continuous video frames
acquired over an extended time duration (i.e., mean
value of each pixel over time) (Shefer et al., 2001).
This technique is based on the assumption that the
integral of the periodic function modeling the water
waves is zero (or constant) when time tends to in-
finity. Average-based methods such as the one de-
scribed in (Shefer et al., 2001) can produce reason-
able results when the distortion is caused by low en-
ergy waves (i.e., waves of low amplitude and low fre-
quency). However, this method does not work well
when the waves are of higher energy, as averaging
over all frames equally combines information from
both high and low distortion data. As a result, the av-
eraged image will appear blurry and the finer details
will be lost.
Modeling the three-dimensional structure of the
waves also provides a way to solve the image recov-
ery problem. Murase (Murase, 1992) approaches the
problem by first reconstructing the 3D geometry of
the waves from videos using optical flow estimation.
He then uses the estimated optical flow field to cal-
culate the water surface normals over time. Once the
surface normals are known, both the 3D wave geom-
etry and the image of submerged objects are recon-
structed. Murase’s algorithm assumes that the water
depth is known, and the amplitude of the waves is low
enough that there is no separation or elimination of
features in the image frames. If these conditions are
not met, the resulting reconstruction will contain er-
rors mainly due to incorrect optical flow extraction.
More recently, Efros et al. (Efros et al., 2004)
proposed a graph-based method that attempts to re-
cover images with minimum distortion from videos
of submerged objects. The main assumption is that
the underlying distribution of local image distortion
due to refraction is Gaussian shaped (Cox and Munk,
1956). Efros et al., propose to form an embedding of
subregions observed at a specific location over time
and then estimate the subregion that corresponds to
the center of the embedding. The Gaussian distor-
tion assumption implies that the local patch that is
closer to the mean is fronto-parallel to the camera and,
as a result, the underwater object should be clearly
observable through the water at that point in time.
The solution is given by selecting the local patches
that are the closest to the center of their embedding.
Efros et al., propose the use of a shortest path al-
gorithm that selects the solution as the frame hav-
ing the shortest overall path to all the other frames.
Distances were computed transitively using normal-
ized cross-correlation (NCC). Their method addresses
likely leakage problems caused by erroneous shortest-
distances between similar but blurred patches by cal-
culating paths using a convex flow approach. The
sharpness of the image reconstruction achieved by
their algorithm is very high compared to average-
based methods even when applied to sequences dis-
torted by high energy waves.
In this paper we follow Efros et al. (Efros et al.,
2004), considering an ensemble of fixed local regions
over the whole video sequence. However, our method
differs from theirs in two main ways. First, we pro-
pose to reduce the leakage problem by addressing the
motion blur effect and the refraction effect separately.
Second, we take a frequency domain approach to the
problem by quantifying the amount of motion blur
present in the local image regions based on measure-
ments of the total energy in high frequencies.
Our method aims at improving upon the basic
average-based techniques by attempting to separate
image subregions into high and low distortion groups.
The K-Means algorithm (Duda et al., 2000) is used
along with a frequency domain analysis for generat-
ing and distinguishing the two groups in terms of the
quality of their member frames. Normalized cross-
correlation is then used as a distance measurement to
IMPROVED RECONSTRUCTION OF IMAGES DISTORTED BY WATER WAVES
229