Clustering-based Acceleration for High-dimensional Gaussian Filtering

Sou Oishi and Norishige Fukushima

Nagoya Institute of Technology, Japan

https://fukushima.web.nitech.ac.jp/en/

Keywords:

High-dimensional Gaussian Filtering, Approximated Acceleration, Clustering, Constant-time Filtering,

Bilateral Filtering, Non-local means, Dual Bilateral Filtering, Cross Trilateral Filtering.

Abstract:

Edge-preserving ﬁltering is an essential tool for image processing applications and has various types of ﬁl-

tering. For real-time applications, acceleration of its speed is also essential. To accelerate various types of

edge-preserving ﬁltering, we represent various edge-preserving ﬁltering by high-dimensional Gaussian ﬁl-

tering. Then, we accelerate the high-dimensional Gaussian ﬁltering by clustering-based constant algorithm,

which has O(K) order, where K is the number of clusters. The clustering-based method was developed for

color bilateral ﬁltering; however, this paper used it for high-dimensional bilateral ﬁltering. Also, cooperating

with tiling, k-means++, and principal component analysis, we can further improve the ﬁlter’s performance.

Experimental results show that our method can approximate various edge-preserving ﬁltering by approximated

clustering-based high-dimensional Gaussian ﬁltering.

1 INTRODUCTION

Edge-preserving smoothing is an essential tool for

image processing and computer vision. The ﬁlter-

ing is used in various image processing applications,

such as denoising (Zhang and Gunturk, 2008), deblur-

ring (Dai et al., 2007), detail enhancement (Farbman

et al., 2008), high-dynamic-range imaging (Durand

and Dorsey, 2002), haze removing (Fukushima et al.,

2018), stereo matching (Matsuo et al., 2015), and op-

tical ﬂow (Fujita et al., 2015).

A representative edge-preserving ﬁlter is bilateral

ﬁltering (BF) (Tomasi and Manduchi, 1998). The

BF’s kernel composites two kernels: a spatial kernel

for pixel positions and a range kernel for pixel val-

ues. Joint/cross bilateral ﬁltering (JBF) (Eisemann

and Durand, 2004; Petschnigg et al., 2004) is a natural

extension, which uses an additional image for range

kernel computing. Joint bilateral upsampling (Kopf

et al., 2007) utilizes the JBF’s kernel for upsampling.

Dual bilateral ﬁltering (DBF) (Bennett et al., 2007)

also extends the JBF, which uses three kernels for

space, an RGB image, and an IR image. Joint/cross-

trilateral ﬁltering (JTF) has a similar way for depth

map ﬁltering, which uses a depth map instead of us-

ing an IR image for ﬁltering depth map (Mueller

et al., 2010; Matsuo et al., 2013). Vector bilateral

ﬁltering (VBF) is an extension for hyperspectral im-

https://orcid.org/0000-0001-8320-6407

ages (Peng and Rao, 2009). Non-local means ﬁltering

(NLM) (Buades et al., 2005) is an extension of BF,

which uses patch vectors instead of color intensities.

In this paper, we summarize these variants of BF

as high-dimensional Gaussian ﬁltering (HDGF). The

HDGF is a convolution ﬁlter with a distance not only

in space but also in other dimensions. As typical ex-

amples, the grayscale BF is a 3-dimensional HDGF

for space and intensity value, and the color BF is a

5-dimensional HDGF, which has 3-dimensions in in-

tensity values. The DBF and JTF, including an IR

image or a depth map, are also 3 + 1-HDGF, respec-

tively. n channels hyperspectral images are an n + 2-

dimensional HDGF. The NLM is a m

+ 2 dimen-

sional HDGF in the case of creating an m ×m patch.

The HDGF has several acceleration algorithms.

The accelerated grayscale BF, i.e., 3-dimensional

HDGF, is proposed early, and it becomes fundamen-

tal. The state-of-the-art approaches have O(K) order,

where K is the approximating order, and it is smaller

than the ﬁltering radius D; K << D. We can extend

the specialized approaches for grayscale BF to HDGF,

e.g., color BF. However, as the number of dimensions

increases, the efﬁciency decreases due to the curse of

dimensionality. The computational order of n chan-

nels BF is O(K

). Spatial and range downsampling is

a solution; however, these methods do not fundamen-

tally solve the issue.

The clustering-based approaches resolve the curse

Oishi, S. and Fukushima, N.

Clustering-based Acceleration for High-dimensional Gaussian Filtering.

DOI: 10.5220/0010548600650072

In Proceedings of the 18th International Conference on Signal Processing and Multimedia Applications (SIGMAP 2021), pages 65-72

ISBN: 978-989-758-525-8

of dimensionality. The methods can reduce the com-

putational order O(K), where K is the number of clus-

ters. However, the approximation performance de-

pends on clustering methods. Also, clustering itself

is an overhead to the processing time. The sophisti-

cated clustering (Miyamura et al., 2020), which uses a

tiling strategy and advanced clustering, improves the

performance of 5-dimensional HDGF of color BF, but

the evaluation as the HDGF is not sufﬁcient. The ﬁlter

was only evaluated as the color BF.

We often use dimensionality reduction to com-

press signals for accelerating processing, but the veri-

ﬁcation of the clustering-based HDGF is not enough.

This paper evaluates the performance of the constant-

time algorithm with clustering for various HDGFs.

Also, we improve the clustering performance by

tiling, K-means++ (Arthur and Vassilvitskii, 2007),

and dimensionality reduction. The contributions of

this paper are as follows:

• We comprehensively deal with various local

smoothing as the HDGF.

• We accelerate various HDGFs by clustering-based

constant-time ﬁltering leveraged by tiling and so-

phisticated clustering techniques.

• We evaluate the dimensionality reduction for var-

ious types of the clustering-based HDGF.

2 RELATED WORKS

1) Grayscale BF (3D-HDGF): (Durand and Dorsey,

2002) proposed the early work of accelerating

grayscale BF. This approach decomposes BF into

multiple Gaussian ﬁltering, and fast Fourier trans-

form (FFT) accelerates it. (Paris and Durand, 2006;

Chen et al., 2007) and (Chen et al., 2007) extended

it by representing BF as the HDGF cooperating with

downsampling. (Weiss, 2006) approximated BF by

using histogram, and (Porikli, 2008) reduces the

other to O(K). (Yang et al., 2009) extended Du-

rand’s approach by constant Gaussian ﬁltering (De-

riche, 1992), whose order is also O(K). (Chaud-

hury et al., 2011; Chaudhury, 2013; Fukushima et al.,

2017) proposed a more efﬁcient O(K) method by

raised cosine approximation. Further, (Sugimoto and

Kamata, 2015; Sumiya et al., 2020) sophisticated the

approach by using Fourier series expansion. Also,

eigenvalue decomposition improved the performance

of grayscale BF (Sugimoto et al., 2016a; Papari et al.,

2017) The BF consists of a numerator and denomina-

tor, which requires convolution, individually. (Chaud-

hury and Dabhade, 2016) shared the numerator and

denominator’s convolution. The idea was extended

for Fourier series expansion (Deng, 2017) and singu-

lar value decomposition (Sugimoto et al., 2019).

2) Color BF (5D-HDGF): Changing the

grayscale BF to the color BF is not simple. We care-

fully handle the distance function of color channels,

i.e., 1-dimension to 3-dimension. (Paris and Durand,

2009) proposed the natural extension from the

grayscale BF. This approach was, however, O(K

);

thus, it utilized downsampling. (Yang et al., 2015)

also extended grayscale one. Recent approaches

reduced the number of the range kernel convolu-

tions (Karam et al., 2015; Ghosh and Chaudhury,

2016; Tu et al., 2016).

3) nD-HDGF: (Adams et al., 2009; Adams

et al., 2010) proposed a suitable data structure for

the downsampling-based HDGF, which are Gaussian

KD-tree and permutohedral lattice. The order is also

O(k

), but we can more massively downsample data

than (Paris and Durand, 2009; Yang et al., 2015).

Also, (Gastal and Oliveira, 2012) approximated the

HDGF by recursive geodesic ﬁltering, which has

Gaussian-like output, but it has high speed.

Clustering-based constant-time BFs (Mozerov

and van de Weijer, 2015; Sugimoto et al., 2016b;

Nair and Chaudhury, 2019; Miyamura et al., 2020)

are aimed for color BF, but we can use them for the

HDGF. The order is O(k), but it requires an additional

footprint of clustering. (Miyamura et al., 2020) im-

proved the clustering speed and accuracy.

4) Non-local Means: Non-local means ﬁlter-

ing(NLM) (Buades et al., 2005; Buades et al., 2005),

proposed simultaneously with unsupervised informa-

tion theoretic adaptive (UINTA) ﬁlter (Awate and

Whitaker, 2005), is regarded as the HDGF. The NLM

requires an entire image for a pixel convolution. Let

patch size P, windows size W , image size N. Na

ıve

approach, W=N, O(PN

). Restricting the search

to patches in a local neighborhood (Buades et al.,

2005), O(PW N), W << N Random sampling also

reduces the cost (Awate and Whitaker, 2005) using

pre-selected patches (Buades et al., 2005), O(PW

N),

<< W . Searching the patches by keying means

and variances, mean, and gradients (Mahmoudi and

Sapiro, 2005; Coup

e et al., 2006; Kervrann et al.,

2007; Gilboa and Osher, 2007) improve the perfor-

mance. (Brox et al., 2008) used recursive a k-means

build-up cluster tree. This approach used cluster-

ing, but the order is O(PK log(k)N). Instead, the

clustering-based HDGF is O(PKN). Computing inte-

gral images of certain error terms (Wang et al., 2006;

Darbon et al., 2008) had O(PN +WN).

Principal component analysis (PCA) was utilized

for dimensionality reduction for the NLM’s patches

for saving distance computation (Tasdizen, 2008; Tas-

SIGMAP 2021 - 18th International Conference on Signal Processing and Multimedia Applications

dizen, 2009). We can extend the idea for any HDGF.

3 BILATERAL FILTERING AND

ITS ACCELERATION

3.1 Deﬁnition

We ﬁrstly deﬁne the BF. Let the D-dimensional R-

tone image be f : S 7→ R , where S ⊂ R

is the spa-

tial domain, R ⊂ [0, R]

is the range domain, and c is

the range dimension (generally, D = 2, R = 256, and

c = 3), respectively. Denote p ∈ S as a pixel posi-

tion, f

∈ R as its intensity vector, and N

⊂ S as its

neighboring pixels. The BF is deﬁned as follows;

∑

q∈N

(p, q)w

, f

∑

q∈N

(p, q)w

, f

)

, (1)

where w

: S ×S 7→ R is a spatial kernel and w

: R ×

R 7→ R is a range kernel. The kernels are deﬁned

based on the Gaussian distribution;

(p, q) = exp



−

kq − pk

2σ



, (2)

, f

) = exp



−

− f

2σ



, (3)

where σ

∈ R

is a spatial scale and σ

∈ R

is a

range scale.

3.2 Clustering-based Constant-time

Bilateral Filter

We introduce a clustering-based constant-time bilat-

eral ﬁltering (CBF). The CBF used in this paper

adopts Nystr

om approximated acceleration of eigen-

value decomposition (EVD) (Nair and Chaudhury,

2019). First, we explain EVD for CBF.

Let T = {f

: p} be a list and let T =

, t

, . . . , t

} be a ordering of the elements in T ,

where m is the number of elements. This means that,

given l ∈ [1, m], t

= f

for some x ∈ S. We track the

correspondence by an index map: S 7→ [1, m], where

= l if t

= f

. (4)

We next deﬁne the kernel matrix W ∈ R

m×m

given by

W (i, j) = w

, t

). (5)

Substituting (5) for (1) gives

∑

q∈N

(p, q)W (τ

, τ

∑

q∈N

(p, q)W (τ

, τ

)

, (6)

Since it is clear from (5) that W is a symmetric ma-

trix, EVD of W is as follows;

W =

∑

k=1

, (7)

where λ

(λ

≥ λ

≥, ..., ≥ λ

∈ R) are eigenvalues

and u

is the corresponding eigenvectors. Substitut-

ing (7) to (6) gives

∑

q∈N

(p, q)

∑

k=1

(τ

∑

q∈N

(p, q)

∑

k=1

(τ

)

, (8)

On switching the sums, this becomes

∑

k=1

(τ

)

∑

q∈N

(p, q){u

(τ

}

∑

k=1

(τ

)

∑

q∈N

(p, q){u

(τ

)}

. (9)

Let

W ∈ R

×m

be the matrix of a low-rank approx-

imation of W using the top m

 m) eigenvalues

and eigenvectors. Using

W instead of W in (9), the

BF can be approximated as

≈

∑

k=1

(τ

)

∑

q∈N

(p, q){u

(τ

}

∑

k=1

(τ

)

∑

q∈N

(p, q){u

(τ

)}

(10)

In the color case, the number of convolutions is 4m

where K in the denominator and 3K in the numerator.

In the grayscale case, the size of the matrix is

256 × 256. However, in the color case, the size of

the matrix is 256

. Therefore, direct EVD of W is

difﬁcult. For Nyst

om approximation of EVD of W ,

we ﬁrst construct a small kernel A and then extrapo-

late the eigenvectors of A to approximate those of W .

A ∈ R

×m

is deﬁned using dominant color vectors

determined by clustering as

A(i, j) = w

(µ

, µ

) i, j ∈ [1, m

]. (11)

The size of A is much smaller than that of W . Thus,

we can easily compute EVD:

A =

∑

k=1

, (12)

where λ

∈ R and v

∈ R

. B ∈ R

×m

is :

B(i, j) = w

(µ

, t

) i ∈ [1, m

], j ∈ [1, m]. (13)

This matrix is used to extrapolate u

as follows;

. (14)

These calculations eliminate the computation of EVD

of the large matrix W .

4 PROPOSED METHOD

This section summarizes various neighborhood ﬁlter-

ing as the HDGF. Also, we accelerate it by clustering-

based constant time algorithm. Then we further accel-

erate it by PCA, tiling, and K-means++.

Clustering-based Acceleration for High-dimensional Gaussian Filtering

4.1 High-dimensional Gaussian Filter

The HDGF increases the distance measure’s dimen-

sionality in the range weights of the BF. Let the

n-dimensional input signal be I ∈ R

and the m-

dimensional guide signal be G ∈ R

. The HDGF’s

output of

at a pixel p is;

∑

q∈N

(p, q)w

, G

∑

q∈N

(p, q)w

, G

)

, (15)

where the kernel w

is deﬁned based on the Gaussian

distribution as follows:

, G

) = exp



−

− G

2σ



. (16)

The form is similar to (1) that is changing the range

kernel’s argument. When we use n-dimensional vec-

tors for f for (1), the equation is the HDGF; thus, we

can directly use clustering-based CBF as the HDGF.

This form is called joint bilateral ﬁltering (Eise-

mann and Durand, 2004; Petschnigg et al., 2004),

when m = 1, 3. When G = I, this is the grayscale

BF and the color BF. This direct representation of the

grayscale BF is the 2 + 1-dimensional HDGF, and the

color BF is the 2 + 3-dimensional HDGF. When we

use a n-channels hyperspectral image, it becomes the

2 + n-dimensional HDGF. Therefore, we can repre-

sent various neighborhood ﬁltering by changing the

guide signal of G:

GBF

:= {I

GJBF

:= {G

CBF

:= {I

, I

CJBF

:= {G

, G

HBF

:= {G

, G

, . . . , G

}, (17)

where GBF is the grayscale BF, GJBF is the grayscale

JBF, CBF is the color BF, CJBF is the color JBF,

HBF is the hyperspectral BF. I

and G

are an in-

put grayscale image and an additional guidance im-

age, respectively. I

{R,G,B}

and G

{R,G,B}

are a vector

for each RGB components. G

is a vector of each

component of a n-channels hyperspectral image.

The NLM is also the HDGF. For example, for the

5 × 5 patch size, NLM is a 2 + 25 dimensional GF.

The NLM is deﬁned as follows

∑

q∈N

(p, q)w

(

∑

r∈R

p+r

− f

q+r

∑

q∈N

(p, q)w

(

∑

r∈R

p+r

− f

q+r

)

(18)

where R ⊂ S is set of patch neighborhood pixels and r

is its element. The form can be represented as neigh-

borhood pixels as high-dimensional vectors. For ex-

ample, 3 × 3 patch’s NLM is represented by the fol-

lowing HDGF:

NLM

:= {I

g(−1,−1)

, I

g(−1,0)

, I

g(−1,1)

g(0,−1)

, I

g(0,0)

, I

g(0,1)

g(1,−1)

, I

g(1,0)

, I

g(1,1)

CNLM

:= {I

B(−1,−1)

, I

B(−1,0)

, I

B(−1,1)

B(0,−1)

, I

B(0,0)

, I

B(0,1)

B(1,−1)

, I

B(1,0)

, I

B(1,1)

G(−1,−1)

, I

G(−1,0)

, I

G(−1,1)

G(0,−1)

, I

G(0,0)

, I

G(0,1)

G(1,−1)

, I

G(1,0)

, I

G(1,1)

R−1,−1)

, I

R(−1,0)

, I

R(−1,1)

R(0,−1)

, I

R(0,0)

, I

R(0,1)

R(1,−1)

, I

R(1,0)

, I

R(1,1)

}, (19)

where the G

NLM

is grayscale NLM and the G

NLM

color NLM. The form can also handle joint ﬁltering,

i.e., we can replace I as G. Also, (x, y) represents the

shift operator, e.g., I

(−1,−1)

is a vector whose position

is shifted by (−1, −1).

We handle dual bilateral ﬁltering (DBF) or joint

trilateral ﬁltering (JTF) for the other extension. These

ﬁlters have an additional kernel for BF expression

of (1) and require both input and guidance signals of

I and G. Also, the input signal and guidance signal

have different characteristics. For example, ﬂash/no-

ﬂash image pair, RGB/IR image pair, RGB/depth im-

age pair, RGB/optical ﬂow pair, RGB/alpha matting

mask, RGB/normal map image pair. The form is rep-

resented by:

∑

q∈N

(p, q)w

, G

, I

∑

q∈N

(p, q)w

, G

, I

)

(20)

where w

is an additional kernel, and is a usually

Gaussian kernel:

, I

) = exp



−

− I

2σ



. (21)

For handling different characteristics of the input and

guidance signals, the parameters for each kernel is

different, i.e., σ

6= σ

We can merge the two Gaussian kernels of w

and

by the exponential function’s rules:

, I

, G

)

= exp



− G

− I

−2σ



= exp



− G

+ ktI

−tI

−2σ



, (22)

SIGMAP 2021 - 18th International Conference on Signal Processing and Multimedia Applications

where σ

= tσ

. We can control each parameter by

splitting the smoothing parameters between the orig-

inal image and the additional image by multiplying

the pixel value by t. The input signal I : S 7→ [0, R] is

transformed into I : S 7→ [0,tR]. The original paper of

DBF handles an RGB image and an IR image to ﬁlter

the RGB image, and also the CTF paper handles an

RGB image and a depth map to ﬁlter the depth map.

For representing the DBF and CTF as the HDGF, the

guidance signals are represented as:

DBF

:= {tI

,tI

, I

CT F

:= {I

, I

,tD}. (23)

The DBF and CTF are the 2 + 3 + 1-dimensional

HDGF, including 2D-spatial dimension, RGB range

dimensions, and a IR image or depth map dimen-

sion. The weight control idea can also be applied

for the NLM and the color BF, e.g., using Gaus-

sian patch weighting or handling the important color

channels, such as the green channel. For example,

we can separately set parameter for each dimension

= t

= ·· · = t

4.2 Dimensionality Reduction by PCA

The HDGF utilizes PCA for reducing dimensionality.

We compute covariance matrix C for G:

C =(G−

(G−

G) =







. . . σ







(24)

where

G contains mean vectors of G. For simplicity,

p+r

= x

n,p

= ¯x

|Ω|

∑

p∈Ω

m,p

− ¯x

)(x

n,p

− ¯x

) (25)

PCA decomposes the matrix by EVD.

C = V ΛV

−1

, (26)

where, V is matrix, which contains eigenvectors.

Λ = diag(λ

, . . . , λ

) is a diagonal matrix, which has

eigenvalue with decent order. PCA projection is rep-

resented as:

= V

(27)

where V

has k-components of eigenvectors, and G

is the projected guidance signal.

4.3 Tiling with Clustering and PCA

For acceleration, we perform image tiling for

constant-time ﬁltering (Miyamura et al., 2020). The

previous method, which is for color BF, acceler-

ates clustering processing and accelerates convolution

processing in (10). For the clustering method, we

used K-means++ (Arthur and Vassilvitskii, 2007) for

improving the accuracy.

The proposed method requires PCA for guidance

signals; thus, we also perform PCA on tile images in-

stead of entire images. Therefore, tiling accelerates

the computational speed, including convolution, clus-

tering, and PCA. Note that PCA for the tiled image

does not include full image information, but the ef-

fect is not large, veriﬁed in experimental results.

5 EXPERIMENTAL RESULTS

We evaluated the approximation accuracy for the

NLM, CTF, and DBF. The accuracy was quanti-

ﬁed as peak signal-to-noise ratio (PSNR) between

approximation and non-approximation results. The

codes were implemented in C++, vectorized by AVX,

and parallelized by OpenMP. We used Visual Stu-

dio 2019’s compiler and Intel Core i7-6700K @

4.00GHz. Also, we divided the image into 4 × 4 for

tiling. In our experiments, we set the parameters to

K = 30, σ

= 70, and σ

= 70, respectively, as default

parameters.

Firstly, we evaluated the NLM approximation ac-

curacy. The input image was Lenna (Figure 1a), and

the guidance image’s channels changed from 4 to 10

by PCA from 3×3×3 = 27 channels. Figure 1 shows

the ﬁltered images by σ

= 70, K = 30 and 4 chan-

nels. Figure 2, 3 (resp. Figure 4) show the accuracy

results (resp. the acceleration results) w.r.t. changing

the number of clusters K or parameter σ

with tiling

(a) input (b) without tiling (c) with tiling

Figure 1: Input/Output of NLM (Lenna: σ

= 70, K = 30).

10 20 30 40 50 60 70 80 90 100

PSNR [dB]

4channels

5channels

6channels

7channels

8channels

9channels

10channels

(a) without tiling

10 20 30 40 50 60 70 80 90 100

PSNR [dB]

4channels

5channels

6channels

7channels

8channels

9channels

10channels

(b) with tiling

Figure 2: K w.r.t PSNR [dB] (NLM approx.: σ

= 70). Over

7 channels case, Accuracy is saturated in without tiling.

With tiling, 5 and 6 channels cases are improved.

Clustering-based Acceleration for High-dimensional Gaussian Filtering

20 30 40 50 60 70 80 90 100

PSNR [dB]

4channels

5channels

6channels

7channels

8channels

9channels

10channels

(a) without tiling

20 30 40 50 60 70 80 90 100

PSNR [dB]

4channels

5channels

6channels

7channels

8channels

9channels

10channels

(b) with tiling

Figure 3: σ

w.r.t PSNR [dB] (NLM approx.: K = 30).

and without tiling cases. Other parameters are default

parameters. We use averaged PSNR and the calcula-

tion time with 100 trials. From these results, PSNR

is improved by increasing K, σ

, and the dimension-

ality. Moreover, we can keep PSNR using tiling in

the HDGF at the highest level, though PCA uses

only local tile information. Also, in the other low-

dimensional approximation, our method has a higher

PSNR than the method without tiling.

100

200

300

400

500

600

700

10 20 30 40 50 60 70 80 90 100

time [ms]

without tiling

with tiling

(a) K w.r.t time (σ

= 70)

100

150

200

250

20 30 40 50 60 70 80 90 100

time [ms]

without tiling

with tiling

(b) σ

w.r.t time (K = 30)

Figure 4: K and σ

w.r.t time [ms] (NLM approximation

with 4 channels).

Figure 4a is computational time w.r.t. the number

of K and Fig. 4b is one w.r.t. σ

. The calculation time

with tiling is faster than without tiling. When σ

small, the speed is slow, though it is a constant-time

algorithm, because we need to compute quite small

numbers, i.e., subnormal numbers. The same prob-

lem is found in the normal BF (Maeda et al., 2018)

cases.

Secondly, we evaluated the CTF. The ﬁltering in-

put image was the depth map of Tsukuba, and the

guidance image was the depth map and the RGB im-

age of Tsukuba (Fig. 5). The depth map is obtained

from (Scharstein and Szeliski, 2002; El-Etriby et al.,

2007). PCA compressed the 4 channels guide image

from 1 to 3 channels. Figure 5 shows the ﬁltered

images by σ

= 70, σ

= 50 and 3 channels. Fig-

ure 6 shows the accuracy results w.r.t. changing σ

and σ

. We used averaged PSNR with 1000 trials.

The ground truth image for measuring PSNR was a

4-dimensional HDGF output without PCA. Figure 6a,

in the dimensions compressed to 2 and 3, PSNR is im-

proved by increasing the smoothing parameter σ

for

the RGB original image. However, in the dimension

compressed to 1, PSNR is constant at a low value.

PSNR does not depend on σ

due to the 1-st compo-

nent of the guide image is mostly the depth map by

PCA. Figure 6b, in the dimensions compressed to 2

and 3, it did not change signiﬁcantly with increasing

the smoothing parameter of the depth map. In the di-

mension compressed to 1, PSNR degrades with too

large the depth map component.

(a) RGB image (b) depth map (c) output

Figure 5: Input/Output of CTF (Tsukuba: σ

= 70, σ

= 50,

and 3-dimensional ﬁltering result).

10 20 30 40 50 60 70 80 90 100

PSNR [dB]

1channels 2channels 3channels

(a) changing RGB’s range

kernel (ﬁx σ

= 70)

10 20 30 40 50 60 70 80 90 100

PSNR [dB]

1channels 2channels 3channels

(b) changing depth map’s

range kernel (ﬁx σ

= 70)

Figure 6: Smoothing parameters w.r.t. PSNR [dB] (CTF

approximation: K = 30).

Finally, we evaluated the DBF. We used Book data

for RGB and IR images. The ﬁltering image was the

RGB image, and the guidance image is the RGB+IR

4 channels image. Figure 7 shows the ﬁltered images

by σ

= 40, σ

= 70 and 3 channels. Figure 8 shows

the accuracy results w.r.t. changing the smoothing pa-

rameters σ

and σ

. We used averaged PSNR with

1000 trials. In the DBF, the compressed guide image

in a dimension is mostly the IR image information

by PCA. However, since the DBF is ﬁltering for the

RGB image, whose color information is mostly deter-

mined by the input image, PSNR can be improved by

increasing the parameters of the IR image.

(a) RGB image (b) IR image (c) output

Figure 7: Input/Output of DBF (Books: σ

= 40, σ

= 70,

and 3-dimensional ﬁltering result).

10 20 30 40 50 60 70 80 90 100

PSNR [dB]

1channels 2channels 3channels

(a) changing RGB’s range

kernel (ﬁx σ

= 70)

10 20 30 40 50 60 70 80 90 100

PSNR [dB]

1channels 2channels 3channels

(b) changing IR’s range ker-

nel (ﬁx σ

= 70)

Figure 8: Smoothing parameters w.r.t. PSNR [dB] (DBF

approximation: K = 30).

SIGMAP 2021 - 18th International Conference on Signal Processing and Multimedia Applications

6 CONCLUSIONS

This paper handled several edge-preserving ﬁlters

as the HDGF, which includes the NLM, DBF, and

CTF. We also accelerate the various HDGF by the

clustering-based HDGF. Also, we improved the ac-

curacy and computational speed of the HDGF by

PCA, K-means++, and tiling. In our experiments,

the clustering-based HDGF with tiling showed that

its performance is high in the accuracy and speed as-

pects.

ACKNOWLEDGEMENTS

This work was supported by JSPS KAKENHI Grant

Number 18K19813, 21H03465.

REFERENCES

Adams, A., Baek, J., and Davis, M. A. (2010). Fast high-

dimensional ﬁltering using the permutohedral lattice.

Computer Graphics Forum, 29(2):753–762.

Adams, A., Gelfand, N., Dolson, J., and Levoy, M. (2009).

Gaussian kd-trees for fast high-dimensional ﬁltering.

ACM Transactions on Graphics, 28(3):21.

Arthur, D. and Vassilvitskii, S. (2007). K-means++: The

advantages of careful seeding. In Proc. Annual ACM-

SIAM Symposium on Discrete Algorithms (SODA).

Awate, S. P. and Whitaker, R. T. (2005). Higher-order im-

age statistics for unsupervised, information-theoretic,

adaptive, image ﬁltering. In IEEE Computer Society

Conference on Computer Vision and Pattern Recogni-

tion (CVPR), volume 2, pages 44–51 vol. 2.

Bennett, E., Mason, J., and McMillan, L. (2007). Multi-

spectral bilateral video fusion. IEEE Transactions on

Image Processing, 16(5):1185–1194.

Brox, T., Kleinschmidt, O., and Cremers, D. (2008). Ef-

ﬁcient nonlocal means for denoising of textural pat-

terns. IEEE Transactions on Image Processing,

17(7):1083–1092.

Buades, A., Coll, B., and Morel, J. . (2005). A non-local

algorithm for image denoising. In IEEE Computer

Society Conference on Computer Vision and Pattern

Recognition (CVPR), volume 2, pages 60–65 vol. 2.

Buades, A., Coll, B., and Morel, J. M. (2005). A review of

image denoising algorithms, with a new one. Multi-

scale Modeling & Simulation, 4(2):490–530.

Chaudhury, K. N. (2013). Acceleration of the shiftable o(1)

algorithm for bilateral ﬁltering and nonlocal means.

IEEE Transactions on Image Processing, 22(4):1291–

1300.

Chaudhury, K. N. and Dabhade, S. D. (2016). Fast and

provably accurate bilateral ﬁltering. IEEE Transac-

tions on Image Processing, 25(6):2519–2528.

Chaudhury, K. N., Sage, D., and Unser, M. (2011).

Fast o(1) bilateral ﬁltering using trigonometric range

kernels. IEEE Transactions on Image Processing,

20(12):3376–3382.

Chen, J., Paris, S., and Durand, F. (2007). Real-time edge-

aware image processing with the bilateral grid. ACM

Transactions on Graphics, 26(3).

Coup

e, P., Yger, P., and Barillot, C. (2006). Fast non

local means denoising for 3d mr images. In In-

ternational Conference on Medical Image Comput-

ing and Computer-Assisted Intervention, pages 33–

40. Springer.

Dai, S., Han, M., Wu, Y., and Gong, Y. (2007). Bilateral

back-projection for single image super resolution. In

Proc. IEEE International Conference on Multimedia

and Expo (ICME), pages 1039–1042.

Darbon, J., Cunha, A., Chan, T. F., Osher, S., and Jensen,

G. J. (2008). Fast nonlocal ﬁltering applied to electron

cryomicroscopy. In IEEE International Symposium

on biomedical imaging: from nano to macro, pages

1331–1334. IEEE.

Deng, G. (2017). Fast compressive bilateral ﬁlter. Electron-

ics Letters, 53(3):150–152.

Deriche, R. (1992). Recursively implementating the gaus-

sian and its derivatives. In Proc. IEEE International

Conference on Image Processing (ICIP).

Durand, F. and Dorsey, J. (2002). Fast bilateral ﬁltering

for the display of high-dynamic-range images. ACM

Transactions on Graphics, 21(3):257–266.

Eisemann, E. and Durand, F. (2004). Flash photography en-

hancement via intrinsic relighting. ACM Transactions

on Graphics, 23(3):673–678.

El-Etriby, S., Al-Hamadi, A., and Michaelis, B. (2007).

Dense stereo correspondence with slanted surface us-

ing phase-based algorithm. In Proc. IEEE Interna-

tional Symposium on Industrial Electronics (ISIE).

Farbman, Z., Fattal, R., Lischinski, D., and Szeliski, R.

(2008). Edge-preserving decompositions for multi-

scale tone and detail manipulation. ACM Transactions

on Graphics, 27(3):67.

Fujita, S., Matsuo, T., Fukushima, N., and Ishibashi, Y.

(2015). Cost volume reﬁnement ﬁlter for post ﬁltering

of visual corresponding. In Proc. Image Processing:

Algorithms and Systems XIII.

Fukushima, N., Sugimoto, K., and Kamata, S. (2017). Com-

plex coefﬁcient representation for iir bilateral ﬁlter. In

Proc. International Conference on Image Processing

(ICIP).

Fukushima, N., Sugimoto, K., and Kamata, S. (2018).

Guided image ﬁltering with arbitrary window func-

tion. In Proc. IEEE International Conference on

Acoustics, Speech and Signal Processing (ICASSP).

Gastal, E. S. and Oliveira, M. M. (2012). Adaptive man-

ifolds for real-time high-dimensional ﬁltering. ACM

Transactions on Graphics (TOG), 31(4):1–13.

Ghosh, S. and Chaudhury, K. N. (2016). Fast bilateral ﬁl-

tering of vector-valued images. In Proc. IEEE Inter-

national Conference on Image Processing (ICIP).

Clustering-based Acceleration for High-dimensional Gaussian Filtering

Gilboa, G. and Osher, S. (2007). Nonlocal linear image reg-

ularization and supervised segmentation. Multiscale

Modeling & Simulation, 6(2):595–630.

Karam, C., Chen, C., and Hirakawa, K. (2015). Stochastic

bilateral ﬁlter for high-dimensional images. In Proc.

IEEE International Conference on Image Processing

(ICIP).

Kervrann, C., Boulanger, J., and Coup

e, P. (2007). Bayesian

non-local means ﬁlter, image redundancy and adap-

tive dictionaries for noise removal. In International

conference on scale space and variational methods in

computer vision, pages 520–532. Springer.

Kopf, J., Cohen, M., Lischinski, D., and Uyttendaele, M.

(2007). Joint bilateral upsampling. ACM Transactions

on Graphics, 26(3).

Maeda, Y., Fukushima, N., and Matsuo, H. (2018). Ef-

fective implementation of edge-preserving ﬁltering on

cpu microarchitectures. Applied Sciences, 8(10).

Mahmoudi, M. and Sapiro, G. (2005). Fast image and video

denoising via nonlocal means of similar neighbor-

hoods. IEEE Signal Processing Letters, 12(12):839–

842.

Matsuo, T., Fujita, S., Fukushima, N., and Ishibashi, Y.

(2015). Efﬁcient edge-awareness propagation via

single-map ﬁltering for edge-preserving stereo match-

ing. In Proc. Three-Dimensional Image Processing,

Measurement (3DIPM), and Applications.

Matsuo, T., Kodera, N., Fukushima, N., and Ishibashi, Y.

(2013). Depth map reﬁnement using reliability based

joint trilateral ﬁlter. ECTI Transactions on Computer

and Information Technology, 7(2):108–117.

Miyamura, T., Fukushima, N., Waqas, M., Sugimoto, K.,

and Kamata, S. (2020). Image tiling for clustering to

improve stability of constant-time color bilateral ﬁl-

tering. In Proc. International Conference on Image

Processing (ICIP).

Mozerov, M. G. and van de Weijer, J. (2015). Global color

sparseness and a local statistics prior for fast bilateral

ﬁltering. IEEE Transactions on Image Processing,

24(12):5842–5853.

Mueller, M., Zilly, F., and Kauff, P. (2010). Adaptive

cross-trilateral depth map ﬁltering. In Proc. 3DTV-

Conference: the True Vision-Capture, Transmission

and Display of 3D Video (3DTV-CON), pages 1–4.

Nair, P. and Chaudhury, K. N. (2019). Fast high-

dimensional kernel ﬁltering. IEEE Signal Processing

Letters, 26:377–381.

Papari, G., Idowu, N., and Varslot, T. (2017). Fast bilateral

ﬁltering for denoising large 3d images. IEEE Trans-

actions on Image Processing, 26(1):251–261.

Paris, S. and Durand, F. (2006). A fast approximation

of the bilateral ﬁlter using a signal processing ap-

proach. In Proc. European conference on computer

vision (ECCV).

Paris, S. and Durand, F. (2009). A fast approximation of

the bilateral ﬁlter using a signal processing approach.

International Journal of Computer Vision, 81(1):24–

52.

Peng, H. and Rao, R. (2009). Hyperspectral image enhance-

ment with vector bilateral ﬁltering. In IEEE Interna-

tional Conference on Image Processing (ICIP), pages

3713–3716.

Petschnigg, G., Agrawala, M., Hoppe, H., Szeliski, R., Co-

hen, M., and Toyama, K. (2004). Digital photography

with ﬂash and no-ﬂash image pairs. ACM Transac-

tions on Graphics, 23(3):664–672.

Porikli, F. (2008). Constant time o(1) bilateral ﬁltering. In

Proc. IEEE Conference on Computer Vision and Pat-

tern Recognition (CVPR).

Scharstein, D. and Szeliski, R. (2002). A taxonomy and

evaluation of dense two-frame stereo correspondence

algorithms. International Journal of Computer Vision,

47(1/2/3):7–42.

Sugimoto, K., Breckon, T., and Kamata, S. (2016a).

Constant-time bilateral ﬁlter using spectral decompo-

sition. In Proc. IEEE International Conference on Im-

age Processing (ICIP).

Sugimoto, K., Fukushima, N., and Kamata, S. (2016b).

Fast bilateral ﬁlter for multichannel images via soft-

assignment coding. In Proc. Asia-Paciﬁc Signal and

Information Processing Association Annual Summit

and Conference (APSIPA).

Sugimoto, K., Fukushima, N., and Kamata, S. (2019). 200

fps constant-time bilateral ﬁlter using svd and tiling

strategy. In Proc. IEEE International Conference on

Image Processing (ICIP).

Sugimoto, K. and Kamata, S. (2015). Compressive bilat-

eral ﬁltering. IEEE Transactions on Image Process-

ing, 24(11):3357–3369.

Sumiya, Y., Fukushima, N., Sugimoto, K., and Kamata, S.

(2020). Extending compressive bilateral ﬁltering for

arbitrary range kernel. In Proc. IEEE International

Conference on Image Processing (ICIP).

Tasdizen, T. (2008). Principal components for non-local

means image denoising. In IEEE International Con-

ference on Image Processing (ICIP). IEEE.

Tasdizen, T. (2009). Principal neighborhood dictionaries for

nonlocal means image denoising. IEEE Transactions

on Image Processing, 18(12):2649–2660.

Tomasi, C. and Manduchi, R. (1998). Bilateral ﬁltering for

gray and color images. In Proc. IEEE International

Conference on Computer Vision (ICCV).

Tu, W., Lai, Y., and Chien, S. (2016). Constant time bilat-

eral ﬁltering for color images. In Proc. IEEE Interna-

tional Conference on Image Processing (ICIP).

Wang, J., Guo, Y., Ying, Y., Liu, Y., and Peng, Q. (2006).

Fast non-local algorithm for image denoising. In 2006

International Conference on Image Processing, pages

1429–1432. IEEE.

Weiss, B. (2006). Fast median and bilateral ﬁltering. ACM

Transactions on Graphics, 25(3):519–526.

Yang, Q., Ahuja, N., and Tan, K. H. (2015). Constant time

median and bilateral ﬁltering. International Journal

of Computer Vision, 112(3):307–318.

Yang, Q., Tan, K. H., and Ahuja, N. (2009). Real-time o(1)

bilateral ﬁltering. In Proc. IEEE Conference on Com-

puter Vision and Pattern Recognition (CVPR).

Zhang, M. and Gunturk, B. K. (2008). Multiresolution bilat-

eral ﬁltering for image denoising. IEEE Transactions

on image processing, 17(12):2324–2333.

SIGMAP 2021 - 18th International Conference on Signal Processing and Multimedia Applications