is to calculate spatial statistics on the texton maps and
use these statistics for classification.
In this paper, we propose the use of co-occurrence
spatial statistics calculated on the texton maps as a
means to improve classification of textures that dif-
fer mostly in their spatial structure. In the implemen-
tation proposed in this paper, we use the gray-level
co-occurrence matrix (GLCM) (Haralick et al., 1973)
to account for the spatial structure of textons. These
statistics help improve classification of textures, espe-
cially in those textures that otherwise cannot be cor-
rectly classified using histogram-based methods.
We are particularly interested in the classification
of textures that present similar texton frequency, but
differ in the way their textons are spatially distributed
in the image. Examples of such textures include man-
made ones such as buildings and cloth pattens, and
natural ones such as coral reefs and pollen grains.
In this paper, we show how the spatial structure of
textons can be used for the recognition of pollen in
optical microscope images (Bush, 2000; Mark and
Wengs, 2006). We are currently working on the de-
velopment of an automatic image-based algorithm
for pollen recognition. Figure 1 shows examples of
pollen grain images acquired using an optical micro-
scope.
The remainder of this paper is organized as follows.
In Section 2, we provide a concise review of the cur-
rent related literature. In Section 3, we summarize
the original texton-based classification framework. In
Section 4, we describe the details of our method. In
Section 5, we demonstrate the effectiveness of our al-
gorithm on two texture datasets. Finally, in Section 6,
we present our conclusions and plans for future work.
2 RELATED WORK
The literature on texture classification is extensive.
In general, approaches can be divided into two main
groups: 2D methods and 3D methods. The first group
of methods model texture in terms of the spatial varia-
tions of albedo on a planar surface. The second group
of methods attempts to model the reflectance varia-
tions due to 3D factors such as surface relief, cam-
era viewpoint, and illumination. Representative of
2D methods include works based on Markov Ran-
dom Fields, which model the spatial statistical rela-
tionships of pixels (Chellapa et al., 1985; Cross and
Jain, 1983), and descriptive models based on filter
banks (Leung and Malik, 1999). Recent work on tex-
ture classification has focused on the problem of mod-
eling 3D appearance of surface materials (Leung and
Malik, 2001; Varma and Zisserman, 2004; Dong and
Chantler, 2005).
Leung and Malik (Leung and Malik, 1999; Le-
ung and Malik, 2001) have introduced a descriptive
model capable of encoding essential local structures
and attributes of natural textures. Their method be-
gins by representing each pixel of a texture as the
convolution response of a bank of multi-scale and
multi-orientation filters. The filter responses at each
pixel location are concatenated into vectors. A K-
means clustering method (Duda et al., 2001) is ap-
plied to each filter response vector and the estimated
cluster centers are chosen to be statistical representa-
tions of the texture elements or “textons”. The idea
of textons as cluster centers of filter responses has in-
spired several extensions of Leung and Malik’s origi-
nal method (Varma and Zisserman, 2004; Dong and
Chantler, 2005; Cula and Dana, 2004; Zhu et al.,
2005).
The strength of Leung and Malik’s (LM) method
is the ability to learn a statistically descriptive model
of textures based on the responses of the filter bank.
However, the method in its original version relies
on similarity measurements between one-dimensional
frequency histograms of vector quantized texture im-
ages. As a result, important information is lost in the
process of histogram generation.
A possible solution to account for spatial struc-
ture is to represent the relationship among textons us-
ing spatial statistics measurements such as Markov
Random Fields (Chellapa et al., 1985; Cross and
Jain, 1983; Li, 1995) as well as descriptors of a co-
occurence matrix (Haralick et al., 1973). In this pa-
per, we propose the use of the well-known gray-level
co-occurrence matrix as a way to capture spatial in-
teraction among textons (Haralick et al., 1973). Next,
we will show how texton spatial interaction represen-
tation can help improve recognition rates for certain
types of textures.
3 TEXTONS AS CLUSTER
CENTERS
In this section, we briefly describe Leung and Ma-
lik’s (Leung and Malik, 1999) approach to texture
classification. The key idea is to model texture el-
ements as cluster centers of convolution filter re-
sponses. Let F be a bank of filters of several ori-
entations and scales. Figure 2 shows an example of
the filter bank used in (Leung and Malik, 1999; Le-
ung and Malik, 2001). The filter bank consisted of 4
types of filters including an oriented filter set and an
isotropic filter set. The oriented set is comprised of
two groups, where each group has 6 different orienta-
tions and 3 different scales. The isotropic set consists
of 12 filters each at different scales.
The process begins with the construction of a tex-
ton dictionary that will represent the information in
VISAPP 2006 - IMAGE UNDERSTANDING
14