The Kullback-Leibler divergence (KL-divergence), which is a measure of how dif-
ferent two probability distributions are, is used as the measure of divergence between
the theoretical distribution, that is derived from the hypothesis about the model of
randomness and the actual distribution observed in the data. The sense whose distri-
bution has the least divergence from the model of randomness is selected as the cor-
rect sense for the target word. As far as the model of randomness, we assign to the
related synsets and evaluate three alternative theoretical distributions: the standard
Normal distribution, the Poisson distribution and the Binomial distribution. In the
same framework, any model of randomness could be assigned to the data and any
measure of differentiation between distributions could be used to quantify the "dis-
crepancy" between the theoretical and the actual distribution.
In section 2, we describe the WordNet relations used by our algorithm to form the
bags of the related synsets. In section 3, we describe our algorithm and how it works
with the various models of randomness. In section 4, experimental results and a com-
parison with the results of other systems are given. Finally, some aspects of our
method and future activities are discussed in section 5.
2 WordNet
WordNet is an electronic lexical database developed at Princeton University in 1991
by Miller et al. [1] and has become last years a valuable resource for identifying taxo-
nomic and networked relationships among concepts.
Lexical entries in WordNet are organized around logical groupings called synsets.
Each synset consists of a list of synonymous words, that is, words that could be inter-
changeable in the same context without variation in the meaning (of the context).
Thus, the synset
{administration, governance, establishment, brass, organization, organisation}
represents the sense of governing body who administers something. The basic feature
that differentiates WordNet from the other conventional dictionaries is the relations,
pointers that describe the relationships between this synset and other ones. WordNet
makes a distinction between semantic relations and lexical relations. Lexical relations
hold between word forms; semantic relations hold between word meanings. Since a
semantic relation is a relation between meanings, and since meanings can be repre-
sented by synsets, we must think of semantic relations as pointers between synsets.
For each synset in WordNet, such pointers connect the synset with other ones and
form a list of connected synsets (the "related synsets"). WordNet stores information
about words that belong to four parts-of-speech: nouns, verbs, adjectives and adverbs.
Prepositions, conjunctions and other functional words are not included. Besides sin-
gle words, WordNet synsets also sometimes contain collocations (e.g. fountain pen,
take in) which are made up of two or more words but are "treated" like single words.
Our algorithm makes use of a portion of all the relations provided by WordNet for
nouns, verbs, adjectives and adverbs, but we have also the possibility to use in a simi-
lar way any combination of these relations to achieve better results. We give a short
description below for the relations used in our work.
73