op cvalue is the threshold directive: op can either be >, >=, <, or <=, and cvalue is
an explicit certainty value.
If the weight is omitted it is assumed to be 1 by default. The threshold directive can
also be omitted. The certainty values are reals in the range 0..1 .
It should be emphasized that a value of a term depends on the current value of the
predicate for the particular instantiation of its variables; if the threshold directive is
used, the value becomes 0 (if the current value of the predicate does not satisfy the
directive), or 1 (if it does). The resulting value of the term is then the value of the
predicate modified by the threshold directive and multiplied by the weight.
When the backward-chaining mode is used in the McESE system, each rule that
has the predicate being evaluated as its right-hand side predicate is eligible to ‘fire’.
The firing of a McESE rule consists of instantiating the variables of the left-hand side
predicates by the instances of the variables of the right-hand size predicate, evaluating
all the left-hand side terms and assigning the new certainty value to the predicate of
the right-hand side term (for the given instantiation of variables). The value is com-
puted by the CVPF F based on the values of the terms T
1
,...,T
n
. In simplified terms,
the certainty of the evaluation of the left-hand side terms determines the certainty of
the right-hand side predicate. There are several built-in CVPF’s the user can use (min,
max, average, weighted average), or the user can provide his/her own custom-made
CVPF's. This approach allows, for instance, to create expert systems with fuzzy logic,
or Bayesian logic, or many others [14].
It is a widely known conflict that any rule-based expert system must deal with the
problem of which of the eligible rules should be ‘fired’. This is dealt with by what is
commonly referred to as conflict resolution. This problem in McESE is slightly
different; each rule is fired and it provides an evaluation of the right-hand predicate –
and we face the problem which of the evaluation should be used. McESE provides the
user with three predefined conflict resolution strategies: min (where one of the rules
leading to the minimal certainty value is considered fired), max (where one of the
rules leading to the maximal certainty value is considered fired), and rand (a
randomly chosen rule is considered fired). The user has the option to use his/her own
conflict resolution strategy as well.
3 Survey of Genetic Algorithms
Data Mining (DM) consists of several procedures that process the real-world data.
One of its components is the induction of concepts from databases; it consists of
searching usually a large space of possible concept descriptions. There exist several
paradigms how to control this search, for instance various statistical methods, logi-
cal/symbolic algorithms, neural nets, and the like. However, such traditional
algorithms select immediate (usually local) optimal values.
The genetic algorithms (GAs) exhibit a newer paradigm for search of concept
descriptions. They comprise a long process of evolution of a large population of
individuals (objects, chromosomes) before selecting optimal values, thus giving a
‘chance’ to weaker, worse objects. They exhibit two important characteristics: the
search is usually global and parallel in nature since a GA processes not just a single
individual but a large set (population) of individuals.
121