OR2 OR3DELAY AND3 H
1
SUM_SIG123
H
3
MAX_SIG2 H
2
DIFF MAX_SIG2
MAX_SIG123 OR3DELAY AND3 H
1
SUM_SIG123 H
3
MAX_SIG3 H
2
DIFF
MAX_SIG3 MAX_SIG123.
(21)
The GP’s evolutionary process for inverting (21) is
summarized in Table 3.
This example is similar to (1), in fact if in (1) the
MAX operations are replaced by the SUM operation
and SUM replaced by MAX, then (21) is obtained.
Given that (1) and (21) only differ in labeling of the
underlying graphs it is anticipated that the GP based
evolutionary processes that yield (1) and (21) would
be similar. This anticipation is born out, but there
are differences in the evolutionary processes. One
significant difference is that the chromosome in (21)
is evolved in a smaller number of generations than
the one found in (1). There is nothing that is
obvious about the rule set or input-output data based
used for both chromosomes that would favor one
over the other. Experimentation seems to indicate
the difference in the number of generations required
is related to the seed of the random number
generator.
Just as with the example in Table 2, the best
chromosome of the first generation has an OR2 at
the end, but is otherwise too short and far removed
from the correct answer. By the eighth generation
the “H
2
DIFF MAX_SIG2 MAX_SIG123” structure
has emerged. The best chromosome of the 16
th
generation preserves the best features of previous
generations and also makes use of an OR3DELAY,
but it still has many defects. For all generations
after the 26
th
generation the elite chromosome has
two subgraphs that take a form that can be derived
in closed form. The elite chromosome of the 30
th
generation has many correct labels and incorrect
ones. It illustrates how evolution can fluctuate from
generation to generation producing individuals of
higher fitness, but departing significantly from the
true DL in form. Finally in generation 46 the GP
converges having produced the correct DL design.
As referenced above it is possible to derive
closed form exact results for the set of digital logic
diagrams or set of DL maps referred to as DLS that
exactly maximize the rule score. The graph
underlying each DL diagram is isomorphic in the
graph-theoretic sense to the graph underlying
T
DL .
Furthermore, for certain signal types it is possible to
write down closed form exact results for the image
sets under these DL maps. From the image set
closed form exact entries for an input-output
database can be derived that maximize the overall
fitness.
It is found that by the 26
th
generation in the
computational studies of Tables 2 and 3 that the GP
finds a member of DLS. After the 26
th
generation
the GP’s elite solutions remain within DLS each
generation. The GP spends the rest of the
generations until it converges, re-labeling the
underlying elite graphs eventually evolving
T
DL .
By selectively eliminating rules from the rule set,
of which Table 1 is a subset or eliminating rows
from the derived IO database, the effect of
uncertainty can be studied. The elimination of rules
represents an approach to the determining the effect
of linguistic imprecision, i.e., the inability of experts
to provide crisp rules. The random loss of a row or
rows from the IO database provides a model of the
effect of uncertainty born of randomness during
measurement.
The implications for the two kinds of uncertainty
can be significantly different. Loss of a rule or rules
can greatly expand the set of DL maps that will
maximize the rule-score. If all the rules are
maintained, but rows are lost from the IO database,
then the ultimate solution can be quite different than
truth, but the underlying graph will still be
isomorphic to
T
DL . In many instances the real
effect of loss of rows from the IO database is to
interchange thresholds on the resulting DL map.
When this occurs input signals can still be designed
that will produce a desirable output. The resulting
signal will be mathematically similar to the true DL,
but more signal power will be required. So the
effect of certain kinds of uncertainty is the
requirement for more power. So the DL map has an
uncertainty insensitivity (UI) up to power.
6 SUMMARY AND
CONCLUSIONS
Genetic program (GP) based data mining has proven
effective for reverse engineering the complex digital
logic underlying sensor devices (SDs) when the
original design specifications for these devices are
unavailable and invasive study of the systems is
impossible.
The database that was subjected to data mining
consisted of known input to the digital logic (DL),
the associated measured output and a set of rules
provided by experts relating to their assumptions
about the digital logic. It is found that having a set
of expert rules in the database is essential; the
ICINCO 2006 - SIGNAL PROCESSING, SYSTEMS MODELING AND CONTROL
112