χ
2
P
6.635 0.01
10.83 0.001
18.70 2
−16
40.17 2
−32
24.02 2
−40
83.82 2
−64
105.8 2
−80
This type of test is dependent upon the sample size;
even a very slightly biased function will yield a high
χ
2
value by the test if the sample size is allowed to be
arbitrarily large. The sample sizes are bound by com-
putational restrictions, however. A distinguishing at-
tack is not relevant unless its total expected computa-
tional complexity is smaller than the claimed security
level of the cipher (typically equivalent to 2
k−1
key
trials, where k is the size of the secret key).
4 GATE COMPLEXITY AND THE
D-MONOMIAL TEST
In this section we will give a formal definition for gate
complexity and investigate its relationship with the d-
Monomial test. Gate complexity is essentially equiv-
alent to circuit complexity with realistic limitations
(Clote, 2002; Wegener, 1987).
Definition. Gate complexity of a Boolean function
f(x
1
, x
2
, . . . , x
n
) is the minimum number of gates
required to implement it in an acyclic circuit network.
A gate is a Boolean function with two inputs. The
constant functions 0 and 1, together with trivial func-
tions x
1
, x
2
, . . . have gate complexity 0.
Note that all 2
2
2
= 16 two-bit functions count as a
single gate, not just the standard ones (∨, ∧, ¬, ⊕).
We have determined the gate complexity of all
2
2
4
= 65536 four-bit Boolean functions. This was
done by performing an exhaustive search over all cir-
cuits with one gate, two gates, etc, until circuits for all
functions had been found. The task was computation-
ally nontrivial, even though we optimized the code to
take various symmetries and isometries into account.
The maximum gate complexity turned out to be 7 (see
Figure 2).
Table 1 gives the distribution of functions by gate
complexity. In it, G
i
is the number of functions of
gate complexity i. These sum to
P
i
G
i
= 65536.
Here g
i,d
is the number of monomials of degree d and
gate complexity i. These sum to
P
d
g
i,d
= G
i
. The
maximum possible value for g
i,d
is G
i
4
d
. The ex-
pected number in a d-monomial test is half of this
value. The table contains the “bias” fraction q
i,d
=
g
i,d
/(G
i
4
d
).
Note how in Table 1 the d-Monomial “bias” q
i,d
tends to be strongly increasing as the gate complex-
ity i grows (apart for anomaly at q
6,4
). This is clear
evidence of a correlation between the complexity of a
Boolean circuit and the d-monomial test. It is plausi-
ble to expect that a similar phenomenon is exhibited
by Boolean functions with 5, 6, . . . inputs. However,
the exact degree of this bias is currently an open prob-
lem for n > 4. We can expect simple functions to be
distinguishable in a d-monomial test even when n is
large.
It is interesting to note that it is even possible to test
the opposite; to distinguish a complex function from a
randomly chosen one, as the following example illus-
trates.
Example. With the 2720 functions of gate com-
plexity 7, all d-Monomial counts appear to be biased
upwards; q
7,d
≥ 0.5. We will use a d-Monomial test
to create a distinguisher based on this fact, particu-
larly that q
7,1
= 0.606.
Consider the following game. There is a list L con-
taining binary vectors of length 5. Entries in L are
may have been generated with one of the following
two methods:
1. Choose a random 4-bit Boolean function of gate
complexity 7 for each entry, and add the following
vector to the list
(f(0, 0, 0, 0), f(1, 0, 0, 0), f(0, 1, 0, 0),
f(0, 0, 1, 0), f(0, 0, 0, 1)).
2. Choose a completely random Boolean function
(one of the 65536 possibilities) and create a vector
in similar fashion.
We pose the following question: How long does L
need to be for us to see which type of list it is ?
We first note that the vectors contain sufficient in-
formation for computation of 1-Monomial test (e.g.
ˆ
f(1, 0, 0, 0) = f(0, 0, 0, 0) + f (1, 0, 0, 0)). Each 1-
Monomial test is simply the sum of 4 bits in the ANF
result. The expected sum after n list entries is 2n for a
random function and based on our exhaustive search,
g
7,1
n/G
7
= 6592/2720n ≈ 2.424n for a gate com-
plexity 7 function. Our distinguisher will simply re-
turn “a” if the sum is greater than 2n and “b” other-
wise.
In the second, fully random case, the distinguisher
has no advantage as the bits in the vector are random
too; “a” and “b” will both be returned with probability
1/2 regadless of the length of L.
In case 1, after n = 34 steps, the sum can be ex-
pected to reach 2.424∗34 = 82.4. ”a” will be returned
by the distinguisher with probability 99%. Hence we
can distinguish the list of (partially computed and ran-
domly chosen) “complex” functions with significant
certainty with a list of only 34 entries! Note that the
CHOSEN-IV STATISTICAL ATTACKS ON eSTREAM CIPHERS
263