FINITE FIELD MULTIPLICATION IN LAGRANGE

REPRESENTATION USING FAST FOURRIER TRANSFORM

Christophe Negre

Equipe DALI, LP2A, Universit

e de Perpignan

avenue P. Alduy, 66 000 Perpignan, France

Keywords:

Finite ﬁeld arithmetic, FFT, multiplication.

Abstract:

The multiplication in F

can be performed using a polynomial version of Montgomery multiplication (Mont-

gomery, 1985). In (Bajard et al., 2003) Bajard et al. improved this method by using a Lagrange represen-

tation: the elements of F

are represented by their values at a ﬁxed set of points. The costly operations in

this new algorithm are the two changes of Lagrange representation which require 2r

operations in F

with

n ≤ r ≤ 2

⌈log

(n)⌉

. In this paper we present a new method to perform the change of Lagrange represen-

tation. This method uses Fast Fourier Transform and has a cost equal to 3rlog

(r) operations in F

with

r = 2

⌈log

(n)⌉

INTRODUCTION

Finite ﬁeld arithmetic is used in many applications,

for example in cryptography (Koblitz, 1987; Miller,

1986) and in error correcting code (Berlekamp, 1982).

Getting efﬁcient ﬁnite ﬁeld arithmetic is an important

challenge to make these applications as fast as pos-

sible. Specially in ECC essentially two ﬁeld opera-

tions are required: multiplication and addition, and

the multiplication is the most costly. Consequently,

many works have been done to get an efﬁcient mul-

tiplication in ﬁnite ﬁeld F

(Ning and Yin, 2001;

Sunar and Koc, 1999), F

and also for F

(Bailey

and Paar, 1998).

Our interest here concerns ﬁnite ﬁeld extensions

. Elements in these ﬁeld can be seen as polyno-

mial A(X) ∈ F

[X] of degree n − 1. The multiplica-

tion consist in a multiplication of polynomial follow-

ing by a reduction modulo an irreducible polynomial

N(X) ∈ F

[X].

A method to perform arithmetic in F

was pro-

posed by Bajard et al. in (Bajard et al., 2003). This

method is based on a generalization of the polynomial

version of the Montgomery Algorithm (Montgomery,

1985). The arithmetic is done modulo friable poly-

nomials Ψ and Ψ

′

with degree r ≥ n; by friable we

means they are product of polynomial of degree one.

The arithmetic modulo such polynomials is very efﬁ-

cient when the elements are expressed in the Lagrange

representation (Bajard et al., 2003) LR

and LR

′

In this situtation the most costly operation is the

change of representation from LR

to LR

′

which

consists of a product of an r × r constant matrix with

the vector LR

(A). This requires roughly r

multi-

plications and r

additions. Here we will improve this

method for well chosen Ψ and Ψ

′

and by using Fast

Fourier Transform. The change of representation will

require 4r log(r) multiplications and 8r log(r) addi-

tions .

This article is organized as follows: in the ﬁrst

section we will give some background on Lagrange

representation of polynomials. In the section 2 we

explain how the Lagrange algorithm (Bajard et al.,

2003) of Bajard et al. works. The main contribution

of the paper is in section 3 in which we present our

new method to perform the change of Lagrange rep-

resentation. At the end we evalutate the complexity of

our algorithm and compare it to the original method

of Bajard et al. and we conclude by a brief conclu-

sion.

1 LAGRANGE

REPRESENTATION

The Lagrange representation consists to represent a

polynomial by its evalutation at n points. In an arith-

320

Negre C. (2006).

FINITE FIELD MULTIPLICATION IN LAGRANGE REPRESENTATION USING FAST FOURRIER TRANSFORM.

In Proceedings of the International Conference on Security and Cryptography, pages 320-324

DOI: 10.5220/0002096503200324

 SciTePress

metic point of view, this is related to the Chinese Re-

mainder Theorem which asserts that the following ap-

plication is an isomorphism.

[X]/(Ψ) → F

[X]/(X − e

) × · · · × F

[X]/(X − e

)

A 7→ (A mod (X − e

), . . . , A mod (X − e

)) ,

We remark that the computation of A mod (X −

) is simply the computation of A(e

). In other words

the image of A(X) by the isomorphism (1) is nothing

else that the multi-points evaluation of A at the roots

of Ψ =

i=1

(X − e

Deﬁnition 1 (Lagrange representation) Let A ∈

[X] with deg A < n, and Ψ =

i=1

(X − e

where e

∈ F

for 1 ≤ i ≤ r and e

6= e

for i 6= j. If

= A(e

) for 1 ≤ i ≤ r, the Lagrange representa-

tion (LR) of A(X) modulo Ψ is deﬁned by

(A(X)) = (a

, . . . , a

). (1)

The advantage of the LR representation to perform

operations modulo Ψ is a consequence of the Chinese

remainder theorem. Specially the arithmetic modulo

Ψ in classical polynomial representation can be costly

if Ψ has a high degree, in LR representation this arith-

metic is decomposed into n independent arithmetic

units, each consists of arithmetic modulo a very sim-

ple polynomial (X − e

). But arithmetic modulo

(X − e

) is the arithmetic modulo p since the prod-

uct of two degree zero polynomials is just the product

modulo p of the two constant coefﬁcients.

2 MONGOMERY

MULTIPLICATION IN

LANGRANGE

REPRESENTATION

Montgomery in (Montgomery, 1985) proposed an al-

gorithm to perform integer modular multiplications.

The polynomial version of the Montgomery algo-

rithm can be used to perform polynomial modular

arithmetic in F

[X]. Here we will present only the

generalized version of this algorithm.

We consider an irreducible polynomial N ∈ F

[X]

of degree n and two polynomials Ψ, Ψ

′

∈ F

[X] such

that

gcd(Ψ, Ψ

′

) = gcd(Ψ, N) = gcd(Ψ

′

, N) = 1,

and deg Ψ, deg Ψ

′

≥ n. In this situation, the gener-

alized polynomial version of Montgomery Algorithm

computes ABΨ

−1

mod N .

The advantage of Montgomery multiplication is to

avoid euclidean division to compute AB modulo N.

This division is replaced by exactly 5 multiplications

Algorithm 1 Generalized Montgomery Multiplica-

tion.

Require: A, B ∈ F

[X], with deg A, deg B ≤ n−1;

a monic irreducible polynomial N ∈ F

[X], with

deg N = n; Ψ, Ψ

′

, with deg Ψ = deg Ψ

′

= r ≥

n, and gcd(Ψ, Ψ

′

) = gcd(Ψ, N) = 1

Ensure: ABΨ

−1

mod N

1: Q ← A × B × N

−1

mod Ψ

2: R ← (A × B + Q × N ) × Ψ

−1

mod Ψ

′

modulo Ψ or Ψ

′

(2 in step 1 and 3 in step 2). But this

makes sense only if the arithmetic operation modulo

Ψ and Ψ

′

can be done efﬁciently. In their original

paper Bajard et al. proposed to use Ψ and Ψ

′

which

split totally in F

[X], i.e.,

Ψ =

i=1

(X − e

) and Ψ

′

i=1

(X − e

′

In section 1 we noticed that the arithmetic modulo

polynomials Ψ and Ψ

′

can be done efﬁciently using

Lagrange representation. The use Lagrange represen-

tation LR

and LR

′

provides some complications

in the Generalized Montgomery Algorithm: between

step 1 and step 2 we have to perform some conver-

sions from Lagrange representation of Q relatively to

Ψ to Lagrange representation of Q relatively to Ψ

′

Similarly after the step 2 we should compute the La-

grange representation of R relatively to Ψ.

If we note Change

Rep

Ψ→Ψ

′

the sub-routine

which computes the Lagrange representation of an el-

ement A relatively to Ψ

′

from its Lagrange represen-

tation relatively to Ψ, the generalized Montgomery’s

Algorithm 1 becomes

Algorithm 2 LR Modular Multiplication.

Require: A, B ∈ F

[X], with deg A, deg B ≤ k−1;

a monic irreducible polynomial N ∈ F

[X], with

deg N = n; Ψ, Ψ

′

, with deg Ψ = deg Ψ

′

= r,

and gcd(Ψ, Ψ

′

) = gcd(Ψ, N) = 1

1: LR

(Q) ← LR

(A) × LR

(B) × LR

(N)

−1

2: LR

′

(Q) ← Change

Rep

Ψ,Ψ

′

(LR

(Q))

3: LR

′

(R) ← (LR

′

(A) × LR

′

(B) −

P si

′

(Q) × LR

′

(N)) × LR

′

(Ψ)

−1

4: LR

′

(R) ← Change

Rep

′

,Ψ

(LR

′

(R)

In (Bajard et al., 2003) Bajard et al. the change

of Lagrange representation is done with a product

matrix-vector Ω · LR

(Q) where the matrix Ω =

[ω

i,j

]

i,j=1,...,r

is a r × r matrix and has the follow-

ing coefﬁcients

i,j

ℓ=1

′

− e

ℓ

− e

ℓ

FINITE FIELD MULTIPLICATION IN LAGRANGE REPRESENTATION USING FAST FOURRIER TRANSFORM

321

This matrix-vector product is a costly operation: it

requires r

multiplications and r(r − 1) additions.

In this paper we study a different strategy to per-

form the change of Lagrange representation. Our

strategy was to express the change of representation

in term of Discrete Fourrier Transform. This becomes

interisting when this DFT can be computed with the

FFT, since the FFT is really efﬁcient.

3 CHANGE OF LAGRANGE

REPRESENTATION WITH FFT

For now, we will take the prime integer p such that

k+1

|p − 1 with k ≥ 1. In this case there exists

an element α ∈ F

such that the 2

k+1

elements α

for i = 0, . . . , 2

k+1

− 1 are the 2

distinct roots of

k+1

− 1.

Let r = 2

, in this situation we choose the two

polynomials Ψ and Ψ

′

as follows

Ψ =

i=0

(X − α

) = X

− 1

and Ψ

′

i=0

(X − α

2i+1

) = X

+ 1.

We are going to express the change between two

Lagrange representations LR

and LR

′

in term of

to Discrete Fourier Transform. But before we recall

some basic fact on DFT and FFT.

3.1 Background On Dft and FFT

Let β = α

, the evaluation of a polynomial A(X) ∈

[X] at the elements β

for i = 0, . . . , r corresponds

to the Discrete Fourier Transformation of A

DF T

(A) = (ˆa

, ˆa

, . . . , ˆa

)

= (A(1), A(β), A(β

), . . . , A(β

r−1

))

The DF T

of A can be computed by using the

F F T

Algorithm. This algorithm is based on the fol-

lowing identities

ˆa

= A

((β

)

) + β

((β

)

ˆa

i+(r/2)

= A

((β

)

) − β

((β

)

(2)

where the polynomial A

(X) is the even part of A

and A

(X) is the odd part of A

(X) =

r/2

i=0

, A

(X) =

r/2

i=0

2i+1

The F F T

recursively computes F F T

r/2

)

and F F T

r/2

) and then deduces F F T

(A) =

(A(β

)A(β

), . . . , A(β

(r−1)

)) using equations (2).

For a complete description of the FFT algorithm

see (von zur Gathen and Gerhard, 1999)

In our situation the FFT is a powerful algorithm

which enables us

• to compute the Lagrange representation relatively

to Ψ of a polynomial A(X).

• to compute the polynomial from its LR representa-

tions relatively to Ψ since the reverse operation of

the FFT is F F T

−1

F F T

3.2 The Change

Rep Routines

Let us see how use the FFT in the change of repre-

sentation between LR

and LR

′

. We know only

how to reconstruct a polynomial A(X) from its LR

representation

A(X) = F F T

−1

(LR

(A)).

But if now we compute

A(X) = A(αX) and if after

that we compute F F T

(

A), we obtain the r = 2

elements

A(β

) = A(αα

) = A(α

2i+1

for i = 0, . . . , r. This means that LR

′

(A) =

F F T

(

A)). Consequently the change of Lagrange

representation Change

Rep

Ψ→Ψ

′

can be done with

the following algorithm.

Algorithm 3 Change Rep

Ψ→Ψ

′

1: A(X) ← F F T

−1

(LR

(A))

A(X) ← A(αX)

3: LR

′

(A) ← F F T (

The reverse of the change of representation is done

using the reverse process. First we get back to

A(X)

by computing F F T

−1

(LR

′

(A), then we compute

to A(X) with A(X) =

A(α

−1

X), and ﬁnally we get

(A) by computing LR

(A) = F F T

(A(X)).

Algorithm 4 Change Rep

′

→Ψ

A(X) ← F F T

−1

(LR

′

(A))

2: A(X) ←

A(α

−1

3: LR

(A) ← F F T (A(X))

It is natural to wonder if it is possible to merge

the three step of the Algorithm 3 and 4 in a FFT-like

recursive Algorithm since the main computation are

done with F F T

and F F T

−1

. Specially the com-

putation of the polynomial representation of A(X)

and

A(X) seems to be superﬂuous, it thus could be

avoided. For now we could not obtain such recursive

algorithm, but it will be interesting in the future to get

such recursive method to improve the performance of

the Algorithm 3 and 4.

SECRYPT 2006 - INTERNATIONAL CONFERENCE ON SECURITY AND CRYPTOGRAPHY

322

4 COMPLEXITY COMPARISON

In this section we evalutate the complexity of the

modiﬁed form of Lagrange multiplication of Bajard et

al.. First we focus on the theoretic complexity by

evaluating the number of ﬁeld operations F

(addi-

tions and multiplications). In the table 4 we give

the cost of each step of the Algorithm 2 used with

Change

Rep of section 3. For an explicit evaluation

of the cost of the FFT we refer to (von zur Gathen and

Gerhard, 1999).

Multiplication Additions

Step 1 2r 0

Step 2 rlog

(r) 2rlog

(r)

Step 3 3r r

Step 4 (2rlog

(r) + r − 1) 4rlog

(r)

The global cost of the algorithm is thus equal to

(4rlog

(r)+7r−2) multiplications and (8rlog

(r)+

r) additions. We get a clearly improvement compared

to original LR multiplication, since the complexity

was equal to (2r(r − 1) + r) addtions and 2r

+ 5r

multiplications in F

Hardware implementation. For hardware imple-

mentations, the FFT have the nice property to be

parallelizable. Specially, we can compute F F T

(A)

with r/2 multipliers and r adders in parallel. The de-

lay of the architecture to perform one F F T

is then

equal to log

(r)T

+log

(r)T

. For a precise expla-

nation of this fact we refer to (Johnson et al., 2000) .

But the other computations in the Algorithm 2 re-

quire also at most only r multipliers in parallel and r

adders in parallel. This is clear for step 1 and step 3,

for step 2 and 4, we need r adders and multipliers for

the FFT parts, and also r multipliers for the compu-

tation of

Q and R(X). We can use at each time the

same r adders and multipliers.

Consequently the Algorithm 2 can be implemented

in hardware with an architecture using r multipliers

and r adders in parallel.

Let us evaluate the delay of such architecture. The

total delay is equal to the sum of the delay of each

step of Algorithm 2. If we note T

the time for a

multiplication in F

, the step 1 has a delay of 2T

and the step 2 has a delay of 3T

+ T

. In step 2

and 4 the delay is equal to the delay of two F F T

plus the delay of two multiplications ,i.e. (one T

for the multiplications by r

−1

and a second for the

computation of

A), each step has a delay of (log

(r)+

2)T

+ (log

(r)T

Finally, the global delay is equal to (log

(r) +

5)T

+ (log

(r) + 1)T

5 CONCLUSION

In this paper we have presented a modiﬁed version of

the Algorithm of Bajard et al. (Bajard et al., 2003)

computes the product of two elements of F

. We

have modiﬁed only a part of the algorihtm: precisely

we modiﬁed the changes of representation in a way

to use FFT algorithm. We thus obtain an algorithm

which have a sub-quadratic complexity: a multipli-

cations requires (4rlog

(r) + 7r − 2) multiplica-

tions and (8rlog

(r) + r) additions in F

instead of

(2r(r−1)+r) addtions and 2r

+5r multiplications in

the original work of Bajard et al. (Bajard et al., 2003).

We are greatefull to N. Louvet for helpfull com-

ments on a preliminary version of this paper.

REFERENCES

Bailey, D. and Paar, C. (1998). Optimal Extension Fields

for Fast Arithmetic in Public-Key Algorithms. Lecture

Notes in Computer Science, 1462:472.

Bajard, J.-C., Imbert, L., Negre, C., and Plantard, T. (2003).

Efﬁcient Multiplication in GF(p

) for Elliptic Curve

Cryptography. In ARITH 16, 16th IEEE Symposium

on Computer Arithmetic June 15-18, 2003 Santiago

de Compostela, SPAIN.

Berlekamp, E. (1982). Bit-serial Reed-Solomon encoder.

IEEE Transaction on Information Theory, IT-28(6).

Johnson, J., Kumhom, P., and Nagvajara, P. (2000). De-

sign, optimization, and implementation of a universal

fft processor. In 13th IEEE International ASIC/SOC

Conference, Washington, DC.

Koblitz, N. (1987). Elliptic curve cryptosystems. Mathe-

matics of Computation, 48(177):203–209.

Miller, V. (1986). Use of elliptic curves in cryptography.

Advances in Cryptology, proceeding’s of CRYPTO’85,

218:417–426.

Montgomery, P. (1985). Modular multiplication without

trial division. Mathematic of computation, 44(170).

Ning, P. and Yin, Y. (2001). Efﬁcient Software Implemen-

tation for Finite Field Multiplication in Normal Basis.

Lecture Notes in Computer Science, 2229:177.

Sunar, B. and Koc, C. (1999). Mastrovito Multiplier for All

Trinomials. IEEE Transaction on Computers.

von zur Gathen, J. and Gerhard, J. (1999). Modern com-

puter algebra. Cambridge University Press, New

York, NY, USA.

FINITE FIELD MULTIPLICATION IN LAGRANGE REPRESENTATION USING FAST FOURRIER TRANSFORM

323