Friday, June 5, 2015

52 Things: Number 35: Give the rough idea of Pollard rho, Pollard "kangaroo" and parallel Pollard rho attacks on ECDLP.

This is the latest in a series of blog posts to address the list of '52 Things Every PhD Student Should Know' to do Cryptography: a set of questions compiled to give PhD candidates a sense of what they should know by the end of their first year. In this blog post we discuss the Pollard rho, Pollard "kangaroo" and parallel Pollard rho attacks on ECDLP.


Our aim is to solve the discrete logarithm problem, h = gx for any cyclic finite abelian group G. Thus, assuming that we have a cyclic group G = g, which has prime order p, we want to find the value of x modulo p such that h = gx when we were also given an h ∈ G. The problem with the Baby-Step/Giant-Step method is that although its run time complexity is O(p), it also requires O(p) space. Hence, we are interested in replacing the large space requirement for a smaller space requirement, but maintain a time complexity of O(p). This task can be achieved with the following algorithms. [1]

1. Pollard’s Rho Algorithm.

Let f : S S be a random mapping between a set S and itself, n is the size of S. For a random value  x0 S we compute xi+1 = f(xi) for i ≥ 0. Each step xi+1 = f(xi) is a deterministic function of the current position xi. The values x0, x1, x2, . . . are considered as a deterministic random walk.

Since S is finite we will eventually obtain xi = xj thus xi+1 = f(xi) = f(xj) = xj+1Hence, the sequence x0, x1, x2, . . . , will eventually become cyclic (“pho” shape: ρ). Our goal is to find a collision in a random mapping like the one above, which means to find 2 values xi and xj with ij such that xi =xj.

To find a collision we use Floyd’s cycle finding algorithm: Given (x1,x2) we compute (x2,x4), then (x3,x6) and so on, i.e. given the pair (xi, x2i) we compute (xi+1,x2i+2) = (f(xi),f(f(x2i))) and we stop when we find xm = x2m. It is m=O( n).

For the discrete logarithm problem we partition group S into three sets S1,S2,S3. We assume that 1 $ \in $ S2, and define the following random walk on the group Gfollowing random walk on the group G: xi+1 =f(xi)=h·xi when xi S1xi+1 =f(xi)=x2i when xS2xi+1 =f(xi)=g·xi when xS3. We actually keep track of (xi, αi, bi) where αi+1 = αi when xi S1αi+1 2αi (modnwhen x∈ S2αi+1 αi+1(modnwhen x∈ S3, and bi+1 bi+1 (modnwhen xi S1,  bi+1 2bi (modn) xi S2bi+1 bi  when x∈ S3. 
 
Starting with the triple (x00,b0) = (1,0,0), then for all i we have logg(xi) = αi + bi logg(h) = αi + bix. Applying Floyd’s algorithm we are able to obtain a collision, thus find a value of m such that xm = x2m. This means that abmx = a2b2mx or (b− b2m)a2− am and if bm $ \neq $ b2m, we obtain x = $ \frac{a_{2m} - a_m}{b_m - b_{2m}} (mod n) $

Assuming that the sequence x0,x1,x2,... is produced by a random mapping from to itself, then the above algorithm will find the discrete logarithm in the expected time O( n).


2) Pollard’s Kangaroo Method.

Pollard’s Kangaroo method is like the Rho method but it is particularly tuned to the situation where we know that the discrete logarithm lies in a certain interval x [a,...,b].

Let w = b a be the length of the interval in which the discrete logarithm x is known to lie. We define a set = {s0,...,sk1} of integers in non-decreasing order and its mean m should be around N =√w. We usually choose si = 2i for 0 i < k (thus the mean of the set is m = $ \frac{2^k}{k}$) and also k $ \frac{1}{2}$ log2(w). The group is divided up to k sets Si, for i = 0, . . . , k 1. We then define the deterministic random walk: xi+1=xi·gsj  if xiSj.

We compute the deterministic random walk, starting from g0 = gb, by setting gi = gi1 · gsfor i=1,...,N. We also set c0 =and ci+1 =ci+sj (mod q). We store gN and notice that we have computed the discrete logarithm of gN with respect to g, which is cN =logg(gN).

Now we have to compute the second deterministic random walk starting from the unknown point in the interval x. We set h0 = h = gx and compute h i+1 = hi · gsj . We also set d0 = 0 and di+1 = di +sj (mod q). Notice that we have logg(hi) = x + di.

Hence, if the path of the hi meets the path of the gi then hi will carry on the path of the gi. We will then be able to find a value M where hM equals our stored point gN

Thus, we will have cN = logg(gN) = logg(hM) = x+dMand the solution to our discrete logarithm problem is given by x = cN dM (mod q).

If we do not get a collision then we can increase N and continue both walks in a similar manner until a collision does occur. The expected running time of this method is w and the storage can be seen to be constant.


3) Parallel Pollard’s Rho Method.


When we use random walk based techniques for solving discrete logarithm problems we often use a parallel Pollard's version. Assuming that we are given the discrete logarithm problem h = gin a group G of prime order q, we first decide on an easily computable function H : G → {1 , . . . , k} (k is usually around 20) and then we define a set of multipliers mi. These are produced by generating random integers ai, bi [0, . . . , q 1] and then setting mi=gaihbi.

To start the deterministic random walk we randomly pick s0, t0 [0, . . . , q 1] and compute g0 =gs0ht0The deterministic random walk is then defined on the triples (gi,si,ti) where gi+1 = gi · mH(gi)si+1 = si + aH(gi) (mod q)ti+1 = ti + bH(gi) (mod q)

Hence, for every gi we record the values of si and ti such that gi =gsihti.

If we assume that we have m processors, then each processor can start a different deterministic random walk from a different starting position using the same algorithm in order to determine the next element in the walk. When two processors (or the same processor) meet an element of the group that has been seen before, then we obtain the equation gsi hti = gsj htj  from which for the discrete logarithm x can be solved

We expect that after O($\sqrt{πq/2}$/miterations of these parallel walks, a collision will be found and the discrete logarithm problem will be solved. However, this means that each processor needs to return every element in its computed deterministic random walk to a central server which then stores all the computed elements. This is highly inefficient due to large storage requirements, namely O($\sqrt{πq/2}$).

Moreover the storage can be reduced to any required value as follows: We define a function d on the group, → {01such that d(g) = 1 around 1/2t of the time. The function d is often defined by returning d(g) = 1 if a certain subset of t of the bits representing g are set to zero for example. The elements in G for which d(g) = 1 will be called distinguished.

It is only the distinguished group elements which are now transmitted back to the central server, which means that we expect the deterministic random walks to continue another 2t steps before a collision is detected between two deterministic random walks. Hence, the computing time now becomes O($\sqrt{πq/2}$/m+2t) and storage becomes O($\sqrt{πq/2}$/2t). Thus, storage can be reduced to any manageable amount, at the expense of a little extra computation.

[1] http://www.cs.bris.ac.uk/~nigel/Crypto_Book/book.ps (pages 208 - 214)

1 comment:

  1. I'm really interested to read this but there seems to be a few problems rendering the LaTeX - could you check the script please?

    ReplyDelete