Montgomery modular multiplication

The computation of $xy \; \text{mod} \; p$ is much slower than usual multiplication, since it requires a division to know how many times $p$ has to be subtracted from the product.

Montgomery modular multiplication (opens new window) is a way to do this computation faster. The trick is to move the computations to the Montgomery space which contains the integers smaller than $p 2^m$ for some positive integer $m$ . We move the element $a$ to the Montgomery space by multiplying it by $R = 2^m - p$ .

Let’s say we need to compute

$(a + b) \; \text{mod} \; p$

We move the computation to the Montgomery space:

$(a R + b R) \; \text{mod} \; p$

We compute $a R + b R,$ then we need to divide it by $R$ , and finally we need to subtract $p$ if the result is bigger than $p$ (we know that the result is smaller than $2p$ because $a < p, b < p$ ).

Well, this is more complicated than simply computing $a + b$ and then subtracting $p$ if the result is bigger than $p$ . However, it is not much more complicated if we do it the right way. Instead of computing $(a R + b R) / R$ simply as integers and in the last step computing modulo $p$ (subtracting $p$ if necessary), we can much more efficiently compute:

$(a R + b R) / R \; \text{mod} \; p$

Let’s see how.

Efficient division by R modulo p

Note that

$R = 2^m \; \text{mod} \; p$

That means

$(a R + b R) / R = (aR + bR) / 2^m \; \text{mod} \; p$

For computers, dividing by $2^m$ means almost no work; it’s just a shift operation. Given that $x$ is divisible by $2^m$ , division by $2^m$ means shifting by $m$ positions:

$x / 2^m = x >> m$

Of course, $aR + bR$ might not be divisible by $2^m$ . But in $\mathbb{Z}_p$ we have some liberty to make an arbitrary $x$ divisible by $2^m$ .

We just need to find $k \in \mathbb{Z}$ such that $x + k p$ is divisible by $2^m$ .

Computing $x/2^m$ in $\mathbb{Z_p}$ means computing the inverse of $2^m$ in $\mathbb{Z_p}$ (which is the same as the inverse of $R$ in $\mathbb{Z_p}$ ) – let’s denote this inverse by $R^{-1}$ – and then computing $x R^{-1}$ :

$x / 2^m = x R^{-1} \; \text{mod} \; p$

For any $x$ it holds:

$x R^{-1} = (x + k p) R^{-1} \; \text{mod} \; p$

But once we have $k \in \mathbb{Z}$ such that $x + k p$ is divisible by $2^m$ , we can compute $x R^{-1}$ mod $p$ simply in integers as $(x + kp) / 2^m$ . So we can compute $x R^{-1}$ mod $p$ very efficiently simply by shifting $x + kp$ by $m$ positions:

$x R^{-1} \; \text{mod} \; p = (x + kp) >> m$

We will see in the example section how to compute $k$ (and why $(x + kp) >> m$ is smaller than $2p$ , which makes it easy to make it smaller than $p$ ).

Multiplication

As we saw above, addition by moving to the Montgomery space and back is not much more expensive than without moving it there and back, but the real reason we are moving to the Montgomery space is multiplication.

The computation of $ab$ mod $p$ is not nearly as simple as the computation of $a + b$ mod $p$ where in the worst-case scenario we need to do an additional subtraction of $p$ . This is why we move $a, b$ to the Montgomery space. We get $x, y \in \mathbb{Z}_p$ such that:

$x = aR \; \text{mod} \; p$

$y = bR \; \text{mod} \; p$

We compute $xy$ and reduce it back to the Montgomery space by dividing it by $R$ . Note that:

$xy = aR bR \; \text{mod} \; p$

So by dividing $xy$ by $R$ we get:

$abR \; \text{mod} \; p$

So we have a Montgomery representation of $ab$ . To compute the final value $ab$ , we again use the efficient division by $R$ .

Note that this pays off when we have multiple operations. We move the elements to the Montgomery space, execute all the operations in the Montgomery space, and as the last step move the elements back from the Montgomery space.

Example

For demonstration purposes, let’s have: $m = 2^{64}, p = 2^{64} - 257, R = 257.$

The division of $a$ by $R$ would go as follows.

First, we compute:

$\mu = -p^{-1} \; \text{mod} \; 2^{64}$

Note that for some $l \in \mathbb{Z}:$

$\mu p = -(l 2^{64} + 1)$

We then compute:

$(\mu (a \; \text{mod} \; 2^{64}) \; \text{mod} \; 2^{64}) \cdot p + a$

Let’s observe the expression above modulo $2^{64}$ :

$(\mu p (a \; \text{mod} \; 2^{64}) \; \text{mod} \; 2^{64}) + a \; \text{mod} \; 2^{64} =$

$= -(l 2^{64} + 1) (a \; \text{mod} \; 2^{64}) + a \; \text{mod} \; 2^{64} =$

$= -1 (a \; \text{mod} \; 2^{64}) + a \; \text{mod} \; 2^{64} = 0$

That means we can divide

$(\mu (a \; \text{mod} \; 2^{64}) \; \text{mod} \; 2^{64}) \cdot p + a$

by $2^{64}$ using the shift operation.

So we found

$k = (\mu (a \; \text{mod} \; 2^{64}) \; \text{mod} \; 2^{64})$

that we were looking for in the Efficient division section above.

We can compute

$x = ((\mu (a \; \text{mod} \; 2^{64}) \; \text{mod} \; 2^{64}) \cdot p + a) / 2^{64}$

Because $a < p2^{64}$ (this is our Montgomery space) and

$(\mu (a \; \text{mod} \; 2^{64}) \; \text{mod} \; 2^{64}) \cdot p < p 2^{64}$

we have

$x < 2p$

If $x \geq p$ , then we compute

$x = x - p$

Finaly, we have $x$ such that:

$x = aR \; \text{mod} \; p$

← Roots in finite fields Univariate Coppersmith algorithm →