Description
this exercise is based on math: number of theory, so involve more math than programming.
Exercise 1.9. Suppose we run Algorithm ExtEuclid on input (117,67). Show the steps of the
computation by giving the data corresponding to that shown in in the table in Example 1.2.
Exercise 1.10. Show that if a Ã¢â€°Â¥ b > 0, then the values s and t computed by ExtEuclid(a, b)
satisfy s Ã¢â€°Â¤ b/d and t Ã¢â€°Â¤ a/d.
Hint: prove by induction on bÃ¢â‚¬â€be careful, you have to stop the induction before b gets to zero,
so the last step to consider is when b  a.
Exercise 1.11. Suppose that a, b are integers with d := gcd(a, b) =
ÃŒÂ¸ 0.
(a) Show that a/d and b/d are relatively prime.
(b) Show that if s and t are integers such that as + bt = d, then s and t are relatively prime.
Exercise 1.12. Let n be an integer. Show that if a, b are relatively prime integers, each of which
divides n, then ab divides n.
Exercise 2.7. Let n be a positive integer. For x Ã¢Ë†Ë† {0, . . . , 2n Ã¢Ë†â€™1}, let x ÃŒÆ’ denote the integer
obtained by inverting the bits in the nbit, binary representation of x (note that x ÃŒÆ’ Ã¢Ë†Ë† {0, . . . , 2n
Ã¢Ë†â€™ 1}). Show that x ÃŒÆ’ + 1 Ã¢â€°Â¡ Ã¢Ë†â€™x (mod 2n). This justifies the usual rule for computing negatives in
2Ã¢â‚¬â„¢s complement arithmetic.
Hint: what is x + x ?ÃŒÆ’
Exercise 2.11. For each of the following congruences, determine all the integer solutions z Ã¢Ë†Ë†
{0, . . . , 999}. To so this, you should first put the congruence in the form az = b (mod 1000), as
we did in going from (2.2) to (2.3) in Example 2.3. Then follow the steps outlined in Ã‚Â§2.2.3.
Show all your steps (but you may use a calculator to help with the multiplications). Use the
extended Euclidean algorithm to compute modular inverses, where necessary.
(a) 100z + 200 Ã¢â€°Â¡ 93z + 171 (mod 1000)
(b) 115z + 130 Ã¢â€°Â¡ 100z + 165 (mod 1000)
(c) 115z + 132 Ã¢â€°Â¡ 100z + 140 (mod 1000)
(d) 119z + 132 Ã¢â€°Â¡ 113z + 140 (mod 1000)
Exercise 2.25. Calculate the order of 9 mod 100. Show your work by building a table of powers
9^i mod 100. In addition, from this table, identify the multiplicative inverse of 9 mod 100.
Exercise 2.30. Compute 399 mod 100 using the repeated squaring algorithm. Show your work.
Exercise 3.3. Consider the field F = Z5. Let f = X3 +X+[1] Ã¢Ë†Ë† F[X], g = [2]X2 +[3]X+[4] Ã¢Ë†Ë† F[X],
and h = [3]X2 + [2]X + [1] Ã¢Ë†Ë† F [X]. Compute (gh) mod f . Show your work. That is, you should
compute the product polynomial P = gh Ã¢Ë†Ë† F [X], and then use polynomial division with
remainder to compute P mod f. Along the way, you the coefficients of any polynomials you
compute should be reduced mod 5.
Exercise 3.4. Consider the field F = Z5. Use the Lagrange interpolation formula to compute the
coefficients of the polynomial g Ã¢Ë†Ë† F [X] of degree less than 3 such that g([1]) = [3], g([2]) = [4],
g([3]) = [1].
Show your work.
A Number Theory Primer:
What Every Computer Scientist Should Know about
Number Theory
(v0.14)
Victor Shoup
Chapter 1
Basic properties of the integers
This chapter discusses some of the basic properties of the integers, including the notions of divisibility and primality, unique factorization into primes, greatest common divisors, and least common
multiples.
1.1
Divisibility and primality
A central concept in number theory is divisibility.
Consider the integers Z = {. . . , Ã¢Ë†â€™2, Ã¢Ë†â€™1, 0, 1, 2, . . .}. For a, b Ã¢Ë†Ë† Z, we say that a divides b if
az = b for some z Ã¢Ë†Ë† Z. If a divides b, we write a  b, and we may say that a is a divisor of b, or
that b is a multiple of a, or that b is divisible by a. If a does not divide b, then we write a – b.
We first state some simple facts about divisibility:
Theorem 1.1. For all a, b, c Ã¢Ë†Ë† Z, we have
(i) a  a, 1  a, and a  0;
(ii) 0  a if and only if a = 0;
(iii) a  b if and only if Ã¢Ë†â€™a  b if and only if a  Ã¢Ë†â€™b;
(iv) a  b and a  c implies a  (b + c);
(v) a  b and b  c implies a  c.
Proof. These properties can be easily derived from the definition of divisibility, using elementary
algebraic properties of the integers. For example, a  a because we can write a Ã‚Â· 1 = a; 1  a because
we can write 1 Ã‚Â· a = a; a  0 because we can write a Ã‚Â· 0 = 0. We leave it as an easy exercise for the
reader to verify the remaining properties.
We make a simple observation. Suppose a  b and b 6= 0. Then we have 1 Ã¢â€°Â¤ a Ã¢â€°Â¤ b. To see
this, suppose az = b 6= 0 for some integer z. Then a 6= 0 and z 6= 0, and it follows that a Ã¢â€°Â¥ 1 and
z Ã¢â€°Â¥ 1. We conclude that a Ã¢â€°Â¤ az = b.
Using this observation, we may prove the following:
Theorem 1.2. For all a, b Ã¢Ë†Ë† Z, we have a  b and b  a if and only if a = Ã‚Â±b. In particular, for
every a Ã¢Ë†Ë† Z, we have a  1 if and only if a = Ã‚Â±1.
1
Proof. Clearly, if a = Ã‚Â±b, then a  b and b  a. So let us assume that a  b and b  a, and prove that
a = Ã‚Â±b. If either of a or b are zero, then the other must be zero as well. So assume that neither is
zero. By the above observation, a  b implies a Ã¢â€°Â¤ b, and b  a implies b Ã¢â€°Â¤ a; thus, a = b, and
so a = Ã‚Â±b. That proves the first statement. The second statement follows from the first by setting
b := 1, and noting that 1  a.
The product of any two nonzero integers is again nonzero. This implies the usual cancellation
law: if a, b, and c are integers such that a 6= 0 and ab = ac, then we must have b = c; indeed,
ab = ac implies a(b Ã¢Ë†â€™ c) = 0, and so a 6= 0 implies b Ã¢Ë†â€™ c = 0, and hence b = c.
1.2
Division with remainder
One of the most important facts about the integers is the following fact regarding division with
remainder. This fact underpins many other properties of the integers that we shall derive later,
including the fact that every positive integer can be expressed uniquely as the product of primes.
Theorem 1.3 (Division with remainder property). Let a, b Ã¢Ë†Ë† Z with b > 0. Then there exist
unique q, r Ã¢Ë†Ë† Z such that a = bq + r and 0 Ã¢â€°Â¤ r < b.
Proof. Consider the set S of nonnegative integers of the form a Ã¢Ë†â€™ bt with t Ã¢Ë†Ë† Z. This set is
clearly nonempty; indeed, if a Ã¢â€°Â¥ 0, set t := 0, and if a < 0, set t := a. Since every nonempty
set of nonnegative integers contains a minimum, we define r to be the smallest element of S. By
definition, r is of the form r = a Ã¢Ë†â€™ bq for some q Ã¢Ë†Ë† Z, and r Ã¢â€°Â¥ 0. Also, we must have r < b, since
otherwise, r Ã¢Ë†â€™ b would be an element of S smaller than r, contradicting the minimality of r; indeed,
if r Ã¢â€°Â¥ b, then we would have 0 Ã¢â€°Â¤ r Ã¢Ë†â€™ b = a Ã¢Ë†â€™ b(q + 1).
That proves the existence of r and q. For uniqueness, suppose that a = bq + r and a = bq 0 + r0 ,
where 0 Ã¢â€°Â¤ r < b and 0 Ã¢â€°Â¤ r0 < b. Then subtracting these two equations and rearranging terms, we
obtain
r0 Ã¢Ë†â€™ r = b(q Ã¢Ë†â€™ q 0 ).
Thus, r0 Ã¢Ë†â€™ r is a multiple of b; however, 0 Ã¢â€°Â¤ r < b and 0 Ã¢â€°Â¤ r0 < b implies r0 Ã¢Ë†â€™ r < b; therefore, the
only possibility is r0 Ã¢Ë†â€™ r = 0. Moreover, 0 = b(q Ã¢Ë†â€™ q 0 ) and b 6= 0 implies q Ã¢Ë†â€™ q 0 = 0.
Theorem 1.3 can be visualized as follows:
0
r
b
2b
3b
a
4b
Starting with a, we subtract (or add, if a is negative) the value b until we end up with a number
in the interval [0, b).
Floors and ceilings. Let us briefly recall the usual floor and ceiling functions, denoted bÃ‚Â·c and
dÃ‚Â·e, respectively. These are functions from R (the real numbers) to Z. For x Ã¢Ë†Ë† R, bxc is the greatest
integer m Ã¢â€°Â¤ x; equivalently, bxc is the unique integer m such that m Ã¢â€°Â¤ x < m + 1, or put another
way, such that x = m + for some Ã¢Ë†Ë† [0, 1). Also, dxe is the smallest integer m Ã¢â€°Â¥ x; equivalently,
dxe is the unique integer m such that m Ã¢Ë†â€™ 1 < x Ã¢â€°Â¤ m, or put another way, such that x = m Ã¢Ë†â€™ for
some Ã¢Ë†Ë† [0, 1).
2
The mod operator. Now let a, b Ã¢Ë†Ë† Z with b > 0. If q and r are the unique integers from
Theorem 1.3 that satisfy a = bq + r and 0 Ã¢â€°Â¤ r < b, we define
a mod b := r;
that is, a mod b denotes the remainder in dividing a by b. It is clear that b  a if and only if
a mod b = 0. Dividing both sides of the equation a = bq + r by b, we obtain a/b = q + r/b. Since
q Ã¢Ë†Ë† Z and r/b Ã¢Ë†Ë† [0, 1), we see that q = ba/bc. Thus,
(a mod b) = a Ã¢Ë†â€™ bba/bc.
One can use this equation to extend the definition of a mod b to all integers a and b, with b 6= 0;
that is, for b < 0, we simply define a mod b to be a Ã¢Ë†â€™ bba/bc.
Theorem 1.3 may be generalized so that when dividing an integer a by a positive integer b,
the remainder is placed in an interval other than [0, b). Let x be any real number, and consider
the interval [x, x + b). As the reader may easily verify, this interval contains precisely b integers,
namely, dxe, . . . , dxe + b Ã¢Ë†â€™ 1. Applying Theorem 1.3 with a Ã¢Ë†â€™ dxe in place of a, we obtain:
Theorem 1.4. Let a, b Ã¢Ë†Ë† Z with b > 0, and let x Ã¢Ë†Ë† R. Then there exist unique q, r Ã¢Ë†Ë† Z such
that a = bq + r and r Ã¢Ë†Ë† [x, x + b).
Div and mod in programming languages. Our definitions of division and remainder are
similar to, but not exactly equivalent to, the corresponding operators used in modern programming
languages. Typically, in such languages, an expression such as Ã¢â‚¬Å“5/3Ã¢â‚¬Â evaluates to b5/3c = 1;
similarly, Ã¢â‚¬Å“5%3Ã¢â‚¬Â evaluates to 5 mod 2 = 1. However, most programming languages treat negative
numbers differently that we do here. For example, the evaluation of Ã¢â‚¬Å“(5)/3Ã¢â‚¬Â would discard the
fractional part of (Ã¢Ë†â€™5)/3, yielding Ã¢Ë†â€™1, rather than b(Ã¢Ë†â€™5)/3c = Ã¢Ë†â€™2. Also, Ã¢â‚¬Å“(5)%3Ã¢â‚¬Â would evaluate
to (Ã¢Ë†â€™5) Ã¢Ë†â€™ (Ã¢Ë†â€™1) Ã‚Â· 3 = Ã¢Ë†â€™2, rather than (Ã¢Ë†â€™5) Ã¢Ë†â€™ (Ã¢Ë†â€™2) Ã‚Â· 3 = 1.
Exercise 1.1. Let a, b, d Ã¢Ë†Ë† Z with d 6= 0. Show that a  b if and only if da  db.
Exercise 1.2. Let m be a positive integer. Show that for every real number x Ã¢â€°Â¥ 1, the number
of multiples of m in the interval [1, x] is bx/mc; in particular, for every integer n Ã¢â€°Â¥ 1, the number
of multiples of m among 1, . . . , n is bn/mc.
Exercise 1.3. Let x Ã¢Ë†Ë† R. Show that 2bxc Ã¢â€°Â¤ b2xc Ã¢â€°Â¤ 2bxc + 1.
Exercise 1.4. Let x Ã¢Ë†Ë† R and n Ã¢Ë†Ë† Z with n > 0. Show that bbxc/nc = bx/nc; in particular,
bba/bc/cc = ba/bcc for all positive integers a, b, c.
Exercise 1.5. Let a, b Ã¢Ë†Ë† Z with b < 0. Show that (a mod b) Ã¢Ë†Ë† (b, 0].
Exercise 1.6. Show that Theorem 1.4 also holds for the interval (x, x + b]. Does it hold in general
for the intervals [x, x + b] or (x, x + b)?
1.3
Greatest common divisors
For a, b Ã¢Ë†Ë† Z, we call d Ã¢Ë†Ë† Z a common divisor of a and b if d  a and d  b; moreover, we call such
a d a greatest common divisor of a and b if d is nonnegative and is divisible be every common
divisor of a and b.
3
The reader may note that we have not defined the greatest common divisor to be the numerically
largest common divisor, as is sometimes done. Rather, we have defined it in terms of the divisibility
relation: it is Ã¢â‚¬Å“greatestÃ¢â‚¬Â in the sense that it is a divisible by every other common divisor. Our
definition is actually a stronger property, and is considered to be the more Ã¢â‚¬Å“mathemtically correctÃ¢â‚¬Â
one, in that it generalizes to other algebraic domains. Unfortunately, it is not painfully obvious
from the definition that greatest common divisors even exist, and if they do, that they are uniquely
defined. We shall, however, show below that for every a, b Ã¢Ë†Ë† Z, there exists a unique greatest
common divisor of a and b, which we will denote by gcd(a, b).
We begin by showing that the greatest common divisor of a and b, if it exists, must be unique.
To this end, suppose d and d0 are greatest common divisors. Then by definition, since d is a greatest
common divisor and d0 is a common divisor, we must have d0  d. Similarly, we must gave d  d0 .
Therefore, by Theorem 1.2, we have d = Ã‚Â±d0 . By definition, greatest common divisors must be
nonnegative, and so we conclude that d = d0 . That proves uniqueness. The reader will notice that
in the uniqueness proof, we made essential use of the fact that greatest common divisors must be
nonnegative Ã¢â‚¬â€ in fact, this is precisely the reason why this requirement is included in our definition.
1.3.1
The existence of greatest common divisors and EuclidÃ¢â‚¬â„¢s algorithm
We shall now prove the existence of gcd(a, b). The proof, in fact, gives us an efficient algorithm to
compute gcd(a, b). The algorithm is the wellknown Euclidean algorithm.
Without loss of generality, we may assume that a Ã¢â€°Â¥ b Ã¢â€°Â¥ 0. We prove by induction on b that
gcd(a, b) exists.
The base case is b = 0. We claim that gcd(a, 0) = a. To see this, observe that by part (i) of
Theorem 1.1, every integer is a divisor of 0. It follows that d is a common divisor of a and 0 if and
only if it is a divisor of a. Certainly, a is a divisor of itself and it is nonnegative, and from this, it
is clear that a satisfies the definition of gcd(a, 0).
Everything in the above paragraph holds even if a = 0. In particular, gcd(0, 0) = 0.
So now assume that b > 0. We also assume our induction hypothesis, which states that gcd(a0 , b0 )
exists for all integers a0 , b0 with a0 Ã¢â€°Â¥ b0 Ã¢â€°Â¥ 0 and b0 < b. If we divide a by b, we obtain the quotient
q := ba/bc and the remainder r := a mod b, where 0 Ã¢â€°Â¤ r < b. By our induction hypothesis, applied
to a0 := b and b0 := r, we see that gcd(b, r) exists. From the equation
a = bq + r,
it is easy to see that if an integer divides both b and r, then it also divides a; likewise, if an integer
divides a and b, then it also divides r. It follows that the the set of common divisors of a and b is
equal to the set of common divisors of b and r. Moreover, as gcd(b, r) is the unique nonnegative
common divisor that is divisible by each of these common divisors, and these common divisors are
the same as the common divisors of a and b, we see that gcd(a, b) exists and is equal to gcd(b, r).
That finishes the proof of the existence of gcd(a, b). This proof suggests the following recursive
algorithm to actually compute it:
Algorithm Euclid(a, b): On input a, b, where a and b are integers such that a Ã¢â€°Â¥ b Ã¢â€°Â¥ 0, compute
gcd(a, b) as follows:
if b = 0
then return a
else return Euclid(b, a mod b)
Example 1.1. Suppose a = 100 and b = 35. We compute the numbers (ai , bi ) that are the inputs
to the ith recursive call to Euclid:
4
i
ai
bi
0
100
35
1
35
30
2
30
5
3
5
0
So we have gcd(a, b) = a3 = 5.
To actually implement this algorithm on a computer, one has to consider the size of the integers
a and b. If they are small enough to fit into a singe machine Ã¢â‚¬Å“wordÃ¢â‚¬Â (typically 64 or 32bits), then
there is no issue. Otherwise, one needs to represent these large integers as vectors of machine
words, and implement all of the basic arithmetic operations (addition, subtraction, multiplication,
and division) in software. Some programming languages implement these operations as part of a
standard library, while others do not.
In these notes, we will not worry about how these basic arithmetic operations are implemented.
Rather, we will focus our attention to how many division steps are performed by EuclidÃ¢â‚¬â„¢s algorithm.
Observe that when called on input (a, b), with b > 0, the algorithm performs one division step,
and then calls itself recursively on input (a0 , b0 ), where a0 := b and b0 := a mod b. In particular,
b > b0 . So in every recursive call, the second argument decreases by at least 1. It follows that
the algorithm performs at most b division steps in total. However, it turns out that the algorithm
terminates much faster than this:
Theorem 1.5. On input (a, b), EuclidÃ¢â‚¬â„¢s algorithm performs O(log b) division steps.
Proof. As observed above, if b > 0, the algorithm performs one division step, and then calls itself
recursively on input (a0 , b0 ), where a0 := b and b0 := a mod b, so that b > b0 . If b0 > 0, the
algorithm performs another division step, and calls itself again on input (a00 , b00 ), where a00 := b0
and b00 := b mod b0 , so that b0 > b00 . Consider the quotient q 0 := bb/b0 c. Since b > b0 , we have q 0 Ã¢â€°Â¥ 1.
Moreover, since b = b0 q 0 + b00 and b0 > b00 , we have
b = b0 q 0 + b00 Ã¢â€°Â¥ b0 + b00 > 2b00 .
This shows that after two division steps, the second argument to Euclid, is less than b/2, and
so, after 2k division step, it is less than b/2k . So if we choose k large enough so that b/2k Ã¢â€°Â¤ 1, we
can be sure that the number division steps is at most 2k. Setting k := dlog2 be does the job.
1.3.2
BezoutÃ¢â‚¬â„¢s Lemma and the extended Euclidean algorithm
For integers a and b, have seen not only that gcd(a, b) exists and is uniquely defined, but that we
can efficiently compute it. Next, we prove the following remarkable fact, which will have numerous
applications.
Theorem 1.6 (BezoutÃ¢â‚¬â„¢s Lemma). Let a, b Ã¢Ë†Ë† Z and d := gcd(a, b).
(i) We have
as + bt = d for some s, t Ã¢Ë†Ë† Z.
(ii) For every dÃ¢Ë†â€” Ã¢Ë†Ë† Z, we have
d  dÃ¢Ë†â€” Ã¢â€¡ÂÃ¢â€¡â€™ asÃ¢Ë†â€” + btÃ¢Ë†â€” = dÃ¢Ë†â€”
for some sÃ¢Ë†â€” , tÃ¢Ë†â€” Ã¢Ë†Ë† Z.
We first observe that the part (ii) of BezoutÃ¢â‚¬â„¢s Lemma follows easily from part (i). Let dÃ¢Ë†â€” Ã¢Ë†Ë† Z
be given. On the one hand, if d  dÃ¢Ë†â€” , then dz = dÃ¢Ë†â€” for some integer z, and sÃ¢Ë†â€” := zs and tÃ¢Ë†â€” := zt,
5
where as + bt = d, do the job. On the other hand, if asÃ¢Ë†â€” + btÃ¢Ë†â€” = dÃ¢Ë†â€” , then since d  a and d  b, we
must have d  (asÃ¢Ë†â€” + btÃ¢Ë†â€” ) = dÃ¢Ë†â€” .
For part (i) of BezoutÃ¢â‚¬â„¢s Lemma, we can again assume without loss of generality that that
a Ã¢â€°Â¥ b Ã¢â€°Â¥ 0. We prove the statement by induction on b.
The base case is b = 0. In this case, d = a, and so, if we set s := 1 and t := 0, then as + 0t = d,
as required.
So assume b > 0. We also assume our induction hypothesis, which states that for all integers
a0 , b0 with a0 Ã¢â€°Â¥ b0 Ã¢â€°Â¥ 0 and b0 < b, there exist integers s0 and t0 with a0 s0 + b0 t0 = d0 , where
d0 := gcd(a0 , b0 ). We divide a by b, obtaining the quotient q and remainder r, where
a = bq + r
and
0 Ã¢â€°Â¤ r < b.
Applying our induction hypothesis to a0 := b and b0 := r, since gcd(a, b) = d = gcd(b, r), we see
there exist s0 , t0 such that
bs0 + rt0 = d.
If we substitute r by a Ã¢Ë†â€™ bq in this equation, and rearrange terms, we obtain
at0 + b(s0 Ã¢Ë†â€™ qt0 ) = d.
Thus, s := t0 and t := s0 Ã¢Ë†â€™ qt0 do the job.
That completes the proof of part (i) of BezoutÃ¢â‚¬â„¢s Lemma. The proof naturally suggests the
following algorithm, which is called the extended Euclidean algorithm:
Algorithm ExtEuclid(a, b): On input a, b, where a and b are integers such that a Ã¢â€°Â¥ b Ã¢â€°Â¥ 0, compute
(d, s, t), where d = gcd(a, b) and s and t are integers such that as + bt = d, as follows:
if b = 0 then
d Ã¢â€ Â a, s Ã¢â€ Â 1, t Ã¢â€ Â 0
else
q Ã¢â€ Â ba/bc, r Ã¢â€ Â a mod b
(d, s0 , t0 ) Ã¢â€ Â ExtEuclid(b, r)
s Ã¢â€ Â t0 , t Ã¢â€ Â s0 Ã¢Ë†â€™ qt0
return (d, s, t)
Example 1.2. Continuing with Example 1.1, we compute, in addition, the numbers si and ti
returned by the ith recursive call to Euclid, along with the corresponding quotient qi = bai /bi c:
i
ai
bi
si
ti
qi
0
100
35
Ã¢Ë†â€™1
3
2
1
35
30
1
Ã¢Ë†â€™1
1
2
30
5
0
1
6
3
5
0
1
0
The rule being applied here is si := ti+1 and ti := si+1 Ã¢Ë†â€™ qi ti+1 . So we have s = s0 = Ã¢Ë†â€™1 and
t = t0 = 3. One can verify that as + bt = 100 Ã‚Â· (Ã¢Ë†â€™1) + 35 Ã‚Â· (3) = 5 = d.
For a, b Ã¢Ë†Ë† Z, we say that a, b Ã¢Ë†Ë† Z are relatively prime if gcd(a, b) = 1, which is the same
as saying that the only common divisors of a and b are Ã‚Â±1. One may also say that a and b are
coprime, which means the same thing as saying that they are relatively prime. Applying part (ii)
of BezoutÃ¢â‚¬â„¢s Lemma with dÃ¢Ë†â€” := 1, we immediately obtain the following:
6
Theorem 1.7 (Corollary to BezoutÃ¢â‚¬â„¢s Lemma). Let a, b Ã¢Ë†Ë† Z. Then we have:
a and b are relatively prime Ã¢â€¡ÂÃ¢â€¡â€™ as + bt = 1 for some s, t Ã¢Ë†Ë† Z.
This fact allows us to easily prove the following result, which will be useful in the next section,
where we prove the unique factorization property for integers.
Theorem 1.8 (Coprime Lemma). Let a, b, c be integers such that c divides ab, and a and c are
relatively prime. Then c must divide b.
Proof. Suppose that c  ab and gcd(a, c) = 1. By Theorem 1.7, we have as + ct = 1 for some
s, t Ã¢Ë†Ë† Z. Multiplying this equation by b, we obtain
abs + cbt = b.
(1.1)
Since c divides ab by hypothesis, and since c clearly divides cbt, it follows that c divides the lefthand
side of (1.1), and hence that c divides b.
Exercise 1.7. Let a and b be integers, where either a 6= 0 or b 6= 0. Let d := gcd(a, b). Show that:
(a) d > 0;
(b) d is the numerically largest common divisor of a and b;
(c) d is the smallest positive integer that can be expressed as as + bt for some integers s and t.
Exercise 1.8. Show that for all integers a, b, c, we have:
(a) gcd(a, b) = gcd(b, a);
(b) gcd(a, b) = a Ã¢â€¡ÂÃ¢â€¡â€™ a  b;
(c) gcd(a, 1) = 1;
(d) gcd(ca, cb) = c gcd(a, b).
Exercise 1.9. Suppose we run Algorithm ExtEuclid on input (117, 67). Show the steps of the
computation by giving the data corresponding to that shown in in the table in Example 1.2.
Exercise 1.10. Show that if a Ã¢â€°Â¥ b > 0, then the values s and t computed by ExtEuclid(a, b)
satisfy
s Ã¢â€°Â¤ b/d and t Ã¢â€°Â¤ a/d.
Hint: prove by induction on bÃ¢â‚¬â€ be careful, you have to stop the induction before b gets to zero, so
the last step to consider is when b  a.
Exercise 1.11. Suppose that a, b are integers with d := gcd(a, b) 6= 0.
(a) Show that a/d and b/d are relatively prime.
(b) Show that if s and t are integers such that as + bt = d, then s and t are relatively prime.
Exercise 1.12. Let n be an integer. Show that if a, b are relatively prime integers, each of which
divides n, then ab divides n.
Exercise 1.13. Let a, b, c be positive integers satisfying gcd(a, b) = 1. Show that there is a number
N , depending on a and b, such for all integers c Ã¢â€°Â¥ N , we can write c = as + bt for nonnegative
integers s, t.
7
1.4
Unique factorization into primes
Let n be a positive integer. Trivially, 1 and n divide n. If n > 1 and no other positive integers
besides 1 and n divide n, then we say n is prime. If n > 1 but n is not prime, then we say that
n is composite. The number 1 is not considered to be either prime or composite. Evidently, n is
composite if and only if n = ab for some integers a, b with 1 < a < n and 1 < b < n. The first few
primes are
2, 3, 5, 7, 11, 13, 17, . . . .
While it is possible to extend the definition of prime and composite to negative integers, we shall
not do so in this text: whenever we speak of a prime or composite number, we mean a positive
integer.
A basic fact is that every nonzero integer can be expressed as a signed product of primes in an
essentially unique way. More precisely:
Theorem 1.9 (Fundamental theorem of arithmetic). Every nonzero integer n can be expressed as
n = Ã‚Â±pe11 Ã‚Â· Ã‚Â· Ã‚Â· perr ,
where p1 , . . . , pr are distinct primes and e1 , . . . , er are positive integers. Moreover, this expression
is unique, up to a reordering of the primes.
Note that if n = Ã‚Â±1 in the above theorem, then r = 0, and the product of zero terms is
interpreted (as usual) as 1.
The theorem intuitively says that the primes act as the Ã¢â‚¬Å“building blocksÃ¢â‚¬Â out of which all nonzero integers can be formed by multiplication (and negation). The reader may be so familiar with
this fact that he may feel it is somehow Ã¢â‚¬Å“self evident,Ã¢â‚¬Â requiring no proof; however, this feeling is
simply a delusion, and most of the rest of this section and the next are devoted to developing a
proof of this theorem. We shall give a quite leisurely proof, introducing a number of other very
important tools and concepts along the way that will be useful later.
To prove Theorem 1.9, we may clearly assume that n is positive, since otherwise, we may
multiply n by Ã¢Ë†â€™1 and reduce to the case where n is positive.
The proof of the existence part of Theorem 1.9 is easy. This amounts to showing that every
positive integer n can be expressed as a product (possibly empty) of primes. We may prove this by
induction on n. If n = 1, the statement is true, as n is the product of zero primes. Now let n > 1,
and assume that every positive integer smaller than n can be expressed as a product of primes. If
n is a prime, then the statement is true, as n is the product of one prime. Assume, then, that n is
composite, so that there exist a, b Ã¢Ë†Ë† Z with 1 < a < n, 1 < b < n, and n = ab. By the induction
hypothesis, both a and b can be expressed as a product of primes, and so the same holds for n.
The uniqueness part of Theorem 1.9 is a bit more challenging. However, using the results we
have proven so far on greatest common divisors, the task is not so difficult.
Suppose that p is a prime and a is any integer. As the only divisors of p are Ã‚Â±1 and Ã‚Â±p, we
have
p  a =Ã¢â€¡â€™ gcd(a, p) = p, and
p  a =Ã¢â€¡â€™ gcd(a, p) = 1.
Combining this observation with Theorem 1.8, we have:
Theorem 1.10 (EuclidÃ¢â‚¬â„¢s Lemma). Let p be prime, and let a, b Ã¢Ë†Ë† Z. Then p  ab implies
that p  a or p  b.
8
Proof. Assume that p  ab. If p  a, we are done, so assume that p  a. By the above observation,
gcd(a, p) = 1, and so by Theorem 1.8, we have p  b.
An obvious corollary to Theorem 1.10 is that if a1 , . . . , ak are integers, and if p is a prime that
divides the product a1 Ã‚Â· Ã‚Â· Ã‚Â· ak , then p  ai for some i = 1, . . . , k. This is easily proved by induction
on k. For k = 1, the statement is trivially true. Now let k > 1, and assume that statement holds
for k Ã¢Ë†â€™ 1. Then by Theorem 1.10, either p  a1 or p  a2 Ã‚Â· Ã‚Â· Ã‚Â· ak ; if p  a1 , we are done; otherwise, by
induction, p divides one of a2 , . . . , ak .
Finishing the proof of Theorem 1.9. We are now in a position to prove the uniqueness part of
Theorem 1.9, which we can state as follows: if p1 , . . . , pr are primes (not necessarily distinct), and
q1 , . . . , qs are primes (also not necessarily distinct), such that
p1 Ã‚Â· Ã‚Â· Ã‚Â· pr = q1 Ã‚Â· Ã‚Â· Ã‚Â· qs ,
(1.2)
then (p1 , . . . , pr ) is just a reordering of (q1 , . . . , qs ). We may prove this by induction on r. If r = 0,
we must have s = 0 and we are done. Now suppose r > 0, and that the statement holds for r Ã¢Ë†â€™ 1.
Since r > 0, we clearly must have s > 0. Also, as p1 obviously divides the lefthand side of (1.2),
it must also divide the righthand side of (1.2); that is, p1  q1 Ã‚Â· Ã‚Â· Ã‚Â· qs . It follows from (the corollary
to) Theorem 1.10 that p1  qj for some j = 1, . . . , s, and moreover, since qj is prime, we must have
p1 = qj . Thus, we may cancel p1 from the lefthand side of (1.2) and qj from the righthand side of
(1.2), and the statement now follows from the induction hypothesis. That proves the uniqueness
part of Theorem 1.9.
Exercise 1.14. Let n be a composite integer. Show that there exists a prime p dividing n, with
p Ã¢â€°Â¤ n1/2 .
Exercise 1.15. Show that two integers are relatively prime if and only if there is no one prime
that divides both of them.
Exercise 1.16. Let p be a prime and k an integer, with 0 < k < p. Show that the binomial
coefficient
p
p!
,
=
k!(p Ã¢Ë†â€™ k)!
k
which is an integer, is divisible by p.
Exercise 1.17. An integer a is called squarefree if it is not divisible by the square of any integer
greater than 1. Show that:
(a) a is squarefree if and only if a = Ã‚Â±p1 Ã‚Â· Ã‚Â· Ã‚Â· pr , where the pi Ã¢â‚¬â„¢s are distinct primes;
(b) every positive integer n can be expressed uniquely as n = ab2 , where a and b are positive
integers, and a is squarefree.
1.5
Some consequences of unique factorization
The following theorem is a consequence of just the existence part of Theorem 1.9:
Theorem 1.11. There are infinitely many primes.
9
Proof. By way of contradiction,
Qk suppose that there were only finitely many primes; call them
p1 , . . . , pk . Then set M := i=1 pi and N := M + 1. Consider a prime p that divides N . There
must be at least one such prime p, since N Ã¢â€°Â¥ 2, and every positive integer can be written as a
product of primes. Clearly, p cannot equal any of the pi Ã¢â‚¬â„¢s, since if it did, then p would divide
M , and hence also divide N Ã¢Ë†â€™ M = 1, which is impossible. Therefore, the prime p is not among
p1 , . . . , pk , which contradicts our assumption that these are the only primes.
For each prime p, we may define the function ÃŽÂ½p , mapping nonzero integers to nonnegative
integers, as follows: for every integer n 6= 0, if n = pe m, where p  m, then ÃŽÂ½p (n) := e. We may
then write the factorization of n into primes as
Y
pÃŽÂ½p (n) ,
n=Ã‚Â±
p
where the product is over all primes p; although syntactically this is an infinite product, all but
finitely many of its terms are equal to 1, and so this expression makes sense.
Observe that if a and b are nonzero integers, then
ÃŽÂ½p (a Ã‚Â· b) = ÃŽÂ½p (a) + ÃŽÂ½p (b) for all primes p,
(1.3)
a  b Ã¢â€¡ÂÃ¢â€¡â€™ ÃŽÂ½p (a) Ã¢â€°Â¤ ÃŽÂ½p (b) for all primes p.
(1.4)
and
From this, it is clear that
gcd(a, b) =
Y
pmin(ÃŽÂ½p (a),ÃŽÂ½p (b)) .
p
Least common multiples. For a, b Ã¢Ë†Ë† Z, a common multiple of a and b is an integer m such
that a  m and b  m; moreover, such an m is the least common multiple of a and b if m is
nonnegative and m divides all common multiples of a and b. It is easy to see that the least common
multiple exists and is unique, and we denote the least common multiple of a and b by lcm(a, b).
Indeed, for all a, b Ã¢Ë†Ë† Z, if either a or b are zero, the only common multiple of a and b is 0, and so
lcm(a, b) = 0; otherwise, if neither a nor b are zero, we have
Y
lcm(a, b) =
pmax(ÃŽÂ½p (a),ÃŽÂ½p (b)) ,
p
or equivalently, lcm(a, b) may be characterized as the smallest positive integer divisible by both a
and b.
It is convenient to extend the domain of definition of ÃŽÂ½p to include 0, defining ÃŽÂ½p (0) := Ã¢Ë†Å¾. If
we interpret expressions involving Ã¢â‚¬Å“Ã¢Ë†Å¾Ã¢â‚¬Â appropriately,1 then for arbitrary a, b Ã¢Ë†Ë† Z, both (1.3) and
(1.4) hold, and in addition,
ÃŽÂ½p (gcd(a, b)) = min(ÃŽÂ½p (a), ÃŽÂ½p (b)) and ÃŽÂ½p (lcm(a, b)) = max(ÃŽÂ½p (a), ÃŽÂ½p (b))
for all primes p.
1
The interpretation given to such expressions should be obvious: for example, for every x Ã¢Ë†Ë† R, we have Ã¢Ë†â€™Ã¢Ë†Å¾ <
x < Ã¢Ë†Å¾, x + Ã¢Ë†Å¾ = Ã¢Ë†Å¾, x Ã¢Ë†â€™ Ã¢Ë†Å¾ = Ã¢Ë†â€™Ã¢Ë†Å¾, Ã¢Ë†Å¾ + Ã¢Ë†Å¾ = Ã¢Ë†Å¾, and (Ã¢Ë†â€™Ã¢Ë†Å¾) + (Ã¢Ë†â€™Ã¢Ë†Å¾) = Ã¢Ë†â€™Ã¢Ë†Å¾. Expressions such as x Ã‚Â· (Ã‚Â±Ã¢Ë†Å¾) also make
sense, provided x 6= 0. However, the expressions Ã¢Ë†Å¾ Ã¢Ë†â€™ Ã¢Ë†Å¾ and 0 Ã‚Â· Ã¢Ë†Å¾ have no sensible interpretation.
10
Generalizing gcdÃ¢â‚¬â„¢s and lcmÃ¢â‚¬â„¢s to many integers. It is easy to generalize the notions of greatest
common divisor and least common multiple from two integers to many integers. Let a1 , . . . , ak be
integers. We call d Ã¢Ë†Ë† Z a common divisor of a1 , . . . , ak if d  ai for i = 1, . . . , k; moreover, we
call such a d the greatest common divisor of a1 , . . . , ak if d is nonnegative and all other common
divisors of a1 , . . . , ak divide d. The greatest common divisor of a1 , . . . , ak is denoted gcd(a1 , . . . , ak )
and is the unique nonnegative integer d satisfying
ÃŽÂ½p (d) = min(ÃŽÂ½p (a1 ), . . . , ÃŽÂ½p (ak )) for all primes p.
Analogously, we call m Ã¢Ë†Ë† Z a common multiple of a1 , . . . , ak if ai  m for all i = 1, . . . , k; moreover,
such an m is called the least common multiple of a1 , . . . , ak if m divides all common multiples of
a1 , . . . , ak . The least common multiple of a1 , . . . , ak is denoted lcm(a1 , . . . , ak ) and is the unique
nonnegative integer m satisfying
ÃŽÂ½p (m) = max(ÃŽÂ½p (a1 ), . . . , ÃŽÂ½p (ak )) for all primes p.
Finally, we say that the family {ai }ki=1 is pairwise relatively prime if for all indices i, j with
i 6= j, we have gcd(ai , aj ) = 1. Certainly, if {ai }ki=1 is pairwise relatively prime, and k > 1, then
gcd(a1 , . . . , ak ) = 1; however, gcd(a1 , . . . , ak ) = 1 does not imply that {ai }ki=1 is pairwise relatively
prime.
Rational numbers. Consider the rational numbers Q = {a/b : a, b Ã¢Ë†Ë† Z, b 6= 0}. Given any
rational number a/b, if we set d := gcd(a, b), and define the integers a0 := a/d and b0 := b/d, then
we have a/b = a0 /b0 and gcd(a0 , b0 ) = 1. Moreover, if a1 /b1 = a0 /b0 , then we have a1 b0 = a0 b1 ,
and so b0  a0 b1 ; also, since gcd(a0 , b0 ) = 1, we see that b0  b1 ; writing b1 = b0 c, we see that
a1 = a0 c. Thus, we can represent every rational number as a fraction in lowest terms, which
means a fraction of the form a0 /b0 where a0 and b0 are relatively prime; moreover, the values of
a0 and b0 are uniquely determined up to sign, and every other fraction that represents the same
rational number is of the form a0 c/b0 c, for some nonzero integer c.
Exercise 1.18. Let n be an integer. Generalizing Exercise 1.12, show that if {ai }ki=1 is a pairwise
Q
relatively prime family of integers, where each ai divides n, then their product ki=1 ai also divides
n.
Exercise 1.19. Show that for all integers a, b, c, we have:
(a) lcm(a, b) = lcm(b, a);
(b) lcm(a, b) = a Ã¢â€¡ÂÃ¢â€¡â€™ b  a;
(c) lcm(a, a) = lcm(a, 1) = a;
(d) lcm(ca, cb) = c lcm(a, b).
Exercise 1.20. Show that for all integers a, b, we have:
(a) gcd(a, b) Ã‚Â· lcm(a, b) = ab;
(b) gcd(a, b) = 1 =Ã¢â€¡â€™ lcm(a, b) = ab.
Exercise 1.21. Let a1 , . . . , ak Ã¢Ë†Ë† Z with k > 1. Show that:
gcd(a1 , . . . , ak ) = gcd(a1 , gcd(a2 , . . . , ak )) = gcd(gcd(a1 , . . . , akÃ¢Ë†â€™1 ), ak );
lcm(a1 , . . . , ak ) = lcm(a1 , lcm(a2 , . . . , ak )) = lcm(lcm(a1 , . . . , akÃ¢Ë†â€™1 ), ak ).
11
Exercise 1.22. Let a1 , . . . , ak Ã¢Ë†Ë† Z with d := gcd(a1 , . . . , ak ). Show that there exist integers
s1 , . . . , sk such that d = a1 s1 + Ã‚Â· Ã‚Â· Ã‚Â· + ak sk .
Exercise 1.23. Show that if {ai }ki=1 is a pairwise relatively prime family of integers, then
lcm(a1 , . . . , ak ) = a1 Ã‚Â· Ã‚Â· Ã‚Â· ak .
Exercise 1.24. Show that every nonzero x Ã¢Ë†Ë† Q can be expressed as
x = Ã‚Â±pe11 Ã‚Â· Ã‚Â· Ã‚Â· perr ,
where the pi Ã¢â‚¬â„¢s are distinct primes and the ei Ã¢â‚¬â„¢s are nonzero integers, and that this expression in
unique up to a reordering of the primes.
Exercise 1.25. Let n and k be positive integers, and suppose x Ã¢Ë†Ë† Q such that xk = n for some
Ã¢Ë†Å¡
x Ã¢Ë†Ë† Q. Show that x Ã¢Ë†Ë† Z. In other words, k n is either an integer or is irrational.
Exercise 1.26. Show that gcd(a + b, lcm(a, b)) = gcd(a, b) for all a, b Ã¢Ë†Ë† Z.
Exercise 1.27. Show that for every positive integer k, there exist k consecutive composite integers.
Thus, there are arbitrarily large gaps between primes.
Exercise 1.28. Let p be a prime. Show that for all a, b Ã¢Ë†Ë† Z, we have ÃŽÂ½p (a+b) Ã¢â€°Â¥ min{ÃŽÂ½p (a), ÃŽÂ½p (b)},
and ÃŽÂ½p (a + b) = ÃŽÂ½p (a) if ÃŽÂ½p (a) < ÃŽÂ½p (b).
Exercise 1.29. For a given prime p, we may extend the domain of definition of ÃŽÂ½p from Z to Q:
for nonzero integers a, b, let us define ÃŽÂ½p (a/b) := ÃŽÂ½p (a) Ã¢Ë†â€™ ÃŽÂ½p (b). Show that:
(a) this definition of ÃŽÂ½p (a/b) is unambiguous, in the sense that it does not depend on the particular
choice of a and b;
(b) for all x, y Ã¢Ë†Ë† Q, we have ÃŽÂ½p (xy) = ÃŽÂ½p (x) + ÃŽÂ½p (y);
(c) for all x, y Ã¢Ë†Ë† Q, we have ÃŽÂ½p (x + y) Ã¢â€°Â¥ min{ÃŽÂ½p (x), ÃŽÂ½p (y)}, and ÃŽÂ½p (x + y) = ÃŽÂ½p (x) if ÃŽÂ½p (x) < ÃŽÂ½p (y);
Q
(d) for all nonzero x Ã¢Ë†Ë† Q, we have x = Ã‚Â± p pÃŽÂ½p (x) , where the product is over all primes, and all
but a finite number of terms in the product are equal to 1;
(e) for all x Ã¢Ë†Ë† Q, we have x Ã¢Ë†Ë† Z if and only if ÃŽÂ½p (x) Ã¢â€°Â¥ 0 for all primes p.
Exercise 1.30. Let n be a positive integer, and let 2k be the highest power of 2 in the set
S := {1, . . . , n}. Show that 2k does not divide any other element in S.
P
Exercise 1.31. Let n Ã¢Ë†Ë† Z with n > 1. Show that ni=1 1/i is not an integer.
Exercise 1.32. Let n be a positive integer, and let Cn denote the number of pairs of integers (a, b)
with a, b Ã¢Ë†Ë† {1, . . . , n} and gcd(a, b) = 1, and let Fn be the number of distinct rational numbers a/b,
where 0 Ã¢â€°Â¤ a < b Ã¢â€°Â¤ n.
(a) Show that Fn = (Cn + 1)/2.
(b) Show that Cn Ã¢â€°Â¥ n2 /4. Hint: first show that Cn Ã¢â€°Â¥ n2 (1 Ã¢Ë†â€™
P
2
dÃ¢â€°Â¥2 1/d Ã¢â€°Â¤ 3/4.
12
P
dÃ¢â€°Â¥2 1/d
2 ),
and then show that
Chapter 2
Congruences
This chapter introduces the basic properties of congruences modulo n, along with the related notion
of residue classes modulo n. Other items discussed include the Chinese remainder theorem and
FermatÃ¢â‚¬â„¢s little theorem.
2.1
Definitions and basic properties of congruences
Let n be a positive integer. For integers a and b, we say that a is congruent to b modulo n if
n  (a Ã¢Ë†â€™ b), and we write a Ã¢â€°Â¡ b (mod n). If n  (a Ã¢Ë†â€™ b), then we write a 6Ã¢â€°Â¡ b (mod n). Equivalently,
a Ã¢â€°Â¡ b (mod n) if and only if a = b + ny for some y Ã¢Ë†Ë† Z. The relation a Ã¢â€°Â¡ b (mod n) is called a
congruence relation, or simply, a congruence. The number n appearing in such congruences is
called the modulus of the congruence. Note that a Ã¢â€°Â¡ 0 (mod n) if and only if n divides a.
If we view the modulus n as fixed, then the following theorem says that the binary relation
Ã¢â‚¬Å“Ã‚Â· Ã¢â€°Â¡ Ã‚Â· (mod n)Ã¢â‚¬Â is an equivalence relation on the set Z.
Theorem 2.1 (Equivalence Property). Let n be a positive integer. For all a, b, c Ã¢Ë†Ë† Z, we have:
(i) a Ã¢â€°Â¡ a (mod n);
(ii) a Ã¢â€°Â¡ b (mod n) implies b Ã¢â€°Â¡ a (mod n);
(iii) a Ã¢â€°Â¡ b (mod n) and b Ã¢â€°Â¡ c (mod n) implies a Ã¢â€°Â¡ c (mod n).
Proof. For (i), observe that n divides 0 = a Ã¢Ë†â€™ a. For (ii), observe that if n divides a Ã¢Ë†â€™ b, then it
also divides Ã¢Ë†â€™(a Ã¢Ë†â€™ b) = b Ã¢Ë†â€™ a. For (iii), observe that if n divides a Ã¢Ë†â€™ b and b Ã¢Ë†â€™ c, then it also divides
(a Ã¢Ë†â€™ b) + (b Ã¢Ë†â€™ c) = a Ã¢Ë†â€™ c.
Another key property of congruences is that they are Ã¢â‚¬Å“compatibleÃ¢â‚¬Â with integer addition and
multiplication, in the following sense:
Theorem 2.2 (Compatibility Property). Let a, a0 , b, b0 , n Ã¢Ë†Ë† Z with n > 0. If
a Ã¢â€°Â¡ a0 (mod n) and b Ã¢â€°Â¡ b0 (mod n),
then
a + b Ã¢â€°Â¡ a0 + b0 (mod n) and a Ã‚Â· b Ã¢â€°Â¡ a0 Ã‚Â· b0 (mod n).
13
Proof. Suppose that a Ã¢â€°Â¡ a0 (mod n) and b Ã¢â€°Â¡ b0 (mod n). This means that there exist integers x
and y such that a = a0 + nx and b = b0 + ny. Therefore,
a + b = a0 + b0 + n(x + y),
which proves the first congruence of the theorem, and
ab = (a0 + nx)(b0 + ny) = a0 b0 + n(a0 y + b0 x + nxy),
which proves the second congruence.
Theorems 1.3 and 1.4 can be restated in terms of congruences (with a as given, and b := n):
Theorem 2.3. Let a, n Ã¢Ë†Ë† Z with n > 0. Then there exists a unique integer r such that a Ã¢â€°Â¡
r (mod n) and 0 Ã¢â€°Â¤ r < n, namely, r := a mod n. More generally, for every x Ã¢Ë†Ë† R, there exists a
unique integer r Ã¢Ë†Ë† [x, x + n) such that a Ã¢â€°Â¡ r (mod n).
We also have:
Theorem 2.4. Let a, b, n Ã¢Ë†Ë† Z with n > 0. Then we have:
a Ã¢â€°Â¡ b (mod n) Ã¢â€¡ÂÃ¢â€¡â€™ (a mod n) = (b mod n).
(2.1)
Proof. By the existence part of Theorem 2.3, we have a Ã¢â€°Â¡ r (mod n) and b Ã¢â€°Â¡ s (mod n), where
r := a mod n and s := b mod n. We want to show a Ã¢â€°Â¡ b (mod n) Ã¢â€¡ÂÃ¢â€¡â€™ r = s.
On the one hand, if r = s, then (using Theorem 2.1) we have a Ã¢â€°Â¡ b (mod n). On the other
hand, if a Ã¢â€°Â¡ b (mod n), then (again using Theorem 2.1), we have a Ã¢â€°Â¡ s (mod n); moreover, by the
uniqueness part of Theorem 2.3, we have r = s.
Note that we are using the notation Ã¢â‚¬Å“modÃ¢â‚¬Â in two different ways here: on the lefthand side of
(2.1), as a part of a congruence relation (e.g., 17 Ã¢â€°Â¡ Ã¢Ë†â€™3 (mod 5)), and on the righthand side, as a
binary operator (e.g., (17 mod 5) = 2 = (Ã¢Ë†â€™3 mod m)). The reader should be aware that these two
uses are similar, but not quite the same.
Theorems 2.1 and 2.2 are deceptively powerful: they allow one to work with congruence relations modulo n much as one would with ordinary equalities. One can add to, subtract from,
or multiply both sides of a congruence modulo n by the same integer. Also, we have the following substitution principle: roughly speaking, if a Ã¢â€°Â¡ a0 (mod n), we can substitute a0 for a in
any arithmetic expression involving a, without changing the value of the expression mod n. More
precisely, suppose E(v) is any arithmetic expression in a variable v, built up from v and integer
constants using addition, subtraction, and multiplication operators; then a Ã¢â€°Â¡ a0 (mod n) implies
E(a) Ã¢â€°Â¡ E(a0 ) (mod n).
To see why this Ã¢â‚¬Å“substitution principleÃ¢â‚¬Â works, consider the expression
E(v) := (v + 1)(v + 2).
Suppose a Ã¢â€°Â¡ a0 (mod n). We want to show: E(a) Ã¢â€°Â¡ E(a0 ) (mod n). Appying Theorem 2.2 three
times, we have:
a + 1 Ã¢â€°Â¡ a0 + 1 (mod n)
a + 2 Ã¢â€°Â¡ a0 + 2 (mod n)
(a + 1)(a + 2) Ã¢â€°Â¡ (a0 + 1)(a0 + 2) (mod n).
This type of calculation generalizes to arbitrary expressions.
14
Example 2.1. We can use the above principles to greatly simplify modular computations. Suppose
we wish to compute 4 Ã‚Â· 5 Ã‚Â· 6 Ã‚Â· 7 Ã‚Â· 8 Ã‚Â· 9 mod 17. We could do this by first computing a := 4 Ã‚Â· 5 Ã‚Â· 6 Ã‚Â· 7 Ã‚Â· 8 Ã‚Â· 9,
and then divide a by 17 to get a mod 17. However, a itself will be a rather large number, and
we can simplify the computation by reducing intermediate results mod 17 as we go. Observe that
4 Ã‚Â· 5 = 20, and 20 Ã¢â€°Â¡ 3 (mod 17). Therefore, using the above Ã¢â‚¬Å“substitution principleÃ¢â‚¬Â, we can replace
4 Ã‚Â· 5 by 3 in the expression 4 Ã‚Â· 5 Ã‚Â· 6 Ã‚Â· 7 Ã‚Â· 8 Ã‚Â· 9, without changing its value mod 17:
4 Ã‚Â· 5 Ã‚Â· 6 Ã‚Â· 7 Ã‚Â· 8 Ã‚Â· 9 Ã¢â€°Â¡ 3 Ã‚Â· 6 Ã‚Â· 7 Ã‚Â· 8 Ã‚Â· 9 (mod 17).
Applying this substitution strategy repeatedly, we can compute:
4Ã‚Â·5 Ã‚Â·6Ã‚Â·7Ã‚Â·8Ã‚Â·9
{z}
4 Ã‚Â· 5 Ã¢â€°Â¡ 3 (mod 17)
3Ã‚Â·6 Ã‚Â·7Ã‚Â·8Ã‚Â·9
{z}
3 Ã‚Â· 6 Ã¢â€°Â¡ 1 (mod 17)
1Ã‚Â·7Ã‚Â·8Ã‚Â·9
7Ã‚Â·8 Ã‚Â·9
{z}
7 Ã‚Â· 8 Ã¢â€°Â¡ 5 (mod 17)
5Ã‚Â·9
{z}
5 Ã‚Â· 9 Ã¢â€°Â¡ 11 (mod 17)
We conclude that 4 Ã‚Â· 5 Ã‚Â· 6 Ã‚Â· 7 Ã‚Â· 8 Ã‚Â· 9 Ã¢â€°Â¡ 11 (mod 17), or in other words, 4 Ã‚Â· 5 Ã‚Â· 6 Ã‚Â· 7 Ã‚Â· 8 Ã‚Â· 9 mod 17 = 11.
Example 2.2. A convenient Ã¢â‚¬Å“rule of thumbÃ¢â‚¬Â that one often uses to test for divisibility by 3 is to
add up the digits of a number, and test if the the sum of digits is itself divisible by 3. For example,
if a = 25614, we add the digits of a, obtaining 2 + 5 + 6 + 1 + 4 = 18; since 18 is divisible by 3,
we conclude that a is divisible by 3 as well. We can justify this rule using congruences.
Let a be
P
i . Let b
a
10
a positive integer whose base10 representation is a = (akÃ¢Ë†â€™1 Ã‚Â· Ã‚Â· Ã‚Â· a1 a0 )10 , so a = kÃ¢Ë†â€™1
i=0 i
PkÃ¢Ë†â€™1
be the sum of the decimal digits of a; that is, let b := i=0 ai . We will show that a Ã¢â€°Â¡ b (mod 3).
This will justify the divisibilityby3 rule, since then we see that a Ã¢â€°Â¡ 0 (mod 3) (i.e., a is divisible
by 3) if and only if b Ã¢â€°Â¡ 0 (mod 3) (i.e., b is divisible by 3). To show that a Ã¢â€°Â¡ b (mod 3), we first
observe that 10 Ã¢â€°Â¡ 1 (mod 3). Then, we calculate
a=
kÃ¢Ë†â€™1
X
i=0
i
ai 10 Ã¢â€°Â¡
kÃ¢Ë†â€™1
X
ai Ã‚Â· 1 (mod 3).
i=0
Here, we have used the above Ã¢â‚¬Å“substitution principleÃ¢â‚¬Â: since, 10 Ã¢â€°Â¡ 1 (mod 3), we can substitute 1
for 10 in the above congruence mod 3. See also Exercises 2.5 and 2.6.
Example 2.3. Let us find the set of solutions z to the congruence
8z Ã¢Ë†â€™ 4 Ã¢â€°Â¡ 5z Ã¢Ë†â€™ 2 (mod 7).
(2.2)
Suppose z is a solution to (2.2). If we add 4 and subtract 5z on both sides of (2.2), we see that z
is a solution to
3z Ã¢â€°Â¡ 2 (mod 7).
(2.3)
Conversely, if z is a solution to (2.3), then by subtracting 4 and adding 5z on both sides of (2.3),
we see that z is a solution to (2.2). Thus, (2.2) and (2.3) have the same solution set.
15
Next, suppose z is a solution to (2.3). We would like to divide both sides of (2.3) by 3, to get
z by itself on the lefthand side. We cannot do this directly, but since 5 Ã‚Â· 3 = 15 Ã¢â€°Â¡ 1 (mod 7), we
can achieve the same effect by multiplying both sides of (2.3) by 5. If we do this, we obtain
15z Ã¢â€°Â¡ 10 (mod 7).
(2.4)
Now, we apply the above Ã¢â‚¬Å“substitution principleÃ¢â‚¬Â: since, 15 Ã¢â€°Â¡ 1 (mod 7), we can substitute 1 for
15 in (2.4). Likewise, since 10 Ã¢â€°Â¡ 3 (mod 7), we can substitute 3 for 10. Making both of these
substitutions, we obtain
z Ã¢â€°Â¡ 3 (mod 7).
(2.5)
Thus, any solution z to (2.3) is a solution to (2.5). Conversely, if z is a solution to (2.5), then by
multiplying both sides of (2.5) by 3, we get 3z Ã¢â€°Â¡ 9 (mod 7), and since 9 Ã¢â€°Â¡ 2 (mod 7), we see that
z is a solution to (2.3). Thus, (2.2), (2.3), and (2.5) all have the same solution set.
Therefore, z is a solution to (2.2) if and only if z Ã¢â€°Â¡ 3 (mod 7). That is, the solutions to (2.2)
are precisely those integers that are congruent to 3 modulo 7, which we can list as follows:
. . . , Ã¢Ë†â€™18, Ã¢Ë†â€™11, Ã¢Ë†â€™4, 3, 10, 17, 24, . . .
In the next section, we shall give a systematic treatment of the problem of solving linear
congruences, such as the one appearing in the previous example.
2.1.1
Application: computer arithmetic
Computers store integers as words, which are binary strings of fixed length. Congruences allow one
to better understand how arithmetic on these words actually works.
Suppose we have a very minimalistic computer with just 4bit words. On such a machine,
the type unsigned in the C language would represent the integers 0, . . . , 15, encoded in binary as
follows:
0000
0
0001
1
0010
2
0011
3
0100
4
0101
5
0110
6
0111
7
1000
8
1001
9
1010
10
1011
11
1100
12
1101
13
1110
14
1111
15
When one adds or subtracts two such 4bit integers, the result is always a 4bit integer: the
hardware just ignores any carries out of or borrows into the highorder bit position. For example, if
we add the numbers 9 and 13, we get 22, which in binary is the 5bit number 10110. The computer
hardware will just throw away the leftmost bit, resulting in the 4bit number 0110, which is the
binary representation of 6. Thus, on this machine, if we add 9 and 13, we get 6 instead of 22. Notice
that while 6 is not equal to 22, it is congruent to 22 modulo 16, i.e., 6 Ã¢â€°Â¡ 22 (mod 16). This is no
coincidence. When the hardware throws away the highorder bit, this is the same as subtracting
off the value 16.
In general, if our computer has nbit words, then such words can represent the numbers
0, . . . , 2n Ã¢Ë†â€™ 1, encoded in binary as nbit strings. Moreover, unsigned arithmetic (addition, subtraction, and multiplication) is really just arithmetic mod 2n .
As for signed arithmetic, if the machine uses 2Ã¢â‚¬â„¢s complement arithmetic (as is the case for
essentially all computers used today), then nothing really changes. Here again is the table for
encoding 4bit integers, where we also include the 2Ã¢â‚¬â„¢s complement encoding of signed integers:
0000
0
0
0001
1
1
0010
2
2
0011
3
3
0100
4
4
0101
5
5
0110
6
6
0111
7
7
1000
8
8
16
1001
9
7
1010
10
6
1011
11
5
1100
12
4
1101
13
3
1110
14
2
1111
15
1
Note that 15 Ã¢â€°Â¡ Ã¢Ë†â€™1 (mod 16), and 14 Ã¢â€°Â¡ Ã¢Ë†â€™2 (mod 16), and so on. This is also not a coincidence.
More generally, for nbit 2Ã¢â‚¬â„¢s complement arithmetic, for each x Ã¢Ë†Ë† [1 . . 2nÃ¢Ë†â€™1 ], we encode the negative
integer Ã¢Ë†â€™x as the binary encoding of the positive integer (Ã¢Ë†â€™x) mod 2n . In particular, Ã¢Ë†â€™1 is encoded
as the binary encoding of (Ã¢Ë†â€™1) mod 2n = 2n Ã¢Ë†â€™ 1, which is the bit string 11 Ã‚Â· Ã‚Â· Ã‚Â· 1 of all 1Ã¢â‚¬â„¢s. Addition,
subtraction, and multiplication of nbit 2Ã¢â‚¬â„¢s complement integers is carried out using exactly the
same hardware as for nbit unsigned integers: all results are just computed mod 2n .
Consider again our 4bit machine, and suppose the type int represents the integers
Ã¢Ë†â€™8, Ã¢Ë†â€™7, . . . , 6, 7 in 2Ã¢â‚¬â„¢s complement. Suppose the a, b, c, d are variable of type int whose values are
a=3, b=4, c=4, d=5
and our C program computes
int e = a*bc*d;
Technically speaking, the evaluation of a*bc*d overflows and the result is not well defined. Nevertheless, it is more likely than not that the program actually assigns to e the correct value, which
is Ã¢Ë†â€™8 = 3 Ã‚Â· 4 Ã¢Ë†â€™ 4 Ã‚Â· 5. The reason is that the hardware will compute the result mod 16, and since
the correct result itself fits in the range Ã¢Ë†â€™8, Ã¢Ë†â€™7, . . . , 6, 7, the computed result and the correct result
must match.1
See Exercise 2.7 for more on 2Ã¢â‚¬â„¢s complement arithmetic.
Exercise 2.1. Let a, b, n Ã¢Ë†Ë† Z with n > 0. Show that a Ã¢â€°Â¡ b (mod n) if and only if (a mod n) =
(b mod n).
Exercise 2.2. Let a, b, n, n0 Ã¢Ë†Ë† Z with n > 0, n0 > 0, and n0  n. Show that if a Ã¢â€°Â¡ b (mod n), then
a Ã¢â€°Â¡ b (mod n0 ).
Exercise 2.3. Let a, b, n, n0 Ã¢Ë†Ë† Z with n > 0, n0 > 0, and gcd(n, n0 ) = 1. Show that if a Ã¢â€°Â¡ b (mod n)
and a Ã¢â€°Â¡ b (mod n0 ), then a Ã¢â€°Â¡ b (mod nn0 ).
Exercise 2.4. Let a, b, n Ã¢Ë†Ë† Z with n > 0 and a Ã¢â€°Â¡ b (mod n). Show that gcd(a, n) = gcd(b, n).
Exercise 2.5. Consider again Example 2.2. Instead of using the Ã¢â‚¬Å“subsitution principleÃ¢â‚¬Â, prove
that
(akÃ¢Ë†â€™1 Ã‚Â· Ã‚Â· Ã‚Â· a1 a0 )10 Ã¢â€°Â¡ akÃ¢Ë†â€™1 + Ã‚Â· Ã‚Â· Ã‚Â· + a1 + a0 (mod 3)
directly, using Theorem 2.2 and a proof by induction on k.
Hint: use the fact that for k > 1, we have
(akÃ¢Ë†â€™1 Ã‚Â· Ã‚Â· Ã‚Â· a1 a0 )10 = 10 Ã‚Â· (akÃ¢Ë†â€™1 Ã‚Â· Ã‚Â· Ã‚Â· a1 )10 + a0 .
Note: the Ã¢â‚¬Å“substitution principleÃ¢â‚¬Â itself can be proved by a similar induction argument.
Exercise 2.6. Analogous to Example 2.2, formulate and justify a simple Ã¢â‚¬Å“rule of thumbÃ¢â‚¬Â for
testing divisibility by 11.
1
Warning: an optimizing compiler may invalidate our assumption that arithmetic is computed mod modulo 2n .
However, such aggressively optimizing compilers have caused problems in legacy code, and so some compromises have
been made; for example, the fwrapv option of gcc ensures arithmetic is always carried out mod 2n .
17
Exercise 2.7. Let n be a positive integer. For x Ã¢Ë†Ë† {0, . . . , 2n Ã¢Ë†â€™1}, let xÃŒÆ’ denote the integer obtained
by inverting the bits in the nbit, binary representation of x (note that xÃŒÆ’ Ã¢Ë†Ë† {0, . . . , 2n Ã¢Ë†â€™ 1}). Show
that xÃŒÆ’ + 1 Ã¢â€°Â¡ Ã¢Ë†â€™x (mod 2n ). This justifies the usual rule for computing negatives in 2Ã¢â‚¬â„¢s complement
arithmetic.
Hint: what is x + xÃŒÆ’?
Exercise 2.8. Show that the equation 7y 3 + 2 = z 3 has no solutions y, z Ã¢Ë†Ë† Z.
Exercise 2.9. Show that there are 14 distinct, possible, yearly (Gregorian) calendars, and show
that all 14 calendars actually occur.
2.2
Solving linear congruences
In this section, we consider the general problem of solving linear congruences. More precisely, for
a given positive integer n, and arbitrary integers a and b, we wish to determine the set of integers
z that satisfy the congruence
az Ã¢â€°Â¡ b (mod n).
(2.6)
2.2.1
Existence of solutions
We begin with a theorem that characterizes when (2.6) has any solutions at all.
Theorem 2.5. Let a, b, n Ã¢Ë†Ë† Z with n > 0, and let d := gcd(a, n). Then (2.6) has a solution z Ã¢Ë†Ë† Z
if and only if d  b.
Proof. We have
az Ã¢â€°Â¡ b (mod n) for some z Ã¢Ë†Ë† Z
Ã¢â€¡ÂÃ¢â€¡â€™ az = b + ny for some z, y Ã¢Ë†Ë† Z (by defÃ¢â‚¬â„¢n of congruence)
Ã¢â€¡ÂÃ¢â€¡â€™ az Ã¢Ë†â€™ ny = b for some z, y Ã¢Ë†Ë† Z
Ã¢â€¡ÂÃ¢â€¡â€™ d  b (by BezoutÃ¢â‚¬â„¢s Lemma).
2.2.2
Uniqueness of solutions: cancellation and modular inverses
Now suppose z satisfies the congruence (2.6). Clearly, all integers z 0 Ã¢â€°Â¡ z (mod n) also satisfy (2.6).
The question we next address is: are there any other solutions? The next theorem says that the
answer is Ã¢â‚¬Å“noÃ¢â‚¬Â, provided gcd(a, n) = 1. In other words, if a and n are relatively prime, then (2.6)
has a unique solution modulo n.
Theorem 2.6. Let a, b, n Ã¢Ë†Ë† Z with n > 0. Assume that gcd(a, n) = 1. For any solutions z and z 0
to (2.6), we have z Ã¢â€°Â¡ z 0 (mod n).
Proof. We have
az Ã¢â€°Â¡ b (mod n) and az 0 Ã¢â€°Â¡ b (mod n)
=Ã¢â€¡â€™ az Ã¢â€°Â¡ az 0 (mod n)
=Ã¢â€¡â€™ a(z Ã¢Ë†â€™ z 0 ) Ã¢â€°Â¡ 0 (mod n)
=Ã¢â€¡â€™ n  a(z Ã¢Ë†â€™ z 0 )
=Ã¢â€¡â€™ n  (z Ã¢Ë†â€™ z 0 ) (by Theorem 1.8)
=Ã¢â€¡â€™ z Ã¢â€°Â¡ z 0 (mod n).
18
Combining Theorems 2.5 and 2.6, we have:
Theorem 2.7. Let a, b, n Ã¢Ë†Ë† Z with n > 0. Assume that gcd(a, n) = 1. The congruence (2.6) has
a unique solution modulo n; that is, there is a unique z Ã¢Ë†Ë† {0, . . . , n Ã¢Ë†â€™ 1} such that az Ã¢â€°Â¡ b (mod n).
A cancellation law. Let a, n Ã¢Ë†Ë† Z with n > 0. The proof of Theorem 2.6 gives us a cancellation
law for congruences:
if gcd(a, n) = 1 and az Ã¢â€°Â¡ az 0 (mod n), then z Ã¢â€°Â¡ z 0 (mod n).
Modular inverses. Again, let a, n Ã¢Ë†Ë† Z with n > 0. We say that z Ã¢Ë†Ë† Z is a multiplicative
inverse of a modulo n if az Ã¢â€°Â¡ 1 (mod n). Theorem 2.5 says that a has a multiplicative inverse
modulo n if and only if gcd(a, n) = 1. Moreover, Theorem 2.6 says that the multiplicative inverse
of a, if it exists, is uniquely determined modulo n; that is, if z and z 0 are multiplicative inverses of
a modulo n, then z Ã¢â€°Â¡ z 0 (mod n).
Notation: We write aÃ¢Ë†â€™1 mod n to denote the unique multiplicative inverse of a modulo n that
lies in the interval {0, . . . , n Ã¢Ë†â€™ 1}.
Example 2.4. As we saw in Example 2.3, we have 3Ã¢Ë†â€™1 mod 7 = 5, since 3 Ã‚Â· 5 = 15 Ã¢â€°Â¡ 1 (mod 7).
Computing modular inverses. We can use the extended Euclidean algorithm, discussed in
Ã‚Â§1.3.1, to compute modular inverses, when they exist. Suppose we are given a and n, and want to
compute aÃ¢Ë†â€™1 mod n. Let use assume that 0 Ã¢â€°Â¤ a < nÃ¢â‚¬â€ we can always replace a by a mod n, if this
is not the case. Now compute
(d, s, t) Ã¢â€ Â ExtEuclid(n, a),
so that d = gcd(n, a) and s and t satisfy ns + at = d. If d 6= 1, the aÃ¢Ë†â€™1 mod n does not exist;
otherwise, aÃ¢Ë†â€™1 mod n = t mod n.
2.2.3
Determining all solutions via modular inverses
We now describe the complete set of solutions z Ã¢Ë†Ë† Z to the congruence (2.6) by means of modular
inverses.
If gcd(a, n) = 1, then setting t := aÃ¢Ë†â€™1 mod n, and z := tb mod n, we see that this z is the unique
solution to the congruence (2.6) that lies in the interval {0, . . . , n Ã¢Ë†â€™ 1}. The set of all solutions to
the congruence (2.6) over the integers is {z + ny : y Ã¢Ë†Ë† Z}, but z is the only one of these in the
interval {0, . . . , n Ã¢Ë†â€™ 1}.
More generally, let d := gcd(a, n). If d  b, then we know that (2.6) does not have any solutions.
So suppose that d  b, and set a0 := a/d, b0 := b/d, and n0 := n/d.
For each z Ã¢Ë†Ë† Z, we have az Ã¢â€°Â¡ b (mod n) if and only if a0 z Ã¢â€°Â¡ b0 (mod n0 ). This is because
az Ã¢â€°Â¡ b (mod n) Ã¢â€¡ÂÃ¢â€¡â€™ az = b + ny for some y Ã¢Ë†Ë† Z
Ã¢â€¡ÂÃ¢â€¡â€™ (a/d) = (b/d) + (n/d)y for some y Ã¢Ë†Ë† Z
Ã¢â€¡ÂÃ¢â€¡â€™ a0 z Ã¢â€°Â¡ b0 (mod n0 ).
Also, we have gcd(a0 , n0 ) = 1 (see Exercise 1.11). So, if we set t := (a0 )Ã¢Ë†â€™1 mod n0 and z 0 :=
tb0 mod n0 , then z 0 is the unique solution to a0 z Ã¢â€°Â¡ b0 (mod n0 ) in the interval {0, . . . , n0 Ã¢Ë†â€™ 1}. It
follows that the solutions to the congruence (2.6) that lie in the interval {0, . . . , n Ã¢Ë†â€™ 1} are the d
integers z 0 , z 0 +n0 , . . . , z 0 +(dÃ¢Ë†â€™1)n0 . The set of all solutions to the congruence (2.6) over the integers
is {z 0 + n0 y : y Ã¢Ë†Ë† Z}, but these d integers are the only ones that lie in the interval {0, . . . , n Ã¢Ë†â€™ 1}.
We can summarize the above observations:
19
Theorem 2.8. Let a, b, n Ã¢Ë†Ë† Z with n > 0. Let d := gcd(a, n). If d – b, then the congruence (2.6)
has no solutions; otherwise, it has d solutions z in the interval {0, . . . , n Ã¢Ë†â€™ 1}, namely,
z 0 , z 0 + (n/d), . . . , z 0 + (d Ã¢Ë†â€™ 1)(n/d),
where z 0 is the unique integer in the interval {0, . . . , (n/d) Ã¢Ë†â€™ 1} such that
(a/d)z 0 Ã¢â€°Â¡ (b/d) (mod n/d).
Specifically, z 0 = t(b/d) mod (n/d), where t := (a/d)Ã¢Ë†â€™1 mod (n/d).
Example 2.5. As a concrete illustration, suppose we want to find the solutions to the congruence
6z Ã¢â€°Â¡ 22 (mod 100). Observe that gcd(6, 100) = 2 and 2 divides 22. It follows that z is a solution
to 6z Ã¢â€°Â¡ 22 (mod 100) if and only if 3z Ã¢â€°Â¡ 11 (mod 50). Here, we have simply divided each of the
numbers appearing in the first congruence by the greatest common divisor to obtain the second
congruence. Since 3 Ã‚Â· 17 = 51 Ã¢â€°Â¡ 1 (mod 50), we see that 3Ã¢Ë†â€™1 mod 50 = 17. So we multiply both
sides of 3z Ã¢â€°Â¡ 11 (mod 50) by 17 to obtain z Ã¢â€°Â¡ 187 Ã¢â€°Â¡ 37 (mod 50). It follows that the original
conguence 6z Ã¢â€°Â¡ 22 (mod 100) has exactly two solutions in the interval {0, . . . , 99}, namely, z = 37
and z = 37 + 50 = 87.
Exercise 2.10. Let a, n Ã¢Ë†Ë† Z with n > 0. Show that z Ã¢Ë†Ë† Z satisfies the congruence az Ã¢â€°Â¡ 0 (mod n)
iff z Ã¢â€°Â¡ 0 (mod n/d), where d := gcd(a, n).
Exercise 2.11. For each of the following congruences, determine all the integer solutions z Ã¢Ë†Ë†
{0, . . . , 999}. To so this, you should first put the congruence in the form az = b (mod 1000), as
we did in going from (2.2) to (2.3) in Example 2.3. Then follow the steps outlined in Ã‚Â§2.2.3. Show
all your steps (but you may use a calculator to help with the multiplications). Use the extended
Euclidean algorithm to compute modular inverses, where necessary.
(a) 100z + 200 Ã¢â€°Â¡ 93z + 171 (mod 1000)
(b) 115z + 130 Ã¢â€°Â¡ 100z + 165 (mod 1000)
(c) 115z + 132 Ã¢â€°Â¡ 100z + 140 (mod 1000)
(d) 119z + 132 Ã¢â€°Â¡ 113z + 140 (mod 1000)
Exercise 2.12. Let a1 , . . . , ak , b, n be integers with n > 0. Show that the congruence
a1 z1 + Ã‚Â· Ã‚Â· Ã‚Â· + ak zk Ã¢â€°Â¡ b (mod n)
has a solution z1 , . . . , zk Ã¢Ë†Ë† Z if and only if d  b, where d := gcd(a1 , . . . , ak , n).
Exercise 2.13. Let p be a prime, and let a, b, c, e be integers, such that e > 0, a 6Ã¢â€°Â¡ 0 (mod pe+1 ),
and 0 Ã¢â€°Â¤ c < pe . Define N to be the number of integers z Ã¢Ë†Ë† {0, . . . , p2e Ã¢Ë†â€™ 1} such that
j
k
(az + b) mod p2e
pe = c.
Show that N = pe .
20
2.3
The Chinese remainder theorem
Next, we consider systems of linear congruences with respect to moduli that are relatively prime
in pairs. The result we state here is known as the Chinese remainder theorem, and is extremely
useful in a number of contexts.
Theorem 2.9 (Chinese remainder theorem). Let {ni }ki=1 be a pairwise relatively prime family
of positive integers, and let a1 , . . . , ak be arbitrary integers. Then there exists a solution a Ã¢Ë†Ë† Z to
the system of congruences
a Ã¢â€°Â¡ ai (mod ni ) (i = 1, . . . , k).
0
0
Moreover, any
Qk a Ã¢Ë†Ë† Z is a solution to this system of congruences if and only if a Ã¢â€°Â¡ a (mod n),
where n := i=1 ni .
Proof. To prove the existence of a solution a to the system of congruences, we first show how to
construct integers e1 , . . . , ek such that for i, j = 1, . . . , k, we have
1 (mod ni ) if j = i,
(2.7)
ej Ã¢â€°Â¡
0 (mod ni ) if j 6= i.
If we do this, then setting
a :=
k
X
ai ei ,
i=1
one sees that for j = 1, . . . , k, we have
aÃ¢â€°Â¡
k
X
ai ei Ã¢â€°Â¡ aj (mod nj ),
i=1
since all the terms in this sum are zero modulo nj , except for the term i = j, which is congruent
to aj modulo nj .
Q
To construct e1 , . . . , ek satisfying (2.7), let n := ki=1 ni as in the statement of the theorem, and
for i = 1, . . . , k, let nÃ¢Ë†â€”i := n/ni ; that is, nÃ¢Ë†â€”i is the product of all the moduli nj with j 6= i. From the
fact that {ni }ki=1 is pairwise relatively prime, it follows that for i = 1, . . . , k, we have gcd(ni , nÃ¢Ë†â€”i ) = 1,
and so we may define ti := (nÃ¢Ë†â€”i )Ã¢Ë†â€™1 mod ni and ei := nÃ¢Ë†â€”i ti . One sees that ei Ã¢â€°Â¡ 1 (mod ni ), while for
j 6= i, we have ni  nÃ¢Ë†â€”j , and so ej Ã¢â€°Â¡ 0 (mod ni ). Thus, (2.7) is satisfied.
That proves the existence of a solution a to the given system of congruences. If a Ã¢â€°Â¡ a0 (mod n),
then since ni  n for i = 1, . . . , k, we see that a0 Ã¢â€°Â¡ a Ã¢â€°Â¡ ai (mod ni ) for i = 1, . . . , k, and so a0 also
solves the system of congruences.
Finally, if a0 is a solution to the given system of congruences, then a Ã¢â€°Â¡ ai Ã¢â€°Â¡ a0 (mod ni ) for
i = 1, . . . , k. Thus, ni  (a Ã¢Ë†â€™ a0 ) for i = 1, . . . , k. Since {ni }ki=1 is pairwise relatively prime, this
implies n  (a Ã¢Ë†â€™ a0 ), or equivalently, a Ã¢â€°Â¡ a0 (mod n).
We can restate Theorem 2.9 in more concrete terms, as follows. For each positive integer m, let
Im denote {0, . . . , m Ã¢Ë†â€™ 1}. Suppose {ni }ki=1 is a pairwise relatively prime family of positive integers,
and set n := n1 Ã‚Â· Ã‚Â· Ã‚Â· nk . Then the map
Ãâ€ž : In Ã¢â€ â€™ In1 Ãƒâ€” Ã‚Â· Ã‚Â· Ã‚Â· Ãƒâ€” Ink
a 7Ã¢â€ â€™ (a mod n1 , . . . , a mod nk )
is a bijection.
21
Example 2.6. The following table illustrates what Theorem 2.9 says for n1 = 3 and n2 = 5.
a 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
a mod 3 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2
a mod 5 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
We see that as a ranges from 0 to 14, the pairs (a mod 3, a mod 5) range over all pairs (a1 , a2 )
with a1 Ã¢Ë†Ë† {0, 1, 2} and a2 Ã¢Ë†Ë† {0, . . . , 4}, with every pair being hit exactly once.
Exercise 2.14. Compute the values e1 , e2 , e3 in the proof of Theorem 2.9 in the case where k = 3,
n1 = 3, n2 = 5, and n3 = 7. Also, find an integer a such that a Ã¢â€°Â¡ 1 (mod 3), a Ã¢â€°Â¡ Ã¢Ë†â€™1 (mod 5), and
a Ã¢â€°Â¡ 5 (mod 7).
Exercise 2.15. If you want to show that you are a real nerd, here is an ageguessing game you
might play at a party. You ask a fellow partygoer to divide his age by each of the numbers 3, 4,
and 5, and tell you the remainders. Show how to use this information to determine his age.
Exercise 2.16. Let {ni }ki=1 be a pairwise relatively prime family of positive integers. Let a1 , . . . , ak
and b1 , . . . , bk be integers, and set di := gcd(ai , ni ) for i = 1, . . . , k. Show that there exists an integer
z such that ai z Ã¢â€°Â¡ bi (mod ni ) for i = 1, . . . , k if and only if di  bi for i = 1, . . . , k.
Exercise 2.17. For each prime p, let ÃŽÂ½p (Ã‚Â·) be defined as in Ã‚Â§1.5. Let p1 , . . . , pr be distinct primes,
a1 , . . . , ar be arbitrary integers, and e1 , . . . , er be arbitrary nonnegative integers. Show that there
exists an integer a such that ÃŽÂ½pi (a Ã¢Ë†â€™ ai ) = ei for i = 1, . . . , r.
Exercise 2.18. Suppose n1 and n2 are positive integers, and let d := gcd(n1 , n2 ). Let a1 and
a2 be arbitrary integers. Show that there exists an integer a such that a Ã¢â€°Â¡ a1 (mod n1 ) and
a Ã¢â€°Â¡ a2 (mod n2 ) if and only if a1 Ã¢â€°Â¡ a2 (mod d).
2.4
Residue classes
Let n be a positive integer. We define the set Zn to be the set of n abstract objects
[0]n , [1]n , . . . , [n Ã¢Ë†â€™ 1]n .
These objects are called the residue classes modulo n. We can extend this notation, where for
arbitrary a Ã¢Ë†Ë† Z, we define
[a]n := [a mod n]n .
We also define addition and multiplication on residue classes as follows. For a, b Ã¢Ë†Ë† {0, . . . , nÃ¢Ë†â€™1},
we define
[a]n + [b]n := [a + b]n
and
[a]n Ã‚Â· [b]n := [a Ã‚Â· b]n ,
Note that in this definition, we are using the extended notation introduced above, so that [a + b]n =
[(a + b) mod n]n and [a Ã‚Â· b]n = [(a Ã‚Â· b) mod n]n .
Note that because of Theorem 2.2, the equations defining addition and multiplication of residue
classes hold for all a, b Ã¢Ë†Ë† Z. That is, for all a, b Ã¢Ë†Ë† Z, we have
[a]n + [b]n = [a + b]n
and
22
[a]n Ã‚Â· [b]n = [a Ã‚Â· b]n .
To see why, if a0 := a mod n and b0 := b mod n, then
[a]n = [a0 ]n ,
[b]n = [b0 ]n ,
[a + b]n = [a0 + b0 ]n ,
and
[a Ã‚Â· b]n = [a0 Ã‚Â· b0 ]n ,
and it follows that
[a]n + [b]n = [a0 ]n + [b0 ]n = [a0 + b0 ]n = [a + b]n
and
[a]n Ã‚Â· [b]n = [a0 ]n Ã‚Â· [b0 ]n = [a0 Ã‚Â· b0 ]n = [a Ã‚Â· b]n .
More generally, for all a, b, c Ã¢Ë†Ë† Z, we have
[a]n + [b]n = [c]n Ã¢â€¡ÂÃ¢â€¡â€™ a + b Ã¢â€°Â¡ c (mod n),
and
[a]n Ã‚Â· [b]n = [c]n Ã¢â€¡ÂÃ¢â€¡â€™ a Ã‚Â· b Ã¢â€°Â¡ c (mod n).
When n is clear from context, we often write [a] instead of [a]n .
Example 2.7. Consider again the residue classes modulo 6. These are
[0], [1], [2], [3], [4], [5].
Using the extended notation, we see, for example, that [Ã¢Ë†â€™1] is the same thing as [5].
Let us write down the addition and multiplication tables for Z6 . The addition table looks like
this:
+
[0]
[1]
[2]
[3]
[4]
[5]
[0]
[0]
[1]
[2]
[3]
[4]
[5]
[1]
[1]
[2]
[3]
[4]
[5]
[0]
[2]
[2]
[3]
[4]
[5]
[0]
[1]
[3]
[3]
[4]
[5]
[0]
[1]
[2]
[4]
[4]
[5]
[0]
[1]
[2]
[3]
[5]
[5]
[0]
[1]
[2]
[3]
[4] .
[1]
[0]
[1]
[2]
[3]
[4]
[5]
[2]
[0]
[2]
[4]
[0]
[2]
[4]
[3]
[0]
[3]
[0]
[3]
[0]
[3]
[4]
[0]
[4]
[2]
[0]
[4]
[2]
[5]
[0]
[5]
[4]
[3]
[2]
[1] .
The multiplication table looks like this:
Ã‚Â·
[0]
[1]
[2]
[3]
[4]
[5]
[0]
[0]
[0]
[0]
[0]
[0]
[0]
The addition and multiplication operations on Zn yield a very natural algebraic structure. For
example, addition and multiplication are commutative and associative; that is, for all ÃŽÂ±, ÃŽÂ², ÃŽÂ³ Ã¢Ë†Ë† Zn ,
we have
ÃŽÂ± + ÃŽÂ² = ÃŽÂ² + ÃŽÂ±, (ÃŽÂ± + ÃŽÂ²) + ÃŽÂ³ = ÃŽÂ± + (ÃŽÂ² + ÃŽÂ³),
ÃŽÂ±ÃŽÂ² = ÃŽÂ²ÃŽÂ±, (ÃŽÂ±ÃŽÂ²)ÃŽÂ³ = ÃŽÂ±(ÃŽÂ²ÃŽÂ³).
23
Note that we have adopted here the usual convention of writing ÃŽÂ±ÃŽÂ² in place of ÃŽÂ± Ã‚Â· ÃŽÂ². Furthermore,
multiplication distributes over addition; that is, for all ÃŽÂ±, ÃŽÂ², ÃŽÂ³ Ã¢Ë†Ë† Zn , we have
ÃŽÂ±(ÃŽÂ² + ÃŽÂ³) = ÃŽÂ±ÃŽÂ² + ÃŽÂ±ÃŽÂ³.
All of these properties follow from the definitions, and the corresponding properties for Z. For
example, for the distributive law, suppose ÃŽÂ± = [a], ÃŽÂ² = [b], and ÃŽÂ³ = [c]. Then we have
ÃŽÂ±(ÃŽÂ² + ÃŽÂ³) = [a(b + c)] = [ab + ac] = ÃŽÂ±ÃŽÂ² + ÃŽÂ±ÃŽÂ³.
Because addition and multiplication in Zn are associative, for ÃŽÂ±1 , . . . , ÃŽÂ±k Ã¢Ë†Ë† Zn , we may write
the sum ÃŽÂ±1 + Ã‚Â· Ã‚Â· Ã‚Â· + ÃŽÂ±k and the product ÃŽÂ±1 Ã‚Â· Ã‚Â· Ã‚Â· ÃŽÂ±k without any parentheses, and there is no ambiguity;
moreover, since both addition and multiplication are commutative, we may rearrange the terms in
such sums and products without changing their values.
The residue class [0] acts as an additive identity; that is, for all ÃŽÂ± Ã¢Ë†Ë† Zn , we have ÃŽÂ± + [0] = ÃŽÂ±;
indeed, if ÃŽÂ± = [a], then a + 0 Ã¢â€°Â¡ a (mod n). Moreover, [0] is the only element of Zn that acts as an
additive identity; indeed, if a + z Ã¢â€°Â¡ a (mod n) holds for all integers a, then it holds in particular
for a = 0, which implies z Ã¢â€°Â¡ 0 (mod n). The residue class [0] also has the property that ÃŽÂ± Ã‚Â· [0] = [0]
for all ÃŽÂ± Ã¢Ë†Ë† Zn .
Every ÃŽÂ± Ã¢Ë†Ë† Zn has an additive inverse, that is, an element ÃŽÂ² Ã¢Ë†Ë† Zn such that ÃŽÂ± + ÃŽÂ² = [0];
indeed, if ÃŽÂ± = [a], then clearly ÃŽÂ² := [Ã¢Ë†â€™a] does the job, since a + (Ã¢Ë†â€™a) Ã¢â€°Â¡ 0 (mod n). Moreover, ÃŽÂ± has
a unique additive inverse; indeed, if a + z Ã¢â€°Â¡ 0 (mod n), then subtracting a from both sides of this
congruence yields z Ã¢â€°Â¡ Ã¢Ë†â€™a (mod n). We naturally denote the additive inverse of ÃŽÂ± by Ã¢Ë†â€™ÃŽÂ±. Observe
that the additive inverse of Ã¢Ë†â€™ÃŽÂ± is ÃŽÂ±; that is Ã¢Ë†â€™(Ã¢Ë†â€™ÃŽÂ±) = ÃŽÂ±. Also, we have the identities
Ã¢Ë†â€™(ÃŽÂ± + ÃŽÂ²) = (Ã¢Ë†â€™ÃŽÂ±) + (Ã¢Ë†â€™ÃŽÂ²), (Ã¢Ë†â€™ÃŽÂ±)ÃŽÂ² = Ã¢Ë†â€™(ÃŽÂ±ÃŽÂ²) = ÃŽÂ±(Ã¢Ë†â€™ÃŽÂ²), (Ã¢Ë†â€™ÃŽÂ±)(Ã¢Ë†â€™ÃŽÂ²) = ÃŽÂ±ÃŽÂ².
For ÃŽÂ±, ÃŽÂ² Ã¢Ë†Ë† Zn , we naturally write ÃŽÂ± Ã¢Ë†â€™ ÃŽÂ² for ÃŽÂ± + (Ã¢Ë†â€™ÃŽÂ²).
The residue class [1] acts as a multiplicative identity; that is, for all ÃŽÂ± Ã¢Ë†Ë† Zn , we have
ÃŽÂ± Ã‚Â· [1] = ÃŽÂ±; indeed, if ÃŽÂ± = [a], then a Ã‚Â· 1 Ã¢â€°Â¡ a (mod n). Moreover, [1] is the only element of Zn
that acts as a multiplicative identity; indeed, if a Ã‚Â· z Ã¢â€°Â¡ a (mod n) holds for all integers a, then in
particular, it holds for a = 1, which implies z Ã¢â€°Â¡ 1 (mod n).
For ÃŽÂ± Ã¢Ë†Ë† Zn , we call ÃŽÂ² Ã¢Ë†Ë† Zn a multiplicative inverse of ÃŽÂ± if ÃŽÂ±ÃŽÂ² = [1]. Not all ÃŽÂ± Ã¢Ë†Ë† Zn have
multiplicative inverses. If ÃŽÂ± = [a] and ÃŽÂ² = [b], then ÃŽÂ² is a multiplicative inverse of ÃŽÂ± if and only
if ab Ã¢â€°Â¡ 1 (mod n). The results in Ã‚Â§2.2 imply that ÃŽÂ± has a multiplicative inverse if and only if
gcd(a, n) = 1, and that if it exists, it is unique. When it exists, we denote the multiplicative inverse
of ÃŽÂ± by ÃŽÂ±Ã¢Ë†â€™1 . We saw at the end of Ã‚Â§2.2 how to compute modular inverses using the extended
Euclidean algorithm.
We define ZÃ¢Ë†â€”n to be the set of elements of Zn that have a multiplicative inverse. By the above
discussion, we have
ZÃ¢Ë†â€”n = {[a] : a = 0, . . . , n Ã¢Ë†â€™ 1, gcd(a, n) = 1}.
If n is prime, then gcd(a, n) = 1 for a = 1, . . . , n Ã¢Ë†â€™ 1, and we see that ZÃ¢Ë†â€”n = Zn {[0]}. If n is
composite, then ZÃ¢Ë†â€”n ( Zn {[0]}; for example, if d  n with 1 < d < n, we see that [d] is not zero,
nor does it belong to ZÃ¢Ë†â€”n . Observe that if ÃŽÂ±, ÃŽÂ² Ã¢Ë†Ë† ZÃ¢Ë†â€”n , then so are ÃŽÂ±Ã¢Ë†â€™1 and ÃŽÂ±ÃŽÂ²; indeed,
(ÃŽÂ±Ã¢Ë†â€™1 )Ã¢Ë†â€™1 = ÃŽÂ± and (ÃŽÂ±ÃŽÂ²)Ã¢Ë†â€™1 = ÃŽÂ±Ã¢Ë†â€™1 ÃŽÂ² Ã¢Ë†â€™1 .
For ÃŽÂ± Ã¢Ë†Ë† Zn and ÃŽÂ² Ã¢Ë†Ë† ZÃ¢Ë†â€”n , we naturally write ÃŽÂ±/ÃŽÂ² for ÃŽÂ±ÃŽÂ² Ã¢Ë†â€™1 .
Example 2.8. We list the elements of ZÃ¢Ë†â€”15 , and for each ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”15 , we also give ÃŽÂ±Ã¢Ë†â€™1 :
24
ÃŽÂ±
ÃŽÂ±Ã¢Ë†â€™1
[1]
[1]
[2]
[8]
[4]
[4]
[7]
[13]
[8]
[2]
[11]
[11]
[13]
[7]
[14]
[14]
.
Suppose ÃŽÂ±, ÃŽÂ², ÃŽÂ³ are elements of Zn that satisfy the equation
ÃŽÂ±ÃŽÂ² = ÃŽÂ±ÃŽÂ³.
If ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n , we may multiply both sides of this equation by ÃŽÂ±Ã¢Ë†â€™1 to infer that
ÃŽÂ² = ÃŽÂ³.
This is the cancellation law for Zn . In particular, if n is prime, then this cancellation law holds
for all ÃŽÂ± 6= [0].
We stress that for arbitrary n, we require that ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n , and not just ÃŽÂ± 6= [0]. Indeed, suppose
that n is composite, so we can factor it as n = ab, where 1 < a < n and 1 < b < n, and set ÃŽÂ± := [a],
ÃŽÂ² := [b], and ÃŽÂ³ := [0]. Then we have ÃŽÂ±ÃŽÂ² = [0] = ÃŽÂ±ÃŽÂ³; however, ÃŽÂ² 6= ÃŽÂ³.
Analogous to Theorems 2.5 and 2.6, we have the following result on the existence and uniqueness to solutions of linear equations in Zn . Specifically, let ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n and ÃŽÂ² Ã¢Ë†Ë† Zn . Then the
equation
ÃŽÂ±ÃŽÂ¶ = ÃŽÂ²
has a unique solution ÃŽÂ¶, namely, ÃŽÂ¶ := ÃŽÂ±Ã¢Ë†â€™1 ÃŽÂ². In particular, if n is prime, then this equation has a
unique solution ÃŽÂ¶ provided ÃŽÂ± 6= [0]. Again, we stress that for arbitrary n, we require that ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n ,
and not just ÃŽÂ± 6= [0].
More
notational conventions. For ÃŽÂ±1 , . . . , ÃŽÂ±k Ã¢Ë†Ë† Zn , we may naturally
Pk sum as
Pk write their
Pk
i=1 (Ã¢Ë†â€™ÃŽÂ±i );
i=1 ÃŽÂ±i =
i=1 ÃŽÂ±i . By convention, this sum is [0] when k = 0. It is easy to see that Ã¢Ë†â€™
that is, the additive inverse of the sum is the sum of P
the additive inverses. In the special case where
all the ÃŽÂ±i Ã¢â‚¬â„¢s have the same value ÃŽÂ±, we define k Ã‚Â· ÃŽÂ± := ki=1 ÃŽÂ±; thus, 0 Ã‚Â· ÃŽÂ± = [0], 1 Ã‚Â· ÃŽÂ± = ÃŽÂ±, 2 Ã‚Â· ÃŽÂ± = ÃŽÂ± + ÃŽÂ±,
3 Ã‚Â· ÃŽÂ± = ÃŽÂ± + ÃŽÂ± + ÃŽÂ±, and so on. The additive inverse of k Ã‚Â· ÃŽÂ± is k Ã‚Â· (Ã¢Ë†â€™ÃŽÂ±), which we may also write
as (Ã¢Ë†â€™k) Ã‚Â· ÃŽÂ±; thus, (Ã¢Ë†â€™1) Ã‚Â· ÃŽÂ± = Ã¢Ë†â€™ÃŽÂ±, (Ã¢Ë†â€™2) Ã‚Â· ÃŽÂ± = (Ã¢Ë†â€™ÃŽÂ±) + (Ã¢Ë†â€™ÃŽÂ±) = Ã¢Ë†â€™(ÃŽÂ± + ÃŽÂ±), and so on. Therefore, the
notation k Ã‚Â· ÃŽÂ±, or more simply, kÃŽÂ±, is defined for all integers k. Note that for all integers k and a,
we have k[a] = [ka] = [k][a].
Q
Analogously, for ÃŽÂ±1 , . . . , ÃŽÂ±k Ã¢Ë†Ë† Zn , we may write their product as ki=1 ÃŽÂ±i . By convention, this
product is [1] when k = 0. It isQeasy to see that
all of the ÃŽÂ±i Ã¢â‚¬â„¢s belong to ZÃ¢Ë†â€”n , then so does
Qk if Ã¢Ë†â€™1
k
Ã¢Ë†â€™1
their product, and in particular, ( i=1 ÃŽÂ±i ) = i=1 ÃŽÂ±i ; that is, the multiplicative inverse of the
product is the product of the multiplicative
inverses. In the special case where all the ÃŽÂ±i Ã¢â‚¬â„¢s have
Q
the same value ÃŽÂ±, we define ÃŽÂ±k := ki=1 ÃŽÂ±; thus, ÃŽÂ±0 = [1], ÃŽÂ±1 = ÃŽÂ±, ÃŽÂ±2 = ÃŽÂ±ÃŽÂ±, ÃŽÂ±3 = ÃŽÂ±ÃŽÂ±ÃŽÂ±, and so
on. If ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n , then the multiplicative inverse of ÃŽÂ±k is (ÃŽÂ±Ã¢Ë†â€™1 )k , which we may also write as ÃŽÂ±Ã¢Ë†â€™k ;
for example, ÃŽÂ±Ã¢Ë†â€™2 = ÃŽÂ±Ã¢Ë†â€™1 ÃŽÂ±Ã¢Ë†â€™1 = (ÃŽÂ±ÃŽÂ±)Ã¢Ë†â€™1 . Therefore, when ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n , the notation ÃŽÂ±k is defined for all
integers k.
One last notational convention. As already mentioned, when the modulus n is clear from
context, we usually write [a] instead of [a]n . Although we want to maintain a clear distinction
between integers and their residue classes, occasionally even the notation [a] is not only redundant,
but distracting; in such situations, we may simply write a instead of [a]. For example, for every
ÃŽÂ± Ã¢Ë†Ë† Zn , we have the identity (ÃŽÂ± + [1]n )(ÃŽÂ± Ã¢Ë†â€™ [1]n ) = ÃŽÂ±2 Ã¢Ë†â€™ [1]n , which we may write more simply as
(ÃŽÂ±+[1])(ÃŽÂ±Ã¢Ë†â€™[1]) = ÃŽÂ±2 Ã¢Ë†â€™[1], or even more simply, and hopefully more clearly, as (ÃŽÂ±+1)(ÃŽÂ±Ã¢Ë†â€™1) = ÃŽÂ±2 Ã¢Ë†â€™1.
Here, the only reasonable interpretation of the symbol Ã¢â‚¬Å“1Ã¢â‚¬Â is [1], and so there can be no confusion.
25
In summary, algebraic expressions involving residue classes may be manipulated in much the
same way as expressions involving ordinary numbers. Extra complications arise only because when
n is composite, some nonzero elements of Zn do not have multiplicative inverses, and the usual
cancellation law does not apply for such elements.
In general, one has a choice between working with congruences modulo n, or with the algebraic
structure Zn ; ultimately, the choice is one of taste and convenience, and it depends on what one
prefers to treat as Ã¢â‚¬Å“first class objectsÃ¢â‚¬Â: integers and congruence relations, or elements of Zn .
Exercise 2.19. Let p be an odd prime. Show that
P
ÃŽÂ²Ã¢Ë†Ë†ZÃ¢Ë†â€”p
ÃŽÂ² Ã¢Ë†â€™1 =
P
ÃŽÂ²Ã¢Ë†Ë†ZÃ¢Ë†â€”p
Exercise 2.20. Let p be an odd prime. Show that the numerator of
Hint: use the previous exercise.
2.5
ÃŽÂ² = 0.
PpÃ¢Ë†â€™1
i=1
1/i is divisible by p.
FermatÃ¢â‚¬â„¢s little theorem
Let n be a positive integer, and let ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n .
Consider the sequence of powers of ÃŽÂ±:
1 = ÃŽÂ±0 , ÃŽÂ±1 , ÃŽÂ±2 , . . . .
Since each such power is an element of ZÃ¢Ë†â€”n , and since ZÃ¢Ë†â€”n is a finite set, this sequence of powers
must start to repeat at some point; that is, there must be a positive integer k such that ÃŽÂ±k = ÃŽÂ±i
for some i = 0, . . . , k Ã¢Ë†â€™ 1. Let us assume that k is chosen to be the smallest such positive integer.
This value k is called the multiplicative order of ÃŽÂ±.
We claim that ÃŽÂ±k = 1. To see this, suppose by way of contradiction that ÃŽÂ±k = ÃŽÂ±i , for some
i = 1, . . . , k Ã¢Ë†â€™ 1; we could then cancel ÃŽÂ± from both sides of the equation ÃŽÂ±k = ÃŽÂ±i , obtaining
ÃŽÂ±kÃ¢Ë†â€™1 = ÃŽÂ±iÃ¢Ë†â€™1 , which would contradict the minimality of k.
Thus, we can characterize the multiplicative order of ÃŽÂ± as the smallest positive integer k such
that
ÃŽÂ±k = 1.
If ÃŽÂ± = [a] with a Ã¢Ë†Ë† Z (and gcd(a, n) = 1, since ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n ), then k is also called the multiplicative
order of a modulo n, and can be characterized as the smallest positive integer k such that
ak Ã¢â€°Â¡ 1 (mod n).
From the above discussion, we see that the first k powers of ÃŽÂ±, that is, ÃŽÂ±0 , ÃŽÂ±1 , . . . , ÃŽÂ±kÃ¢Ë†â€™1 , are
distinct. Moreover, other powers of ÃŽÂ± simply repeat this pattern. The following is an immediate
consequence of this observation.
Theorem 2.10. Let n be a positive integer, and let ÃŽÂ± be an element of ZÃ¢Ë†â€”n of multiplicative
order k. Then for every i Ã¢Ë†Ë† Z, we have ÃŽÂ±i = 1 if and only if k divides i. More generally, for
all i, j Ã¢Ë†Ë† Z, we have ÃŽÂ±i = ÃŽÂ±j if and only if i Ã¢â€°Â¡ j (mod k).
Example 2.9. Let n = 7. For each value a = 1, . . . , 6, we can compute successive powers of a
modulo n to find its multiplicative order modulo n.
26
1i
2i
3i
4i
5i
6i
i
mod 7
mod 7
mod 7
mod 7
mod 7
mod 7
1
1
2
3
4
5
6
2
1
4
2
2
4
1
3
1
1
6
1
6
6
4
1
2
4
4
2
1
5
1
4
5
2
3
6
6
1
1
1
1
1
1
So we conclude that modulo 7: 1 has order 1; 6 has order 2; 2 and 4 have order 3; and 3 and 5
have order 6.
In the above example, we see that every element of ZÃ¢Ë†â€”7 has multiplicative order either 1, 2, 3,
or 6. In particular, ÃŽÂ±6 = 1 for all ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”7 . This is a special case of FermatÃ¢â‚¬â„¢s little theorem:
Theorem 2.11 (FermatÃ¢â‚¬â„¢s little theorem). Let p be a prime and let ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”p . Then ÃŽÂ±pÃ¢Ë†â€™1 = 1.
Proof. Since ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”p , for every ÃŽÂ² Ã¢Ë†Ë† ZÃ¢Ë†â€”p we have ÃŽÂ±ÃŽÂ² Ã¢Ë†Ë† ZÃ¢Ë†â€”p , and so we may define the Ã¢â‚¬Å“multiplication
by ÃŽÂ±Ã¢â‚¬Â map
Ãâ€žÃŽÂ± : ZÃ¢Ë†â€”p Ã¢â€ â€™ ZÃ¢Ë†â€”p
ÃŽÂ² 7Ã¢â€ â€™ ÃŽÂ±ÃŽÂ².
It is easy to see that Ãâ€žÃŽÂ± is a bijection:
Injectivity: If ÃŽÂ±ÃŽÂ² = ÃŽÂ±ÃŽÂ² 0 , then cancel ÃŽÂ± to obtain ÃŽÂ² = ÃŽÂ² 0 .
Surjectivity: For every ÃŽÂ³ Ã¢Ë†Ë† ZÃ¢Ë†â€”p , ÃŽÂ±Ã¢Ë†â€™1 ÃŽÂ³ is a preimage of ÃŽÂ³ under Ãâ€žÃŽÂ± .
Thus, as ÃŽÂ² ranges over the set ZÃ¢Ë†â€”p , so does ÃŽÂ±ÃŽÂ², and we have
Y
Y
Y
pÃ¢Ë†â€™1
ÃŽÂ² .
ÃŽÂ²=
(ÃŽÂ±ÃŽÂ²) = ÃŽÂ±
ÃŽÂ²Ã¢Ë†Ë†ZÃ¢Ë†â€”p
Canceling the common factor
Q
ÃŽÂ²Ã¢Ë†Ë†ZÃ¢Ë†â€”p
(2.8)
ÃŽÂ²Ã¢Ë†Ë†ZÃ¢Ë†â€”p
ÃŽÂ²Ã¢Ë†Ë†ZÃ¢Ë†â€”p
ÃŽÂ² Ã¢Ë†Ë† ZÃ¢Ë†â€”p from the left and righthand side of (2.8), we obtain
1 = ÃŽÂ±pÃ¢Ë†â€™1 .
As a consequence of FermatÃ¢â‚¬â„¢s little theorem and Theorem 2.10, we see that for a prime p, the
multiplicative order of ÃŽÂ± divides p Ã¢Ë†â€™ 1 for every ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”p . It turns out that for every prime p, there
exists an element ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”p of whose multiplicative order is equal to p Ã¢Ë†â€™ 1. Such an ÃŽÂ± is called a
primitive root mod p. For instance, in Example 2.9, we saw that [3]7 and [5]7 are primitive
roots mod 7. We shall prove this fact later (see Theorem 3.11), after developing some other tools.
Observe that if ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”p is a primitive root, then every ÃŽÂ² Ã¢Ë†Ë† ZÃ¢Ë†â€”p can be expressed as a power of ÃŽÂ±.
FermatÃ¢â‚¬â„¢s little theorem is sometimes stated in the following form: for every prime p and every
ÃŽÂ± Ã¢Ë†Ë† Zp , we have
ÃŽÂ±p = ÃŽÂ±.
(2.9)
Observe that for ÃŽÂ± = 0, the equation (2.9) obviously holds. Otherwise, for ÃŽÂ± 6= 0, Theorem 2.11
says that ÃŽÂ±pÃ¢Ë†â€™1 = 1, and multiplying both sides of this equation by ÃŽÂ± yields (2.9).
Finally, in terms of congruences, FermatÃ¢â‚¬â„¢s little theorem can be stated as follows: for every
prime p and every integer a, we have
ap Ã¢â€°Â¡ a (mod p).
27
(2.10)
2.5.1
Application: primality testing
FermatÃ¢â‚¬â„¢s little theorem forms the basis for several primality testing algorithms.
Consider the following computational problem: given a large number n, decide whether n is
prime or not. Here, we are thinking of n has being very large, perhaps several hundred decimal digits
in lengthÃ¢â‚¬â€these are the size of prime numbers needed in a number of cryptographic applications.
A naive approach to testing if n is prime is trial division: simply test if n is divisible by 2, 3, 5,
etc., testing divisibility by all primes p up to n1/2 (see Exercise 1.14). Unfortunately, for the large
values of n we are considering here, this would take an enormous amount of time, and is completely
impractical.
FermatÃ¢â‚¬â„¢s little theorem suggests the following primality test: simply select a nonzero ÃŽÂ± Ã¢Ë†Ë† Zn ,
compute ÃŽÂ² := ÃŽÂ±nÃ¢Ë†â€™1 Ã¢Ë†Ë† Zn , and check if ÃŽÂ² = 1. On the one hand, if ÃŽÂ² 6= 1, FermatÃ¢â‚¬â„¢s little theorem
tells us that n cannot be prime. On the other hand, if ÃŽÂ² = 1, the test is inconclusive: n may or
may not prime.
We can repeat the above test several times. If any of the ÃŽÂ²Ã¢â‚¬â„¢s are not 1, we know n is not prime;
otherwise, the test is still inconclusive.
While the Fermat primality test is not perfect, it is not hard to modify it slightly so that it
becomes much more effective Ã¢â‚¬â€ the most practical primality tests used today are all minor variations
on the Fermat primality test. We shall not go into the details of these variations. Rather, we shall
show how to efficiently implement the Fermat primality test. The techniques we present apply to
the more effective primality tests, and have many other applications.
The first issue to be addressed is how to represent elements of Zn . It is natural and convenient to
work with the set of representatives {0, . . . , n Ã¢Ë†â€™ 1}. So to multiply two elements in Zn , we multiply
their representatives, and then reduce the product mod n. Just as in Ã‚Â§1.3.1, we shall assume that
we have efficient algorithms to implement these basic arithmetic operations. That still leaves the
issue of how to efficiently perform the exponentiation ÃŽÂ±nÃ¢Ë†â€™1 .
A simple algorithm to compute ÃŽÂ² := ÃŽÂ±nÃ¢Ë†â€™1 is the following:
ÃŽÂ² Ã¢â€ Â [1]n Ã¢Ë†Ë† Zn
repeat n Ã¢Ë†â€™ 1 times
ÃŽÂ² Ã¢â€ ÂÃŽÂ²Ã‚Â·ÃŽÂ±
This algorithm computes the value ÃŽÂ² correctly. Moreover, the numbers that arise in the computation never get too large, since in every multiplication in Zn , the result gets reduced mod n.
Unfortunately, the number of loop iterations is n Ã¢Ë†â€™ 1, and so this algorithm is actually slower than
the trial division primality test that we started out with.
Fortunately, there is a much faster exponentiation algorithm, called repeated squaring. Consider the following more general problem: given ÃŽÂ± Ã¢Ë†Ë† Zn and a nonnegative integer e, compute
ÃŽÂ±e . Using the repeated squaring algorithm, we can compute ÃŽÂ±e using O(log e) multiplications in
Zn . Setting e := n Ã¢Ë†â€™ 1, we can therefore implement the Fermat primality test using O(log n)
multiplications in Zn , which is much more practical.
As a warmup, suppose e is a power of 2, say e = 2k . Then the following simple algorithm
computes ÃŽÂ² := ÃŽÂ±e :
ÃŽÂ²Ã¢â€ ÂÃŽÂ±
repeat k times
ÃŽÂ² Ã¢â€ Â ÃŽÂ²2
This algorithm requires just k = log2 e multiplications in Zn .
28
For arbitrary e, we can use the following strategy. Suppose that the binary representation of e
is e = (b1 Ã‚Â· Ã‚Â· Ã‚Â· bk )2 , where b1 is the highorder bit of e and bk is the loworder bit. Then the following
iterative algorithm computes ÃŽÂ² := ÃŽÂ±e :
ÃŽÂ² Ã¢â€ Â [1]n Ã¢Ë†Ë† Zn
for i in [1 . . k] do
ÃŽÂ² Ã¢â€ Â ÃŽÂ²2
if bi = 1 then ÃŽÂ² Ã¢â€ Â ÃŽÂ² Ã‚Â· ÃŽÂ±
One can easily verify that after the ith loop iteration, we have ÃŽÂ² = ÃŽÂ±ei , where ei = (b1 Ã‚Â· Ã‚Â· Ã‚Â· bi )2 .
Indeed, observe that ei = 2eiÃ¢Ë†â€™1 + bi , and therefore,
ÃŽÂ±ei = ÃŽÂ±2eiÃ¢Ë†â€™1 +bi = (ÃŽÂ±eiÃ¢Ë†â€™1 )2 Ã‚Â· ÃŽÂ±bi .
Example 2.10. Suppose e = 37 = (100101)2 . The above algorithm performs the following operations in this case:
ÃŽÂ²
ÃŽÂ²
ÃŽÂ²
ÃŽÂ²
ÃŽÂ²
ÃŽÂ²
ÃŽÂ²
Ã¢â€ Â [1]
Ã¢â€ Â ÃŽÂ²2, ÃŽÂ² Ã¢â€ Â ÃŽÂ² Ã‚Â· ÃŽÂ±
Ã¢â€ Â ÃŽÂ²2
Ã¢â€ Â ÃŽÂ²2
Ã¢â€ Â ÃŽÂ²2, ÃŽÂ² Ã¢â€ Â ÃŽÂ² Ã‚Â· ÃŽÂ±
Ã¢â€ Â ÃŽÂ²2
Ã¢â€ Â ÃŽÂ²2, ÃŽÂ² Ã¢â€ Â ÃŽÂ² Ã‚Â· ÃŽÂ±
//
//
//
//
//
//
//
//
computed exponent (in binary)
0
1
10
100
1001
10010
100101 .
Exercise 2.21. Suppose ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n has multiplicative order k. Let m be any integer. Show that ÃŽÂ±m
has multiplicative order k/ gcd(m, k). Hint: use Theorem 2.10 and Exercise 2.10.
Exercise 2.22. Suppose ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n has multiplicative order k and ÃŽÂ² Ã¢Ë†Ë† ZÃ¢Ë†â€”n has multiplicative order
`, where gcd(k, `) = 1. Show that ÃŽÂ±ÃŽÂ² has multiplicative order k`. Hint: suppose (ÃŽÂ±ÃŽÂ²)m = 1, and
deduce that both sides of the equation ÃŽÂ±m = ÃŽÂ² Ã¢Ë†â€™m must have multiplicative order 1 (the previous
exercise may be helpful); from this, deduce that both k and ` divide m and then apply the result
of Exercise 1.12.
Exercise 2.23. Let ÃŽÂ± Ã¢Ë†Ë† ZÃ¢Ë†â€”n . Suppose that for a prime q and positive integer e, we have
e
ÃŽÂ±q = 1 and ÃŽÂ±q
eÃ¢Ë†â€™1
6= 1.
Show that ÃŽÂ± has multiplicative order q e .
Exercise 2.24. Find all primitive roots mod 19. Show your work. To simplify the calculations,
you may make make use of FermatÃ¢â‚¬â„¢s little theorem, as well as the result of Exercise 2.21.
Exercise 2.25. Calculate the order of 9 mod 100. Show your work by building a table of powers
9i mod 100. In addition, from this table, identify the multiplicative inverse of 9 mod 100.
Exercise 2.26. Let n Ã¢Ë†Ë† Z with n > 1. Show that n is prime if and only if ÃŽÂ±nÃ¢Ë†â€™1 = 1 for every
nonzero ÃŽÂ± Ã¢Ë†Ë† Zn .
Exercise 2.27. Let p be any prime other than 2 or 5. Show that p divides infinitely many of the
numbers 9, 99, 999, etc.
29
Exercise 2.28. Let n be an integer greater than 1. Show that n does not divide 2n Ã¢Ë†â€™ 1. Hint:
assume that n divides 2n Ã¢Ë†â€™ 1; now consider a prime p which divides n, so that p also divides 2n Ã¢Ë†â€™ 1,
and derive a contradiction.
Exercise 2.29. This exercise develops an alternative proof of FermatÃ¢â‚¬â„¢s little theorem.
(a) Using Exercise 1.16, show that for all primes p and integers a, we have (a+1)p Ã¢â€°Â¡ ap +1 (mod p).
(b) Now derive FermatÃ¢â‚¬â„¢s little theorem from part (a).
Exercise 2.30. Compute 399 mod 100 using the repeated squaring algorithm. Show your work.
30
Chapter 3
Rings and polynomials
This chapter introduces the notion of polynomials whose coefficients are in a general algebraic
structure called a Ã¢â‚¬Å“ringÃ¢â‚¬Â. So to begin with, we introduce the mathematical notion of a ring. While
there is a lot of terminology associated with rings, the basic ideas are fairly simple. Intuitively
speaking, a ring is just a structure with addition and multiplication operations that behave exactly
as one would expect Ã¢â‚¬â€ there are very few surprises.
3.1
Rings: Definitions, examples, and basic properties
Definition 3.1 (Ring). A ring is a set R together with addition and multiplication operations
on R, such that:
(i) addition is commutative and association;
(ii) there exists an additive identity (or zero element), denoted 0R , where a+0R = a for all a Ã¢Ë†Ë† R;
(iii) every a Ã¢Ë†Ë† R has an additive inverse, denoted Ã¢Ë†â€™a, where a + (Ã¢Ë†â€™a) = 0R ;
(iv) multiplication is commutative and associative;
(v) there exists a multiplicative identity, denoted 1R , where a Ã‚Â· 1R = a for all a Ã¢Ë†Ë† R;
(vi) multiplication distributes over addition; that is, for all a, b, c Ã¢Ë†Ë† R, we have a(b + c) = ab + ac.
Note that in this definition, the only property that connects addition and multiplication is the
distributive law (property (vi)). If the ring R is clear from context, we may just write 0 and 1
instead of 0R and 1R .
Example 3.1. The set Z under the usual rules of multiplication and addition forms a ring.
Example 3.2. For n Ã¢â€°Â¥ 1, the set Zn under the rules of multiplication and addition defined in Ã‚Â§2.4
forms a ring.
Example 3.3. The set Q of rational numbers under the usual rules of multiplication and addition
forms a ring.
Example 3.4. The set R of real numbers under the usual rules of multiplication and addition
forms a ring.
31
Example 3.5. The set C of complex numbers under the usual rules of multiplication and addition
Ã¢Ë†Å¡
forms a ring. Every ÃŽÂ± Ã¢Ë†Ë† C can be written (uniquely) as ÃŽÂ± = a + bi, where a, b Ã¢Ë†Ë† R and i = Ã¢Ë†â€™1.
If ÃŽÂ±0 = a0 + b0 i is another complex number, with a0 , b0 Ã¢Ë†Ë† R, then
ÃŽÂ± + ÃŽÂ±0 = (a + a0 ) + (b + b0 )i and ÃŽÂ±ÃŽÂ±0 = (aa0 Ã¢Ë†â€™ bb0 ) + (ab0 + a0 b)i.
Example 3.6. The trivial ring consists of a single element. It is not a very interesting ring.
We state some simple facts:
Theorem 3.2. Let R be a ring. Then:
(i) the additive and multiplicative identities are unique;
(ii) for all a, b, c Ã¢Ë†Ë† R, if a + b = a + c, then b = c;
(iii) Ã¢Ë†â€™(a + b) = (Ã¢Ë†â€™a) + (Ã¢Ë†â€™b) for all a, b Ã¢Ë†Ë† R;
(iv) Ã¢Ë†â€™(Ã¢Ë†â€™a) = a for all a Ã¢Ë†Ë† R;
(v) 0R Ã‚Â· a = 0R for all a Ã¢Ë†Ë† R;
(vi) (Ã¢Ë†â€™a)b = Ã¢Ë†â€™(ab) = a(Ã¢Ë†â€™b) for all a, b Ã¢Ë†Ë† R;
(vii) (Ã¢Ë†â€™a)(Ã¢Ë†â€™b) = ab for all a, b Ã¢Ë†Ë† R.
While there are many parts to this theorem, everything in it just states familiar properties
that are satisfied by ordinary numbers. What is interesting about this theorem is that all of these
properties follow directly from Definition 3.1. We shall not prove these properties here, but invite
the reader to do so as a straightforward exercise.
Because addition and multiplication in a ring R are associative, for a1 , . . . , ak Ã¢Ë†Ë† Zn , we may
write the sum a1 + Ã‚Â· Ã‚Â· Ã‚Â· + ak and the product a1 Ã‚Â· Ã‚Â· Ã‚Â· ak without any parentheses, and there is no
ambiguity; moreover, since both addition and multiplication are commutative, we may rearrange
the
without changing their values. We can write the sum as
Pk terms in such sums andQproducts
k
a
.
By
convention, if k = 0, the sum is 0R and the product is 1R .
a
and
the
product
as
i=1 i
i=1 i
If all of the ai Ã¢â‚¬â„¢s areP
equal to a,Pwe can write the sum as ka and the product as ak . Note that the
additive inverse of ki=1 ai is ki=1 (Ã¢Ë†â€™ai ), and the additive inverse of ka is k(Ã¢Ë†â€™a), which we can
also write as (Ã¢Ë†â€™k)a.
One other matter of notation: for a, b Ã¢Ë†Ë† R, we can write a Ã¢Ë†â€™ b instead of a + (Ã¢Ë†â€™b).
3.1.1
Multiplicative inverses and fields
While the definition of a ring requires that every element has an additive inverse, it does not require
that any element has a multiplicative inverse.
Definition 3.3 (Multiplicative inverses in a ring). Let R be a ring. For a Ã¢Ë†Ë† R, we say
that b Ã¢Ë†Ë† R is a multiplicative inverse of a if ab = 1R . We define RÃ¢Ë†â€” to be the set of all elements
of R that have a multiplicative inverse.
Example 3.7. In the ring Q, every nonzero element has a multiplicative inverse. The same holds
for the rings R and C.
32
Example 3.8. In the ring Z, the only elements with multiplicative inverse are Ã‚Â±1. That is, ZÃ¢Ë†â€” =
{Ã‚Â±1}. While the integer 2 has a multiplicative inverse 1/2 Ã¢Ë†Ë† Q, it does not have a multiplicative
inverse in Z.
Example 3.9. As we saw in Ã‚Â§2.4, ZÃ¢Ë†â€”n = {[a]n : gcd(a, n) = 1}.
Example 3.10. If R is a ring, then 1R Ã¢Ë†Ë† RÃ¢Ë†â€” (this follows from the fact that 1R Ã‚Â· 1R = 1R , by
definition). Moreover, if R is nontrivial ring, then we have 0R 6= 1R , and moreover, 0R Ã¢Ë†Ë†
/ RÃ¢Ë†â€” (these
observations follow from part (v) of Theorem 3.2).
Let R be a ring. If a Ã¢Ë†Ë† RÃ¢Ë†â€” , then it is not hard to prove that the multiplicative inverse must
be unique, and it is denoted aÃ¢Ë†â€™1 . It is not hard to see that if a, b Ã¢Ë†Ë† RÃ¢Ë†â€” , then so are aÃ¢Ë†â€™1 and ab;
indeed, one can easily verify that we must have
(aÃ¢Ë†â€™1 )Ã¢Ë†â€™1 = a and (ab)Ã¢Ë†â€™1 = aÃ¢Ë†â€™1 bÃ¢Ë†â€™1 .
If a has a multiplicative inverse aÃ¢Ë†â€™1 , and k is a nonnegative integer, then the multiplicative inverse
of ak is (aÃ¢Ë†â€™1 )k , which we may write as aÃ¢Ë†â€™k . Naturally, for a Ã¢Ë†Ë† R and b Ã¢Ë†Ë† RÃ¢Ë†â€” , we may write a/b
instead of abÃ¢Ë†â€™1 .
As the above examples demonstrate, it need not be the case that every nonzero element in a
ring has a multiplicative inverse. However, rings with this property are especially nice and deserve
to be singled out:
Definition 3.4 (Field). A nontrivial ring where every nonzero element has a multiplicative
inverse is called a field.
Example 3.11. The rings Q, R, and C are fields.
Example 3.12. The ring Z is not a field.
Example 3.13. The ring Zn is a field of and only if n is prime.
Suppose a, b, c are elements of R that satisfy the equation
ab = ac.
If a Ã¢Ë†Ë† RÃ¢Ë†â€” , we may multiply both sides of this equation by aÃ¢Ë†â€™1 to infer that
b = c.
This is the cancellation law for R. In particular, if R is a field, then this cancellation law holds
provided a 6= 0.
We also have the following result on the existence and uniqueness to solutions of linear
equations in R. Let a Ã¢Ë†Ë† RÃ¢Ë†â€” and b Ã¢Ë†Ë† R. Then the equation
az = b
has a unique solution z, namely, z := aÃ¢Ë†â€™1 b. In particular, if R is a field, then this equation has a
unique solution z provided a 6= 0.
Exercise 3.1. Show that the product of two nonzero elements in a field is also nonzero.
Exercise 3.2. Give an example of a ring R and two nonzero elements a, b Ã¢Ë†Ë† R such ab = 0R .
33
3.2
Polynomial rings
If R is a ring, then we can form the ring of polynomials R[X], consisting of all polynomials
g = a0 + a1 X + Ã‚Â· Ã‚Â· Ã‚Â· + ak Xk in the indeterminate, or Ã¢â‚¬Å“formalÃ¢â‚¬Â variable, X, with coefficients ai in R,
and with addition and multiplication defined in the usual way.
Example 3.14. Let us define a couple of polynomials over the ring Z5 :
g := [3]X + X2 ,
h := [1] + [2]X + [4]X3
We have:
g + h = [1] + X2 + [4]X3 ,
g Ã‚Â· h = [3]X + [2]X2 + [2]X3 + [2]X4 + [4]X5 .
Elements of R are also considered to be polynomials. Such polynomials are called constant
polynomials. In particular, 0R is the additive identity in R[X] and 1R is the multiplicative identity
in R[X]. Note that if R is the trivial ring, then so is R[X].
So as to keep the distinction between ring elements and indeterminates clear, we shall use the
symbol Ã¢â‚¬Å“XÃ¢â‚¬Â only to denote the latter. Also, for a polynomial g Ã¢Ë†Ë† R[X], we shall in general write
this simply as Ã¢â‚¬Å“g,Ã¢â‚¬Â and not as Ã¢â‚¬Å“g(X).Ã¢â‚¬Â Of course, the choice of the symbol Ã¢â‚¬Å“XÃ¢â‚¬Â is arbitrary.
3.2.1
Formalities
For completeness, we present a more formal definition of the ring R[X]. The reader should bear in
mind that this formalism is rather tedious, and may be more distracting than it is enlightening.
Formally, a polynomial g Ã¢Ë†Ë† R[X] is an infinite sequence (a0 , a1 , a2 , . . .), where each ai Ã¢Ë†Ë† R, but only
finitely many of the ai Ã¢â‚¬â„¢s are nonzero (intuitively, ai represents the coefficient of Xi ).
For
g = (a0 , a1 , a2 , . . .) Ã¢Ë†Ë† R[X] and h = (b0 , b1 , b2 , . . .) Ã¢Ë†Ë† R[X],
we define
g + h := (s0 , s1 , s2 , . . .) and gh := (p0 , p1 , p2 , . . .),
where for i = 0, 1, 2, . . . ,
si := ai + bi
(3.1)
X
(3.2)
and
pi :=
aj bk ,
i=j+k
the sum being over all pairs (j, k) of nonnegative integers such that i = j +k (which is a finite sum).
We leave it to the reader to verify that g + h and gh are polynomials (i.e., only finitely many of the
si Ã¢â‚¬â„¢s and pi Ã¢â‚¬â„¢s are nonzero). The reader may also verify that all the requirements of Definition 3.1 are
satisfied: the additive identity is the allzero sequence (0R , 0R , 0R , . . .); the multiplicative identity
is the sequence (1R , 0R , 0R , . . .), that is, the sequence consists of 1R followed by all zeros.
For c Ã¢Ë†Ë† R, we can identify c with the corresponding Ã¢â‚¬Å“constantÃ¢â‚¬Â polynomial (c, 0R , 0R , . . .). If
we define the polynomial
X := (0R , 1R , 0R , 0R , . . .),
then for any
g = (a0 , a1 , a2 , . . .), if ai = 0R for all i exceeding some value k, then we
Pkpolynomial
i
have g = i=0 ai X , and so we can return to the standard practice of writing polynomials as we
did in Example 3.14, without any loss of precision.
34
3.2.2
Basic properties of polynomials
P
Let R be a ring. For nonzero g Ã¢Ë†Ë† R[X], if g = ki=0 ai Xi with ak 6= 0, then we call k the degree
of g, denoted deg(g), we call ak the leading coefficient of g, denoted lc(g), and we call a0 the
constant term of
If lc(g) = 1, then
monic.
Pg.
Pg` is called
k
i
i
Suppose g = i=0 ai X and h = i=0 bi X are polynomials such that ak 6= 0 and b` 6= 0, so
that deg(g) = k and lc(g) = ak , and deg(h) = ` and lc(h) = b` . When we multiply these two
polynomials, we get
gh = a0 b0 + (a0 b1 + a1 b0 )X + Ã‚Â· Ã‚Â· Ã‚Â· + ak b` Xk+` .
In particular, if gh 6= 0, then deg(gh) Ã¢â€°Â¤ deg(g) + deg(h). Note that if R is a field, we must have
ak b` 6= 0, and so deg(gh) = deg(g) + deg(h) in this case.
For the zero polynomial, we establish the following conventions: its leading coefficient and
constant term are defined to be 0R , and its degree is defined to be Ã¢Ë†â€™Ã¢Ë†Å¾. With these conventions,
we may succinctly state that
for all g, h Ã¢Ë†Ë† R[X], we have deg(gh) Ã¢â€°Â¤ deg(g) + deg(h), and equality always holds if R
is a field.
3.2.3
Polynomial evaluation
P
A polynomial g = ki=0 ai Xi Ã¢Ë†Ë† R[X] naturally defines a polynomial function on R that sends u Ã¢Ë†Ë† R
P
to ki=0 ai ui Ã¢Ë†Ë† R, and we denote the value of this function as g(u) (note that Ã¢â‚¬Å“XÃ¢â‚¬Â denotes an
indeterminate, while Ã¢â‚¬Å“uÃ¢â‚¬Â denotes an element of R). As usual, we define u Ã¢Ë†Ë† R to be a root of g if
g(u) = 0.
It is important to regard polynomials over R as formal expressions, and not to identify them
with their corresponding functions. In particular, two polynomials are equal if and only if their
coefficients are equal, while two functions are equal if and only if their values agree at all inputs
in R. This distinction is important, since there are rings R over which two different polynomials
define the same function. One can of course define the ring of polynomial functions on R, but in
general, that ring has a different structure from the ring of polynomials over R.
Example 3.15. In the ring Zp , for prime p, by FermatÃ¢â‚¬â„¢s little theorem, we have up = u for all
u Ã¢Ë†Ë† Zp . However, the polynomials Xp and X are not the same polynomials (in particular, the former
has degree p, while the latter has degree 1).
An obvious, yet important, fact is the following:
Theorem 3.5. Let R be a ring. For all g, h Ã¢Ë†Ë† R[X] and u Ã¢Ë†Ë† R, if s := g + h Ã¢Ë†Ë† R[X] and
p := gh Ã¢Ë†Ë† R[X], then we have
s(u) = g(u) + h(u) and p(u) = g(u)h(u).
P
P
Proof. The
just symbol pushing. Indeed, suppose g = i ai Xi and h = i bi Xi .
Pproof is really
Then s = i (ai + bi )Xi , and so
X
X
X
s(u) =
(ai + bi )ui =
ai ui +
bi ui = g(u) + h(u).
i
i
i
Also, we have
p=
X
i
ai X
i
X
bj X
j
j
=
X
i,j
35
ai bj Xi+j ,
and employing the result for evaluating sums of polynomials, we have
X
X
X
i+j
i
j
p(u) =
ai bj u
=
ai u
bj u = g(u)h(u).
i,j
3.2.4
i
j
Polynomial interpolation
The reader is surely familiar with the fact that two points determine a line, in the context of real
numbers. The reader is perhaps also familiar with the fact that over the real (or complex) numbers,
every polynomial of degree k has at most k distinct roots, and the fact that every set of k points
can be interpolated by a unique polynomial of degree less than k. As we will now see, these results
extend to arbitrary fields.1
Let F be a field, and consider the ring of polynomials F [X]. Analogous to integers, we can
define a notion of divisibility for F [X]: for polynomials g, h Ã¢Ë†Ë† F [X] we say that g divides h, which
we may write as g  h, if gz = h for some z Ã¢Ë†Ë† F [X].
Just like the integers, there is a corresponding division with remainder property for polynomials:
Theorem 3.6 (Division with remainder property). Let F be a field. For all g, h Ã¢Ë†Ë† F [X]
with h 6= 0, there exist unique q, r Ã¢Ë†Ë† F [X] such that g = hq + r and deg(r) < deg(h).
Proof. Consider the set S := {g Ã¢Ë†â€™ ht : t Ã¢Ë†Ë† F [X]}. Let r = g Ã¢Ë†â€™ hq be an element of S of minimum
degree. We must have deg(r) < deg(h), since otherwise, we could subtract an appropriate multiple
of h from r so as to eliminate the leading coefficient of r, obtaining
r0 := r Ã¢Ë†â€™ h Ã‚Â· (lc(r) lc(h)Ã¢Ë†â€™1 Xdeg(r)Ã¢Ë†â€™deg(h) ) Ã¢Ë†Ë† S,
where deg(r0 ) < deg(r), contradicting the minimality of deg(r).
That proves the existence of r and q. For uniqueness, suppose that g = hq + r and g = hq 0 + r0 ,
where deg(r) < deg(h) and deg(r0 ) < deg(h). This implies r0 Ã¢Ë†â€™ r = h Ã‚Â· (q Ã¢Ë†â€™ q 0 ). However, if q 6= q 0 ,
then
deg(h) > deg(r0 Ã¢Ë†â€™ r) = deg(h Ã‚Â· (q Ã¢Ë†â€™ q 0 )) = deg(h) + deg(q Ã¢Ë†â€™ q 0 ) Ã¢â€°Â¥ deg(h),
which is impossible. Therefore, we must have q = q 0 , and hence r = r0 .
If g = hq + r as in the above theorem, we define g mod h := r. Clearly, h  g if and only
if g mod h = 0. Moreover, note that if deg(g) < deg(h), then q = 0 and r = g; otherwise, if
deg(g) Ã¢â€°Â¥ deg(h), then q 6= 0 and deg(g) = deg(h) + deg(q).
Theorem 3.7. Let F be a field, g Ã¢Ë†Ë† F [X], and u Ã¢Ë†Ë† F be a root of g. Then (X Ã¢Ë†â€™ u) divides g.
Proof. Using the division with remainder property for polynomials, there exist q, r Ã¢Ë†Ë† F [X] such that
g = (X Ã¢Ë†â€™ u)q + r, with q, r Ã¢Ë†Ë† F [X] and deg(r) < 1, which means that r Ã¢Ë†Ë† F . Evaluating at u, we
see that g(u) = (u Ã¢Ë†â€™ u)q(u) + r = r. Since u is a root of g, we must have r = 0, and therefore,
g = (X Ã¢Ë†â€™ u)q, and so (X Ã¢Ë†â€™ u) divides g.
Theorem 3.8. Let F be a field, and let u1 , . . . , uk be distinct elements of F . Then
Qkfor every
polynomial g Ã¢Ë†Ë† F [X], the elements u1 , . . . , uk are roots of g if and only if the polynomial i=1 (XÃ¢Ë†â€™ui )
divides g.
1
In fact, much of what we discuss here extends, with some modification, to more general coefficient rings.
36
Q
Proof. One direction is trivial: if ki=1 (X Ã¢Ë†â€™ ui ) divides g, then it is clear that each ui is a root of
g. We prove the converse by induction on k. The base case k = 1 is just Theorem 3.7. So assume
k > 1, and that the statement holds for k Ã¢Ë†â€™ 1. Let g Ã¢Ë†Ë† F [X] and let u1 , . . . , uk be distinct roots
of g. Since uk is a root of g, then by Theorem 3.7, there exists q Ã¢Ë†Ë† F [X] such that g = (X Ã¢Ë†â€™ uk )q.
Moreover, for each i = 1, . . . , k Ã¢Ë†â€™ 1, we have
0 = g(ui ) = (ui Ã¢Ë†â€™ uk )q(ui ),
and since ui Ã¢Ë†â€™Quk 6= 0 and F is a field, we must have q(ui ) = 0. Thus, qQhas roots u1 , . . . , ukÃ¢Ë†â€™1 , and
kÃ¢Ë†â€™1
by induction i=1
(X Ã¢Ë†â€™ ui ) divides q, from which it then follows that ki=1 (X Ã¢Ë†â€™ ui ) divides g.
As an immediate consequence of this theorem, we obtain:
Theorem 3.9. Let F be a field, and suppose that g Ã¢Ë†Ë† F [X], with deg(g) = k Ã¢â€°Â¥ 0. Then g has at
most k distinct roots.
Proof.
Qk+1 If g had k + 1 distinct roots u1 , . . . , uk+1 , then by the previous theorem, the polynomial
i=1 (X Ã¢Ë†â€™ ui ), which has degree k + 1, would divide g, which has degree k Ã¢â‚¬â€ an impossibility.
We now present the main result of this section. It says given k points (u1 , v1 ), . . . , (uk , vk ), were
the ui Ã¢â‚¬â„¢s are distinct elements of a field F and the vi Ã¢â‚¬â„¢s are arbitrary elements of F , there is a unique
polynomial g Ã¢Ë†Ë† F [X] of degree less than k that Ã¢â‚¬Å“interpolatesÃ¢â‚¬Â or Ã¢â‚¬Å“passes throughÃ¢â‚¬Â these points,
that is, g(ui ) = vi for i = 1, . . . , k.
Theorem 3.10 (Lagrange interpolation). Let F be a field, let u1 , . . . , uk be distinct elements
of F , and let v1 , . . . , vk be arbitrary elements of F . Then there exists a unique polynomial g Ã¢Ë†Ë† F [X]
with deg(g) < k such that g(ui ) = vi for i = 1, . . . , k, namely
g :=
k
X
i=1
Q
vi Q
j6=i (X
Ã¢Ë†â€™ uj )
j6=i (ui
Ã¢Ë†â€™ uj )
.
Proof. For the existence part of the theorem, one just has to verify that g(ui ) = vi for the given g,
which clearly has degree less than k. This is easy to see: for i = 1, . . . , k, evaluating the ith term in
the sum defining g at ui yields vi , while evaluating any other term at ui yields 0. The uniqueness
part of the theorem follows almost immediately from Theorem 3.9: if g and h are polynomials of
degree less than k such that g(ui ) = vi = h(ui ) for i = 1, . . . , k, then g Ã¢Ë†â€™ h is a polynomial of degree
less than k with k distinct roots, which, by the previous theorem, is impossible.
Example 3.16. Consider the field F = Z5 . We use the Lagrange interpolation formula to compute
the coefficients of the polynomial g Ã¢Ë†Ë† F [X] of degree less than 3 such that
g([1]) = [1],
g([3]) = [4],
g([4]) = [2].
Applying Theorem 3.10 with
u1 = [1], u2 = [3], u3 = [4],
and v1 = [1], v2 = [4], and v3 = [2],
37
we have g = L1 + L2 + L3 , where
(X Ã¢Ë†â€™ [3])(X Ã¢Ë†â€™ [4])
X2 + [3]X + [2]
= [1]
([1] Ã¢Ë†â€™ [3])([1] Ã¢Ë†â€™ [4])
[1]
2
= X + [3]X + [2],
L1 = [1]
(X Ã¢Ë†â€™ [1])(X Ã¢Ë†â€™ [4])
X2 + [4]
= [4]
([3] Ã¢Ë†â€™ [1])([3] Ã¢Ë†â€™ [4])
[3]
2
= [4][2](X + [4]) (using the fact that [3]Ã¢Ë†â€™1 = [2])
L2 = [4]
= [3]X2 + [2],
(X Ã¢Ë†â€™ [1])(X Ã¢Ë†â€™ [3])
X2 + X + [3]
= [2]
([4] Ã¢Ë†â€™ [1])([4] Ã¢Ë†â€™ [3])
[3]
2
= [2][2](X + X + [3]) (again, using the fact that [3]Ã¢Ë†â€™1 = [2])
L3 = [2]
= [4]X2 + [4]X + [2].
Putting it all together, we have g = [3]X2 + [2]X + [1].
Given distinct u1 , . . . , uk Ã¢Ë†Ë† F and arbitrary v1 ,P
. . . , vk Ã¢Ë†Ë† F , as in Theorem 3.10, the coefficients
a0 , . . . , akÃ¢Ë†â€™1 of the interpolating polynomial g = i ai Xi Ã¢Ë†Ë† F [X] can be expressed as the unique
solution to the following matrix equation:
Ã¯Â£Â«
Ã¯Â£Â¶ Ã¯Â£Â« Ã¯Â£Â¶
Ã¯Â£Â¶Ã¯Â£Â«
1 u1 Ã‚Â· Ã‚Â· Ã‚Â· u1kÃ¢Ë†â€™1
a0
v1
Ã¯Â£Â¬1 u2 Ã‚Â· Ã‚Â· Ã‚Â· ukÃ¢Ë†â€™1 Ã¯Â£Â· Ã¯Â£Â¬ a1 Ã¯Â£Â· Ã¯Â£Â¬v2 Ã¯Â£Â·
2 Ã¯Â£Â·Ã¯Â£Â¬
Ã¯Â£Â¬
Ã¯Â£Â· Ã¯Â£Â¬ Ã¯Â£Â·
Ã¯Â£Â¬ .. ..
.. Ã¯Â£Â· Ã¯Â£Â¬ .. Ã¯Â£Â· = Ã¯Â£Â¬ .. Ã¯Â£Â·
Ã¯Â£Â. .
. Ã¯Â£Â¸Ã¯Â£Â . Ã¯Â£Â¸ Ã¯Â£Â . Ã¯Â£Â¸
kÃ¢Ë†â€™1
akÃ¢Ë†â€™1
vk
1 uk Ã‚Â· Ã‚Â· Ã‚Â· uk

{z
}
V :=
The matrix V is called a Vandermonde matrix. Note that one can derive the uniqueness part
of Theorem 3.10 from the existence part by general facts from linear algebra. Indeed, the existence
part implies that the column space of V has full rank k, which means that the null space of V has
dimension 0.
Application: Erasure and Error Correcting Codes
As an application of polynomial interpolation, we show how they can be used to implement erasure
codes.
Suppose you have a file that consists of 3 blocks, A1 , A2 , A3 . To protect against data loss, you
could use, say, 5 computers (nodes) to store these blocks in a redundant fashion, so that if any two
nodes crash, you can still recover all the blocks. For example, if the nodes are N1 , N2 , N3 , N4 , N5 ,
you could store the blocks as follows
N1 : A1 , A3 ;
N2 : A1 , A3 ;
N3 : A1 , A2 ;
N4 : A2 , A3 ;
N5 : A2 .
You can see that each block Ai is stored on 3 nodes, so if any two nodes crash, there is still at least
one copy left.
While this works, you can see that this requires a lot more storage. We are storing a tota...
Purchase answer to see full
attachment