+1(978)310-4246 credencewriters@gmail.com
  

Description

this exercise is based on math: number of theory, so involve more math than programming.

Exercise 1.9. Suppose we run Algorithm ExtEuclid on input (117,67). Show the steps of the
computation by giving the data corresponding to that shown in in the table in Example 1.2.
Exercise 1.10. Show that if a ≥ b > 0, then the values s and t computed by ExtEuclid(a, b)
satisfy |s| ≤ b/d and |t| ≤ a/d.
Hint: prove by induction on b—be careful, you have to stop the induction before b gets to zero,
so the last step to consider is when b | a.
Exercise 1.11. Suppose that a, b are integers with d := gcd(a, b) =
̸ 0.
(a) Show that a/d and b/d are relatively prime.
(b) Show that if s and t are integers such that as + bt = d, then s and t are relatively prime.
Exercise 1.12. Let n be an integer. Show that if a, b are relatively prime integers, each of which
divides n, then ab divides n.
Exercise 2.7. Let n be a positive integer. For x ∈ {0, . . . , 2n −1}, let x ̃ denote the integer
obtained by inverting the bits in the n-bit, binary representation of x (note that x ̃ ∈ {0, . . . , 2n
− 1}). Show that x ̃ + 1 ≡ −x (mod 2n). This justifies the usual rule for computing negatives in
2’s complement arithmetic.
Hint: what is x + x ?̃
Exercise 2.11. For each of the following congruences, determine all the integer solutions z ∈
{0, . . . , 999}. To so this, you should first put the congruence in the form az = b (mod 1000), as
we did in going from (2.2) to (2.3) in Example 2.3. Then follow the steps outlined in §2.2.3.
Show all your steps (but you may use a calculator to help with the multiplications). Use the
extended Euclidean algorithm to compute modular inverses, where necessary.
(a) 100z + 200 ≡ 93z + 171 (mod 1000)
(b) 115z + 130 ≡ 100z + 165 (mod 1000)
(c) 115z + 132 ≡ 100z + 140 (mod 1000)
(d) 119z + 132 ≡ 113z + 140 (mod 1000)
Exercise 2.25. Calculate the order of 9 mod 100. Show your work by building a table of powers
9^i mod 100. In addition, from this table, identify the multiplicative inverse of 9 mod 100.
Exercise 2.30. Compute 399 mod 100 using the repeated squaring algorithm. Show your work.
Exercise 3.3. Consider the field F = Z5. Let f = X3 +X+[1] ∈ F[X], g = [2]X2 +[3]X+[4] ∈ F[X],
and h = [3]X2 + [2]X + [1] ∈ F [X]. Compute (gh) mod f . Show your work. That is, you should
compute the product polynomial P = gh ∈ F [X], and then use polynomial division with
remainder to compute P mod f. Along the way, you the coefficients of any polynomials you
compute should be reduced mod 5.
Exercise 3.4. Consider the field F = Z5. Use the Lagrange interpolation formula to compute the
coefficients of the polynomial g ∈ F [X] of degree less than 3 such that g([1]) = [3], g([2]) = [4],
g([3]) = [1].
Show your work.
A Number Theory Primer:
What Every Computer Scientist Should Know about
Number Theory
(v0.14)
Victor Shoup
Chapter 1
Basic properties of the integers
This chapter discusses some of the basic properties of the integers, including the notions of divisibility and primality, unique factorization into primes, greatest common divisors, and least common
multiples.
1.1
Divisibility and primality
A central concept in number theory is divisibility.
Consider the integers Z = {. . . , −2, −1, 0, 1, 2, . . .}. For a, b ∈ Z, we say that a divides b if
az = b for some z ∈ Z. If a divides b, we write a | b, and we may say that a is a divisor of b, or
that b is a multiple of a, or that b is divisible by a. If a does not divide b, then we write a – b.
We first state some simple facts about divisibility:
Theorem 1.1. For all a, b, c ∈ Z, we have
(i) a | a, 1 | a, and a | 0;
(ii) 0 | a if and only if a = 0;
(iii) a | b if and only if −a | b if and only if a | −b;
(iv) a | b and a | c implies a | (b + c);
(v) a | b and b | c implies a | c.
Proof. These properties can be easily derived from the definition of divisibility, using elementary
algebraic properties of the integers. For example, a | a because we can write a · 1 = a; 1 | a because
we can write 1 · a = a; a | 0 because we can write a · 0 = 0. We leave it as an easy exercise for the
reader to verify the remaining properties.
We make a simple observation. Suppose a | b and b 6= 0. Then we have 1 ≤ |a| ≤ |b|. To see
this, suppose az = b 6= 0 for some integer z. Then a 6= 0 and z 6= 0, and it follows that |a| ≥ 1 and
|z| ≥ 1. We conclude that |a| ≤ |a||z| = |b|.
Using this observation, we may prove the following:
Theorem 1.2. For all a, b ∈ Z, we have a | b and b | a if and only if a = ±b. In particular, for
every a ∈ Z, we have a | 1 if and only if a = ±1.
1
Proof. Clearly, if a = ±b, then a | b and b | a. So let us assume that a | b and b | a, and prove that
a = ±b. If either of a or b are zero, then the other must be zero as well. So assume that neither is
zero. By the above observation, a | b implies |a| ≤ |b|, and b | a implies |b| ≤ |a|; thus, |a| = |b|, and
so a = ±b. That proves the first statement. The second statement follows from the first by setting
b := 1, and noting that 1 | a.
The product of any two non-zero integers is again non-zero. This implies the usual cancellation
law: if a, b, and c are integers such that a 6= 0 and ab = ac, then we must have b = c; indeed,
ab = ac implies a(b − c) = 0, and so a 6= 0 implies b − c = 0, and hence b = c.
1.2
Division with remainder
One of the most important facts about the integers is the following fact regarding division with
remainder. This fact underpins many other properties of the integers that we shall derive later,
including the fact that every positive integer can be expressed uniquely as the product of primes.
Theorem 1.3 (Division with remainder property). Let a, b ∈ Z with b > 0. Then there exist
unique q, r ∈ Z such that a = bq + r and 0 ≤ r < b. Proof. Consider the set S of non-negative integers of the form a − bt with t ∈ Z. This set is clearly non-empty; indeed, if a ≥ 0, set t := 0, and if a < 0, set t := a. Since every non-empty set of non-negative integers contains a minimum, we define r to be the smallest element of S. By definition, r is of the form r = a − bq for some q ∈ Z, and r ≥ 0. Also, we must have r < b, since otherwise, r − b would be an element of S smaller than r, contradicting the minimality of r; indeed, if r ≥ b, then we would have 0 ≤ r − b = a − b(q + 1). That proves the existence of r and q. For uniqueness, suppose that a = bq + r and a = bq 0 + r0 , where 0 ≤ r < b and 0 ≤ r0 < b. Then subtracting these two equations and rearranging terms, we obtain r0 − r = b(q − q 0 ). Thus, r0 − r is a multiple of b; however, 0 ≤ r < b and 0 ≤ r0 < b implies |r0 − r| < b; therefore, the only possibility is r0 − r = 0. Moreover, 0 = b(q − q 0 ) and b 6= 0 implies q − q 0 = 0. Theorem 1.3 can be visualized as follows: 0 r b 2b 3b a 4b Starting with a, we subtract (or add, if a is negative) the value b until we end up with a number in the interval [0, b). Floors and ceilings. Let us briefly recall the usual floor and ceiling functions, denoted b·c and d·e, respectively. These are functions from R (the real numbers) to Z. For x ∈ R, bxc is the greatest integer m ≤ x; equivalently, bxc is the unique integer m such that m ≤ x < m + 1, or put another way, such that x = m + for some ∈ [0, 1). Also, dxe is the smallest integer m ≥ x; equivalently, dxe is the unique integer m such that m − 1 < x ≤ m, or put another way, such that x = m − for some ∈ [0, 1). 2 The mod operator. Now let a, b ∈ Z with b > 0. If q and r are the unique integers from
Theorem 1.3 that satisfy a = bq + r and 0 ≤ r < b, we define a mod b := r; that is, a mod b denotes the remainder in dividing a by b. It is clear that b | a if and only if a mod b = 0. Dividing both sides of the equation a = bq + r by b, we obtain a/b = q + r/b. Since q ∈ Z and r/b ∈ [0, 1), we see that q = ba/bc. Thus, (a mod b) = a − bba/bc. One can use this equation to extend the definition of a mod b to all integers a and b, with b 6= 0; that is, for b < 0, we simply define a mod b to be a − bba/bc. Theorem 1.3 may be generalized so that when dividing an integer a by a positive integer b, the remainder is placed in an interval other than [0, b). Let x be any real number, and consider the interval [x, x + b). As the reader may easily verify, this interval contains precisely b integers, namely, dxe, . . . , dxe + b − 1. Applying Theorem 1.3 with a − dxe in place of a, we obtain: Theorem 1.4. Let a, b ∈ Z with b > 0, and let x ∈ R. Then there exist unique q, r ∈ Z such
that a = bq + r and r ∈ [x, x + b).
Div and mod in programming languages. Our definitions of division and remainder are
similar to, but not exactly equivalent to, the corresponding operators used in modern programming
languages. Typically, in such languages, an expression such as “5/3” evaluates to b5/3c = 1;
similarly, “5%3” evaluates to 5 mod 2 = 1. However, most programming languages treat negative
numbers differently that we do here. For example, the evaluation of “(-5)/3” would discard the
fractional part of (−5)/3, yielding −1, rather than b(−5)/3c = −2. Also, “(-5)%3” would evaluate
to (−5) − (−1) · 3 = −2, rather than (−5) − (−2) · 3 = 1.
Exercise 1.1. Let a, b, d ∈ Z with d 6= 0. Show that a | b if and only if da | db.
Exercise 1.2. Let m be a positive integer. Show that for every real number x ≥ 1, the number
of multiples of m in the interval [1, x] is bx/mc; in particular, for every integer n ≥ 1, the number
of multiples of m among 1, . . . , n is bn/mc.
Exercise 1.3. Let x ∈ R. Show that 2bxc ≤ b2xc ≤ 2bxc + 1.
Exercise 1.4. Let x ∈ R and n ∈ Z with n > 0. Show that bbxc/nc = bx/nc; in particular,
bba/bc/cc = ba/bcc for all positive integers a, b, c.
Exercise 1.5. Let a, b ∈ Z with b < 0. Show that (a mod b) ∈ (b, 0]. Exercise 1.6. Show that Theorem 1.4 also holds for the interval (x, x + b]. Does it hold in general for the intervals [x, x + b] or (x, x + b)? 1.3 Greatest common divisors For a, b ∈ Z, we call d ∈ Z a common divisor of a and b if d | a and d | b; moreover, we call such a d a greatest common divisor of a and b if d is non-negative and is divisible be every common divisor of a and b. 3 The reader may note that we have not defined the greatest common divisor to be the numerically largest common divisor, as is sometimes done. Rather, we have defined it in terms of the divisibility relation: it is “greatest” in the sense that it is a divisible by every other common divisor. Our definition is actually a stronger property, and is considered to be the more “mathemtically correct” one, in that it generalizes to other algebraic domains. Unfortunately, it is not painfully obvious from the definition that greatest common divisors even exist, and if they do, that they are uniquely defined. We shall, however, show below that for every a, b ∈ Z, there exists a unique greatest common divisor of a and b, which we will denote by gcd(a, b). We begin by showing that the greatest common divisor of a and b, if it exists, must be unique. To this end, suppose d and d0 are greatest common divisors. Then by definition, since d is a greatest common divisor and d0 is a common divisor, we must have d0 | d. Similarly, we must gave d | d0 . Therefore, by Theorem 1.2, we have d = ±d0 . By definition, greatest common divisors must be non-negative, and so we conclude that d = d0 . That proves uniqueness. The reader will notice that in the uniqueness proof, we made essential use of the fact that greatest common divisors must be non-negative — in fact, this is precisely the reason why this requirement is included in our definition. 1.3.1 The existence of greatest common divisors and Euclid’s algorithm We shall now prove the existence of gcd(a, b). The proof, in fact, gives us an efficient algorithm to compute gcd(a, b). The algorithm is the well-known Euclidean algorithm. Without loss of generality, we may assume that a ≥ b ≥ 0. We prove by induction on b that gcd(a, b) exists. The base case is b = 0. We claim that gcd(a, 0) = a. To see this, observe that by part (i) of Theorem 1.1, every integer is a divisor of 0. It follows that d is a common divisor of a and 0 if and only if it is a divisor of a. Certainly, a is a divisor of itself and it is non-negative, and from this, it is clear that a satisfies the definition of gcd(a, 0). Everything in the above paragraph holds even if a = 0. In particular, gcd(0, 0) = 0. So now assume that b > 0. We also assume our induction hypothesis, which states that gcd(a0 , b0 )
exists for all integers a0 , b0 with a0 ≥ b0 ≥ 0 and b0 < b. If we divide a by b, we obtain the quotient q := ba/bc and the remainder r := a mod b, where 0 ≤ r < b. By our induction hypothesis, applied to a0 := b and b0 := r, we see that gcd(b, r) exists. From the equation a = bq + r, it is easy to see that if an integer divides both b and r, then it also divides a; likewise, if an integer divides a and b, then it also divides r. It follows that the the set of common divisors of a and b is equal to the set of common divisors of b and r. Moreover, as gcd(b, r) is the unique non-negative common divisor that is divisible by each of these common divisors, and these common divisors are the same as the common divisors of a and b, we see that gcd(a, b) exists and is equal to gcd(b, r). That finishes the proof of the existence of gcd(a, b). This proof suggests the following recursive algorithm to actually compute it: Algorithm Euclid(a, b): On input a, b, where a and b are integers such that a ≥ b ≥ 0, compute gcd(a, b) as follows: if b = 0 then return a else return Euclid(b, a mod b) Example 1.1. Suppose a = 100 and b = 35. We compute the numbers (ai , bi ) that are the inputs to the ith recursive call to Euclid: 4 i ai bi 0 100 35 1 35 30 2 30 5 3 5 0 So we have gcd(a, b) = a3 = 5. To actually implement this algorithm on a computer, one has to consider the size of the integers a and b. If they are small enough to fit into a singe machine “word” (typically 64- or 32-bits), then there is no issue. Otherwise, one needs to represent these large integers as vectors of machine words, and implement all of the basic arithmetic operations (addition, subtraction, multiplication, and division) in software. Some programming languages implement these operations as part of a standard library, while others do not. In these notes, we will not worry about how these basic arithmetic operations are implemented. Rather, we will focus our attention to how many division steps are performed by Euclid’s algorithm. Observe that when called on input (a, b), with b > 0, the algorithm performs one division step,
and then calls itself recursively on input (a0 , b0 ), where a0 := b and b0 := a mod b. In particular,
b > b0 . So in every recursive call, the second argument decreases by at least 1. It follows that
the algorithm performs at most b division steps in total. However, it turns out that the algorithm
terminates much faster than this:
Theorem 1.5. On input (a, b), Euclid’s algorithm performs O(log b) division steps.
Proof. As observed above, if b > 0, the algorithm performs one division step, and then calls itself
recursively on input (a0 , b0 ), where a0 := b and b0 := a mod b, so that b > b0 . If b0 > 0, the
algorithm performs another division step, and calls itself again on input (a00 , b00 ), where a00 := b0
and b00 := b mod b0 , so that b0 > b00 . Consider the quotient q 0 := bb/b0 c. Since b > b0 , we have q 0 ≥ 1.
Moreover, since b = b0 q 0 + b00 and b0 > b00 , we have
b = b0 q 0 + b00 ≥ b0 + b00 > 2b00 .
This shows that after two division steps, the second argument to Euclid, is less than b/2, and
so, after 2k division step, it is less than b/2k . So if we choose k large enough so that b/2k ≤ 1, we
can be sure that the number division steps is at most 2k. Setting k := dlog2 be does the job.
1.3.2
Bezout’s Lemma and the extended Euclidean algorithm
For integers a and b, have seen not only that gcd(a, b) exists and is uniquely defined, but that we
can efficiently compute it. Next, we prove the following remarkable fact, which will have numerous
applications.
Theorem 1.6 (Bezout’s Lemma). Let a, b ∈ Z and d := gcd(a, b).
(i) We have
as + bt = d for some s, t ∈ Z.
(ii) For every d∗ ∈ Z, we have
d | d∗ ⇐⇒ as∗ + bt∗ = d∗
for some s∗ , t∗ ∈ Z.
We first observe that the part (ii) of Bezout’s Lemma follows easily from part (i). Let d∗ ∈ Z
be given. On the one hand, if d | d∗ , then dz = d∗ for some integer z, and s∗ := zs and t∗ := zt,
5
where as + bt = d, do the job. On the other hand, if as∗ + bt∗ = d∗ , then since d | a and d | b, we
must have d | (as∗ + bt∗ ) = d∗ .
For part (i) of Bezout’s Lemma, we can again assume without loss of generality that that
a ≥ b ≥ 0. We prove the statement by induction on b.
The base case is b = 0. In this case, d = a, and so, if we set s := 1 and t := 0, then as + 0t = d,
as required.
So assume b > 0. We also assume our induction hypothesis, which states that for all integers
a0 , b0 with a0 ≥ b0 ≥ 0 and b0 < b, there exist integers s0 and t0 with a0 s0 + b0 t0 = d0 , where d0 := gcd(a0 , b0 ). We divide a by b, obtaining the quotient q and remainder r, where a = bq + r and 0 ≤ r < b. Applying our induction hypothesis to a0 := b and b0 := r, since gcd(a, b) = d = gcd(b, r), we see there exist s0 , t0 such that bs0 + rt0 = d. If we substitute r by a − bq in this equation, and rearrange terms, we obtain at0 + b(s0 − qt0 ) = d. Thus, s := t0 and t := s0 − qt0 do the job. That completes the proof of part (i) of Bezout’s Lemma. The proof naturally suggests the following algorithm, which is called the extended Euclidean algorithm: Algorithm ExtEuclid(a, b): On input a, b, where a and b are integers such that a ≥ b ≥ 0, compute (d, s, t), where d = gcd(a, b) and s and t are integers such that as + bt = d, as follows: if b = 0 then d ← a, s ← 1, t ← 0 else q ← ba/bc, r ← a mod b (d, s0 , t0 ) ← ExtEuclid(b, r) s ← t0 , t ← s0 − qt0 return (d, s, t) Example 1.2. Continuing with Example 1.1, we compute, in addition, the numbers si and ti returned by the ith recursive call to Euclid, along with the corresponding quotient qi = bai /bi c: i ai bi si ti qi 0 100 35 −1 3 2 1 35 30 1 −1 1 2 30 5 0 1 6 3 5 0 1 0 The rule being applied here is si := ti+1 and ti := si+1 − qi ti+1 . So we have s = s0 = −1 and t = t0 = 3. One can verify that as + bt = 100 · (−1) + 35 · (3) = 5 = d. For a, b ∈ Z, we say that a, b ∈ Z are relatively prime if gcd(a, b) = 1, which is the same as saying that the only common divisors of a and b are ±1. One may also say that a and b are coprime, which means the same thing as saying that they are relatively prime. Applying part (ii) of Bezout’s Lemma with d∗ := 1, we immediately obtain the following: 6 Theorem 1.7 (Corollary to Bezout’s Lemma). Let a, b ∈ Z. Then we have: a and b are relatively prime ⇐⇒ as + bt = 1 for some s, t ∈ Z. This fact allows us to easily prove the following result, which will be useful in the next section, where we prove the unique factorization property for integers. Theorem 1.8 (Coprime Lemma). Let a, b, c be integers such that c divides ab, and a and c are relatively prime. Then c must divide b. Proof. Suppose that c | ab and gcd(a, c) = 1. By Theorem 1.7, we have as + ct = 1 for some s, t ∈ Z. Multiplying this equation by b, we obtain abs + cbt = b. (1.1) Since c divides ab by hypothesis, and since c clearly divides cbt, it follows that c divides the left-hand side of (1.1), and hence that c divides b. Exercise 1.7. Let a and b be integers, where either a 6= 0 or b 6= 0. Let d := gcd(a, b). Show that: (a) d > 0;
(b) d is the numerically largest common divisor of a and b;
(c) d is the smallest positive integer that can be expressed as as + bt for some integers s and t.
Exercise 1.8. Show that for all integers a, b, c, we have:
(a) gcd(a, b) = gcd(b, a);
(b) gcd(a, b) = |a| ⇐⇒ a | b;
(c) gcd(a, 1) = 1;
(d) gcd(ca, cb) = |c| gcd(a, b).
Exercise 1.9. Suppose we run Algorithm ExtEuclid on input (117, 67). Show the steps of the
computation by giving the data corresponding to that shown in in the table in Example 1.2.
Exercise 1.10. Show that if a ≥ b > 0, then the values s and t computed by ExtEuclid(a, b)
satisfy
|s| ≤ b/d and |t| ≤ a/d.
Hint: prove by induction on b— be careful, you have to stop the induction before b gets to zero, so
the last step to consider is when b | a.
Exercise 1.11. Suppose that a, b are integers with d := gcd(a, b) 6= 0.
(a) Show that a/d and b/d are relatively prime.
(b) Show that if s and t are integers such that as + bt = d, then s and t are relatively prime.
Exercise 1.12. Let n be an integer. Show that if a, b are relatively prime integers, each of which
divides n, then ab divides n.
Exercise 1.13. Let a, b, c be positive integers satisfying gcd(a, b) = 1. Show that there is a number
N , depending on a and b, such for all integers c ≥ N , we can write c = as + bt for non-negative
integers s, t.
7
1.4
Unique factorization into primes
Let n be a positive integer. Trivially, 1 and n divide n. If n > 1 and no other positive integers
besides 1 and n divide n, then we say n is prime. If n > 1 but n is not prime, then we say that
n is composite. The number 1 is not considered to be either prime or composite. Evidently, n is
composite if and only if n = ab for some integers a, b with 1 < a < n and 1 < b < n. The first few primes are 2, 3, 5, 7, 11, 13, 17, . . . . While it is possible to extend the definition of prime and composite to negative integers, we shall not do so in this text: whenever we speak of a prime or composite number, we mean a positive integer. A basic fact is that every non-zero integer can be expressed as a signed product of primes in an essentially unique way. More precisely: Theorem 1.9 (Fundamental theorem of arithmetic). Every non-zero integer n can be expressed as n = ±pe11 · · · perr , where p1 , . . . , pr are distinct primes and e1 , . . . , er are positive integers. Moreover, this expression is unique, up to a reordering of the primes. Note that if n = ±1 in the above theorem, then r = 0, and the product of zero terms is interpreted (as usual) as 1. The theorem intuitively says that the primes act as the “building blocks” out of which all nonzero integers can be formed by multiplication (and negation). The reader may be so familiar with this fact that he may feel it is somehow “self evident,” requiring no proof; however, this feeling is simply a delusion, and most of the rest of this section and the next are devoted to developing a proof of this theorem. We shall give a quite leisurely proof, introducing a number of other very important tools and concepts along the way that will be useful later. To prove Theorem 1.9, we may clearly assume that n is positive, since otherwise, we may multiply n by −1 and reduce to the case where n is positive. The proof of the existence part of Theorem 1.9 is easy. This amounts to showing that every positive integer n can be expressed as a product (possibly empty) of primes. We may prove this by induction on n. If n = 1, the statement is true, as n is the product of zero primes. Now let n > 1,
and assume that every positive integer smaller than n can be expressed as a product of primes. If
n is a prime, then the statement is true, as n is the product of one prime. Assume, then, that n is
composite, so that there exist a, b ∈ Z with 1 < a < n, 1 < b < n, and n = ab. By the induction hypothesis, both a and b can be expressed as a product of primes, and so the same holds for n. The uniqueness part of Theorem 1.9 is a bit more challenging. However, using the results we have proven so far on greatest common divisors, the task is not so difficult. Suppose that p is a prime and a is any integer. As the only divisors of p are ±1 and ±p, we have p | a =⇒ gcd(a, p) = p, and p - a =⇒ gcd(a, p) = 1. Combining this observation with Theorem 1.8, we have: Theorem 1.10 (Euclid’s Lemma). Let p be prime, and let a, b ∈ Z. Then p | ab implies that p | a or p | b. 8 Proof. Assume that p | ab. If p | a, we are done, so assume that p - a. By the above observation, gcd(a, p) = 1, and so by Theorem 1.8, we have p | b. An obvious corollary to Theorem 1.10 is that if a1 , . . . , ak are integers, and if p is a prime that divides the product a1 · · · ak , then p | ai for some i = 1, . . . , k. This is easily proved by induction on k. For k = 1, the statement is trivially true. Now let k > 1, and assume that statement holds
for k − 1. Then by Theorem 1.10, either p | a1 or p | a2 · · · ak ; if p | a1 , we are done; otherwise, by
induction, p divides one of a2 , . . . , ak .
Finishing the proof of Theorem 1.9. We are now in a position to prove the uniqueness part of
Theorem 1.9, which we can state as follows: if p1 , . . . , pr are primes (not necessarily distinct), and
q1 , . . . , qs are primes (also not necessarily distinct), such that
p1 · · · pr = q1 · · · qs ,
(1.2)
then (p1 , . . . , pr ) is just a reordering of (q1 , . . . , qs ). We may prove this by induction on r. If r = 0,
we must have s = 0 and we are done. Now suppose r > 0, and that the statement holds for r − 1.
Since r > 0, we clearly must have s > 0. Also, as p1 obviously divides the left-hand side of (1.2),
it must also divide the right-hand side of (1.2); that is, p1 | q1 · · · qs . It follows from (the corollary
to) Theorem 1.10 that p1 | qj for some j = 1, . . . , s, and moreover, since qj is prime, we must have
p1 = qj . Thus, we may cancel p1 from the left-hand side of (1.2) and qj from the right-hand side of
(1.2), and the statement now follows from the induction hypothesis. That proves the uniqueness
part of Theorem 1.9.
Exercise 1.14. Let n be a composite integer. Show that there exists a prime p dividing n, with
p ≤ n1/2 .
Exercise 1.15. Show that two integers are relatively prime if and only if there is no one prime
that divides both of them.
Exercise 1.16. Let p be a prime and k an integer, with 0 < k < p. Show that the binomial coefficient p p! , = k!(p − k)! k which is an integer, is divisible by p. Exercise 1.17. An integer a is called square-free if it is not divisible by the square of any integer greater than 1. Show that: (a) a is square-free if and only if a = ±p1 · · · pr , where the pi ’s are distinct primes; (b) every positive integer n can be expressed uniquely as n = ab2 , where a and b are positive integers, and a is square-free. 1.5 Some consequences of unique factorization The following theorem is a consequence of just the existence part of Theorem 1.9: Theorem 1.11. There are infinitely many primes. 9 Proof. By way of contradiction, Qk suppose that there were only finitely many primes; call them p1 , . . . , pk . Then set M := i=1 pi and N := M + 1. Consider a prime p that divides N . There must be at least one such prime p, since N ≥ 2, and every positive integer can be written as a product of primes. Clearly, p cannot equal any of the pi ’s, since if it did, then p would divide M , and hence also divide N − M = 1, which is impossible. Therefore, the prime p is not among p1 , . . . , pk , which contradicts our assumption that these are the only primes. For each prime p, we may define the function νp , mapping non-zero integers to non-negative integers, as follows: for every integer n 6= 0, if n = pe m, where p - m, then νp (n) := e. We may then write the factorization of n into primes as Y pνp (n) , n=± p where the product is over all primes p; although syntactically this is an infinite product, all but finitely many of its terms are equal to 1, and so this expression makes sense. Observe that if a and b are non-zero integers, then νp (a · b) = νp (a) + νp (b) for all primes p, (1.3) a | b ⇐⇒ νp (a) ≤ νp (b) for all primes p. (1.4) and From this, it is clear that gcd(a, b) = Y pmin(νp (a),νp (b)) . p Least common multiples. For a, b ∈ Z, a common multiple of a and b is an integer m such that a | m and b | m; moreover, such an m is the least common multiple of a and b if m is non-negative and m divides all common multiples of a and b. It is easy to see that the least common multiple exists and is unique, and we denote the least common multiple of a and b by lcm(a, b). Indeed, for all a, b ∈ Z, if either a or b are zero, the only common multiple of a and b is 0, and so lcm(a, b) = 0; otherwise, if neither a nor b are zero, we have Y lcm(a, b) = pmax(νp (a),νp (b)) , p or equivalently, lcm(a, b) may be characterized as the smallest positive integer divisible by both a and b. It is convenient to extend the domain of definition of νp to include 0, defining νp (0) := ∞. If we interpret expressions involving “∞” appropriately,1 then for arbitrary a, b ∈ Z, both (1.3) and (1.4) hold, and in addition, νp (gcd(a, b)) = min(νp (a), νp (b)) and νp (lcm(a, b)) = max(νp (a), νp (b)) for all primes p. 1 The interpretation given to such expressions should be obvious: for example, for every x ∈ R, we have −∞ < x < ∞, x + ∞ = ∞, x − ∞ = −∞, ∞ + ∞ = ∞, and (−∞) + (−∞) = −∞. Expressions such as x · (±∞) also make sense, provided x 6= 0. However, the expressions ∞ − ∞ and 0 · ∞ have no sensible interpretation. 10 Generalizing gcd’s and lcm’s to many integers. It is easy to generalize the notions of greatest common divisor and least common multiple from two integers to many integers. Let a1 , . . . , ak be integers. We call d ∈ Z a common divisor of a1 , . . . , ak if d | ai for i = 1, . . . , k; moreover, we call such a d the greatest common divisor of a1 , . . . , ak if d is non-negative and all other common divisors of a1 , . . . , ak divide d. The greatest common divisor of a1 , . . . , ak is denoted gcd(a1 , . . . , ak ) and is the unique non-negative integer d satisfying νp (d) = min(νp (a1 ), . . . , νp (ak )) for all primes p. Analogously, we call m ∈ Z a common multiple of a1 , . . . , ak if ai | m for all i = 1, . . . , k; moreover, such an m is called the least common multiple of a1 , . . . , ak if m divides all common multiples of a1 , . . . , ak . The least common multiple of a1 , . . . , ak is denoted lcm(a1 , . . . , ak ) and is the unique non-negative integer m satisfying νp (m) = max(νp (a1 ), . . . , νp (ak )) for all primes p. Finally, we say that the family {ai }ki=1 is pairwise relatively prime if for all indices i, j with i 6= j, we have gcd(ai , aj ) = 1. Certainly, if {ai }ki=1 is pairwise relatively prime, and k > 1, then
gcd(a1 , . . . , ak ) = 1; however, gcd(a1 , . . . , ak ) = 1 does not imply that {ai }ki=1 is pairwise relatively
prime.
Rational numbers. Consider the rational numbers Q = {a/b : a, b ∈ Z, b 6= 0}. Given any
rational number a/b, if we set d := gcd(a, b), and define the integers a0 := a/d and b0 := b/d, then
we have a/b = a0 /b0 and gcd(a0 , b0 ) = 1. Moreover, if a1 /b1 = a0 /b0 , then we have a1 b0 = a0 b1 ,
and so b0 | a0 b1 ; also, since gcd(a0 , b0 ) = 1, we see that b0 | b1 ; writing b1 = b0 c, we see that
a1 = a0 c. Thus, we can represent every rational number as a fraction in lowest terms, which
means a fraction of the form a0 /b0 where a0 and b0 are relatively prime; moreover, the values of
a0 and b0 are uniquely determined up to sign, and every other fraction that represents the same
rational number is of the form a0 c/b0 c, for some non-zero integer c.
Exercise 1.18. Let n be an integer. Generalizing Exercise 1.12, show that if {ai }ki=1 is a pairwise
Q
relatively prime family of integers, where each ai divides n, then their product ki=1 ai also divides
n.
Exercise 1.19. Show that for all integers a, b, c, we have:
(a) lcm(a, b) = lcm(b, a);
(b) lcm(a, b) = |a| ⇐⇒ b | a;
(c) lcm(a, a) = lcm(a, 1) = |a|;
(d) lcm(ca, cb) = |c| lcm(a, b).
Exercise 1.20. Show that for all integers a, b, we have:
(a) gcd(a, b) · lcm(a, b) = |ab|;
(b) gcd(a, b) = 1 =⇒ lcm(a, b) = |ab|.
Exercise 1.21. Let a1 , . . . , ak ∈ Z with k > 1. Show that:
gcd(a1 , . . . , ak ) = gcd(a1 , gcd(a2 , . . . , ak )) = gcd(gcd(a1 , . . . , ak−1 ), ak );
lcm(a1 , . . . , ak ) = lcm(a1 , lcm(a2 , . . . , ak )) = lcm(lcm(a1 , . . . , ak−1 ), ak ).
11
Exercise 1.22. Let a1 , . . . , ak ∈ Z with d := gcd(a1 , . . . , ak ). Show that there exist integers
s1 , . . . , sk such that d = a1 s1 + · · · + ak sk .
Exercise 1.23. Show that if {ai }ki=1 is a pairwise relatively prime family of integers, then
lcm(a1 , . . . , ak ) = |a1 · · · ak |.
Exercise 1.24. Show that every non-zero x ∈ Q can be expressed as
x = ±pe11 · · · perr ,
where the pi ’s are distinct primes and the ei ’s are non-zero integers, and that this expression in
unique up to a reordering of the primes.
Exercise 1.25. Let n and k be positive integers, and suppose x ∈ Q such that xk = n for some
√
x ∈ Q. Show that x ∈ Z. In other words, k n is either an integer or is irrational.
Exercise 1.26. Show that gcd(a + b, lcm(a, b)) = gcd(a, b) for all a, b ∈ Z.
Exercise 1.27. Show that for every positive integer k, there exist k consecutive composite integers.
Thus, there are arbitrarily large gaps between primes.
Exercise 1.28. Let p be a prime. Show that for all a, b ∈ Z, we have νp (a+b) ≥ min{νp (a), νp (b)},
and νp (a + b) = νp (a) if νp (a) < νp (b). Exercise 1.29. For a given prime p, we may extend the domain of definition of νp from Z to Q: for non-zero integers a, b, let us define νp (a/b) := νp (a) − νp (b). Show that: (a) this definition of νp (a/b) is unambiguous, in the sense that it does not depend on the particular choice of a and b; (b) for all x, y ∈ Q, we have νp (xy) = νp (x) + νp (y); (c) for all x, y ∈ Q, we have νp (x + y) ≥ min{νp (x), νp (y)}, and νp (x + y) = νp (x) if νp (x) < νp (y); Q (d) for all non-zero x ∈ Q, we have x = ± p pνp (x) , where the product is over all primes, and all but a finite number of terms in the product are equal to 1; (e) for all x ∈ Q, we have x ∈ Z if and only if νp (x) ≥ 0 for all primes p. Exercise 1.30. Let n be a positive integer, and let 2k be the highest power of 2 in the set S := {1, . . . , n}. Show that 2k does not divide any other element in S. P Exercise 1.31. Let n ∈ Z with n > 1. Show that ni=1 1/i is not an integer.
Exercise 1.32. Let n be a positive integer, and let Cn denote the number of pairs of integers (a, b)
with a, b ∈ {1, . . . , n} and gcd(a, b) = 1, and let Fn be the number of distinct rational numbers a/b,
where 0 ≤ a < b ≤ n. (a) Show that Fn = (Cn + 1)/2. (b) Show that Cn ≥ n2 /4. Hint: first show that Cn ≥ n2 (1 − P 2 d≥2 1/d ≤ 3/4. 12 P d≥2 1/d 2 ), and then show that Chapter 2 Congruences This chapter introduces the basic properties of congruences modulo n, along with the related notion of residue classes modulo n. Other items discussed include the Chinese remainder theorem and Fermat’s little theorem. 2.1 Definitions and basic properties of congruences Let n be a positive integer. For integers a and b, we say that a is congruent to b modulo n if n | (a − b), and we write a ≡ b (mod n). If n - (a − b), then we write a 6≡ b (mod n). Equivalently, a ≡ b (mod n) if and only if a = b + ny for some y ∈ Z. The relation a ≡ b (mod n) is called a congruence relation, or simply, a congruence. The number n appearing in such congruences is called the modulus of the congruence. Note that a ≡ 0 (mod n) if and only if n divides a. If we view the modulus n as fixed, then the following theorem says that the binary relation “· ≡ · (mod n)” is an equivalence relation on the set Z. Theorem 2.1 (Equivalence Property). Let n be a positive integer. For all a, b, c ∈ Z, we have: (i) a ≡ a (mod n); (ii) a ≡ b (mod n) implies b ≡ a (mod n); (iii) a ≡ b (mod n) and b ≡ c (mod n) implies a ≡ c (mod n). Proof. For (i), observe that n divides 0 = a − a. For (ii), observe that if n divides a − b, then it also divides −(a − b) = b − a. For (iii), observe that if n divides a − b and b − c, then it also divides (a − b) + (b − c) = a − c. Another key property of congruences is that they are “compatible” with integer addition and multiplication, in the following sense: Theorem 2.2 (Compatibility Property). Let a, a0 , b, b0 , n ∈ Z with n > 0. If
a ≡ a0 (mod n) and b ≡ b0 (mod n),
then
a + b ≡ a0 + b0 (mod n) and a · b ≡ a0 · b0 (mod n).
13
Proof. Suppose that a ≡ a0 (mod n) and b ≡ b0 (mod n). This means that there exist integers x
and y such that a = a0 + nx and b = b0 + ny. Therefore,
a + b = a0 + b0 + n(x + y),
which proves the first congruence of the theorem, and
ab = (a0 + nx)(b0 + ny) = a0 b0 + n(a0 y + b0 x + nxy),
which proves the second congruence.
Theorems 1.3 and 1.4 can be restated in terms of congruences (with a as given, and b := n):
Theorem 2.3. Let a, n ∈ Z with n > 0. Then there exists a unique integer r such that a ≡
r (mod n) and 0 ≤ r < n, namely, r := a mod n. More generally, for every x ∈ R, there exists a unique integer r ∈ [x, x + n) such that a ≡ r (mod n). We also have: Theorem 2.4. Let a, b, n ∈ Z with n > 0. Then we have:
a ≡ b (mod n) ⇐⇒ (a mod n) = (b mod n).
(2.1)
Proof. By the existence part of Theorem 2.3, we have a ≡ r (mod n) and b ≡ s (mod n), where
r := a mod n and s := b mod n. We want to show a ≡ b (mod n) ⇐⇒ r = s.
On the one hand, if r = s, then (using Theorem 2.1) we have a ≡ b (mod n). On the other
hand, if a ≡ b (mod n), then (again using Theorem 2.1), we have a ≡ s (mod n); moreover, by the
uniqueness part of Theorem 2.3, we have r = s.
Note that we are using the notation “mod” in two different ways here: on the left-hand side of
(2.1), as a part of a congruence relation (e.g., 17 ≡ −3 (mod 5)), and on the right-hand side, as a
binary operator (e.g., (17 mod 5) = 2 = (−3 mod m)). The reader should be aware that these two
uses are similar, but not quite the same.
Theorems 2.1 and 2.2 are deceptively powerful: they allow one to work with congruence relations modulo n much as one would with ordinary equalities. One can add to, subtract from,
or multiply both sides of a congruence modulo n by the same integer. Also, we have the following substitution principle: roughly speaking, if a ≡ a0 (mod n), we can substitute a0 for a in
any arithmetic expression involving a, without changing the value of the expression mod n. More
precisely, suppose E(v) is any arithmetic expression in a variable v, built up from v and integer
constants using addition, subtraction, and multiplication operators; then a ≡ a0 (mod n) implies
E(a) ≡ E(a0 ) (mod n).
To see why this “substitution principle” works, consider the expression
E(v) := (v + 1)(v + 2).
Suppose a ≡ a0 (mod n). We want to show: E(a) ≡ E(a0 ) (mod n). Appying Theorem 2.2 three
times, we have:
a + 1 ≡ a0 + 1 (mod n)
a + 2 ≡ a0 + 2 (mod n)
(a + 1)(a + 2) ≡ (a0 + 1)(a0 + 2) (mod n).
This type of calculation generalizes to arbitrary expressions.
14
Example 2.1. We can use the above principles to greatly simplify modular computations. Suppose
we wish to compute 4 · 5 · 6 · 7 · 8 · 9 mod 17. We could do this by first computing a := 4 · 5 · 6 · 7 · 8 · 9,
and then divide a by 17 to get a mod 17. However, a itself will be a rather large number, and
we can simplify the computation by reducing intermediate results mod 17 as we go. Observe that
4 · 5 = 20, and 20 ≡ 3 (mod 17). Therefore, using the above “substitution principle”, we can replace
4 · 5 by 3 in the expression 4 · 5 · 6 · 7 · 8 · 9, without changing its value mod 17:
4 · 5 · 6 · 7 · 8 · 9 ≡ 3 · 6 · 7 · 8 · 9 (mod 17).
Applying this substitution strategy repeatedly, we can compute:
4·5 ·6·7·8·9
|{z}
4 · 5 ≡ 3 (mod 17)
3·6 ·7·8·9
|{z}
3 · 6 ≡ 1 (mod 17)
1·7·8·9
7·8 ·9
|{z}
7 · 8 ≡ 5 (mod 17)
5·9
|{z}
5 · 9 ≡ 11 (mod 17)
We conclude that 4 · 5 · 6 · 7 · 8 · 9 ≡ 11 (mod 17), or in other words, 4 · 5 · 6 · 7 · 8 · 9 mod 17 = 11.
Example 2.2. A convenient “rule of thumb” that one often uses to test for divisibility by 3 is to
add up the digits of a number, and test if the the sum of digits is itself divisible by 3. For example,
if a = 25614, we add the digits of a, obtaining 2 + 5 + 6 + 1 + 4 = 18; since 18 is divisible by 3,
we conclude that a is divisible by 3 as well. We can justify this rule using congruences.
Let a be
P
i . Let b
a
10
a positive integer whose base-10 representation is a = (ak−1 · · · a1 a0 )10 , so a = k−1
i=0 i
Pk−1
be the sum of the decimal digits of a; that is, let b := i=0 ai . We will show that a ≡ b (mod 3).
This will justify the divisibility-by-3 rule, since then we see that a ≡ 0 (mod 3) (i.e., a is divisible
by 3) if and only if b ≡ 0 (mod 3) (i.e., b is divisible by 3). To show that a ≡ b (mod 3), we first
observe that 10 ≡ 1 (mod 3). Then, we calculate
a=
k−1
X
i=0
i
ai 10 ≡
k−1
X
ai · 1 (mod 3).
i=0
Here, we have used the above “substitution principle”: since, 10 ≡ 1 (mod 3), we can substitute 1
for 10 in the above congruence mod 3. See also Exercises 2.5 and 2.6.
Example 2.3. Let us find the set of solutions z to the congruence
8z − 4 ≡ 5z − 2 (mod 7).
(2.2)
Suppose z is a solution to (2.2). If we add 4 and subtract 5z on both sides of (2.2), we see that z
is a solution to
3z ≡ 2 (mod 7).
(2.3)
Conversely, if z is a solution to (2.3), then by subtracting 4 and adding 5z on both sides of (2.3),
we see that z is a solution to (2.2). Thus, (2.2) and (2.3) have the same solution set.
15
Next, suppose z is a solution to (2.3). We would like to divide both sides of (2.3) by 3, to get
z by itself on the left-hand side. We cannot do this directly, but since 5 · 3 = 15 ≡ 1 (mod 7), we
can achieve the same effect by multiplying both sides of (2.3) by 5. If we do this, we obtain
15z ≡ 10 (mod 7).
(2.4)
Now, we apply the above “substitution principle”: since, 15 ≡ 1 (mod 7), we can substitute 1 for
15 in (2.4). Likewise, since 10 ≡ 3 (mod 7), we can substitute 3 for 10. Making both of these
substitutions, we obtain
z ≡ 3 (mod 7).
(2.5)
Thus, any solution z to (2.3) is a solution to (2.5). Conversely, if z is a solution to (2.5), then by
multiplying both sides of (2.5) by 3, we get 3z ≡ 9 (mod 7), and since 9 ≡ 2 (mod 7), we see that
z is a solution to (2.3). Thus, (2.2), (2.3), and (2.5) all have the same solution set.
Therefore, z is a solution to (2.2) if and only if z ≡ 3 (mod 7). That is, the solutions to (2.2)
are precisely those integers that are congruent to 3 modulo 7, which we can list as follows:
. . . , −18, −11, −4, 3, 10, 17, 24, . . .
In the next section, we shall give a systematic treatment of the problem of solving linear
congruences, such as the one appearing in the previous example.
2.1.1
Application: computer arithmetic
Computers store integers as words, which are binary strings of fixed length. Congruences allow one
to better understand how arithmetic on these words actually works.
Suppose we have a very minimalistic computer with just 4-bit words. On such a machine,
the type unsigned in the C language would represent the integers 0, . . . , 15, encoded in binary as
follows:
0000
0
0001
1
0010
2
0011
3
0100
4
0101
5
0110
6
0111
7
1000
8
1001
9
1010
10
1011
11
1100
12
1101
13
1110
14
1111
15
When one adds or subtracts two such 4-bit integers, the result is always a 4-bit integer: the
hardware just ignores any carries out of or borrows into the high-order bit position. For example, if
we add the numbers 9 and 13, we get 22, which in binary is the 5-bit number 10110. The computer
hardware will just throw away the left-most bit, resulting in the 4-bit number 0110, which is the
binary representation of 6. Thus, on this machine, if we add 9 and 13, we get 6 instead of 22. Notice
that while 6 is not equal to 22, it is congruent to 22 modulo 16, i.e., 6 ≡ 22 (mod 16). This is no
coincidence. When the hardware throws away the high-order bit, this is the same as subtracting
off the value 16.
In general, if our computer has n-bit words, then such words can represent the numbers
0, . . . , 2n − 1, encoded in binary as n-bit strings. Moreover, unsigned arithmetic (addition, subtraction, and multiplication) is really just arithmetic mod 2n .
As for signed arithmetic, if the machine uses 2’s complement arithmetic (as is the case for
essentially all computers used today), then nothing really changes. Here again is the table for
encoding 4-bit integers, where we also include the 2’s complement encoding of signed integers:
0000
0
0
0001
1
1
0010
2
2
0011
3
3
0100
4
4
0101
5
5
0110
6
6
0111
7
7
1000
8
-8
16
1001
9
-7
1010
10
-6
1011
11
-5
1100
12
-4
1101
13
-3
1110
14
-2
1111
15
-1
Note that 15 ≡ −1 (mod 16), and 14 ≡ −2 (mod 16), and so on. This is also not a coincidence.
More generally, for n-bit 2’s complement arithmetic, for each x ∈ [1 . . 2n−1 ], we encode the negative
integer −x as the binary encoding of the positive integer (−x) mod 2n . In particular, −1 is encoded
as the binary encoding of (−1) mod 2n = 2n − 1, which is the bit string 11 · · · 1 of all 1’s. Addition,
subtraction, and multiplication of n-bit 2’s complement integers is carried out using exactly the
same hardware as for n-bit unsigned integers: all results are just computed mod 2n .
Consider again our 4-bit machine, and suppose the type int represents the integers
−8, −7, . . . , 6, 7 in 2’s complement. Suppose the a, b, c, d are variable of type int whose values are
a=3, b=4, c=4, d=5
and our C program computes
int e = a*b-c*d;
Technically speaking, the evaluation of a*b-c*d overflows and the result is not well defined. Nevertheless, it is more likely than not that the program actually assigns to e the correct value, which
is −8 = 3 · 4 − 4 · 5. The reason is that the hardware will compute the result mod 16, and since
the correct result itself fits in the range −8, −7, . . . , 6, 7, the computed result and the correct result
must match.1
See Exercise 2.7 for more on 2’s complement arithmetic.
Exercise 2.1. Let a, b, n ∈ Z with n > 0. Show that a ≡ b (mod n) if and only if (a mod n) =
(b mod n).
Exercise 2.2. Let a, b, n, n0 ∈ Z with n > 0, n0 > 0, and n0 | n. Show that if a ≡ b (mod n), then
a ≡ b (mod n0 ).
Exercise 2.3. Let a, b, n, n0 ∈ Z with n > 0, n0 > 0, and gcd(n, n0 ) = 1. Show that if a ≡ b (mod n)
and a ≡ b (mod n0 ), then a ≡ b (mod nn0 ).
Exercise 2.4. Let a, b, n ∈ Z with n > 0 and a ≡ b (mod n). Show that gcd(a, n) = gcd(b, n).
Exercise 2.5. Consider again Example 2.2. Instead of using the “subsitution principle”, prove
that
(ak−1 · · · a1 a0 )10 ≡ ak−1 + · · · + a1 + a0 (mod 3)
directly, using Theorem 2.2 and a proof by induction on k.
Hint: use the fact that for k > 1, we have
(ak−1 · · · a1 a0 )10 = 10 · (ak−1 · · · a1 )10 + a0 .
Note: the “substitution principle” itself can be proved by a similar induction argument.
Exercise 2.6. Analogous to Example 2.2, formulate and justify a simple “rule of thumb” for
testing divisibility by 11.
1
Warning: an optimizing compiler may invalidate our assumption that arithmetic is computed mod modulo 2n .
However, such aggressively optimizing compilers have caused problems in legacy code, and so some compromises have
been made; for example, the -fwrapv option of gcc ensures arithmetic is always carried out mod 2n .
17
Exercise 2.7. Let n be a positive integer. For x ∈ {0, . . . , 2n −1}, let x̃ denote the integer obtained
by inverting the bits in the n-bit, binary representation of x (note that x̃ ∈ {0, . . . , 2n − 1}). Show
that x̃ + 1 ≡ −x (mod 2n ). This justifies the usual rule for computing negatives in 2’s complement
arithmetic.
Hint: what is x + x̃?
Exercise 2.8. Show that the equation 7y 3 + 2 = z 3 has no solutions y, z ∈ Z.
Exercise 2.9. Show that there are 14 distinct, possible, yearly (Gregorian) calendars, and show
that all 14 calendars actually occur.
2.2
Solving linear congruences
In this section, we consider the general problem of solving linear congruences. More precisely, for
a given positive integer n, and arbitrary integers a and b, we wish to determine the set of integers
z that satisfy the congruence
az ≡ b (mod n).
(2.6)
2.2.1
Existence of solutions
We begin with a theorem that characterizes when (2.6) has any solutions at all.
Theorem 2.5. Let a, b, n ∈ Z with n > 0, and let d := gcd(a, n). Then (2.6) has a solution z ∈ Z
if and only if d | b.
Proof. We have
az ≡ b (mod n) for some z ∈ Z
⇐⇒ az = b + ny for some z, y ∈ Z (by def’n of congruence)
⇐⇒ az − ny = b for some z, y ∈ Z
⇐⇒ d | b (by Bezout’s Lemma).
2.2.2
Uniqueness of solutions: cancellation and modular inverses
Now suppose z satisfies the congruence (2.6). Clearly, all integers z 0 ≡ z (mod n) also satisfy (2.6).
The question we next address is: are there any other solutions? The next theorem says that the
answer is “no”, provided gcd(a, n) = 1. In other words, if a and n are relatively prime, then (2.6)
has a unique solution modulo n.
Theorem 2.6. Let a, b, n ∈ Z with n > 0. Assume that gcd(a, n) = 1. For any solutions z and z 0
to (2.6), we have z ≡ z 0 (mod n).
Proof. We have
az ≡ b (mod n) and az 0 ≡ b (mod n)
=⇒ az ≡ az 0 (mod n)
=⇒ a(z − z 0 ) ≡ 0 (mod n)
=⇒ n | a(z − z 0 )
=⇒ n | (z − z 0 ) (by Theorem 1.8)
=⇒ z ≡ z 0 (mod n).
18
Combining Theorems 2.5 and 2.6, we have:
Theorem 2.7. Let a, b, n ∈ Z with n > 0. Assume that gcd(a, n) = 1. The congruence (2.6) has
a unique solution modulo n; that is, there is a unique z ∈ {0, . . . , n − 1} such that az ≡ b (mod n).
A cancellation law. Let a, n ∈ Z with n > 0. The proof of Theorem 2.6 gives us a cancellation
law for congruences:
if gcd(a, n) = 1 and az ≡ az 0 (mod n), then z ≡ z 0 (mod n).
Modular inverses. Again, let a, n ∈ Z with n > 0. We say that z ∈ Z is a multiplicative
inverse of a modulo n if az ≡ 1 (mod n). Theorem 2.5 says that a has a multiplicative inverse
modulo n if and only if gcd(a, n) = 1. Moreover, Theorem 2.6 says that the multiplicative inverse
of a, if it exists, is uniquely determined modulo n; that is, if z and z 0 are multiplicative inverses of
a modulo n, then z ≡ z 0 (mod n).
Notation: We write a−1 mod n to denote the unique multiplicative inverse of a modulo n that
lies in the interval {0, . . . , n − 1}.
Example 2.4. As we saw in Example 2.3, we have 3−1 mod 7 = 5, since 3 · 5 = 15 ≡ 1 (mod 7).
Computing modular inverses. We can use the extended Euclidean algorithm, discussed in
§1.3.1, to compute modular inverses, when they exist. Suppose we are given a and n, and want to
compute a−1 mod n. Let use assume that 0 ≤ a < n— we can always replace a by a mod n, if this is not the case. Now compute (d, s, t) ← ExtEuclid(n, a), so that d = gcd(n, a) and s and t satisfy ns + at = d. If d 6= 1, the a−1 mod n does not exist; otherwise, a−1 mod n = t mod n. 2.2.3 Determining all solutions via modular inverses We now describe the complete set of solutions z ∈ Z to the congruence (2.6) by means of modular inverses. If gcd(a, n) = 1, then setting t := a−1 mod n, and z := tb mod n, we see that this z is the unique solution to the congruence (2.6) that lies in the interval {0, . . . , n − 1}. The set of all solutions to the congruence (2.6) over the integers is {z + ny : y ∈ Z}, but z is the only one of these in the interval {0, . . . , n − 1}. More generally, let d := gcd(a, n). If d - b, then we know that (2.6) does not have any solutions. So suppose that d | b, and set a0 := a/d, b0 := b/d, and n0 := n/d. For each z ∈ Z, we have az ≡ b (mod n) if and only if a0 z ≡ b0 (mod n0 ). This is because az ≡ b (mod n) ⇐⇒ az = b + ny for some y ∈ Z ⇐⇒ (a/d) = (b/d) + (n/d)y for some y ∈ Z ⇐⇒ a0 z ≡ b0 (mod n0 ). Also, we have gcd(a0 , n0 ) = 1 (see Exercise 1.11). So, if we set t := (a0 )−1 mod n0 and z 0 := tb0 mod n0 , then z 0 is the unique solution to a0 z ≡ b0 (mod n0 ) in the interval {0, . . . , n0 − 1}. It follows that the solutions to the congruence (2.6) that lie in the interval {0, . . . , n − 1} are the d integers z 0 , z 0 +n0 , . . . , z 0 +(d−1)n0 . The set of all solutions to the congruence (2.6) over the integers is {z 0 + n0 y : y ∈ Z}, but these d integers are the only ones that lie in the interval {0, . . . , n − 1}. We can summarize the above observations: 19 Theorem 2.8. Let a, b, n ∈ Z with n > 0. Let d := gcd(a, n). If d – b, then the congruence (2.6)
has no solutions; otherwise, it has d solutions z in the interval {0, . . . , n − 1}, namely,
z 0 , z 0 + (n/d), . . . , z 0 + (d − 1)(n/d),
where z 0 is the unique integer in the interval {0, . . . , (n/d) − 1} such that
(a/d)z 0 ≡ (b/d) (mod n/d).
Specifically, z 0 = t(b/d) mod (n/d), where t := (a/d)−1 mod (n/d).
Example 2.5. As a concrete illustration, suppose we want to find the solutions to the congruence
6z ≡ 22 (mod 100). Observe that gcd(6, 100) = 2 and 2 divides 22. It follows that z is a solution
to 6z ≡ 22 (mod 100) if and only if 3z ≡ 11 (mod 50). Here, we have simply divided each of the
numbers appearing in the first congruence by the greatest common divisor to obtain the second
congruence. Since 3 · 17 = 51 ≡ 1 (mod 50), we see that 3−1 mod 50 = 17. So we multiply both
sides of 3z ≡ 11 (mod 50) by 17 to obtain z ≡ 187 ≡ 37 (mod 50). It follows that the original
conguence 6z ≡ 22 (mod 100) has exactly two solutions in the interval {0, . . . , 99}, namely, z = 37
and z = 37 + 50 = 87.
Exercise 2.10. Let a, n ∈ Z with n > 0. Show that z ∈ Z satisfies the congruence az ≡ 0 (mod n)
iff z ≡ 0 (mod n/d), where d := gcd(a, n).
Exercise 2.11. For each of the following congruences, determine all the integer solutions z ∈
{0, . . . , 999}. To so this, you should first put the congruence in the form az = b (mod 1000), as
we did in going from (2.2) to (2.3) in Example 2.3. Then follow the steps outlined in §2.2.3. Show
all your steps (but you may use a calculator to help with the multiplications). Use the extended
Euclidean algorithm to compute modular inverses, where necessary.
(a) 100z + 200 ≡ 93z + 171 (mod 1000)
(b) 115z + 130 ≡ 100z + 165 (mod 1000)
(c) 115z + 132 ≡ 100z + 140 (mod 1000)
(d) 119z + 132 ≡ 113z + 140 (mod 1000)
Exercise 2.12. Let a1 , . . . , ak , b, n be integers with n > 0. Show that the congruence
a1 z1 + · · · + ak zk ≡ b (mod n)
has a solution z1 , . . . , zk ∈ Z if and only if d | b, where d := gcd(a1 , . . . , ak , n).
Exercise 2.13. Let p be a prime, and let a, b, c, e be integers, such that e > 0, a 6≡ 0 (mod pe+1 ),
and 0 ≤ c < pe . Define N to be the number of integers z ∈ {0, . . . , p2e − 1} such that j k (az + b) mod p2e pe = c. Show that N = pe . 20 2.3 The Chinese remainder theorem Next, we consider systems of linear congruences with respect to moduli that are relatively prime in pairs. The result we state here is known as the Chinese remainder theorem, and is extremely useful in a number of contexts. Theorem 2.9 (Chinese remainder theorem). Let {ni }ki=1 be a pairwise relatively prime family of positive integers, and let a1 , . . . , ak be arbitrary integers. Then there exists a solution a ∈ Z to the system of congruences a ≡ ai (mod ni ) (i = 1, . . . , k). 0 0 Moreover, any Qk a ∈ Z is a solution to this system of congruences if and only if a ≡ a (mod n), where n := i=1 ni . Proof. To prove the existence of a solution a to the system of congruences, we first show how to construct integers e1 , . . . , ek such that for i, j = 1, . . . , k, we have 1 (mod ni ) if j = i, (2.7) ej ≡ 0 (mod ni ) if j 6= i. If we do this, then setting a := k X ai ei , i=1 one sees that for j = 1, . . . , k, we have a≡ k X ai ei ≡ aj (mod nj ), i=1 since all the terms in this sum are zero modulo nj , except for the term i = j, which is congruent to aj modulo nj . Q To construct e1 , . . . , ek satisfying (2.7), let n := ki=1 ni as in the statement of the theorem, and for i = 1, . . . , k, let n∗i := n/ni ; that is, n∗i is the product of all the moduli nj with j 6= i. From the fact that {ni }ki=1 is pairwise relatively prime, it follows that for i = 1, . . . , k, we have gcd(ni , n∗i ) = 1, and so we may define ti := (n∗i )−1 mod ni and ei := n∗i ti . One sees that ei ≡ 1 (mod ni ), while for j 6= i, we have ni | n∗j , and so ej ≡ 0 (mod ni ). Thus, (2.7) is satisfied. That proves the existence of a solution a to the given system of congruences. If a ≡ a0 (mod n), then since ni | n for i = 1, . . . , k, we see that a0 ≡ a ≡ ai (mod ni ) for i = 1, . . . , k, and so a0 also solves the system of congruences. Finally, if a0 is a solution to the given system of congruences, then a ≡ ai ≡ a0 (mod ni ) for i = 1, . . . , k. Thus, ni | (a − a0 ) for i = 1, . . . , k. Since {ni }ki=1 is pairwise relatively prime, this implies n | (a − a0 ), or equivalently, a ≡ a0 (mod n). We can restate Theorem 2.9 in more concrete terms, as follows. For each positive integer m, let Im denote {0, . . . , m − 1}. Suppose {ni }ki=1 is a pairwise relatively prime family of positive integers, and set n := n1 · · · nk . Then the map τ : In → In1 × · · · × Ink a 7→ (a mod n1 , . . . , a mod nk ) is a bijection. 21 Example 2.6. The following table illustrates what Theorem 2.9 says for n1 = 3 and n2 = 5. a 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 a mod 3 0 1 2 0 1 2 0 1 2 0 1 2 0 1 2 a mod 5 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 We see that as a ranges from 0 to 14, the pairs (a mod 3, a mod 5) range over all pairs (a1 , a2 ) with a1 ∈ {0, 1, 2} and a2 ∈ {0, . . . , 4}, with every pair being hit exactly once. Exercise 2.14. Compute the values e1 , e2 , e3 in the proof of Theorem 2.9 in the case where k = 3, n1 = 3, n2 = 5, and n3 = 7. Also, find an integer a such that a ≡ 1 (mod 3), a ≡ −1 (mod 5), and a ≡ 5 (mod 7). Exercise 2.15. If you want to show that you are a real nerd, here is an age-guessing game you might play at a party. You ask a fellow party-goer to divide his age by each of the numbers 3, 4, and 5, and tell you the remainders. Show how to use this information to determine his age. Exercise 2.16. Let {ni }ki=1 be a pairwise relatively prime family of positive integers. Let a1 , . . . , ak and b1 , . . . , bk be integers, and set di := gcd(ai , ni ) for i = 1, . . . , k. Show that there exists an integer z such that ai z ≡ bi (mod ni ) for i = 1, . . . , k if and only if di | bi for i = 1, . . . , k. Exercise 2.17. For each prime p, let νp (·) be defined as in §1.5. Let p1 , . . . , pr be distinct primes, a1 , . . . , ar be arbitrary integers, and e1 , . . . , er be arbitrary non-negative integers. Show that there exists an integer a such that νpi (a − ai ) = ei for i = 1, . . . , r. Exercise 2.18. Suppose n1 and n2 are positive integers, and let d := gcd(n1 , n2 ). Let a1 and a2 be arbitrary integers. Show that there exists an integer a such that a ≡ a1 (mod n1 ) and a ≡ a2 (mod n2 ) if and only if a1 ≡ a2 (mod d). 2.4 Residue classes Let n be a positive integer. We define the set Zn to be the set of n abstract objects [0]n , [1]n , . . . , [n − 1]n . These objects are called the residue classes modulo n. We can extend this notation, where for arbitrary a ∈ Z, we define [a]n := [a mod n]n . We also define addition and multiplication on residue classes as follows. For a, b ∈ {0, . . . , n−1}, we define [a]n + [b]n := [a + b]n and [a]n · [b]n := [a · b]n , Note that in this definition, we are using the extended notation introduced above, so that [a + b]n = [(a + b) mod n]n and [a · b]n = [(a · b) mod n]n . Note that because of Theorem 2.2, the equations defining addition and multiplication of residue classes hold for all a, b ∈ Z. That is, for all a, b ∈ Z, we have [a]n + [b]n = [a + b]n and 22 [a]n · [b]n = [a · b]n . To see why, if a0 := a mod n and b0 := b mod n, then [a]n = [a0 ]n , [b]n = [b0 ]n , [a + b]n = [a0 + b0 ]n , and [a · b]n = [a0 · b0 ]n , and it follows that [a]n + [b]n = [a0 ]n + [b0 ]n = [a0 + b0 ]n = [a + b]n and [a]n · [b]n = [a0 ]n · [b0 ]n = [a0 · b0 ]n = [a · b]n . More generally, for all a, b, c ∈ Z, we have [a]n + [b]n = [c]n ⇐⇒ a + b ≡ c (mod n), and [a]n · [b]n = [c]n ⇐⇒ a · b ≡ c (mod n). When n is clear from context, we often write [a] instead of [a]n . Example 2.7. Consider again the residue classes modulo 6. These are [0], [1], [2], [3], [4], [5]. Using the extended notation, we see, for example, that [−1] is the same thing as [5]. Let us write down the addition and multiplication tables for Z6 . The addition table looks like this: + [0] [1] [2] [3] [4] [5] [0] [0] [1] [2] [3] [4] [5] [1] [1] [2] [3] [4] [5] [0] [2] [2] [3] [4] [5] [0] [1] [3] [3] [4] [5] [0] [1] [2] [4] [4] [5] [0] [1] [2] [3] [5] [5] [0] [1] [2] [3] [4] . [1] [0] [1] [2] [3] [4] [5] [2] [0] [2] [4] [0] [2] [4] [3] [0] [3] [0] [3] [0] [3] [4] [0] [4] [2] [0] [4] [2] [5] [0] [5] [4] [3] [2] [1] . The multiplication table looks like this: · [0] [1] [2] [3] [4] [5] [0] [0] [0] [0] [0] [0] [0] The addition and multiplication operations on Zn yield a very natural algebraic structure. For example, addition and multiplication are commutative and associative; that is, for all α, β, γ ∈ Zn , we have α + β = β + α, (α + β) + γ = α + (β + γ), αβ = βα, (αβ)γ = α(βγ). 23 Note that we have adopted here the usual convention of writing αβ in place of α · β. Furthermore, multiplication distributes over addition; that is, for all α, β, γ ∈ Zn , we have α(β + γ) = αβ + αγ. All of these properties follow from the definitions, and the corresponding properties for Z. For example, for the distributive law, suppose α = [a], β = [b], and γ = [c]. Then we have α(β + γ) = [a(b + c)] = [ab + ac] = αβ + αγ. Because addition and multiplication in Zn are associative, for α1 , . . . , αk ∈ Zn , we may write the sum α1 + · · · + αk and the product α1 · · · αk without any parentheses, and there is no ambiguity; moreover, since both addition and multiplication are commutative, we may rearrange the terms in such sums and products without changing their values. The residue class [0] acts as an additive identity; that is, for all α ∈ Zn , we have α + [0] = α; indeed, if α = [a], then a + 0 ≡ a (mod n). Moreover, [0] is the only element of Zn that acts as an additive identity; indeed, if a + z ≡ a (mod n) holds for all integers a, then it holds in particular for a = 0, which implies z ≡ 0 (mod n). The residue class [0] also has the property that α · [0] = [0] for all α ∈ Zn . Every α ∈ Zn has an additive inverse, that is, an element β ∈ Zn such that α + β = [0]; indeed, if α = [a], then clearly β := [−a] does the job, since a + (−a) ≡ 0 (mod n). Moreover, α has a unique additive inverse; indeed, if a + z ≡ 0 (mod n), then subtracting a from both sides of this congruence yields z ≡ −a (mod n). We naturally denote the additive inverse of α by −α. Observe that the additive inverse of −α is α; that is −(−α) = α. Also, we have the identities −(α + β) = (−α) + (−β), (−α)β = −(αβ) = α(−β), (−α)(−β) = αβ. For α, β ∈ Zn , we naturally write α − β for α + (−β). The residue class [1] acts as a multiplicative identity; that is, for all α ∈ Zn , we have α · [1] = α; indeed, if α = [a], then a · 1 ≡ a (mod n). Moreover, [1] is the only element of Zn that acts as a multiplicative identity; indeed, if a · z ≡ a (mod n) holds for all integers a, then in particular, it holds for a = 1, which implies z ≡ 1 (mod n). For α ∈ Zn , we call β ∈ Zn a multiplicative inverse of α if αβ = [1]. Not all α ∈ Zn have multiplicative inverses. If α = [a] and β = [b], then β is a multiplicative inverse of α if and only if ab ≡ 1 (mod n). The results in §2.2 imply that α has a multiplicative inverse if and only if gcd(a, n) = 1, and that if it exists, it is unique. When it exists, we denote the multiplicative inverse of α by α−1 . We saw at the end of §2.2 how to compute modular inverses using the extended Euclidean algorithm. We define Z∗n to be the set of elements of Zn that have a multiplicative inverse. By the above discussion, we have Z∗n = {[a] : a = 0, . . . , n − 1, gcd(a, n) = 1}. If n is prime, then gcd(a, n) = 1 for a = 1, . . . , n − 1, and we see that Z∗n = Zn {[0]}. If n is composite, then Z∗n ( Zn {[0]}; for example, if d | n with 1 < d < n, we see that [d] is not zero, nor does it belong to Z∗n . Observe that if α, β ∈ Z∗n , then so are α−1 and αβ; indeed, (α−1 )−1 = α and (αβ)−1 = α−1 β −1 . For α ∈ Zn and β ∈ Z∗n , we naturally write α/β for αβ −1 . Example 2.8. We list the elements of Z∗15 , and for each α ∈ Z∗15 , we also give α−1 : 24 α α−1 [1] [1] [2] [8] [4] [4] [7] [13] [8] [2] [11] [11] [13] [7] [14] [14] . Suppose α, β, γ are elements of Zn that satisfy the equation αβ = αγ. If α ∈ Z∗n , we may multiply both sides of this equation by α−1 to infer that β = γ. This is the cancellation law for Zn . In particular, if n is prime, then this cancellation law holds for all α 6= [0]. We stress that for arbitrary n, we require that α ∈ Z∗n , and not just α 6= [0]. Indeed, suppose that n is composite, so we can factor it as n = ab, where 1 < a < n and 1 < b < n, and set α := [a], β := [b], and γ := [0]. Then we have αβ = [0] = αγ; however, β 6= γ. Analogous to Theorems 2.5 and 2.6, we have the following result on the existence and uniqueness to solutions of linear equations in Zn . Specifically, let α ∈ Z∗n and β ∈ Zn . Then the equation αζ = β has a unique solution ζ, namely, ζ := α−1 β. In particular, if n is prime, then this equation has a unique solution ζ provided α 6= [0]. Again, we stress that for arbitrary n, we require that α ∈ Z∗n , and not just α 6= [0]. More notational conventions. For α1 , . . . , αk ∈ Zn , we may naturally Pk sum as Pk write their Pk i=1 (−αi ); i=1 αi = i=1 αi . By convention, this sum is [0] when k = 0. It is easy to see that − that is, the additive inverse of the sum is the sum of P the additive inverses. In the special case where all the αi ’s have the same value α, we define k · α := ki=1 α; thus, 0 · α = [0], 1 · α = α, 2 · α = α + α, 3 · α = α + α + α, and so on. The additive inverse of k · α is k · (−α), which we may also write as (−k) · α; thus, (−1) · α = −α, (−2) · α = (−α) + (−α) = −(α + α), and so on. Therefore, the notation k · α, or more simply, kα, is defined for all integers k. Note that for all integers k and a, we have k[a] = [ka] = [k][a]. Q Analogously, for α1 , . . . , αk ∈ Zn , we may write their product as ki=1 αi . By convention, this product is [1] when k = 0. It isQeasy to see that all of the αi ’s belong to Z∗n , then so does Qk if −1 k −1 their product, and in particular, ( i=1 αi ) = i=1 αi ; that is, the multiplicative inverse of the product is the product of the multiplicative inverses. In the special case where all the αi ’s have Q the same value α, we define αk := ki=1 α; thus, α0 = [1], α1 = α, α2 = αα, α3 = ααα, and so on. If α ∈ Z∗n , then the multiplicative inverse of αk is (α−1 )k , which we may also write as α−k ; for example, α−2 = α−1 α−1 = (αα)−1 . Therefore, when α ∈ Z∗n , the notation αk is defined for all integers k. One last notational convention. As already mentioned, when the modulus n is clear from context, we usually write [a] instead of [a]n . Although we want to maintain a clear distinction between integers and their residue classes, occasionally even the notation [a] is not only redundant, but distracting; in such situations, we may simply write a instead of [a]. For example, for every α ∈ Zn , we have the identity (α + [1]n )(α − [1]n ) = α2 − [1]n , which we may write more simply as (α+[1])(α−[1]) = α2 −[1], or even more simply, and hopefully more clearly, as (α+1)(α−1) = α2 −1. Here, the only reasonable interpretation of the symbol “1” is [1], and so there can be no confusion. 25 In summary, algebraic expressions involving residue classes may be manipulated in much the same way as expressions involving ordinary numbers. Extra complications arise only because when n is composite, some non-zero elements of Zn do not have multiplicative inverses, and the usual cancellation law does not apply for such elements. In general, one has a choice between working with congruences modulo n, or with the algebraic structure Zn ; ultimately, the choice is one of taste and convenience, and it depends on what one prefers to treat as “first class objects”: integers and congruence relations, or elements of Zn . Exercise 2.19. Let p be an odd prime. Show that P β∈Z∗p β −1 = P β∈Z∗p Exercise 2.20. Let p be an odd prime. Show that the numerator of Hint: use the previous exercise. 2.5 β = 0. Pp−1 i=1 1/i is divisible by p. Fermat’s little theorem Let n be a positive integer, and let α ∈ Z∗n . Consider the sequence of powers of α: 1 = α0 , α1 , α2 , . . . . Since each such power is an element of Z∗n , and since Z∗n is a finite set, this sequence of powers must start to repeat at some point; that is, there must be a positive integer k such that αk = αi for some i = 0, . . . , k − 1. Let us assume that k is chosen to be the smallest such positive integer. This value k is called the multiplicative order of α. We claim that αk = 1. To see this, suppose by way of contradiction that αk = αi , for some i = 1, . . . , k − 1; we could then cancel α from both sides of the equation αk = αi , obtaining αk−1 = αi−1 , which would contradict the minimality of k. Thus, we can characterize the multiplicative order of α as the smallest positive integer k such that αk = 1. If α = [a] with a ∈ Z (and gcd(a, n) = 1, since α ∈ Z∗n ), then k is also called the multiplicative order of a modulo n, and can be characterized as the smallest positive integer k such that ak ≡ 1 (mod n). From the above discussion, we see that the first k powers of α, that is, α0 , α1 , . . . , αk−1 , are distinct. Moreover, other powers of α simply repeat this pattern. The following is an immediate consequence of this observation. Theorem 2.10. Let n be a positive integer, and let α be an element of Z∗n of multiplicative order k. Then for every i ∈ Z, we have αi = 1 if and only if k divides i. More generally, for all i, j ∈ Z, we have αi = αj if and only if i ≡ j (mod k). Example 2.9. Let n = 7. For each value a = 1, . . . , 6, we can compute successive powers of a modulo n to find its multiplicative order modulo n. 26 1i 2i 3i 4i 5i 6i i mod 7 mod 7 mod 7 mod 7 mod 7 mod 7 1 1 2 3 4 5 6 2 1 4 2 2 4 1 3 1 1 6 1 6 6 4 1 2 4 4 2 1 5 1 4 5 2 3 6 6 1 1 1 1 1 1 So we conclude that modulo 7: 1 has order 1; 6 has order 2; 2 and 4 have order 3; and 3 and 5 have order 6. In the above example, we see that every element of Z∗7 has multiplicative order either 1, 2, 3, or 6. In particular, α6 = 1 for all α ∈ Z∗7 . This is a special case of Fermat’s little theorem: Theorem 2.11 (Fermat’s little theorem). Let p be a prime and let α ∈ Z∗p . Then αp−1 = 1. Proof. Since α ∈ Z∗p , for every β ∈ Z∗p we have αβ ∈ Z∗p , and so we may define the “multiplication by α” map τα : Z∗p → Z∗p β 7→ αβ. It is easy to see that τα is a bijection: Injectivity: If αβ = αβ 0 , then cancel α to obtain β = β 0 . Surjectivity: For every γ ∈ Z∗p , α−1 γ is a pre-image of γ under τα . Thus, as β ranges over the set Z∗p , so does αβ, and we have Y Y Y p−1 β . β= (αβ) = α β∈Z∗p Canceling the common factor Q β∈Z∗p (2.8) β∈Z∗p β∈Z∗p β ∈ Z∗p from the left- and right-hand side of (2.8), we obtain 1 = αp−1 . As a consequence of Fermat’s little theorem and Theorem 2.10, we see that for a prime p, the multiplicative order of α divides p − 1 for every α ∈ Z∗p . It turns out that for every prime p, there exists an element α ∈ Z∗p of whose multiplicative order is equal to p − 1. Such an α is called a primitive root mod p. For instance, in Example 2.9, we saw that [3]7 and [5]7 are primitive roots mod 7. We shall prove this fact later (see Theorem 3.11), after developing some other tools. Observe that if α ∈ Z∗p is a primitive root, then every β ∈ Z∗p can be expressed as a power of α. Fermat’s little theorem is sometimes stated in the following form: for every prime p and every α ∈ Zp , we have αp = α. (2.9) Observe that for α = 0, the equation (2.9) obviously holds. Otherwise, for α 6= 0, Theorem 2.11 says that αp−1 = 1, and multiplying both sides of this equation by α yields (2.9). Finally, in terms of congruences, Fermat’s little theorem can be stated as follows: for every prime p and every integer a, we have ap ≡ a (mod p). 27 (2.10) 2.5.1 Application: primality testing Fermat’s little theorem forms the basis for several primality testing algorithms. Consider the following computational problem: given a large number n, decide whether n is prime or not. Here, we are thinking of n has being very large, perhaps several hundred decimal digits in length—these are the size of prime numbers needed in a number of cryptographic applications. A naive approach to testing if n is prime is trial division: simply test if n is divisible by 2, 3, 5, etc., testing divisibility by all primes p up to n1/2 (see Exercise 1.14). Unfortunately, for the large values of n we are considering here, this would take an enormous amount of time, and is completely impractical. Fermat’s little theorem suggests the following primality test: simply select a non-zero α ∈ Zn , compute β := αn−1 ∈ Zn , and check if β = 1. On the one hand, if β 6= 1, Fermat’s little theorem tells us that n cannot be prime. On the other hand, if β = 1, the test is inconclusive: n may or may not prime. We can repeat the above test several times. If any of the β’s are not 1, we know n is not prime; otherwise, the test is still inconclusive. While the Fermat primality test is not perfect, it is not hard to modify it slightly so that it becomes much more effective — the most practical primality tests used today are all minor variations on the Fermat primality test. We shall not go into the details of these variations. Rather, we shall show how to efficiently implement the Fermat primality test. The techniques we present apply to the more effective primality tests, and have many other applications. The first issue to be addressed is how to represent elements of Zn . It is natural and convenient to work with the set of representatives {0, . . . , n − 1}. So to multiply two elements in Zn , we multiply their representatives, and then reduce the product mod n. Just as in §1.3.1, we shall assume that we have efficient algorithms to implement these basic arithmetic operations. That still leaves the issue of how to efficiently perform the exponentiation αn−1 . A simple algorithm to compute β := αn−1 is the following: β ← [1]n ∈ Zn repeat n − 1 times β ←β·α This algorithm computes the value β correctly. Moreover, the numbers that arise in the computation never get too large, since in every multiplication in Zn , the result gets reduced mod n. Unfortunately, the number of loop iterations is n − 1, and so this algorithm is actually slower than the trial division primality test that we started out with. Fortunately, there is a much faster exponentiation algorithm, called repeated squaring. Consider the following more general problem: given α ∈ Zn and a non-negative integer e, compute αe . Using the repeated squaring algorithm, we can compute αe using O(log e) multiplications in Zn . Setting e := n − 1, we can therefore implement the Fermat primality test using O(log n) multiplications in Zn , which is much more practical. As a warmup, suppose e is a power of 2, say e = 2k . Then the following simple algorithm computes β := αe : β←α repeat k times β ← β2 This algorithm requires just k = log2 e multiplications in Zn . 28 For arbitrary e, we can use the following strategy. Suppose that the binary representation of e is e = (b1 · · · bk )2 , where b1 is the high-order bit of e and bk is the low-order bit. Then the following iterative algorithm computes β := αe : β ← [1]n ∈ Zn for i in [1 . . k] do β ← β2 if bi = 1 then β ← β · α One can easily verify that after the ith loop iteration, we have β = αei , where ei = (b1 · · · bi )2 . Indeed, observe that ei = 2ei−1 + bi , and therefore, αei = α2ei−1 +bi = (αei−1 )2 · αbi . Example 2.10. Suppose e = 37 = (100101)2 . The above algorithm performs the following operations in this case: β β β β β β β ← [1] ← β2, β ← β · α ← β2 ← β2 ← β2, β ← β · α ← β2 ← β2, β ← β · α // // // // // // // // computed exponent (in binary) 0 1 10 100 1001 10010 100101 . Exercise 2.21. Suppose α ∈ Z∗n has multiplicative order k. Let m be any integer. Show that αm has multiplicative order k/ gcd(m, k). Hint: use Theorem 2.10 and Exercise 2.10. Exercise 2.22. Suppose α ∈ Z∗n has multiplicative order k and β ∈ Z∗n has multiplicative order `, where gcd(k, `) = 1. Show that αβ has multiplicative order k`. Hint: suppose (αβ)m = 1, and deduce that both sides of the equation αm = β −m must have multiplicative order 1 (the previous exercise may be helpful); from this, deduce that both k and ` divide m and then apply the result of Exercise 1.12. Exercise 2.23. Let α ∈ Z∗n . Suppose that for a prime q and positive integer e, we have e αq = 1 and αq e−1 6= 1. Show that α has multiplicative order q e . Exercise 2.24. Find all primitive roots mod 19. Show your work. To simplify the calculations, you may make make use of Fermat’s little theorem, as well as the result of Exercise 2.21. Exercise 2.25. Calculate the order of 9 mod 100. Show your work by building a table of powers 9i mod 100. In addition, from this table, identify the multiplicative inverse of 9 mod 100. Exercise 2.26. Let n ∈ Z with n > 1. Show that n is prime if and only if αn−1 = 1 for every
non-zero α ∈ Zn .
Exercise 2.27. Let p be any prime other than 2 or 5. Show that p divides infinitely many of the
numbers 9, 99, 999, etc.
29
Exercise 2.28. Let n be an integer greater than 1. Show that n does not divide 2n − 1. Hint:
assume that n divides 2n − 1; now consider a prime p which divides n, so that p also divides 2n − 1,
and derive a contradiction.
Exercise 2.29. This exercise develops an alternative proof of Fermat’s little theorem.
(a) Using Exercise 1.16, show that for all primes p and integers a, we have (a+1)p ≡ ap +1 (mod p).
(b) Now derive Fermat’s little theorem from part (a).
Exercise 2.30. Compute 399 mod 100 using the repeated squaring algorithm. Show your work.
30
Chapter 3
Rings and polynomials
This chapter introduces the notion of polynomials whose coefficients are in a general algebraic
structure called a “ring”. So to begin with, we introduce the mathematical notion of a ring. While
there is a lot of terminology associated with rings, the basic ideas are fairly simple. Intuitively
speaking, a ring is just a structure with addition and multiplication operations that behave exactly
as one would expect — there are very few surprises.
3.1
Rings: Definitions, examples, and basic properties
Definition 3.1 (Ring). A ring is a set R together with addition and multiplication operations
on R, such that:
(i) addition is commutative and association;
(ii) there exists an additive identity (or zero element), denoted 0R , where a+0R = a for all a ∈ R;
(iii) every a ∈ R has an additive inverse, denoted −a, where a + (−a) = 0R ;
(iv) multiplication is commutative and associative;
(v) there exists a multiplicative identity, denoted 1R , where a · 1R = a for all a ∈ R;
(vi) multiplication distributes over addition; that is, for all a, b, c ∈ R, we have a(b + c) = ab + ac.
Note that in this definition, the only property that connects addition and multiplication is the
distributive law (property (vi)). If the ring R is clear from context, we may just write 0 and 1
instead of 0R and 1R .
Example 3.1. The set Z under the usual rules of multiplication and addition forms a ring.
Example 3.2. For n ≥ 1, the set Zn under the rules of multiplication and addition defined in §2.4
forms a ring.
Example 3.3. The set Q of rational numbers under the usual rules of multiplication and addition
forms a ring.
Example 3.4. The set R of real numbers under the usual rules of multiplication and addition
forms a ring.
31
Example 3.5. The set C of complex numbers under the usual rules of multiplication and addition
√
forms a ring. Every α ∈ C can be written (uniquely) as α = a + bi, where a, b ∈ R and i = −1.
If α0 = a0 + b0 i is another complex number, with a0 , b0 ∈ R, then
α + α0 = (a + a0 ) + (b + b0 )i and αα0 = (aa0 − bb0 ) + (ab0 + a0 b)i.
Example 3.6. The trivial ring consists of a single element. It is not a very interesting ring.
We state some simple facts:
Theorem 3.2. Let R be a ring. Then:
(i) the additive and multiplicative identities are unique;
(ii) for all a, b, c ∈ R, if a + b = a + c, then b = c;
(iii) −(a + b) = (−a) + (−b) for all a, b ∈ R;
(iv) −(−a) = a for all a ∈ R;
(v) 0R · a = 0R for all a ∈ R;
(vi) (−a)b = −(ab) = a(−b) for all a, b ∈ R;
(vii) (−a)(−b) = ab for all a, b ∈ R.
While there are many parts to this theorem, everything in it just states familiar properties
that are satisfied by ordinary numbers. What is interesting about this theorem is that all of these
properties follow directly from Definition 3.1. We shall not prove these properties here, but invite
the reader to do so as a straightforward exercise.
Because addition and multiplication in a ring R are associative, for a1 , . . . , ak ∈ Zn , we may
write the sum a1 + · · · + ak and the product a1 · · · ak without any parentheses, and there is no
ambiguity; moreover, since both addition and multiplication are commutative, we may rearrange
the
without changing their values. We can write the sum as
Pk terms in such sums andQproducts
k
a
.
By
convention, if k = 0, the sum is 0R and the product is 1R .
a
and
the
product
as
i=1 i
i=1 i
If all of the ai ’s areP
equal to a,Pwe can write the sum as ka and the product as ak . Note that the
additive inverse of ki=1 ai is ki=1 (−ai ), and the additive inverse of ka is k(−a), which we can
also write as (−k)a.
One other matter of notation: for a, b ∈ R, we can write a − b instead of a + (−b).
3.1.1
Multiplicative inverses and fields
While the definition of a ring requires that every element has an additive inverse, it does not require
that any element has a multiplicative inverse.
Definition 3.3 (Multiplicative inverses in a ring). Let R be a ring. For a ∈ R, we say
that b ∈ R is a multiplicative inverse of a if ab = 1R . We define R∗ to be the set of all elements
of R that have a multiplicative inverse.
Example 3.7. In the ring Q, every non-zero element has a multiplicative inverse. The same holds
for the rings R and C.
32
Example 3.8. In the ring Z, the only elements with multiplicative inverse are ±1. That is, Z∗ =
{±1}. While the integer 2 has a multiplicative inverse 1/2 ∈ Q, it does not have a multiplicative
inverse in Z.
Example 3.9. As we saw in §2.4, Z∗n = {[a]n : gcd(a, n) = 1}.
Example 3.10. If R is a ring, then 1R ∈ R∗ (this follows from the fact that 1R · 1R = 1R , by
definition). Moreover, if R is non-trivial ring, then we have 0R 6= 1R , and moreover, 0R ∈
/ R∗ (these
observations follow from part (v) of Theorem 3.2).
Let R be a ring. If a ∈ R∗ , then it is not hard to prove that the multiplicative inverse must
be unique, and it is denoted a−1 . It is not hard to see that if a, b ∈ R∗ , then so are a−1 and ab;
indeed, one can easily verify that we must have
(a−1 )−1 = a and (ab)−1 = a−1 b−1 .
If a has a multiplicative inverse a−1 , and k is a non-negative integer, then the multiplicative inverse
of ak is (a−1 )k , which we may write as a−k . Naturally, for a ∈ R and b ∈ R∗ , we may write a/b
instead of ab−1 .
As the above examples demonstrate, it need not be the case that every non-zero element in a
ring has a multiplicative inverse. However, rings with this property are especially nice and deserve
to be singled out:
Definition 3.4 (Field). A non-trivial ring where every non-zero element has a multiplicative
inverse is called a field.
Example 3.11. The rings Q, R, and C are fields.
Example 3.12. The ring Z is not a field.
Example 3.13. The ring Zn is a field of and only if n is prime.
Suppose a, b, c are elements of R that satisfy the equation
ab = ac.
If a ∈ R∗ , we may multiply both sides of this equation by a−1 to infer that
b = c.
This is the cancellation law for R. In particular, if R is a field, then this cancellation law holds
provided a 6= 0.
We also have the following result on the existence and uniqueness to solutions of linear
equations in R. Let a ∈ R∗ and b ∈ R. Then the equation
az = b
has a unique solution z, namely, z := a−1 b. In particular, if R is a field, then this equation has a
unique solution z provided a 6= 0.
Exercise 3.1. Show that the product of two non-zero elements in a field is also non-zero.
Exercise 3.2. Give an example of a ring R and two non-zero elements a, b ∈ R such ab = 0R .
33
3.2
Polynomial rings
If R is a ring, then we can form the ring of polynomials R[X], consisting of all polynomials
g = a0 + a1 X + · · · + ak Xk in the indeterminate, or “formal” variable, X, with coefficients ai in R,
and with addition and multiplication defined in the usual way.
Example 3.14. Let us define a couple of polynomials over the ring Z5 :
g := [3]X + X2 ,
h := [1] + [2]X + [4]X3
We have:
g + h = [1] + X2 + [4]X3 ,
g · h = [3]X + [2]X2 + [2]X3 + [2]X4 + [4]X5 .
Elements of R are also considered to be polynomials. Such polynomials are called constant
polynomials. In particular, 0R is the additive identity in R[X] and 1R is the multiplicative identity
in R[X]. Note that if R is the trivial ring, then so is R[X].
So as to keep the distinction between ring elements and indeterminates clear, we shall use the
symbol “X” only to denote the latter. Also, for a polynomial g ∈ R[X], we shall in general write
this simply as “g,” and not as “g(X).” Of course, the choice of the symbol “X” is arbitrary.
3.2.1
Formalities
For completeness, we present a more formal definition of the ring R[X]. The reader should bear in
mind that this formalism is rather tedious, and may be more distracting than it is enlightening.
Formally, a polynomial g ∈ R[X] is an infinite sequence (a0 , a1 , a2 , . . .), where each ai ∈ R, but only
finitely many of the ai ’s are non-zero (intuitively, ai represents the coefficient of Xi ).
For
g = (a0 , a1 , a2 , . . .) ∈ R[X] and h = (b0 , b1 , b2 , . . .) ∈ R[X],
we define
g + h := (s0 , s1 , s2 , . . .) and gh := (p0 , p1 , p2 , . . .),
where for i = 0, 1, 2, . . . ,
si := ai + bi
(3.1)
X
(3.2)
and
pi :=
aj bk ,
i=j+k
the sum being over all pairs (j, k) of non-negative integers such that i = j +k (which is a finite sum).
We leave it to the reader to verify that g + h and gh are polynomials (i.e., only finitely many of the
si ’s and pi ’s are non-zero). The reader may also verify that all the requirements of Definition 3.1 are
satisfied: the additive identity is the all-zero sequence (0R , 0R , 0R , . . .); the multiplicative identity
is the sequence (1R , 0R , 0R , . . .), that is, the sequence consists of 1R followed by all zeros.
For c ∈ R, we can identify c with the corresponding “constant” polynomial (c, 0R , 0R , . . .). If
we define the polynomial
X := (0R , 1R , 0R , 0R , . . .),
then for any
g = (a0 , a1 , a2 , . . .), if ai = 0R for all i exceeding some value k, then we
Pkpolynomial
i
have g = i=0 ai X , and so we can return to the standard practice of writing polynomials as we
did in Example 3.14, without any loss of precision.
34
3.2.2
Basic properties of polynomials
P
Let R be a ring. For non-zero g ∈ R[X], if g = ki=0 ai Xi with ak 6= 0, then we call k the degree
of g, denoted deg(g), we call ak the leading coefficient of g, denoted lc(g), and we call a0 the
constant term of
If lc(g) = 1, then
monic.
Pg.
Pg` is called
k
i
i
Suppose g = i=0 ai X and h = i=0 bi X are polynomials such that ak 6= 0 and b` 6= 0, so
that deg(g) = k and lc(g) = ak , and deg(h) = ` and lc(h) = b` . When we multiply these two
polynomials, we get
gh = a0 b0 + (a0 b1 + a1 b0 )X + · · · + ak b` Xk+` .
In particular, if gh 6= 0, then deg(gh) ≤ deg(g) + deg(h). Note that if R is a field, we must have
ak b` 6= 0, and so deg(gh) = deg(g) + deg(h) in this case.
For the zero polynomial, we establish the following conventions: its leading coefficient and
constant term are defined to be 0R , and its degree is defined to be −∞. With these conventions,
we may succinctly state that
for all g, h ∈ R[X], we have deg(gh) ≤ deg(g) + deg(h), and equality always holds if R
is a field.
3.2.3
Polynomial evaluation
P
A polynomial g = ki=0 ai Xi ∈ R[X] naturally defines a polynomial function on R that sends u ∈ R
P
to ki=0 ai ui ∈ R, and we denote the value of this function as g(u) (note that “X” denotes an
indeterminate, while “u” denotes an element of R). As usual, we define u ∈ R to be a root of g if
g(u) = 0.
It is important to regard polynomials over R as formal expressions, and not to identify them
with their corresponding functions. In particular, two polynomials are equal if and only if their
coefficients are equal, while two functions are equal if and only if their values agree at all inputs
in R. This distinction is important, since there are rings R over which two different polynomials
define the same function. One can of course define the ring of polynomial functions on R, but in
general, that ring has a different structure from the ring of polynomials over R.
Example 3.15. In the ring Zp , for prime p, by Fermat’s little theorem, we have up = u for all
u ∈ Zp . However, the polynomials Xp and X are not the same polynomials (in particular, the former
has degree p, while the latter has degree 1).
An obvious, yet important, fact is the following:
Theorem 3.5. Let R be a ring. For all g, h ∈ R[X] and u ∈ R, if s := g + h ∈ R[X] and
p := gh ∈ R[X], then we have
s(u) = g(u) + h(u) and p(u) = g(u)h(u).
P
P
Proof. The
just symbol pushing. Indeed, suppose g = i ai Xi and h = i bi Xi .
Pproof is really
Then s = i (ai + bi )Xi , and so
X
X
X
s(u) =
(ai + bi )ui =
ai ui +
bi ui = g(u) + h(u).
i
i
i
Also, we have
p=
X
i
ai X
i
X
bj X
j
j

=
X
i,j
35
ai bj Xi+j ,
and employing the result for evaluating sums of polynomials, we have
X
X

X
i+j
i
j
p(u) =
ai bj u
=
ai u
bj u = g(u)h(u).
i,j
3.2.4
i
j
Polynomial interpolation
The reader is surely familiar with the fact that two points determine a line, in the context of real
numbers. The reader is perhaps also familiar with the fact that over the real (or complex) numbers,
every polynomial of degree k has at most k distinct roots, and the fact that every set of k points
can be interpolated by a unique polynomial of degree less than k. As we will now see, these results
extend to arbitrary fields.1
Let F be a field, and consider the ring of polynomials F [X]. Analogous to integers, we can
define a notion of divisibility for F [X]: for polynomials g, h ∈ F [X] we say that g divides h, which
we may write as g | h, if gz = h for some z ∈ F [X].
Just like the integers, there is a corresponding division with remainder property for polynomials:
Theorem 3.6 (Division with remainder property). Let F be a field. For all g, h ∈ F [X]
with h 6= 0, there exist unique q, r ∈ F [X] such that g = hq + r and deg(r) < deg(h). Proof. Consider the set S := {g − ht : t ∈ F [X]}. Let r = g − hq be an element of S of minimum degree. We must have deg(r) < deg(h), since otherwise, we could subtract an appropriate multiple of h from r so as to eliminate the leading coefficient of r, obtaining r0 := r − h · (lc(r) lc(h)−1 Xdeg(r)−deg(h) ) ∈ S, where deg(r0 ) < deg(r), contradicting the minimality of deg(r). That proves the existence of r and q. For uniqueness, suppose that g = hq + r and g = hq 0 + r0 , where deg(r) < deg(h) and deg(r0 ) < deg(h). This implies r0 − r = h · (q − q 0 ). However, if q 6= q 0 , then deg(h) > deg(r0 − r) = deg(h · (q − q 0 )) = deg(h) + deg(q − q 0 ) ≥ deg(h),
which is impossible. Therefore, we must have q = q 0 , and hence r = r0 .
If g = hq + r as in the above theorem, we define g mod h := r. Clearly, h | g if and only
if g mod h = 0. Moreover, note that if deg(g) < deg(h), then q = 0 and r = g; otherwise, if deg(g) ≥ deg(h), then q 6= 0 and deg(g) = deg(h) + deg(q). Theorem 3.7. Let F be a field, g ∈ F [X], and u ∈ F be a root of g. Then (X − u) divides g. Proof. Using the division with remainder property for polynomials, there exist q, r ∈ F [X] such that g = (X − u)q + r, with q, r ∈ F [X] and deg(r) < 1, which means that r ∈ F . Evaluating at u, we see that g(u) = (u − u)q(u) + r = r. Since u is a root of g, we must have r = 0, and therefore, g = (X − u)q, and so (X − u) divides g. Theorem 3.8. Let F be a field, and let u1 , . . . , uk be distinct elements of F . Then Qkfor every polynomial g ∈ F [X], the elements u1 , . . . , uk are roots of g if and only if the polynomial i=1 (X−ui ) divides g. 1 In fact, much of what we discuss here extends, with some modification, to more general coefficient rings. 36 Q Proof. One direction is trivial: if ki=1 (X − ui ) divides g, then it is clear that each ui is a root of g. We prove the converse by induction on k. The base case k = 1 is just Theorem 3.7. So assume k > 1, and that the statement holds for k − 1. Let g ∈ F [X] and let u1 , . . . , uk be distinct roots
of g. Since uk is a root of g, then by Theorem 3.7, there exists q ∈ F [X] such that g = (X − uk )q.
Moreover, for each i = 1, . . . , k − 1, we have
0 = g(ui ) = (ui − uk )q(ui ),
and since ui −Quk 6= 0 and F is a field, we must have q(ui ) = 0. Thus, qQhas roots u1 , . . . , uk−1 , and
k−1
by induction i=1
(X − ui ) divides q, from which it then follows that ki=1 (X − ui ) divides g.
As an immediate consequence of this theorem, we obtain:
Theorem 3.9. Let F be a field, and suppose that g ∈ F [X], with deg(g) = k ≥ 0. Then g has at
most k distinct roots.
Proof.
Qk+1 If g had k + 1 distinct roots u1 , . . . , uk+1 , then by the previous theorem, the polynomial
i=1 (X − ui ), which has degree k + 1, would divide g, which has degree k — an impossibility.
We now present the main result of this section. It says given k points (u1 , v1 ), . . . , (uk , vk ), were
the ui ’s are distinct elements of a field F and the vi ’s are arbitrary elements of F , there is a unique
polynomial g ∈ F [X] of degree less than k that “interpolates” or “passes through” these points,
that is, g(ui ) = vi for i = 1, . . . , k.
Theorem 3.10 (Lagrange interpolation). Let F be a field, let u1 , . . . , uk be distinct elements
of F , and let v1 , . . . , vk be arbitrary elements of F . Then there exists a unique polynomial g ∈ F [X]
with deg(g) < k such that g(ui ) = vi for i = 1, . . . , k, namely g := k X i=1 Q vi Q j6=i (X − uj ) j6=i (ui − uj ) . Proof. For the existence part of the theorem, one just has to verify that g(ui ) = vi for the given g, which clearly has degree less than k. This is easy to see: for i = 1, . . . , k, evaluating the ith term in the sum defining g at ui yields vi , while evaluating any other term at ui yields 0. The uniqueness part of the theorem follows almost immediately from Theorem 3.9: if g and h are polynomials of degree less than k such that g(ui ) = vi = h(ui ) for i = 1, . . . , k, then g − h is a polynomial of degree less than k with k distinct roots, which, by the previous theorem, is impossible. Example 3.16. Consider the field F = Z5 . We use the Lagrange interpolation formula to compute the coefficients of the polynomial g ∈ F [X] of degree less than 3 such that g([1]) = [1], g([3]) = [4], g([4]) = [2]. Applying Theorem 3.10 with u1 = [1], u2 = [3], u3 = [4], and v1 = [1], v2 = [4], and v3 = [2], 37 we have g = L1 + L2 + L3 , where (X − [3])(X − [4]) X2 + [3]X + [2] = [1] ([1] − [3])([1] − [4]) [1] 2 = X + [3]X + [2], L1 = [1] (X − [1])(X − [4]) X2 + [4] = [4] ([3] − [1])([3] − [4]) [3] 2 = [4][2](X + [4]) (using the fact that [3]−1 = [2]) L2 = [4] = [3]X2 + [2], (X − [1])(X − [3]) X2 + X + [3] = [2] ([4] − [1])([4] − [3]) [3] 2 = [2][2](X + X + [3]) (again, using the fact that [3]−1 = [2]) L3 = [2] = [4]X2 + [4]X + [2]. Putting it all together, we have g = [3]X2 + [2]X + [1]. Given distinct u1 , . . . , uk ∈ F and arbitrary v1 ,P . . . , vk ∈ F , as in Theorem 3.10, the coefficients a0 , . . . , ak−1 of the interpolating polynomial g = i ai Xi ∈ F [X] can be expressed as the unique solution to the following matrix equation:      1 u1 · · · u1k−1 a0 v1 1 u2 · · · uk−1   a1  v2  2       .. .. ..   ..  =  ..  . . .  .   .  k−1 ak−1 vk 1 uk · · · uk | {z } V := The matrix V is called a Vandermonde matrix. Note that one can derive the uniqueness part of Theorem 3.10 from the existence part by general facts from linear algebra. Indeed, the existence part implies that the column space of V has full rank k, which means that the null space of V has dimension 0. Application: Erasure and Error Correcting Codes As an application of polynomial interpolation, we show how they can be used to implement erasure codes. Suppose you have a file that consists of 3 blocks, A1 , A2 , A3 . To protect against data loss, you could use, say, 5 computers (nodes) to store these blocks in a redundant fashion, so that if any two nodes crash, you can still recover all the blocks. For example, if the nodes are N1 , N2 , N3 , N4 , N5 , you could store the blocks as follows N1 : A1 , A3 ; N2 : A1 , A3 ; N3 : A1 , A2 ; N4 : A2 , A3 ; N5 : A2 . You can see that each block Ai is stored on 3 nodes, so if any two nodes crash, there is still at least one copy left. While this works, you can see that this requires a lot more storage. We are storing a tota... Purchase answer to see full attachment

  
error: Content is protected !!