Unit 6: Computer Algebra Systems and Symbolic Computation¶
The following presentation is mostly taken from the notes of Dr Michael Monagan of Simon Fraser University for the course _Introduction to Computer Algebra_. The typesetting author (Paul Vrbik) has made some additions which are probably the source of any errors. The Julia code was provided by Dr Yoni Nazarathy.
$\newcommand{\RR}{\mathcal{R}}$ $\newcommand{\FF}{\mathbb{F}}$ $\newcommand{\QQ}{\mathbb{Q}}$ $\newcommand{\ZZ}{\mathbb{Z}}$ $\newcommand{\NN}{\mathbb{N}}$ $\newcommand{\PP}{\mathbb{P}}$ $\renewcommand{\cong}{\equiv}$ $\renewcommand{\mod}{\bmod}$ $\renewcommand{\mod}{\bmod}$ $\DeclareMathOperator{\content}{cont}$ $\DeclareMathOperator{\primpart}{primpart}$ $\DeclareMathOperator{\deg}{deg}$ $\DeclareMathOperator{\lc}{lc}$ $\DeclareMathOperator{\lt}{lt}$ $\DeclareMathOperator{\lm}{lm}$
See slides of the motivational part of the lecture (2022 version).
path = "/Users/uqpvrbik/Library/Mobile Documents/com~apple~CloudDocs/UQ/MATH2504/2504_2025_project1";
using Pkg;
Pkg.activate(raw"$(path)");
Pkg.resolve();
Pkg.update();
Pkg.instantiate();
include("$(path)/poly_factorization_project.jl");
Activating
project at `~/Library/Mobile Documents/com~apple~CloudDocs/UQ/MATH2504/jupyter/$(path)`
No Changes to `~/Library/Mobile Documents/com~apple~CloudDocs/UQ/MATH2504/jupyter/$(path)/Project.toml` No Changes to `~/Library/Mobile Documents/com~apple~CloudDocs/UQ/MATH2504/jupyter/$(path)/Manifest.toml`
Updating registry at `~/.julia/registries/General.toml` No Changes to `~/Library/Mobile Documents/com~apple~CloudDocs/UQ/MATH2504/jupyter/$(path)/Project.toml` No Changes to `~/Library/Mobile Documents/com~apple~CloudDocs/UQ/MATH2504/jupyter/$(path)/Manifest.toml`
Symbolic Computation¶
To compute exactly means no error is introduced during arithmetic --- something not true when working with floats. In a computer algebra system $\sqrt{2} = \sqrt{2}$ and not $\sqrt{2} = 1.41\ldots$ as one may expect.
Our interest is performing exact arithmetic in rings with efficient algorithms. In particular we are interested in the: ring of integers $$\ZZ = \{0, -1, 1, 2, -2, 3, -3, \ldots \}$$ and polynomials $$\RR[x] = \left\{ \sum_{k=0}^{N} a_kx^k \, : \, k \in \NN,\, a_k \in \RR \right\}.$$
Preliminaries¶
It is first necessary to establish terms and definitions.
Let $\NN = \{0,1,2,\ldots\}$ be the set of natural numbers, $\ZZ = \{\ldots,-2,-1,0,1,2,\ldots\}$ be the integers and $\PP$ be the set of odd primes.
Definition (Ring)¶
A ring $\RR$ is a set with addition $(+)$ and multiplication $(\cdot)$ satisfying the following conditions called the ring axioms:
- $(\RR,+)$ is an abelian group, that is:
- $(+)$ is associative and commutative,
- $\exists b \in \RR$ called the additive identity satisfying $\forall a \in \RR \; a + b = a$,
- $\forall a \in \RR\; \exists b \in \RR$ called the additive inverse satisfying $a + b = 0$.
- $(\RR,\cdot)$ is a monoid, that is:
- $(\cdot)$ is associative, and
- $\exists b \in \RR$ called the multiplicative identity satisfying $\forall a \in \RR; \; a \cdot b = b \cdot a = a$
- Multiplication distributes over addition. That is, $\forall a,b,c \in \RR$:
- $a \cdot (b+c) = (a \cdot b) + (a \cdot c)$
- $(b+c) \cdot a = (b \cdot a) + (c \cdot a)$
The ring is called a commutative ring when the multiplicative operator is commutative (i.e. $a \cdot b = b \cdot a$).
Example¶
- The $\RR = \{ 0 \}$ (the zero ring) is a trivial ring.
- The naturals $\NN$ do not form a ring as they lack additive inverses.
- Arithmetic on the clock forms a ring.
Definition (Zero Divisor)¶
An element $a \in \RR$ is called a zero divisor if there is nonzero $b \in \RR$ such that $ab = 0$.
Definition (Integral Domain)¶
An integral domain is a nonzero commutative ring without zero divisors.
Examples¶
- The integers $\ZZ$ form an integral domain.
- 'Clock arithmetic' does not form an integral domain because, for instance, $3 \cdot 4$ is 12 which is the zero.
Definition (Field)¶
A ring $\RR$ is also a field when each non-zero element has a multiplicative inverse. Equivalently, $\RR$ is a ring when $$ \forall a \in \RR^{\neq 0};\; \exists b \in \RR \; : \; a \cdot b. $$
Example¶
For instance, the rationals $$\QQ = \left\{ \frac{a}{b} \,:\, a,b \in \ZZ, \, b \neq 0 \right\}$$ is a field because $\frac a b \cdot \frac b a = 1$ when $a,b \neq 0$.
Theorem (Division Algorithm)¶
Let $a,b \in \ZZ$ with $b > 0$. There is unique $q$ and $r$ (called the quotient and remainder) satisfying $$a = b \cdot q + r$$ with $r < |q|$.
# An example
a, b = 23, 7
(23, 7)
q = a ÷ b # \div + [TAB]
3
r = a % b
2
a == b * q + r
true
# Recall: abstract type «name» <: «supertype» end
function remainder(a::T, b::T) where T <: Integer
#Note that later we'll extend this to integral domains
#short circuit evalution gives cheeky if statements
a < 0 && return remainder(-a, b) #later replace `0` with `zero(T)`
a < b && return a
return remainder(a-b, b)
end
remainder (generic function with 1 method)
#note there is an in-built `rem`
remainder(23, 7), rem(23, 7), 23 % 7, 23 - (23 ÷ 7)*7
(2, 2, 2, 2)
remainder(15, 5), remainder(4, 101), remainder(125, 4)
(0, 4, 1)
function quo(a::T, b::T) where T <: Integer
a < 0 && return -quo(-a,b)
a < b && return 0
return 1 + quo(a-b, b)
end
quo (generic function with 2 methods)
quo(15, 5), quo(4, 101), quo(125, 4)
(3, 0, 31)
for (a,b) in [(15, 5), (4, 101), (125,4 )]
q, r = quo(a, b), rem(a, b)
println("$a = $q⋅$b + $r")
end
15 = 3⋅5 + 0 4 = 0⋅101 + 4 125 = 31⋅4 + 1
Of course, in Julia we have % and ÷ for remainder and and quotient.
(Note that in Python // is the quotient whereas in Julia // defines a rational type).
23 // 7, float(23//7) #The first // is a rational type #!!! Note in Python // is ÷
(23//7, 3.2857142857142856)
Incidentally, another way to express the quotient/remainder relation is by way of "proper" fractions. $$ 23 = 3\cdot 7 + 2 \iff \frac {23} 7 = 3 + \frac {2} {7} $$
(23 ÷ 7), (23 % 7) // 7
(3, 2//7)
23 // 7 == (23 ÷ 7) + (23 % 7) // 7
true
Definition (Euclidean Domain)¶
A commutative ring $\RR$ is called a Euclidean Domain when $\RR$
- is an integral domain, and
- has the division algorithm.
Elementary Number Theory¶
The sets $\ZZ_n = \{0,1,\ldots,n-1\}$ for $n \in \NN$ are rings called residue classes when addition and multiplication are done 'mod n'. This is sometimes called 'clock arithmetic' as three hours past eleven is two because $11 + 3 \cong 2 \mod 12$.
We say/write $$a \cong b \mod c$$ when $a$ is congruent to $b$ modulo $c$. Also: $$a \cong b \mod c \iff \mathop{\textrm{rem}}(a,c) = b \iff \texttt{a \% c = b}.$$
We have already seen something similar with overflow. In particular, working in an 8-bit register is essentially doing arithmetic modulo $2^8$.
UInt8(2^8-3) + UInt8(4)
0x01
(2^8-3 + 4) % 2^8
1
Symmetric Mod¶
We may also use negatives values, here
$\ZZ_n = \{ -{\rm quo}(n,2),\, 0,\,\ldots,\,{\rm quo}(n,2)\}$. For instance, $\ZZ_7 = \{-3,-2,-1,0,1,2,3\}$ and
$$
5 \cong -2 ~{\rm smod}~ 7.
$$
We do not use this notation and simply move use mod as smod when we want as there is no effect on the theory and presentation.
n = 6
Z_n = 0:(n-1)
0:5
typeof(Z_n)
UnitRange{Int64}
A = [(x+y) % n for y in Z_n, x in Z_n]
6×6 Matrix{Int64}:
0 1 2 3 4 5
1 2 3 4 5 0
2 3 4 5 0 1
3 4 5 0 1 2
4 5 0 1 2 3
5 0 1 2 3 4
Notice:
- the addition table is symmetric (equal to its transpose) which can only happen when the addition is commutative,
- the additive identity is $0$,
- each column (and row) has $0$ and thereby each element has an additive inverse: for instance, the third column indicates that $2$ has additive inverse $4$ and correspondingly $2 + 4 \cong 0 \mod 6$.
(A .== 0)
6×6 BitMatrix: 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0
additive_inverses = [findfirst((A .== 0)[:,k+1])-1 for k in Z_n]
println(additive_inverses) # print horizontally instead of vertically
[0, 5, 4, 3, 2, 1]
Example (Multiplication Table)¶
$\ZZ_6$ has the following multiplication table
M = [(x*y) % n for y in Z_n, x in Z_n]
6×6 Matrix{Int64}:
0 0 0 0 0 0
0 1 2 3 4 5
0 2 4 0 2 4
0 3 0 3 0 3
0 4 2 0 4 2
0 5 4 3 2 1
Notice:
- the multiplicative identity is $1$,
- not all rows contain one, which means there are some elements that do not have multiplicative inverse. For instance there is no $b \in \ZZ_6$ such that $2 \cdot b \cong 1 \mod 6$.
println([(2*y) % n for y in Z_n])
[0, 2, 4, 0, 2, 4]
mult_inverses = Z_n[[sum((M .== 1)[:,k+1]) > 0 for k in Z_n]]
println("Elements with a multiplicative inverse: ", mult_inverses )
println("Elements withoout a multiplicative inverse: ", setdiff(Z_n,mult_inverses))
Elements with a multiplicative inverse:
[1, 5] Elements withoout a multiplicative inverse: [0, 2, 3, 4]
Notice $\ZZ_6$ is not a field because, recall, there is no multiplicative inverse for $2$. However, $\ZZ_p$ is a field when $p$ is a prime number.
Example¶
$\ZZ_5 = \{0,\ldots,4\}$ is a field and has the following multiplication table.
n = 5
Z_n = 0:(n-1)
0:4
M = [(x*y) % n for y in Z_n, x in Z_n]
5×5 Matrix{Int64}:
0 0 0 0 0
0 1 2 3 4
0 2 4 1 3
0 3 1 4 2
0 4 3 2 1
mult_inverses = Z_n[[sum((M .== 1)[:,k+1]) > 0 for k in Z_n]]
println("Elements with a multiplicative inverse: ", mult_inverses )
println("Elements withoout a multiplicative inverse: ", setdiff(Z_n,mult_inverses))
Elements with a multiplicative inverse: [1, 2, 3, 4] Elements withoout a multiplicative inverse: [0]
Notice every column (and row) contains $1$ asides the first column (and row) which corresponds to $0$. For instance, the third column indicates the inverse of $3$ is $2$ and correspondingly $3 \cdot 2 \cong 1 \mod 5$.
Greatest Common Divisor¶
The greatest common divisor is one of the fundamental operations of a computer algebra system. For instance, we need gcd to reduce fractions to their canonical form.
Def (GCD)¶
Let $a,b \in \ZZ$ not both 0. We say $g \in \ZZ$ is the greatest common divsor of $a$ and $b$, denoted $\gcd(a,b)$, when
- $g | a$ and $g | b$ ($g$ is a common divisor),
- $h | a \;\wedge\; h | b \implies h | g$ (greatest),
- $g > 0$ (required for uniqueness)
Example (GCD)¶
- The $\gcd(6,4) = \gcd(2\cdot 3, 2 \cdot 2) = 2$.
- The $\gcd(6,0) = \gcd(1\cdot6, 0\cdot 6) = 6$.
#The is an in-built gcd() function for integers (and a few more types)
#but we'll soon make our own
using Random; Random.seed!(10)
a_test = *(rand(1:100,5)...)
b_test = *(rand(1:100,5)...)
@show a_test, b_test
Base.gcd(a_test, b_test)
(a_test, b_test) = (4145382, 1056517560)
18
Euclid's Algorithm¶
The Euclidean Algorithm computes the gcd of two integers. (Actually the Euclidean Algorithm computers gcds for any two elements of a Euclidean Domain). The algorithm exploits the following property of the gcd:
Lemma¶
Let $a,b\in \ZZ^{>0}$ and $a = b\cdot q + r$ with $r \in [0,b)$. Then
- $\gcd(b, a) = \gcd(a, b)$,
- $\gcd(a, b) = gcd(r, b)$,
- $\gcd(a, b) = \gcd(a-b, b)$.
It is straightforward to turn this lemma into an algorithm:
function euclid_alg(a,b)
(b == 0) && return a
return euclid_alg(b, a % b) # Rule 1 and 2
end
euclid_alg(2, 2*2), euclid_alg(3*7, 2*3), euclid_alg(2*2, 13)
(2, 3, 1)
Base.gcd(a_test, b_test), euclid_alg(a_test,b_test)
(18, 18)
Extended Euclid's Algorithm¶
The Extended Euclidean Algorithm in addition to the gcd(a, b), also computes $s$ and $t$ (called the Bézout coefficients) such that $$as + bt = \gcd(a,b).$$
function i_ext_euclid_alg(a,b)
(a == 0) && return b, 0, 1
g, s, t = i_ext_euclid_alg(b % a, a)
return g, t - (b ÷ a)*s, s
end
pretty_print_egcd((a,b),(g,s,t)) = println("$a×$s + $b×$t = $g = gcd($a, $b)") #\times + [TAB]
for (a,b) in [(4,12), (9,12), (4,13)]
pretty_print_egcd((a,b), i_ext_euclid_alg(a,b))
end
4×1 + 12×0 = 4 = gcd(4, 12) 9×-1 + 12×1 = 3 = gcd(9, 12) 4×-3 + 13×1 = 1 = gcd(4, 13)
#Note there is an in-built gcdx - let's compare against that too
Base.gcdx(a_test,b_test), i_ext_euclid_alg(a_test,b_test)
((18, 2477299, -9720), (18, 2477299, -9720))
?gcdx
search:
gcdx gcd logcdf ccdf cd ecdf cdf
gcdx(a, b)
Computes the greatest common (positive) divisor of a and b and their Bézout coefficients, i.e. the integer coefficients u and v that satisfy $ua+vb = d = gcd(a, b)$. $gcdx(a, b)$ returns $(d, u, v)$.
The arguments may be integer and rational numbers.
!!! compat "Julia 1.4" Rational arguments require Julia 1.4 or later.
Examples¶
jldoctest
julia> gcdx(12, 42)
(6, -3, 1)
julia> gcdx(240, 46)
(2, -9, 47)
!!! note
Bézout coefficients are not uniquely defined. gcdx returns the minimal Bézout coefficients that are computed by the extended Euclidean algorithm. (Ref: D. Knuth, TAoCP, 2/e, p. 325, Algorithm X.) For signed integers, these coefficients u and v are minimal in the sense that $|u| < |b/d|$ and $|v| < |a/d|$. Furthermore, the signs of u and v are chosen so that d is positive. For unsigned integers, the coefficients u and v might be near their typemax, and the identity then holds only via the unsigned integers' modulo arithmetic.
Inversion in $\ZZ_m$¶
Notice the Extended Euclidean Algorithm computes inverses in $\ZZ_m$. Given $a \in \ZZ_m^{\neq 0}$ the ${\rm egcd}(a,m)$ returns $s$ and $t$ such that $$a \cdot s + m \cdot t = \gcd(a,m).$$ Taking the entire equation $\mod m$ we get $$ \begin{align*} &\gcd(a,m) \cong a \cdot s + m \cdot t ~\mod m \\ &\implies \gcd(a,m) \cong a \cdot s + 0 ~\mod m \\ &\implies \gcd(a,m) \cong a \cdot s ~\mod m. \end{align*}$$ Provided $\gcd(a,m) = 1$ (i.e. "coprime" or "relatively prime") we get $a \cdot s \cong 1 \mod m$ and thus $a^{-1} \cong s \mod m$.
Example¶
To find the inverse of $5 \in \ZZ_{13}$ do:
a, m = 5, 13
i_ext_euclid_alg(5, 13)
(1, -5, 2)
inverse_mod(a,m) = mod(i_ext_euclid_alg(a,m)[2], m)
inverse_mod (generic function with 1 method)
i = inverse_mod(a,m)
8
mod(a*i, m)
1
Chinese Remaindering¶
Problem Statement¶
Two integers $a$ and $b$ are relatively prime when $\gcd(a,b) = 1$. Given a finite sequence of relatively prime integers $m_1,\ldots,m_k$ called the moduli and correspoing integers $u_1,\ldots,u_k$ called the images, find $u \in \ZZ$ such that \begin{align*} u &\cong u_1 \mod m_1 \\ u &\cong u_2 \mod m_2 \\ &\; \vdots \\ u &\cong u_k \mod m_k \end{align*}
Example¶
For example, given the following information \begin{align*} u &\cong 4 \mod 5 &\implies&& (u_1,\, m_1) &= (4,\, 5)\\ u &\cong 5 \mod 7 &\implies&& (u_2,\, m_2) &= (5,\, 7) \end{align*} we want to deduce $u = 19$ because
(19 % 5, 19 % 7)
(4, 5)
Theorem (CRT)¶
Let $M = m_1 \cdot m_2 \cdots \cdot m_k$. There exists a unique $u \in \ZZ$ on $\{0,\ldots,M-1\}$ such that $u \cong u_i \mod m_i$ for $i = 1,\ldots,k$.
Proof Sketch:
We can construct a number with the property $u \cong 4 \mod 5$ and $u \cong 5 \mod 7$ directly.
It is trivial to ensure $u \cong 4 \mod 5$ by letting $u = 4$ but this does not satisfy the $u \cong 5 \mod 7$ requirement.
However, $4 = 4 + 0 \cdot k$ for any $k$ and $5 \cong 0 \mod %$. This means $$4 \cong 4 + k \cdot 5 \mod 5$$ and it just remains to find $k$ so that $$ 4 + k \cdot 5 \cong 5 \mod 7. $$ This is accomplished by re-arranging terms: $$ k \cong (5-4) \cdot (5)^{-1} \mod 7 \implies k \cong 3 \mod 7 $$ and there $u = 4 + 3 \cdot 5 = 19$. It just remains to extend this approach for more than two numbers.
Proof of Existence:
(This is a constructive proof which provides an algorithm for determining $u$).
Let $$ u = v_1 + v_2m_1 + v_3m_1m_2 + \cdots + v_{k+1} m_1 m_2 \cdots m_{k} $$ and notice the vanishing terms after reducing $u$ modulo $m_j$: $$ u \mod m_j = v_1 + v_2m_1 + \cdots + v_{j}m_1\cdots m_{j-1} + 0 + \cdots + 0 $$
We have \begin{align*} u_1 &\cong v_1 \mod m_1 &&\implies v_1 \gets u_1 \\ u_2 &\cong v_1 + v_2 m_1 \mod m_2 &&\implies v_2 \gets (u_2 - v_1) \cdot (m_1)^{-1} \mod m_2 \\ u_3 &\cong v_1 + v_2 m_1 + v_3 m_1 m_2 \mod m_3 &&\implies v_3 \gets (u_3-v_1-v_2 m_1)(m_1 m_2)^{-1} \mod m_3\\ &\;\,\vdots \end{align*}
Proof of Uniquenss
Left as an exercise. :P
Example (CRT)¶
Suppose $u \cong 2 \mod 3$, $u \cong 1 \mod 5$, $u \cong 5 \mod 7$. What is $u$?
Proceeding with CRT...
uᵢ = [2, 1, 5] # \_i + [TAB]
m = [3, 5, 7]
v = Vector{Int}(undef,3)
v[1] = uᵢ[1]
v[2] = (uᵢ[2] - v[1]) * inverse_mod(m[1], m[2]) % m[2]
v[3] = (uᵢ[3] - v[1] - v[2]*m[1]) * inverse_mod(m[1]*m[2], m[3]) % m[3]
@show v #intermediate
u = v[1] + v[2]*m[1] + v[3]*m[1]*m[2]
for i in [1, 2, 3]
println("$u ≡ $(uᵢ[i]) mod $(m[i])")
end
v = [2, -2, 2] 26 ≡ 2 mod 3 26 ≡ 1 mod 5 26 ≡ 5 mod 7
The Polynomial Ring $\RR[x]$¶
By this point in your mathematics career you have surely worked with polynomials. A polynomial is something like $x^2 + 2x + 1$ but not like $\frac{x^2+1}{x-1}$ or $x \sin(x)$.
Definition¶
Let $\RR$ be a ring and $a \in \RR[x]$. Let $$ a = a_n x^n + a_{n-1}x^{n-1} + \cdots + a_1x + a_0 $$ with $a_n \neq 0$.
- $a$ is called a polynomial in $x$ and $a_k x^k$ are its terms.
- The degree of $a$ is $n$: $$\deg(a) = n.$$
- The leading coefficient of $a$ is $a_n$: $$\lc(a) = a_n.$$
- The leading term of $a$ is $a_n x^n$: $$\lt(a) = a_n x^n.$$
By convention we let $\lc(0) = \lt(0) = 0$ and $\deg(0) = - \infty$.
Representation of Polynomials in Project¶
"""
A term.
"""
struct Term{C <: Number, D <: Integer} #structs are immutable by default
coeff::C
degree::D
function Term{C, D}(coeff::C, degree::D) where {C, D}
degree < 0 && error("Degree must be non-negative")
coeff != 0 ? new(coeff, degree) : new(zero(C), zero(D))
end
end
"""
Creates the zero term.
"""
function zero(::Type{Term{C, D}})::Term{C, D} where {C, D}
Term(zero(C), zero(D))
end
zero(t::Term) = zero(typeof(t))
"""
Creates the unit term.
"""
function one(::Type{Term{C, D}})::Term{C, D} where {C, D}
Term(one(C), zero(D))
end
one(t::Term) = one(typeof(t))
z = zero(PolynomialDense)
0
iszero(z)
true
une = one(PolynomialDense)
1
iszero(une - une)
true
abstract type Polynomial{C <: Number, D <: Integer} end
"""
A generic error message for functionality that should be implemented by concrete subtypes.
"""
function not_implemented_error(p::Polynomial, method::String)
error("The method '$(method)' is not yet implemented for a polynomial of type $(typeof(p))")
end
"""
A Polynomial type - designed to be for polynomials with integer coefficients.
"""
struct PolynomialDense{C, D} <: Polynomial{C, D}
terms::Vector{Term{C, D}}
#Inner constructor of 0 polynomial
PolynomialDense{C, D}() where {C, D} = new{C, D}([zero(Term{C, D})])
#Inner constructor of polynomial based on arbitrary list of terms
function PolynomialDense{C, D}(vt::Vector{Term{C, D}}) where {C, D}
#Filter the vector so that there is not more than a single zero term
vt = filter((t)->!iszero(t), vt)
if isempty(vt)
vt = [zero(Term{C, D})]
end
max_degree = maximum((t)->t.degree, vt)
terms = [zero(Term{C, D}) for i in 0:max_degree] #First set all terms with zeros
#now update based on the input terms
for t in vt
terms[t.degree + 1] = t #+1 accounts for 1-indexing
end
return new{C, D}(terms)
end
end
Addition in $\RR[x]$¶
For $\RR[x]$ to be a ring we must define addition.
def ADDITION:
Input a, b in ZZ[x]
Ouput a + b (definitionally)
if a == 0
return b
if b == 0
return a
if deg(a) == deg(b)
term = lt(a) + lt(b) # addtion on *terms* ax^n + bx^n = (a+b)x^n
return term + ADDITION(a-lt(a), b-lt(b))
if deg(a) > deg(b)
return lt(a) + ADDITION(a-lt(a), b)
# deg(b) > deg(a)
return lt(b) + ADDITION(a, b-lt(b))
x = x_poly(PolynomialDense)
x^2 + 3*x + 2
x² + 3x + 2
x = x_poly(PolynomialDense)
p1 = 2x^2 + 3x + (-5) #Project: Fix this so we can simply do 2x^2+3x-5
p2 = -3x^2 - 4x + 6
p1+p2
-x² - x + 1
"""
Add a polynomial and a term.
"""
function +(p::PolynomialDense{C, D}, t::Term{C, D})::PolynomialDense{C, D} where {C, D}
iszero(t) && return p
p = deepcopy(p)
if t.degree > degree(p)
push!(p, t)
else
if !iszero(p.terms[t.degree + 1]) #+1 is due to indexing
p.terms[t.degree + 1] += t
else
p.terms[t.degree + 1] = t
end
end
return trim!(p)
end
Multiplication in $\RR[x]$¶
For $\RR[x]$ to be a ring we must define multiplication.
def MULTIPLICATION:
Input a, b in ZZ[x]
Ouput a * b (definitionally)
if a = 0 or b = 0:
return 0
if lt(a) = a and lt(b) = b: # a and b are single terms
return a * b # product of *terms* ax^n * bx^m = (ab)x^(n+m)
write a = a1 + a2 with deg(a1) != deg(a2)
write b = b1 + b2 with deg(b1) != deg(b2)
# DISTRIBUTION PROPERTY
# (a1 + a2)*(b1 + b2) = a1*b1 + a1*b2 + a2*b1 + a2*b2
return M(a1,b1) + M(a1,b2) + M(a2,b1) + M(a2,b2) where M = MULTIPLICATION
"""
Multiply two polynomials.
"""
function *(p1::P, p2::P)::P where {P <: Polynomial}
p_out = P()
for t in p1
new_summand = (t * p2)
p_out = p_out + new_summand
end
return p_out
end
x = x_poly(PolynomialDense)
p1 = 2x^2 + 3x + (-5)
p2 = -3x^2 - 4x + 6
p1 * p2
-6x⁴ - 17x³ + 15x² + 38x - 30
The following properties are consequences of our definition. (They are obligated to be).
Proposition (Polynomial Multiplication)¶
When $\RR$ is an integral domain (a nonzero commutative ring in which the product of any two nonzero elements is nonzero) and $a,b \in \RR[x]^{\neq 0}$ then
- $\deg(ab) = \deg a + \deg b$,
- $\lc(ab) = a_b \cdot b_m = \lc(a) \cdot \lc(b)$,
- $\lm(ab) = \lm(a) \cdot \lm(b)$,
- $\lt(ab) = \lt(a) \cdot \lt(b)$.
Proof (Sketch) $$ \begin{align*} a \times b &= (a_n x^n + \cdots + a_0)(b_mx^m + \cdots + b_0) \\ &= a_n \cdot b_m x^{n+m} + \cdots + a_0 \cdot b_0 \end{align*} $$
Proposition¶
Every field is also an integral domain.
Proof (Omitted)
Multiplication by Chinese Remainder Theorem (Project)¶
Let $a = a_nx^n + a_{n-1}x^{n-1} + \cdots + a_1 x + a_0$. Define the height of a polynomial by $$ ||a||_{\infty} := \max\{|a_0|, \ldots, |a_n|\}. $$ So, for instance $||2x^2-5||_{\infty} = 5$.
Further let
$$\# a := \mbox{number of nonzero terms of }a.$$
So, for instance $\# 2x^2-5 = 2$.
If we intend to use CRT to cacluate the product $c = a \cdot b$ of $a,b \in \ZZ_p[x]$ we need $M = p_1,\ldots,p_\ell$ satisfying $$ M = p_1 \cdot \cdots \cdot p_\ell > 2 \cdot ||c||_\infty. $$ It is twice $||c||_\infty$ to account for negative ceofficients. For instance, $3x^2 - 3x +1$ requires coefficients in $\{ -3,-2,-1,0,1,2,3\}$ which is $\ZZ_7$ and not $\ZZ_5$.
Proposition¶
Let $a,b \in \ZZ[x]$. $$ ||c||_{\infty} \leq ||a||_{\infty} \cdot ||b||_{\infty} \cdot \min(\# a,\, \# b) $$
Proof (Sketch) Do a worst case analysis where you mutiply $a = \alpha x^n + \alpha x^{n-1} + \cdots + \alpha$ by $b = \beta x^n + \beta x^{n-1} + \cdots + \beta$ and see what the largest coefficient will end up being.
Example¶
Let \begin{align*} a = 3x - 4, && b = 6x+5, && c = 18x^2-9x-20 \end{align*} and notice $ab = c$.
$$||c||_\infty \leq 4 \cdot 6 \cdot \min(2,2) = 48$$
$$ \begin{align*} & && a_i && b_i && c_i \\[1em] \hline p_1 &= 5 && 3x + 1 && 1x + 0 && 3x^2 + 1x + 0 \\[1.5em] \hline p_2 &= 7 && 3x + 3 && 6x + 5 && 4x^2 + 5x + 1 \\[1.5em] \hline p_3 &= 3 && 0x + 2 && 0x + 2 && 0x^2 + 0x + 1 \\[1.5em] \hline \end{align*} $$
Suppose you have defined a function:
CRT([u1, u2, ..., um], [p1, p2, ..., pm]) = u such that u == uk mod pk
CRT([3,4,0], [5,7,3]) = 18 mod 3*5*7
CRT([1,5,0], [5,7,3]) = 96 mod 3*5*7
CRT([0,1,1], [5,7,3]) = 85 mod 3*5*7
Let the symmetric mod be given by
smod(a, m) = (a mod m) if (a mod m) <= m//2 else (a mod m) - m
For instance,
smod(0, 7) = 0
smod(1, 7) = 1
smod(2, 7) = 2
smod(3, 7) = 3
smod(4, 7) = -3
smod(5, 7) = -2
smod(6, 7) = -1
CRT([3,4,0], [5,7,3]) = 18 mod 3*5*7
CRT([1,5,0], [5,7,3]) = 96 mod 3*5*7 = -9 smod 3*5*7
CRT([0,1,1], [5,7,3]) = 85 mod 3*5*7 = -20 smod 3*5*7
$$ \begin{align*} & && a_i && b_i && c_i \\[1em] \hline p_1 &= 5 && 3x + 1 && 1x + 0 && 3x^2 + 1x + 0 \\[1.5em] \hline p_2 &= 7 && 3x + 3 && 6x + 5 && 4x^2 + 5x + 1 \\[1.5em] \hline p_3 &= 3 && 0x + 2 && 0x + 2 && 0x^2 + 0x + 1 \\[1.5em] \hline p &= 105 && 3x + 101 && 6x+5 && 18x^2 + 96 x + 85 \\[1.5em] \hline \end{align*} $$
So we, apply the symmetric mod and deduce $$(3x - 4)(6x+5) = 18x^2 -9x - 20.$$
Chinese Remainder Theorem on TWO polynomials
def CRT:
Input [a,b] and [n,m]
such that a in \ZZ_n[x], b in \ZZ_m[x], and gcd(n,m)=1.
Ouput c \in \ZZ_{nm}[x]
such that c ≡ a (mod n) and c ≡ b (mod m).
c <- 0 # the 0 polynomial
for k from max(deg a, deg b) to 0 by -1 do
ak <- 0 if k > deg a else ak # = coefficient on x^k in a
bk <- 0 if k > deg b else bk # = coefficient on x^k in b
ck <- iCRT([ak, bk], [n, m]) # integer chinese remainder
c <- c + ck * x^k
return c
def MULTIPLICATION:
Input a = an x^n + ... a1 x + a0, b = bm x^m + ... b1 x + b0 in ZZ[x]
Ouput a * b in ZZ[x]
height_a <- max(|an|, ..., |a1|, |a0|)
height_b <- max(|bm|, ..., |b1|, |b0|)
B <- 2 * height_a * height_b * min(n+1, m+1)
p <- 3
M <- p
c <- (a*b) mod M # multiplication in ZZ_3[x]
while M < B do
p <- NextPrime(p) # the first prime > p
c' <- a*b mod p # multiplication in ZZ_p[x]
c <- CRT([c, c'], [M, p]) # spatial optimaztion
M <- M*p
return c smod M # restore negative coefficients with symemetric mod
Division in $\FF[x]$¶
We are unable to divide in $\RR[x]$. Consider the following operation in $\ZZ[x]$ $$ \frac{x^2}{2x} = \frac{1}{2}x. $$ Even though $x^2$ and $2x$ are from $\ZZ[x]$ that their division is in $\QQ[x]$. This is because $\ZZ$ is not a field but $\QQ$ is.
It turns out $\FF[x]$ is a Euclidean Domain when $\FF$ is a field. Re-consider the division, but this time let us do the calcuation in $\ZZ_7[x]$ $$ \begin{align*} \frac{x^2}{2x} &\cong \frac{1}{2}x \mod 7 \\ &\cong 2^{-1} x \mod 7 \\ &\cong 4 x \mod 7. \end{align*} $$ Note that $\frac{1}{2} \cong 4 \mod 7$.
Theorem (Division in $\FF[x]$)¶
Let $a,b \in \FF[x]$, $b\neq0$. There exists unique polynomials $q,r \in \FF[x]$ such that $$ \begin{align*} a = bq + r &&\mbox{and}&& r = 0 &&\mbox{or}&& \deg r < \deg b \end{align*} $$
Proof (Uniqueness) Let $a = b q_1 + r_1 = b q_2 + r_2$ with $\deg r_1, \deg r_2 < \deg b$.
Consider $$ \begin{align*} b q_1 + r_1 = b q_2 + r_2 &\implies b(q_1 - q_2) = (r_2-r_1) \\ &\implies b | (r_2-r_1) & \mbox{But }\deg r_1, r_2 < \deg b\\ &\implies r_2 - r_1 = 0 \\ &\implies r_2 = r_1 \end{align*} $$
Therefore $b(q_1-q_2) = 0$. But since $\FF[x]$ is an integral domain it cannot have zero divisors. As we assumed $b\neq 0$ this implies $$(q_1-q_2)=0 \implies q_1 = q_2. ~\square$$
Proof (Existence -- The Division Algorithm)
def DIVISION:
Input a, b in ZZ_p[x] such that b != 0
Ouput q, r such that a = b*q + r and deg r < deg b in ZZ_p[x]
r, q = a, 0
while r != 0 and deg r >= deg b do:
t <- lt(r)/lt(b) # division of *terms* in ZZ_p: (ax^n)/(bx^m) = a/b mod p x^(n-m), n>m
q <- q + t
r <- r - t*b # deg r *must* monotonically decrease
return (q,r)
A couple of convenience methods
def QUO(a, b) = first(DIVISION(a,b)) in ZZ_p[x]
def REM(a, b) = first(DIVISION(a,b)) in ZZ_p[x]
""" Modular algorithm.
f divide by g
"""
function div_rem_mod_p(num::P, den::P, prime::Integer)::Tuple{P, P} where {P <: Polynomial}
f, g = mod(num,prime), mod(den,prime)
@assert degree(num) == degree(mod(num, prime))
iszero(g) && throw(DivideError())
iszero(f) && return zero(P), zero(P)
q = P()
prev_degree = degree(f)
while degree(f) ≥ degree(g)
h = P( div_mod_p(leading(f), leading(g), prime) ) #syzergy
f = mod((f - h*g), prime)
q = mod((q + h), prime)
prev_degree == degree(f) && break
prev_degree = degree(f)
end
@assert iszero( mod((num - (q*g + f)),prime))
return q, f
end
"""
The quotient from polynomial division modulo a prime.
"""
div_mod_p(num::P, den::P, prime::Integer) where {P <: Polynomial} = first(div_rem_mod_p(num, den, prime))
"""
The remainder from polynomial division modulo a prime.
"""
rem_mod_p(num::P, den::P, prime::Integer) where {P <: Polynomial} = last(div_rem_mod_p(num, den, prime))
p1 = 10x^2 + 4x + 1;
p2 = 5x + (-3); # you fix this
@show p1
@show p2;
p1 = 10x² + 4x + 1 p2 = 5x - 3
q, r = div_rem_mod_p(p1, p2, 11)
(2x + 2, 7)
div_mod_p(p1, p2, 11)
2x + 2
rem_mod_p(p1, p2, 11)
7
mod(q*p2+r - p1, 11)
0
Example¶
Consider $a = 10x^2 + 4x + 1$ divided by $b = 5x-3$ in $\ZZ_{11}[x]$.
$$ \begin{align*} r,q &\gets 10x^2 + 4x + 1, 0 \\[1em] r \neq 0 &\mbox{ and } \deg r = 2 \geq \deg b = 1 \\ t &\gets \frac{10x^2}{5x} \cong 2x \mod 11\\ q &\gets 0 + 2x\\ r &\gets (10x^2 + 4x + 1) - 2x(5x-3) \cong 10x + 1 \mod 11\\[1em] r \neq 0 &\mbox{ and } \deg r = 1 \geq \deg b = 1 \\ t &\gets \frac{10x}{5x} \cong 2 \mod 11\\ q &\gets 2x + 2\\ r &\gets (10x+1) - 2(5x-3) \cong 7 \mod 11\\[1em] r \neq 0 &\mbox{ and } \color{red}{\deg r = 0 \geq \deg b = 1} \\[1em] &\mbox{return } (2x+2, 7) \end{align*} $$
And notice $10x^2 + 4x + 1 \cong (5x-3)(2x+2) + 7 \mod 11$.
"""
The extended euclid algorithm for polynomials (of the same concrete subtype) modulo prime.
"""
function extended_euclid_alg_mod_p(a::P, b::P, prime::Integer) where {P <: Polynomial}
old_r, r = mod(a, prime), mod(b, prime)
old_s, s = one(P), zero(P)
old_t, t = zero(P), one(P)
while !iszero(mod(r,prime))
q = div_mod_p(old_r, r, prime)
old_r, r = r, mod(old_r - q*r, prime)
old_s, s = s, mod(old_s - q*s, prime)
old_t, t = t, mod(old_t - q*t, prime)
end
g, s, t = old_r, old_s, old_t
@assert mod(s*a + t*b - g, prime) == 0
return g, s, t
end
"""
The GCD of two polynomials (of the same concrete subtype) modulo prime.
"""
gcd_mod_p(a::P, b::P, prime::Integer) where {P <: Polynomial} = extended_euclid_alg_mod_p(a, b, prime) |> first
function smod(p::P, prime::Integer) where {P <: Polynomial}
p = mod(p, prime)
s = c -> c > prime ÷ 2 ? c - prime : c
P( map(t -> Term(s(t.coeff), t.degree), p) )
end
smod (generic function with 1 method)
x = x_poly(PolynomialDense)
p1 = 2x^2 + 3x + (-5) #Note we need +(-5)... try later without... how to fix?
p2 = (-3)x^2 + (-4)x + 6
p3, _, _ = extended_euclid_alg_mod_p(p1*p2, p2, 101)
(98x² + 97x + 6, 0, 1)
smod(p3, 101)
-3x² - 4x + 6
Factorization in $\ZZ[x]$¶
Analogous to factoring integers into primes we wish to factor polynomials into irreducible components.
Definition (Irreducible Polynomial)¶
An irreducible polynomial is a polynomial that cannot be factored into the product of two non-constant polynomials. In particular, $a \in \ZZ[x]$ is irreducible when $$ a = b \cdot c \implies \deg b = 0 \mbox{ or } \deg c = 0. $$
Irreducibility depends on the coefficient field:
Consider $f = x^4 - 4$. The polynomial $f$ will have different factorizations depending on which coefficient ring we are working with:
- over $\ZZ$ we have $f = (x^2-2)(x^2+2)$,
- over $\mathbb{R}$ we have $f = \left(x-\sqrt{2}\right)\left(x+\sqrt{2}\right)\left(x^2+2\right)$
- over $\mathbb{C}$ we have $f = \left(x-\sqrt{2}\right)\left(x+\sqrt{2}\right)\left(x-\sqrt{2}i\right)\left(x+\sqrt{2}i\right)$ (which is why the complex number are called a splitting field).
Factorization in $\ZZ[x]$ using Integer Factorization¶
To premise this section, we acknowledge that integer factorization is intractable and thus reducing polynomial factorization to integer factorization is somewhat pointless. It is, however, the first strategy devised for factorizations and was used until a better method was discovered in 1970.
Consider evaluating $f(x) = 9x^4-1$ at various integral points. We can see $$ f(10) = 89\,999 = 7 \cdot 13 \cdot 23 \cdot 43 = \left[(x-3)(x+3)(2x+3)(4x+3)\right]_{x=10} $$ which gives us a reasonable guess at the polynomial factorization. Unfortunately, none of those factors divide $f$ which means they cannot be a part of the factorization. So, we try groups of two integers to search for a factor: $$ \begin{align*} (7 \cdot 13) = 91 = (100-10+1) = [(x^2-x+1)]_{x=10} && \mbox{ does not divide $f$} \\ (7 \cdot 23) = 161 =(200-40+1) = [(2x^2-4x+1)]_{x=10} && \mbox{ does not divide $f$} \\ (7 \cdot 43) = 301 =(300+1) = [(3x^2+1)]_{x=10} && \mbox{ yes! this divides $f$} \end{align*} $$ We have deduced $f = (3x^2 + 1) \frac{9x^4-1}{3x^2+1} = (3x^2+1)(3x^2-1)$.
We may get luckier by chosing different eval points that have fewer integer factors: $$ f(12) = 186\,623 = 431 \cdot 433 = [(3x^2-1)(3x^2+1)]_{x=12} $$ Or we may get unlucky by choosing eval points that have more integer factors: $$ f(11) = 2^3 \cdot 7 \cdot 13 \cdot 181 $$
In any case. This is not the method we are going to use.
Simplifying the Factoring Problem¶
There are various things we assume that will simplify our algorithms. In particular, we assume our input polynomials are (and we expand on each of these terms)
- Primitive,
- Monic, and
- Squarefree.
Primitive Polynomials¶
First, notice that we do not care about scaling factors. If we can factor $k \cdot a$ for $k \in \ZZ$ and $a \in \ZZ[x]$ then we can factor $a$. Thus, we will presume the polynomials given to us are primitive (that is, have unit content).
Definition (Content)¶
The content of a polynomial is the gcd of its coefficients. Namely, if $a = a_nx^n + \cdots a_0$ then $$ \content(a) = \gcd(a_0,\, a_1,\, \ldots, a_n) $$ The primitive part of $a$ is $$ \primpart(a) = \frac{a}{\content(a)} $$
Example¶
$\content(6x^3 + 3x + 3) = 3$ and $\primpart(6x^3 + 3x + 3) = 2x^3 + x + 1$.
Second, given a polynomial to factor like $$a = (x-1)^3(x^2+x+1)^2(x^3-3x-1)^7$$ we are satisfied finding only the irreducible components: $x-1$, $x^2 + x + 1$, and $x^3 - 3x -1$ because we can recover their degrees by repeated division.
x = x_poly(PolynomialDense)
p = 6x^3 + 3x + 3
content(p)
3
prim_part(p) #Again, here prim_part returns a function that uses a prime
#but with PolynomialModP prim_part will have a simpler interface
2x³ + x + 1
Monic Polynomials¶
When considering factoring a polynomial in $\ZZ_p[x]$ we require only that the polynomial be made monic (that is, have unit leading coefficients).
Making a polynomial monic is easy. When the coefficients are taken from a field, $\lc(a)$ is guaranteed invertible and $\frac{a}{\lc(a)}$ will be monic. This will also guarantee that the factorization will be into monic irreducibles.
function make_monic(p::P, prime::Integer)::Tuple{Integer, P} where {P <: Polynomial}
p = mod(p, prime)
lc = leading(p).coeff
@show p, lc
smod_fp = smod(p * int_inverse_mod(lc, prime), prime)
return lc, smod_fp
end;
fp = 6x^3 + 3x + 3
make_monic(fp, 101)
(p, lc) = (6x³ + 3x + 3, 6)
(6, x³ - 50x - 50)
Square-Free Polynomials¶
Note here we are not presuming that $f_i$ is irreducible.
Square-free polynomials, as their name suggests, are polynomials with no repeated factors. A factor that repeats, say, three times also repeats two times and hence square-free implies repeated-root free.
Definition (Square-Free)¶
$a \in \FF[x]$ is square-free if $a$ has no repeated factors. That is, $$\lnot\exists b \in \FF[x]; \, \deg(b)>0 \mbox{ and } b^2 \mid a$$
Lemma 1¶
$a \in \ZZ_p[x]$ is square-free $\iff$ $\gcd(a,a') = 1$.
Proof Omitted.
Lemma 2¶
Let $a \in \ZZ_p[x]$ and $a = f_1^1 \cdot f_2^2 \cdot f_3^3 \cdot \cdots \cdot f_n^n$ be a square-free factorization for $a$. Then $$\gcd(a,a') = f_2^1 f_3^2 f_4^3 \cdots f_n^{n-1}.$$
Proof Omitted.
Notice that $$ \frac f {\gcd(f,\, f')} = \frac{ f_1^1 f_2^2 f_3^3 \cdots f_n^n}{f_2^1 f_3^2 f_4^3 \cdots f_n^{n-1}} = f_1f_2\cdots f_n $$ is a square-free polynomial.
Example¶
Consider $f = (x^2 + x + 1)^3 (x^3 - 3x -1)^2 \in \ZZ_{11}[x]$. We have $$ g = \gcd(f, f') = x^5 + x^4 + 9x^3 + 7x^2 + 7x + 10 $$ and $$ \frac{f}{g} = (x^2+x+1)(x^3+8x+10). $$
x = x_poly(PolynomialDense)
f1 = (x^2+x+1)
lc, sqfr_f1 = make_monic(square_free_mod_p(f1^3, 11), 11)
(p, lc) = (5x² + 5x + 5, 5)
(5, x² + x + 1)
f2 = x^3 - 3x + (-1)
lc, sqfr_f2 = make_monic(square_free_mod_p(f2^2, 11), 11)
(p, lc) = (x³ + 8x + 10, 1)
(1, x³ - 3x - 1)
_, sqfr_f = make_monic(square_free_mod_p(f1^3*f2^2, 11), 11)
smod(f1*f2, 11) - sqfr_f
(p, lc) = (3x⁵ + 3x⁴ + 5x³ + 10x² + 10x + 8, 3)
0
Factorization in $\ZZ_p[x]$¶
Let $p$ be an odd prime. That is, $p$ cannot be two.
Proposition¶
Let $a \in \ZZ[x]$ be primitive, square-free and $$a = f_1 \cdot f_2 \cdot \cdots \cdot f_\ell$$ where each $f_k$ is irreducible over $\ZZ$. If $p \not\mid \lc(a)$ then the number of factors of $a$ over $\ZZ_p$ is $\geq \ell$.
Example¶
Consider $a = 9x^4-1 = (3x^2-1)(3x^2+1)$. We say $a$ has two factors and therefore has $\ell = 2$. But also $$ \begin{align*} a &\cong (3x^2-1)(3x^2+1) \mod 5\\ a &\cong (3x^2-1) \cdot 3(x-2)(x+3) \mod 7\\ a &\cong 2 \mod 3 && \mbox{3 | $\lc(a)$} \end{align*} $$
Notation ($D_p$)¶
Let $D_p(a)$ be the set of possible degrees of factors of $a \in \ZZ[x]$ inferred from the factorization of $a$ over $\ZZ_p$.
Example (How to determine if $a \in \ZZ[x]$ is irreducible)¶
Let $a = 85x^5 + 55x^4 + 37x^3 + 35x^2 - 97x - 50$ $$ \begin{align*} a &\cong (x+1)(x^4+x^3+x^2+x+1) \mod 2 &&\implies D_2(a) = \{1,4,5\}\\ a &\cong (x+1)(x^4+x^2+x+1) \mod 3 &&\implies D_3(a) = \{1,4,5\}\\ a &\cong 0 \mod 5 && \mbox{ bad prime} \\ a &\cong (x^2+5x+5)(x^3+6x+4) \mod 7 &&\implies D_7(a) = \{2,3,5\}\\ \end{align*} $$
Now $D_2(a) \cap D_3(a) \cap D_7(a) = \{5\} \implies a$ is irreducible over $\mathbb{Z}$.
Cantor Zassenhaus Factorization¶
Let $a \in \ZZ_p[x]$, $d = \deg a > 0$. Suppose $\gcd(a,a')=1$ (i.e. $a$ has no repeated factors).
Question How can we compute the linear factors of $a$ in $\ZZ_p[x]$?
Idea 1: $$\gcd(a,\, (x-0)(x-1)\cdots(x-p+1)) \mod p$$
Question How can factor the product of linears into its linear components.
Idea 2: Pick $k=\frac{p-1}{2}$ (half) the linear factors at random. Maybe generate nontrivial GCD. Recurse. $$\gcd(a,\, (x-\sigma_0)\cdots(x-\sigma_k)) \in \left\{a,\,1,\, \mbox{proper factor of } a\right\}$$ (Note: $\sigma$ is some permutation of $\ZZ_p$.)
Question How can we get do the same thing for irreducible quadratics, cubics, quartics, $\ldots$
Answer: Essentially the same way. It will take some theory to express.
a = 5
p = 23
mod(a^p-a, p)
0
a = 15
p = 29
mod(a^p-a, p) #WRONG!
# There is an overflow....
# by analogy your power mod p will improve this
11
big(24) |> typeof
BigInt
a = big(15)
p = 29
mod(a^p-a, p)
0
Corollary¶
Let $f(x) = x^p -x \in \ZZ_p[x]$, then for any $a \in \ZZ_p$, $$f(a) \cong 0 \mod p$$ and thus $(x-a) | f$ which means $(x-a)$ is a factor $f$ for every $a$. Thus $x^p -x$ is a product of all the linear irreducible polynomials of $\ZZ_p[x]$: $$ x^p - x = (x-0)\cdots(x-p+1) $$
Supposing that $f \in \ZZ_p[x]$ is square-free and primitive we have that $$g = \gcd(x^p-x,\, f)$$ is the product of all monic linear factors of $f$.
Theorem¶
In $\ZZ_p[x]$, $$x^{p^k}-x$$ is the product of all monic irreducible polynomials in $\ZZ_p[x]$ of degreee $d | k$.
This means that $$ \begin{align*} g_1 &= gcd(a, x^p-x) && \mbox{contains all irreducible linear factors of } a \\ g_2 &= gcd(a/g_1, x^{p^2}-x) && \mbox{contains all irreducible quadratic factors of } a \\ g_3 &= gcd(a/g_1/g_2, x^{p^3}-x) && \mbox{contains all irreducible cubic factors of } a \\ &\;\vdots \end{align*} $$
Notice we now have an algorithm for decomposing a polynomial into groups of products of irreducible polyomials of the same degree.
def DISTINCT_DEGREE_FACTORIZATION:
Input a in ZZ_p[x] such that d = deg a > 0, a is monic, primitive, and a square-free.
Ouput g_1, g_2, ..., g_m such that a = g_1 * g_2 * ... * g_m and
g_i = Product of irreducible polynomials of degree i
w <- x
for k from 1 to degree(a) do
w <- Rem(w^p, a)
g[k] <- Gcd(w-x, a)
f <- Quo(a, g[k])
end do
if f <> 1 then
g[k] <- a
end if
return [g_1, ... , g_d]
function dd_factor_mod_p(f::P, prime::Integer)::Array{P} where {P <: Polynomial}
x = x_poly(P)
w = deepcopy(x)
g = Array{P}(undef,degree(f)) #Array of polynomials indexed by degree
#Looping over degrees
for k in 1:degree(f)
w = rem_mod_p(pow_mod(w,prime,prime), f, prime)
g[k] = gcd_mod_p(w - x, f, prime)
f = div_mod_p(f, g[k], prime)
end
#edge case for final factor
f != one(P) && push!(g,f)
return g
end
prime = 17
p = mod((7x^3 + 2x^2 + 8x + 1)*(x^2+x+1),prime)
println("Will factor this polynomial (mod $prime): ", p)
println("Starting with dd_factor")
dds = dd_factor_mod_p(p,prime)
Will factor this polynomial (mod 17): 7
x⁵ + 9x⁴ + 11x² + 9x + 1 Starting with dd_factor
5-element Vector{PolynomialDense}:
16x + 9
10x⁴ + 13x³ + 15x² + 5x + 2
1
1
1
Distinct Degree Splitting (Idea 2)¶
Consider that we know how to factor differences of squares like $x^2 - 1 = (x+1)(x-1)$. In our case we have $$ x^p - x = x(x^{p-1}-1) = x\left(x^\frac{p-1}{2}-1\right)\left(x^\frac{p-1}{2}+1\right). $$ and also $$ (x-0)(x-1)\cdots(x-p+1) = x\left(x^\frac{p-1}{2}-1\right)\left(x^\frac{p-1}{2}+1\right) $$ so $$\left(x^\frac{p-1}{2}-1\right)$$ must be a product of half the roots.
Thereby, $$ \gcd(g, x^\frac{p-1}{2}-1 ) \in \left\{1, g, \mbox{ proper divisor of g} \right\}. $$
Notice the effect of shifting $x$ by $\alpha \in \ZZ_p$: $$ (x+\alpha)^\frac{p-1}{2}-1 = (x-\sigma_0) \cdots (x-\sigma_k) $$ for $\sigma$ a permutation on $\ZZ_p$ and $k = \frac{p-1}2$. In particular, we have merely pushed the roots around.
Similarily, $$ x^{p^k}-x = x(x^\frac{p^k-1}2 - 1)(x^\frac{p^k-1}2 + 1) $$ is the product of all monic polynomials of degree $k$. Thereby $$ (x^\frac{p^k-1}2 - 1) $$ is a product of half the monic polynomials of degree $k$.
Theorem¶
Suppose $g_k \in \ZZ_p[x]$ is the product of monic polynomials of degree $k$. Let $w$ be a random monic polynomial of degree $k$. Then $$ \mbox{Prob}\left[\gcd(g_k, w^\frac{p^k-1}2 - 1) \not\in \left\{1, g_k \right\}\right] > \frac12 - \frac 1 {2p^2} \geq \frac49. $$
Notice this means we have an algorithm for splitting the groups returned by the distinct degree factorization.
def DISTINCT_DEGREE_SPLIT:
Input a in ZZ_p[x] monic and d in ZZ such that a = g_1 * g_2 * ... * g_m,
g_k in \ZZ_p[x] irreducible, and deg g_k = d
Ouput [g_1, g_2, ..., g_m]
if degree(a) = d then
return [a]
end if
if degree(a) = 0 then
return []
end if
w <- rand_monic_poly(d, p) # Monte Carlo step
g <- Gcd(w^((p^d-1)/2)-1, a) # Project: Optimize this.
gbar <- Quo(a, g, x)
DDS = DISTINCT_DEGREE_SPLIT
return DDS(g, d) + DDS(gbar, d) # + is list concatenation
function dd_split_mod_p(f::P, d::Integer, prime::Integer)::Vector{P} where {P <: Polynomial}
f = mod(f,prime)
degree(f) == d && return [f]
degree(f) == 0 && return []
w = rand(P, degree = d, monic = true)
w = mod(w,prime)
n_power = (prime^d-1) ÷ 2 # ÷ is integer quotient
g = gcd_mod_p(pow_mod(w,n_power,prime) - one(f), f, prime)
g = mod(g, prime)
# @show g
ḡ = div_mod_p(f, g, prime) # g\bar + [TAB]
return vcat(dd_split_mod_p(g, d, prime), dd_split_mod_p(ḡ, d, prime) )
end
@show prime, dds[2]
dd_split_mod_p(dds[2], 2, prime)
(prime, dds[2]) = (17, 10x⁴ + 13x³ + 15x² + 5x + 2)
2-element Vector{PolynomialDense}:
15x² + 15x + 15
11x² + 5x + 9
Factoring in $\ZZ_p[x]$¶
All that remains is to use utilize the previous two algorithms to factor an arbitrary polynomial from $\ZZ_p[x]$.
# Cantor Zassenhaus Factorization #
let DDF = DISTINCT_DEGREE_FACTORIZATION
let DDS = DISTINCT_DEGREE_SPLIT
def FACTOR():
Input a in ZZ_p[x]
Ouput [[g1, d1], [g2, d2], ..., [gm, dm]] such that gk, irreducible, and
a ≡ g1^d1 * g2^d2 * ... * gm^dm (mod p)
f <- a
f <- Quo(f, content(f)) # make f primitive;
f <- Quo(f, lc(f), x) # make f monic
f <- Quo(f, Gcd(f, f')) # make f square-free
[d1,...,dk] <- DDF(f)
[g1,...,gm] <- DDS(d1, 1) + ... + DDS(dk, k) # + is list concatenation
[e1,...,em] <- [max(k in ZZ : e1^k | a),..., max(k in ZZ : em^k | a)]
return [[lc(a), 1], [g1, e1], ..., [gm, em]]
function factor_mod_p(f::P, prime::Integer)::Vector{Tuple{P,Integer}} where {P <: Polynomial}
# Cantor Zassenhaus factorization
f_modp = mod(f, prime)
degree(f_modp) ≤ 1 && return [(f_modp,1)]
ret_val = Tuple{P, Int}[]
# make f square-free
sqr_fr_poly = square_free_mod_p(f, prime)
# Drop to Zp[x]
fp = mod(sqr_fr_poly, prime)
# @show "see square free part" fp
# make f primitive
content = mod(gcd(map(t -> t.coeff, fp)), prime)
fp = mod(fp * int_inverse_mod(content, prime), prime)
# @show "after square free:", ff
# make f monic
old_coeff = leading(fp).coeff
fp = mod(fp * int_inverse_mod(old_coeff, prime), prime)
# @show "after monic:", ff
dds = dd_factor_mod_p(fp, prime)
for (k,dd) in enumerate(dds)
sp = dd_split_mod_p(dd, k, prime)
sp = map((p)->div_mod_p(p, leading(p).coeff, prime), sp) #makes the polynomials inside the list sp, monic
for mp in sp
push!(ret_val, (mp, multiplicity_mod_p(f_modp,mp,prime)) )
end
end
#Append the leading coefficient as well
push!(ret_val, (leading(f_modp).coeff* one(f), 1) )
return ret_val
end
#Taken from example_script.jl
prime = 17
a = mod((7x^3 + 2x^2 + 8x + 1)*(x^2+x+1), prime)
7x⁵ + 9x⁴ + 11x² + 9x + 1
factorization = factor_mod_p(a, prime)
4-element Vector{Tuple{PolynomialDense, Integer}}:
(x + 8, 1)
(x² + 2x + 7, 1)
(x² + x + 1, 1)
(7, 1)
mod(expand_factorization(factorization), prime)
7x⁵ + 9x⁴ + 11x² + 9x + 1
Optimizing PowMod¶
Question: How do we compute $\gcd(a,x^{p^k}-x)$ when $k$ large?
Answer: Use binary powering with remainder.
Consider that there is a foolish way of calculating $8^{17^3} \mod 17 \ldots$ $$ 8^{17^3} = 7 \mbox{[10,216 digts]} 8 \mod 17 = 9 \mod 17 $$ Here we are doing $8^{17^3}$ in $\ZZ$ (not in $\ZZ_p$). The first optimization is to fix this by doing: $$ (\cdots((8 \cdot 8 \mod 17) \cdot 8 \mod 17) \cdot \cdots \cdot 8 \mod 17) $$ instead.
Second, we want to many less multiplications than $17^3$ many. We can accomplish this via repeated squaring (also called binary powering). Suppose we want to do $w^{103} \mod m$. First notice $$ 103 = 64 + 32 + 4 + 2 + 1 = (1100111) $$ where $(x)_2$ is the binary representaiton of $x$. Thereby $$ w^{103} \mod p = (w^{64}) \cdot (w^{32}) \cdot (w^4) \cdot (w^2) \cdot w^1 \mod m $$ So we just need to calculate $$ \begin{align*} && s &\gets w \\ w^2 \mod a && s &\gets s^2 \mod a \\ w^4 \mod a && s &\gets s^2 \mod a \\ w^8 \mod a && s &\gets s^2 \mod a \\ w^{16} \mod a && s &\gets s^2 \mod a \\ w^{32} \mod a && s &\gets s^2 \mod a \\ w^{64} \mod a && s &\gets s^2 \mod a \end{align*} $$ So, instead of doing $103$ multiplications we are doing less than $2 \cdot 5 = 10$ mutiplications (5 to get the squares, then worst case multiply those 5 squares together).
Returning to $8^{17^3}$ we have $$17^3 = (1001100110001)_2 = 2^{12} + 2^9 + 2^8 + 2^5 + 2^4 + 2^0$$ So here we do 12 multiplications (to get the powers of two by squaring) and then 5 more to get $$ 8^{2^{12}} \cdot 8^{2^9} \cdot 8^{2^8} \cdot 8^{2^5} \cdot 8^{2^4} \cdot 8^{2^0}. $$
For comparison, the naive method requires $17^3 = 4913$ multiplications.
Task¶
Implement a PowMod(x, m, p) which computes $x^m \mod p$ using repeated squaring in $\ZZ_p$.
def POWMOD:
Input x in ZZ, m in ZZ such that m >= 0, p a prime.
Ouput x^m mod p
let m = b0*2^0 + b1*2^1 + ... + bk*2^k with bi in {0, 1} # i.e. determine the binary representation of m
ans, w = 1, x
for i from 0 to k do
if bi == 1 then
ans *= w
w *= w
return ans
Implementation Hints¶
Addition of Sparse Polynomials¶
You should not use indexing for sparse polynomials --- there is no relationship between term degree and its position in the terms vector. Rather, limit yourself to two operations on polynomials:
- Adding a single term
p + twhich we can abstract aspush!(p::PolynomialSparse, t::Term). - Removing the leading term
p - leading(p)which we can abstract aspop!(p::PolynomialSparse)::Term. Note: The bang symbol!is userd to indicate the function mutates its input.
def PUSH:
INPUT p (a polynomial) in ZZ[x], t (term) in ZZ[x]
OUTPUT p + t
EXTEND p.terms by one cell
INSERT t into p.terms so that order is maintained
def POP:
INPUT p (a polynomial in ZZ[x])
OUTPUT leading term of p
MUTATE p gets p - leading(p)
if p is the zero polynomial then
return zero term
end if
lt <- p.terms[1] # or perhaps p.terms[length(p)]
DELETE the empty cell in p.terms
return lt
def ADD:
INPUT p and q (polynomials) in ZZ[x]
OUTPUT p + q
p', q' = copy(p), copy(q) # since pop is destructive
ans = zero polynomial # initialize
while p' is not zero or q' is not zero do
if degree(p') == degree(q'):
ltp, ltq = POP(p'), pop(q')
if ltp + ltq <> zero then
PUSH (ltp+ltq) into ans # + is addition of term
end if
end if
if degree(p') > degree(q'):
PUSH POP(p') into ans
end if
if degree(p') < degree(q'):
PUSH POP(q') into ans
end if
return ans
END