(last edit: September 18, 2024)
Here is an explainer video
Due: Due: September 20, 2024, 16:00.
Individual work: This is an individual work assignment. Plagiarism will not be accepted. Nevertheless, feel free to consult with friends or classmates via Ed and other means about how to go about certain tasks. Note that similar and not identical projects were issued in previous editions of the course. If you wish, you may look at GitHub repos of such projects from previous years (in case students made them public). Do not copy code from those repos directly - but rather use such code bases for help as needed.
Note: The teaching staff will only answer questions (via Ed, consultation hour, or practicals) regarding this assignment up to the late evening of Wednesday 18/9.
Weights and marking criteria: Total number of points: 100. There are 10 points for setting up the GitHub repo properly and following instructions for hand-in (including voice recording). The remaining 90 points are for the project tasks.
Submission format: This assignment should be submitted via a GitHub Repo and a PDF file via Blackboard.
Specific instructions for the GitHub repo are below. It is important that the GitHub repo be made private and the course user name uqMATH2504
be invited as a collaborator before the due date. It is also important to name the repo exactly as <<FIRST NAME>>-<<LAST NAME>>-2504-2024-PROJECT1
. So for example if your name is "Anthony Albanese", the repo name should be Anthony-Albanese-2504-2024-PROJECT1
.
The PDF file should be a nice formatted file that has:
Your name, student number, and assignment title (Project 1 - 2024) on the top.
A (clickable) link to your GitHub repo.
An illustration of the code and output for example_script.jl
: Show the lines of the script and display the corresponding output.
An illustration of the output of test/runtests.jl
.
The main body of the PDF file is then an indication of each location of the code project where you made changes and improvements, in view of the tasks below. That is, present the new snippets of code that you created and indicate what they do via brief descriptions. It is also a place to answer any questions that require output, or description.
The recommended way to make the PDF file is via a Jupyter notebook where you copy in some code and output into the notebook, and in certain cases use include()
to run Julia code if appropriate. You also comment on questions in this PDF (e.g. when asked to answer things not via code). If desired you could keep this jupyter notebook (.ipynb
file) in the repo. However, this Jupyter notebook will not be checked (only the PDF file), and it isn't required to be a "runnable" notebook. In any case, please avoid pixilated screenshots of code, so for example if you choose to format your PDF file in MS-word (instead of Jupyter), make sure the code is clean, formatted, and readable.
Marking responses will be made by the teaching staff by annotating selected parts of your PDF file via blackboard. The teaching staff will also inspect your GitHub repo. A very readable and clean PDF file is important. Note that in printing Jupyter to PDF (or exporting to PDF) there may sometimes be excessive white space. Do not worry about this extra white space; it is not a problem.
Marking Criteria:
10 points are allocated for setting up the GitHub hand-in according to the instructions and following all submission instructions.
90 points are for the polynomial project itself. Points are awarded for completing each of the tasks with full marks given to clean readable code that completes the task. The points allocated to Tasks 1 – 7 are 15, 10, 15, 10, 15, 15 and 10 points respectivly.
In general (unlike BigHW), points will be deducted for sloppy coding style. Make sure to have your code properly indented, to use sensible and consistent variable names, and to write code that is in general clean and consistent.
Ideally use the same account you used for BigHW.
Ideally we would like you to fork the GitHub repo https://github.com/yoninazarathy/2504_2024_project1. The new forked repo under your GitHub account should then be made private. However, GitHub does not allow to make forked repositories private. So instead duplicate the repository as described here. You will first create your new repository and then follow the 4 steps under "Mirroring a repository".
Invite the course GitHub user, uqMATH2504
as a collaborator. You may do so early on while working on the project, and you must do this no later than the project due date. You must verify that uqMATH2504
is a collaborator on your project and let the course staff know if that is not the case. It is your duty to check that we accepted the invite (give us 2-5 working days to accept the invite).
Do not make any changes (commits) to the repo after the project due date.
Create a local clone of the repo. It is recommended that use use git
command line via the shell to work on making changes/additions to the assignment and submitting the changes. However you are free to use any other mechanism (VS-Code, GitHub desktop, etc). Making intermediate commits is certainly recommended.
If for some reason you are not able or willing ot use GitHub, an alternative is GitLab. This is not recommended as it adds additional work to the teaching staff. However if you wish to use GitLab instead of GitHub contact the teaching staff for permission.
Your GitHub repo should be formatted like the original repo that you forked with additional files and modifications based on the tasks below.
Note: make sure that there aren't any excessive files in your submission repo. An exception is perhaps a Jupyter .ipynb
file if you choose to use it for creating the PDF. Use a .gitignore
file to ensure git
does not commit additional files and output files to your repo
.
Build your project on top of the https://github.com/yoninazarathy/2504_2024_project1 by making modifications, additions, and improvements. Your first goal is to understand every bit of code of the basic repository. After that, you will make improvements, additions, and cleanups.
Please follow the suggested type names below. These are suggested so that tasks have backwards compatibility which means that you do not throw away old code, but rather add modifications. Note that in an actual project, you would sometimes replace older types with newer ones, and/or use parametric types - a concept which we do not use in this project. (There is NO need for you to use parameteric types - yet if you wish, go ahead and do that).
Once you create your own repository make sure it works by running test/runtests.jl
and example_script.jl
. At this point simply study the functionality of the repository by looking at the function signatures (not implementations) to understand what the repo can do.
Create your own example script called example_script_2.jl
and include polynomials for illustrating the key functionality of the operations available via the software. Namely, your script should
run and demonstrate how the code in the repository works,
contain some intermediate output (e.g. using println()
) that describes what is going on, and in particular
have example polynomials over residue classes (primes).
Basically, example_script_2.jl
is an improvement over the minimally supplied example_script.jl
.
In your scripts. Make sure you understand how ÷ and gcd
work.
max_degree_allowed
)The current implementation for polynomials is a vector where the term of degree $k$ is stored at position $k+1$. There is also a limit on the degree of the polynomial via a global variable max_degree_allowed
(currently set at 400
).
First change the value from 400
to 100
in the code base, restart Julia, and re-run the tests. Explain why the first failure occurs.
After changing back to 400
, run cyclotonic_polynomial(401)
and see that this fails as well.
Now it turns out that this limit is not needed. It was a "feature" that was added wrongfully to the code base. So remove max_degree_allowed
and all code related to it. As you do so, run the tests again, and also try cyclotonic_polynomial
with a very large power. Reason about your code change.
Then make a commit of this change and in your answer point to the commit, the files changed, as the fact that you "did not seem to break anything" because the tests passed.
Improve the "pretty printing" of the polynomials by incorporating the following rules:
Terms of zero degree (constants) should exclude x^0
, e.g. 2
not 2x^0
.
Terms with unit degree should exclude the degree, e.g. 3x
not 3x^1
.
Excluding the constant term 1
, terms with unit coefficient should exclude the coefficient, e.g. 3x^2 + x
not 3x^2 + 1x
.
Use subtraction instead of adding negative terms, e.g. 3x^2 - 2x
not 3x^2 + -2x
.
Display the terms in descending degree order. $a_n \cdot x^n + a_n \cdot x^{n-1} + \cdots + a_1 x + a_0$.
Make sure your code does not print terms with zero coefficients, e.g. x^2 + 1
not x^2 + 0 x^1 + 1
.
Once your pretty printing works, enhance it with some Unicode characters. Specifically have the exponents appear as superscripts. For example x^32
should appear as x³²
.
After you get Unicode to work, you do not need the previous printing format (i.e. no need for x^32
and similar).
lowest_to_highest
One more option you should add is reversing of the degree. For this have a global variable lowest_to_highest
which if set to true
implies that the order printed is from lowest to highest as opposed to the default (highest to lowest). Note that if lowest_to_highest
is not set (does not exist as a global) then the program should still work and simply act as though lowest_to_highest = false
.
Int
to BigInt
Exact computation requires polynomials with (truly) integer coefficients and thereby we must use BigInt
rather than (the finitely many integers comprising) Int
.
Refactor Polynomial
and Term
to have Big
coefficients instead of Int
. Do this by introducing a new type PolynomialBig
and TermBig
.
Duplicate the functionality (functions and methods).
Create tests for PolynomialBig
and associated functions and ensure your implementation passes them.
Find examples where Polynomial
(the one with Int
coefficients) overflows and yields the wrong result whereas your new PolynomialBig
does not. The test code for PolynomialBig
should have such cases.
Empirically compare the run-time performance of polynomial multiplication for Polynomial
and PolynomialBig
. Do this for cases where Polynomial
does not overflow (i.e. returns the correct result). In general, your results should show that Polynomial
is faster, yet may overflow.
Note: Ideally we would use parametric types in Julia but for simplicity we simply introduce the new type PolynomialBig
. You may (if you wish) use parametric types but it is not required.
Notice the Polynomial
, PolynomialBig
data-types have integer coefficients whereas operations such as divide
, gcd
, and factor
require the coefficient arithmetic to be done modulo $p$. Consequently, the aforementioned functions must do this reduction themselves, and worse yet can only do so once a prime $p$ is provided. It is for this reason that divide
, gcd
, and factor
return a function that returns a polynomial once a $p$ is provided. It would be much cleaner to have a polynomial datatype where the "mod p" behavior is "baked in" and automated.
Create a type PolynomialModP
which has a polynomial (of type Polynomial
) and a prime number (Int
) as field names. PolynomialModP
should support the same arithmetic operations as Polynomial
but with coefficient arithmetic over $\mathbb{Z}_p$ rather than Int
. PolynomialModP
should be easier to work with when carrying out division, gcd, and factorization. So for example f1 ÷ f2
where f1
and f2
are of type PolynomialModP
, would directly yield a new PolynomialModP
which is the result mod $p$. Similarly for the other operations.
Create tests for PolynomialModP
and associated functions and ensure your implementation passes them.
Your new datatype should rely on the existing functionality of Polynomial
but restrict the coefficients to $0$, $1$, \ldots, $p-1$.
Once you complete this task, certain operations of Polynomial
like p1 ÷ p2
, are now better done via PolynomialModP
. That is, while you may use f1 ÷ f2
for f1
and f2
of types Polynomial
, it is better to only ÷
when f1
and f2
are of type PolynomialModP
because you get back an instance of PolynomialModP
rather than a function.
Implement ^
for PolynomialModP
and update the pow_mod
function to use it.
Update the show
method for PolynomialModP
to include an indication it is "mod p" in the output.
The current multiplication implementation is not efficient for PolynomialBig
because biger integer arithmetic may be expensive. In this task we reconcile this problem by using CRT to compute products of PolynomialBig
via cheaper products of PolynomialModP
.
Implement multiplication for PolynomialModP
(if you haven't done so yet in Task 4).
Implement multiplication using the Chinese Remainder Theorem (CRT) for PolynomialBig
as presented in the lecture. Here the product of two PolynomialBig
s will be computed by doing sufficiently many multiplications of their PolynomialModP
counterparts.
Create tests for PolynomialBig
CRT multiplication and ensure your implementation passes them.
At this point you should have efficient multiplication and power for PolynomialBig
as well as PolynomialModP
and its associated functions and methods.
As discussed in class, if you have a polynomial f
and raise it to the power 2^n
then instead of doing about 2^n
multiplications you can do about n
using repeated squaring.
Re-factor pow_mod
and ^
into something more efficient by using repeated squaring. Your implementation should work for both PolynomialBig
and PolynomialModP
.
Create tests for pow_mod
and ^
for both implementations and ensure your implementation passes them.
Adapt the supplied polynomial factorization code to incorporate the new methods and types you have built up to this point. The current supplied code works for limited cases but may get stuck on certain examples.
Create tests for factorization and ensure your implementation passes them. It is a good idea to pass your factorization algorithm a product of polynomials so your factorization is predictable. Note that your factorization algorithm may fail on small primes due to bad luck. It is therefore best to choose large primes.
Benchmark your factorization algorithm by factoring a database of problems (e.g. generated by you via rand
). Do not use more than 120 seconds of local compute time.
Using a compute budget of approximately 120 seconds "stress test" your factorization implementation by factoring something substantial.