UQ MATH2504
Programming of Simulation, Analysis, and Learning Systems
(Semester 2 2022)

This is an OLDER SEMESTER.
Go to current semester


Main MATH2504 Page


BigHW - Semester 2, 2022.

(last edit: Aug 15, 2022)

Due: End of day August 26, 2022. This assignment is in pairs.

Note:

Since this assignment is in pairs make sure to have the names of both partners (or all three in exceptional circumstances) as well as student numbers. Submission will be via the Blackboard account of only one of the teammates but will count equally for both (or all three). All cases that involve a group of a single person or three persons requires explicit permission from the course coordinator.

Submission format: The submission format is via a GitHub repository as described in Question 10. The submission should include the four submissions forms I, II, III, and IV.

The submission should also include an audio recording (30 sec minimum, 3 minutes maximum, 1 minute recommended). The voice recording should have voices of both (all) members of the group. The recording should verify that the work is your own. You can also state how you felt in the assignment. Say what was easy, what was hard, what you found useful, and what less.

To submit, go to Blackboard and submit as instructed there by providing a link to the GitHub repo, naming the partners, and uploading the voice recording (do not upload the voice recording to GitHub - upload it to Blackboard).

Marking Criteria:

  • 4 points are allocated for following instructions. This includes proper submission of the voice recording, the .ipynb file(s), having names and student numbers on the hand-in, and most importantly neatly formatting all the answers of the jupyter notebook.

  • There are 16 questions in total, with 6 points allocated for each question. Some questions are easier and some harder. Some require more work and some require less work.

  • There is an additional bonus question, question 17, dealing with Dr. Amy Chan's perspective seminar. It yields an additional 10 bonus points.

  • As this is the first assignment, points will not be deducted for sloppy coding style. However, in future assignments clean and consistent coding style is required.


Questions 1–4 are submitted on Jupyter notebook I

Question 1: LaTeX, Jupyter, HTML, and Markdown

<a id="q1">

This question deals with basic formatting tools.

1a: Consider the following continued fraction formula for the golden ratio $\varphi$:

\[ \varphi=1+\frac{1}{1+\frac{1}{1+\ddots}} \]

Use LaTeX to write a version of it with two additional nested levels. That is, replace the '$\ddots$' with two additional levels.

1b: Use Markdown to write an ordered list of your 5 favorite books. Add (Markdown) hyperlinks to each of the books.

1c: Use HTML to embed an image with an hyperlink from the image. Use HTML to resize the image to a size you like. Make sure the aspect ratio is maintained.

1d: Use HTML to create a table of your weekly class schedule. See for example this reference.

Question 2: Warming up with quadratic equations

The following function uses the quadratic formula to compute the real roots of the quadratic equation $ax^2 + bx + c = 0$. The return value is an array with 0, 1, or 2 roots.

function real_roots_of_quadratic(a::Number, b::Number, c::Number)

    #Start by initializing an empty array
    roots::Array{Float64} = []

    #Compute the discriminant 
    Δ = b^2 - 4a*c #\Delta + [TAB]

    #Based on the sign of the discriminant return 0, 1, or 2 roots.
    if Δ < 0 
        roots = []
    elseif Δ == 0
        roots = [-b/(2a)]
    else
        roots = [-b + Δ, -b - Δ] / (2a) #\sqrt + [TAB]
    end
    return roots 
end

#Attempting on -x²+5x-6=0
real_roots_of_quadratic(-1,5,-6)
2-element Vector{Float64}:
 2.0
 3.0

2a: Experiment with multiple syntax variations by making a modification to the code snippet above and in each case observing the outcome.

  1. The function accepts three arguments, a, b, and c. They are each of type Number as specified via ::Number. Would the code still work without the ::Number type specification? (yes/no). Try to explain why.

  2. Observe that the expression for the discriminant uses 4a*c as short hand for 4*a*c. Would 4ac work? (yes/no). Try to explain why.

  3. A common typo in programming is to use = in place of ==. The former is an assignment operator and the later is a logical operator checking for equality. Insert this typo by replacing Δ == 0 with Δ = 0. Do you get a syntax error or a logical program error?

  4. The expression [-b + √Δ, -b - √Δ] / (2a) can also be written as (-b .+ [√Δ,-√Δ])/(2a). Check this, does it work? (yes/no). Now do it with using + in the new expression instead of .+. Does it still work? (yes/no) Try to explain why.

  5. The return statement in the line return roots is optional yet makes for more verbose code. Try removing return. Does the function still work? (yes/no).

  6. There is a certain set of inputs for which this function will not work. Try setting the input argument a as 0 or 0.0. What is the result?

  7. To handle such cases where the quadratic equation is actually "not quadratic" add error checking code at the top of the function. One way to do this is to return some error message for example add at the top of the function:

if a == 0
    return "Error - this is not a quadratic equation"
end

Check that this works.

  1. Instead of adding these three lines you may also shorten them as,

a == 0 && return "Error - this is not a quadratic equation"

Check that this works and try to figure out (and explain) what this does.

  1. Finally instead of returning a string as an error, we may use a more advanced mechanism. So use this instead:

a == 0 && throw(DomainError(a, "argument must be nonzero"))

You will discuss such exception handling in Unit 4, yet at this point try to do your best to explain what this does.

2b: Now that we have a function that find the real roots of such quadratic equations, let us try it on a few examples:

examples = [[1,-5,6], [1,2,3],[1,7,0]]

for example in examples
    roots = real_roots_of_quadratic(example[1], example[2], example[3])
    print("The equation $(example[1])x² + $(example[2])x + $(example[3]) = 0 ") #\^2 + [TAB] #***
    if length(roots) == 0 #***
        println("has no real roots.") #***
    elseif length(roots) == 1 #***
        println("has the single (real) root $(roots[1])") #***
    else #***
        println("has the real roots $(roots[1]) and $(roots[2])") #***
    end #***
end
The equation 1x² + -5x + 6 = 0 has the real roots 3.0 and 2.0
The equation 1x² + 2x + 3 = 0 has no real roots.
The equation 1x² + 7x + 0 = 0 has the real roots 0.0 and -7.0
  1. Add two more examples of your own to the array (of arrays) examples and see how it works.

  2. The lines marked with #*** could have been set in a function of their own and that function can be called (invoked) in place of those lines. Name that function print_quadratic_roots() and have it accept two arguments. The first argument is an array coefficients which is interpreted as the coefficients ($a$, $b$, and $c$). The second argument is the array roots. Define that function and run the code with your new function in place of the lines marked with #***. Since your new function does not have a return value you may write return nothing in the end.

  3. Finally beautify the output of your function such that when a coefficient is $+1$ or $-1$ it is not printed but rather only the sign is printed (if needed). For example the polynomial $+1x^2-1x+1$ is better printed as $-x^2-x+1$. In doing so, use the ternary operator ? :. In one way or another. This may have code with less lines (less explicit big if - else statements).

2c: It is common and useful to create testing code that checks that operational code works. For the function real_roots_of_quadratic() we may create another function as follows:

using Random
"""
This function generates `num_tests` random triples of coefficients and checks that the function `real_roots_of_quadratic()` does its job. The return value is `true` if the test passed, otherwise it is `false`.
"""
function test_real_roots_of_quadratic(;num_tests = 10000, seed=42)
    Random.seed!(seed)
    test_passed = true
    for _ in 1:num_tests
        a, b, c = 2000rand(3) .- 1000 #uniform values in the range [-1000, 1000]
        roots = real_roots_of_quadratic(a,b,c)
        for x in roots
            err = a*x^2 + b*x + c 
            test_passed = (test_passed && isapprox(err, 0.0, atol = 1e-8)) 
        end
    end
    return test_passed
end

test_real_roots_of_quadratic() ? println("Test passed") : println("Test failed")
Test passed
  1. Work on understanding the expression 2000rand(3) .- 1000. How would you change this expression if you wanted the coefficients to be in the range [-5,5]? Make that change and see if the test still passes. (yes/no).

  2. Instead of having the numbers 2000 and 1000 "hardcoded" in the function it is better practice to pass them as optional arguments, similarly to the arguments num_tests and seed. Do this by creating arguments coeff_min and coeff_max that are by default at -1000 and 1000 respectively. This means that the values in the expression 2000rand(3) .- 1000 need to be represented in terms of these arguments. Do this.

  3. What happens in this test when there are no real roots? How many iterations does the for loop execute in this case?

  4. In general a test passing is only a necessary condition for validity of a program rather than a sufficient condition. Can you suggest some sort of bug in the real_roots_of_quadratic() function that can exist which would slip through the test? Try this bug (modify real_roots_of_quadratic() and show that the test still passes.

  5. A different way to test would be to use "random roots" instead of random coefficients. Such a test would first randomly decide if there are 0 real roots, 1 real root, or 2 real roots. It would then randomly draw the roots. It would then find the coefficients that match these roots. It would then test that real_roots_of_quadratic() has these roots. Try to implement such a test function. For the case of 0 (real roots) you may "hardcode" a few specific examples. Alternatively (optional) try to use the complex numbers. For example rand()+rand()*im is a complex number with random real and imaginary values, each in [0,1]

2d: In general, we may have had a solver that finds the (potentially) complex roots of a quadratic. Since the coefficients, $a$, $b$, and $c$ are real, these are complex conjugates. So we want to use the quadratic formula directly (potentially taking square roots of negative numbers).

Trying:

(-4) #\sqrt + [TAB] is like the sqrt() function
ERROR: DomainError with -4.0:
sqrt will only return a complex result if called with a complex argument. Try sqrt(Complex(x)).

We got an error. Instead we need to highlight that -4 is a complex number:

Complex(-4) #Create a complex number -4 + 0i
-4 + 0im

Now the sqrt() function, which we shorten via the √ symbol works:

Complex(-4) #This is \sqrt + [TAB], just like sqrt()
0.0 + 2.0im

So we can now create:

function roots_of_quadratic(a::Number, b::Number, c::Number)
    roots::Array{ComplexF64} = []
    Δ = b^2 - 4a*c
    if Δ < 0 
        roots = [-b + Complex(Δ), -b - Complex(Δ)] / (2a)
    elseif Δ == 0
        roots = -b/(2a)
    else
        roots = [-b + Δ, -b - Δ] / (2a) 
    end
    return roots 
end

roots_of_quadratic(1,2,3)
2-element Vector{ComplexF64}:
 -1.0 + 1.4142135623730951im
 -1.0 - 1.4142135623730951im

The above implementation is fine, however it has code duplication for both cases of $\Delta \neq 0$. That is the case Δ < 0 and the case Δ > 0 can be merged. Modify the function so that it does not have an elseif statement. In fact, try to write the function without any if statement but rather only using the trinary operator ? :.

Question 3: The Collatz Conjecture

The Collatz conjecture can be interpreted as a statement about the function hail_stone_sequence() below. It states that running the function on any integer input x, it will always terminate (ignoring machine overflow of integers).

function hail_stone_sequence(x::Int, verbose::Bool = false)
    while x != 1
        x = ((x % 2 == 0) ? (x ÷ 2) : (3x + 1)) #\div + [TAB] for ÷ 
        verbose && println("x = $x")
    end
    return nothing
end

hail_stone_sequence(12,true)
x = 6
x = 3
x = 10
x = 5
x = 16
x = 8
x = 4
x = 2
x = 1

3a:

  1. The function uses ÷ (integer division). It also uses % (modulo). Experiment with these two operators and explain the expression ((x % 2 == 0) ? (x ÷ 2) : (3x + 1)).

  2. Write the conjecture mathematically.

  3. Try to disprove the conjecture by running hail_stone_sequence() starting with the first $10^8$ integers (do this with verbose = false).

3b: Now we modify the function hail_stone_sequence() so that it returns the number of steps in the sequence until $1$ is reached. We call that function hail_length(). It receives an input argument x which is the initial value of the sequence and it returns the number of steps taken until reaching $1$ (or may get stuck if the conjecture is wrong).

We then also define the function hail_max() which finds the length of the maximal hailstone sequence (maximal number of steps until reaching 1), when starting in a range of values $1,2,\ldots,N$. We can then plot hail_max() as a function of N and it plots nicely on a logarithmic scale for the x-axis.

using Plots
function hail_length(x::Int)
    n = 0
    while x != 1
        x = ((x % 2 == 0) ? (x ÷ 2) : (3x + 1))
        n += 1
    end
    return n
end


function hail_max(N)
    max_n = -Inf
    for x in 1:N
        n = hail_length(x)
        if n > max_n
            max_n = n
        end
    end
    return max_n
end

step_range = 1:10^4
max_vals = [hail_max(N) for N in step_range]
plot(step_range, max_vals, xaxis=:log, 
    label=false, xlabel="Maximal initial value", ylabel="Maximal number of steps")

Are you able to replace hail_max() with a one line function? This function does not necessarily need to be efficient in terms of run time and or memory usage. Present your replacement function and show that you obtain the same plot.

3c: Say we now wish to have a step range that reaches a maximal value of $10^7$. Modifying 10^4 to 10^7 would make that work but it would take an excessively long time to run. Why?

Improve the code above to produce a plot similar to the plot above, but up to an initial value of $10^7$.

From an (mathematical) exploratory point view, observe your output plot and make any observations that pop out.

Question 4: Binary, Hex, and Decimal

Consider this code which creates two functions for creating random strings representing binary numbers and hexadecimal numbers.

random_binary_string(n = 12) = *([rand(['0','1']) for _ in 1:n]...)
random_hex_string(n=3) = *([rand(union('0':'9','A':'F')) for _ in 1:n]...);

For example,

using Random
Random.seed!(0)
@show random_binary_string()
@show random_hex_string();
random_binary_string() = "001010010011"
random_hex_string() = "456"

4a: Explain how these functions work.

Assume now that you are given inputs:

using Random
Random.seed!(0)
r_bin_strs = [random_binary_string() for _ in 1:10^3]
r_hex_strs = [random_hex_string() for _ in 1:10^3];

4b: Use standard Julia functions to parse these arrays of strings into arrays of numerical values and report the sum of the values (present the result in decimal). So your output should be two decimal numbers, one which is the sum of all of the binary numbers in the array, and the other which is the sum of all of the hex numbers. Prior to checking your output, suggest an estimate (expected value) for the sum - explain why.

4c: Now create your own functions (from first principles parsing strings) to convert a binary digit string and a hex string into values. Repeat 4b with your own functions and see you get the same result. Present your function implementations and the output.

Questions 5–9 are submitted on Jupyter notebook II

Question 5: Perfect numbers

A perfect number is a positive integer that is equal to the sum of its positive divisors, excluding the number itself. For instance, 6 has divisors 1, 2 and 3 (excluding itself), and 1 + 2 + 3 = 6, so 6 is a perfect number.

Write julia code that finds and prints all of the perfect numbers less than $10^6$.

Question 6: Primes and Goldbach's conjecture

Goldbach's conjecture states that every even number greater than two can be represented as a sum of two primes. For example $16 = 11+5$ and $40 = 37+3$ or $23+17$. In this question you will try to disprove the conjecture by searching for an even number that is not a sum of two primes (don't expect to succeed in disproving).

First consider the Sieve of Eratosthenes implemented as follows:

"""
Returns the all the primes up to n.
"""
function sieve_of_Eratosthenes(n)
    primebits = ones(Bool,n) #Will contain true if the index is prime (initially all assumed prime)
    primebits[1] = false #The number 1 is not prime
    p = 2 #Smallest prime
    while p  n
        i = 2p
        while i  n  # \le +[TAB]
            primebits[i] = false
            i += p
        end
        p += 1
        while p  n && !primebits[p]
            p += 1
        end
    end
    (1:n)[primebits]
end

n = 100
println("First $n primes: ",sieve_of_Eratosthenes(n))
First 100 primes: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 
53, 59, 61, 67, 71, 73, 79, 83, 89, 97]

This sieve yields a way to get a list of the first $n$ primes. However it is used inefficiently in the code below to create a plot of the number of Goldbach pairs.

using Plots
function check_Goldbach(n)
    @assert iseven(n)
    num_pairs = 0
    for p in sieve_of_Eratosthenes(n)
        if in(n-p,sieve_of_Eratosthenes(n))
            num_pairs += 1
        end
    end
    return num_pairs
end

n = 10^3
even_range = 4:2:n
checks = check_Goldbach.(even_range)
if 0  checks
    println("Found a counter example for Goldbach")
end
scatter(even_range,checks,legend=false,xlabel="n",ylabel="Number of Goldbach pairs")

Modify this code or create code of your own so that you can plot a figure similar to the above for the first $10^6$ Goldbach pairs. Do not use any in-built (or package) functions for prime numbers (e.g. Primes.jl. In the figure, instead of plotting all the points associated with Goldbach pairs, only plot a random subset of the points so that the figure won't have more than about $20,000$ points.

Question 7: Binary search

The following code presents the functions prepare_data() and prepare_sorted_data() which create a list of indexes (in the range $1,\ldots,10^9$), and a list of license plates (a string with three numbers followed by three letters). The data also creates a special entry with index 1234567890 and license plate 000 ZZZ.

using Random

random_license_plate() = String(rand('0':'9',3)) * " " * String(rand('A':'Z',3))

function prepare_data(;N = 10^7)
    Random.seed!(1)
    indexes = rand(1:10^12, N)
    plates = [random_license_plate() for _ in 1:N]

    #One specific spot is set to a special value.
    special_spot = Int(floor(0.9*N)) #Arbitrary index
    indexes[special_spot], plates[special_spot] = 1234567890, "000 ZZZ"
    
    return indexes, plates
end

function prepare_sorted_data(;N = 10^7)
    indexes, plates = prepare_data(N=N)  
    perm = sortperm(indexes)
    return indexes[perm], plates[perm]
end

indexes, plates = prepare_sorted_data(N=5) #as an example set N=5
@show indexes
@show plates;
indexes = [1234567890, 73366354470, 349241489558, 698826683692, 91492900366
3]
plates = ["000 ZZZ", "177 REO", "430 OQF", "031 OQH", "744 URJ"]

7a: Investigate the some of the features in this code including the one line function definition (random_license_plate()), the concatenation of strings via *, multiple uses of rand(), the use of a comprehension of the form [ for _ in ], the return value of a function that is a tuple (seperated by a comma), and the in-built function sortperm(). For each of these language attributes present a little (short) example of how they work with a brief 1-3 line explanation.

7b: Assume we now wish to use the indexes to find the license plate matching the index 1234567890. The following code does this when there are $10^7$ license plates. Note that there is a chance for having more than one such index, but it is small (what is it?). We also use the @btime macro to time how long it takes.

using BenchmarkTools

function find_special_plate(indexes, plates; special_index = 1234567890)
    for (i, index) in enumerate(indexes) 
        if index == special_index 
            return plates[i]
        end
    end 
    throw(Error("Index not found"))
end

indexes, plates = prepare_sorted_data()

@btime find_special_plate(indexes, plates)
4.765 μs (0 allocations: 0 bytes)
"000 ZZZ"

Run this code. Then look a the GitHub repo and documentation of BencmarkTools.jl. How many commits has the package had? Find one of the issues in the repo, and copy it here. What is it about? Go to the documentation page of BenchmarkTools.jl (there is a link from the GitHub repo). Based on what you see in the documentation, briefly explain the difference between @btime and @benchmark? Then replace @btime with @benchmark and run again.

7c: A generally better way to search for an index within a sorted list is via binary search. Implement a binary search procedure by replacing the algorithm in find_special_plate(). How much faster is it?

7d: In general what is the time complexity for searching via binary search in a sorted array? Explain your answer.

Question 8: Sorting

Consider this code which sorts a list of names:

function array_table(array, heading)
    println(heading,":")
    for (i,a) in enumerate(array)
        println(i,"\t",a)
    end
end

names = [   "Amy Chan",
            "Maithili Mehta",
            "Anna Foeglein",
            "Andy Ferris",
            "Thomas Graham",
            "Elaine Schenk",
            "Jesse Woods",
            "Tina Moghaddam",
            "Paul Bellette",
            "Paul Vrbik",
            "Tom Cranitch",
            "Yoni Nazarathy",
            "Sam Hambleton",
            "Alistair Falconer",
            "Emma Comino",
            "Ivana Carrizo-Molina"]

sorted_names = sort(names)
array_table(sorted_names, "Sorted by first name")

println()

sorted_names = sort(names,by=(x)->reverse(split(x," ")))
array_table(sorted_names,"Sorted by last name")
Sorted by first name:
1	Alistair Falconer
2	Amy Chan
3	Andy Ferris
4	Anna Foeglein
5	Elaine Schenk
6	Emma Comino
7	Ivana Carrizo-Molina
8	Jesse Woods
9	Maithili Mehta
10	Paul Bellette
11	Paul Vrbik
12	Sam Hambleton
13	Thomas Graham
14	Tina Moghaddam
15	Tom Cranitch
16	Yoni Nazarathy

Sorted by last name:
1	Paul Bellette
2	Ivana Carrizo-Molina
3	Amy Chan
4	Emma Comino
5	Tom Cranitch
6	Alistair Falconer
7	Andy Ferris
8	Anna Foeglein
9	Thomas Graham
10	Sam Hambleton
11	Maithili Mehta
12	Tina Moghaddam
13	Yoni Nazarathy
14	Elaine Schenk
15	Paul Vrbik
16	Jesse Woods

8a: Read the help entry for sort(). Now modify the code to sort the list based on the shortest last name (shortest first, longest last).

8b: Implement your own sort function, say my_sort_bubble(), using a Bubble sort algorithm. Your function should also accept a by= argument and work in the same way. Show the output on the same list, sorting by last name first.

8c: Now implement a version of your function my_sort_quick() that uses the quick sort algorithm. Test it on the list of names above, using last name first.

8d: Now generate lists of a million integers via data = rand(Int,10^6). Compare running times of the system's sort(), my_sort_bubble(), and my_sort_quick() on this list. Use the @btime macro. Then repeat for $10^7$ and $10^8$ integers. Report your results in an organized manner. Note that for my_sort_bubble() you can stop at at $10^4$ or $10^5$ becuase longer inputs will take too much time. However, try to use the large inputs for my_sort_quick() and the system's sort().

Question 9: Matrix Multiplication

Here is code that creates two random matrices and displays their matrix product:

using LinearAlgebra
Random.seed!(0)
A = round.(10*rand(2,3))
B = round.(10*rand(3,4))
A*B
2×4 Matrix{Float64}:
 146.0  99.0  80.0  173.0
  21.0  16.0  11.0   26.0

This in built multiplication is designed to be very numerically efficient. However matrix multiplication can be computed in different ways and to explore these ways you will write your own matrix multiplication functions.

For example:

function my_mult_inner_products(A,B)
    nA,mA = size(A)
    nB,mB = size(B)
    @assert mA == nB
    n, m, p = nA, mA, mB
    C = Array{Float64}(undef,n,p)

    for i in 1:n
        for j in 1:p
            C[i,j] = A[i,:]' * B[:,j] #compute inner product of i'th row of A and j'th column of B
        end
    end
    return C
end

my_mult_inner_products(A,B)
2×4 Matrix{Float64}:
 146.0  99.0  80.0  173.0
  21.0  16.0  11.0   26.0

However, matrix products can be represented in other ways, and each of these other ways will have their own implementation. You will now create your own three functions that execute these implementations, and in each case test they get the same result for the test matrices A and B above.

9a: my_mult_by_cols(): The matrix product can be represented as the linear combination of the columns. That is, the $j$th column of $C$ is a linear combination of the columns of $A$ with weights given by the $j$th column of $B$. Implement this making use of Julia's vector arithmetic e.g.:

vec1 = [1,2,3];
vec2 = [2,0,4];
3*vec1 + 4*vec2
3-element Vector{Int64}:
 11
  6
 25

9b: my_mult_by_rows(): Similarly to the above, you can take linear combinations of the rows of $B$. This representation is the "dual" of 7a. Carry this out as well.

9c: my_mult_sum_outer_products(): A third representation is to consider outer products and observe that,

\[ A B = \sum_{j=1}^m A_{:,\,j} B_{j,\,:}^\top ~~, \]

where $A_{:,\,j}$ is the $j$th column of $A$ and $B_{j,\,:}$ is the $j$th row of $B$ (represented as a column vector and hence its transpose is a row vector).

To see such outer products (rank one matrices) in Julia, consider:

vec1 = [1,2,3];
vec2 = [2,0,4];
vec1*vec2'
3×3 Matrix{Int64}:
 2  0   4
 4  0   8
 6  0  12

So now rank one matrices may be added up.

9d: After you have implemented the functions for 9a, 9b, and 9c, it may appear that in each of these cases there is only a single loop while the function provided, my_mult_inner_products() has a nested loop. Does this mean that in terms of complexity of the number of the operations, 9a, 9b, and 9c are necessarily much more efficient? Explain your result.

Questions 10–12 are submitted on Jupyter notebook III

Question 10: Setting up your GitHub repo for hand-in.

  • Create your GitHub account if you don't have one.

  • Create a new repo for this assignment. Name the repo exactly as <<FIRST NAME1>>-<<LAST NAME1>>__<<FIRST NAME2>>-<<LAST NAME2>>2504-2022-BigHW. So for example if your names are "Jeannette Young" and "Kidd Gee" the repo name should be Jeannette-Young__Kidd-Gee-2504-2022-BigHW.

  • Make sure the repo is private.

  • Invite the course GitHub user, uqMATH2504 as a collaborator. You may do so early on while working on the assignment, and must do this no later than the assignment due date.

  • Do not make any changes (commits) to the repo after the assignment due date.

  • Create a local clone of the repo. It is recommended that use use git command line via the shell to work on making changes/additions to the assignment and submitting the changes. However you are free to use any other mechanism (VS-Code, GitHub desktop, etc).

  • If for some reason you are not able or willing ot use GitHub an alternative is GitLab. This is not recommended as it adds additional work to the teaching staff, however if you wish to use GitLab instead of GitHub contact the teaching staff for permission.

Your GitHub repo should be formatted as exactly as follows:

  • Have a README.md file which states your name, the assignment title, and has a (clickable) link to the assignment instructions on the course website (to this document).

  • Have a LICENSE.md file. Choose a license as you wish (for example the MIT license). However keep in mind that you must keep the submission private until the end of the semester.

  • Have a .gitignore file.

  • Have a notebooks directory and in it have the 4 submission forms (notebooks) I–IV.

  • Note: make sure that there aren't any files in your submission repo except for the submission forms I–IV. Use may use the .gitignore file to ensure git does not commit additional files and output files to your repo.

Question 11: A few unix commands

In this question you will manipulate a few files via Unix command line. If you are using Windows, it is best you do it via GitBash. If you are using Linux or Mac, just use the shell. For the question create a folder hw2_q1 and make sure that hw2_q1 is your working directory in the terminal/shell/command-line via cd. You will then use the commands ls, pwd, cd, cp, mv, rm (be careful with rm), cat, and echo. You'll also use output redirection to file (>).

Your submission should be comprised of the text from the shell including the commands you used and the outputs. Just get it by copying the shell text and pasting in a file.

11a: Create the folder hw2_q1 and make that your working directory in the shell. Illustrate this via pwd.

11b: Display your name to the terminal using the echo command.

11c: Repeat and use output redirection (>) to send the output of the echo command to a file my_name.txt.

11d: Display the contents of the file using the cat command.

11e: Inspect the length of the file using ls. Make sure to use a flag for ls so as to see the number of bytes in the file.

11f: Create a duplicate of the file to another file called my_name2.txt using the cp command.

11g: Create a folder named people using the mkdir command.

11h: Move my_name2.txt into people, where the file name will be <<YOUR-NAME>>.txt with <<YOUR-NAME>> being your first and last name without white space using _. For this use the mv command.

11i: Using the same thing you did in 1c, create three additional files of peoples names in the folder people.

11j: Change directory into people. Now Use mv to move one of the people files back into hw2_q1. Use .. (double dot).

11k: Move another file back into hw2_q1, but now use an absolute path to describe the destination folder.

11l: Copy one of the files in people into a file name starting with . (single dot as prefix). For example if the file name was marie_curie.txt, copy it into .marie_curie.txt.

11m: Show all the files in the directory (also showing .marie_curie.txt), where you also see the size of the files and other information. Use the correct flag(s) for ls to show hidden files.

11n: Delete all the files in the directory using a single command. Please be very careful when deleting files!

11o: Move back to hw2_q1 and delete the directory people. Use either rm with a correct flag or rmdir.

Question 12: HTTP, Files, JSON, and CSV

Inspect the Jupyter notebook file for practical B, practical_B_julia_essentials.ipynb. The source URL is available by clicking "Raw" on the GitHub page. It is in JSON format.

Write a program that reads in the JSON file (directly from the web) of the Jupyter notebook and prints a summary of the notebook including:

  • Total number of cells.

  • Number of code cells.

  • Number of markdown cells.

  • A breakdown of the number of code cells to cells that have output and those that don't.

  • Any other summary information you find relevant (you choose what summary to print - must be at least one item).

The program should then create two CSV files, both with header rows and named markdown_summary.csv and code_summary.csv. Both of these files should contain a data row for each respective markdown or code cell in the notebook. The first column in each of the files should be named cell_number and should indicate the cell number in the Jupyter notebook (this is a unique count starting at 1 within the cells). The second and third columns are:

  • character_count - determines how many characters total are in the cell (include characters such as \n and count unique characters as a single character. E.g. is a single character.

  • line_count - The number of lines the cell uses.

The remaining columns count frequency of elements within the markdown or code of the cell. These columns will differ for markdown_summary.csv and code_summary.csv, and are described now.

For the markdown summary file create columns: #, ##, ###, ####, each of which count the number of times the respective element appears in the cell. For example if a cell has # and ## in it, each once, then the respective entries in the line for this cell should be 1, 1, 0, 0.

For the code summary CSV file do so similarly, for the keywords: return, for, if, and using (i.e. 4 columns, counting the frequency of these keywords in each of the code cells).

Thus in summary, your program reads in a JSON file representing a Jupyter notebook, displays basic summary information, and then stores additional information to two CSV files which in principle could be further analyzed at a later time.

Your relevant submission form for the assignment should show the first few lines of these CSV files.

Questions 13–17 are submitted on Jupyter notebook IV

Question 13: Numerical derivatives

The lecture notes contain code that illustrates the absolute relative error (as a function of $h$) for numerical derivatives using the forward difference scheme,

\[ f'(x) \approx \frac{f(x+h) - f(x)}{h}. \]

An alternative scheme is the central difference scheme using,

\[ f'(x) \approx \frac{f(x+h/2) - f(x-h/2)}{h}. \]

13a: Consider the functions, $f_1(x) = sin(x^2)$ at $x=1/2$, $f_2(x) = e^x$ at $x=1$, and $f_3(x) = atan(x)/(1+e^{-x^2})$ at $x=2$. Evaluate the performance of each of the schemes (forward and central) for each of these functions at the given points. In each case plot the error as a function of $h$, and find the optimal $h$. Display your results in a neat manner using one or two plots and a minimal summary.

13b: Consider now a vector valued function of the form, $f: {\mathbb R}^K \to {\mathbb R}^K$, defined as,

\[ S(z) = \frac{1}{\sum_{i=1}^{K} e^{z_i}} \begin{bmatrix} e^{z_1} & \cdots & e^{z_{K}} \end{bmatrix}^\top. \]

This function is called the softmax function. Its Jacobian is a $K \times K$ matrix with elements of the form,

\[ J_{ij} = \frac{\partial}{\partial z_j} \frac{e^{z_i}}{\sum_{i=1}^{K} e^{z_i}}%\\ = \begin{cases} S(z)_{i}(1-S(z)_{i}), & i = j, \\[10pt] -S(z)_{i}S(z)_{j}, & i \neq j. \end{cases} \]

Assume you do not know the Jacobian expression and wish to compute it for a vector $z$ of dimension $K$. How many softmax evaluations are needed with the forward difference scheme? How many softmax function evaluations are needed with the central difference scheme? Explain your answer.

13c: Consider now $z = [1^{1/3}, 2^{1/3}, \ldots, K^{1/3}]^\top$ for $K=10$, $K=20$, and $K=100$. Plot the error of the Jacobian for both the forward difference scheme and the central difference scheme where as error you consider the euclidean distance between the approximated Jacobian and the explicit Jacobian (the square root of the sum up the squares of the differences between entries of the numerical derivative and the explicit one). What is the optimal $h$ in each case?

Question 14: Markov chains and LinearAlgebra

A discrete time, time-homogenous, finite state markov chain can be defined as a random sequence $X_0,X_1,X_2,\ldots$ where $X_i \in \{1,2,\ldots,L\}$ with $L$ being the number of states, and where the probability of $X_{n+1} = j$ given that $X_n = i$ is determined by the element $P_{ij}$ of the (stochastic) matrix $P$. This is an $L \times L$ matrix with non-negative entries and, $\sum_{j=1}^L P_{ij} = 1$ for all $i \in \{1,\ldots,L\}$. Hence each row $i$ of the matrix is the probability distribution of the next value in the sequence in case where the current value is $i$. An initial probability of $X_0$ is also given via some vector $p^0$ where $p^0_i = {\mathbb P}(X_0 = i)$.

Markov chains are very useful models for a variety of applications and at UQ are briefly introduced in STAT2003 and studied in detail in STAT3004. Here we will simply use them as a motivating example for practicing using linear algebra and no further background from these courses is needed.

We focus on irreducible, aperiodic, finite state Markov chains. Irreducible means that for every pair $(i,j)$ there exists an $n$ such that $i$-th $j$-th entry of the matrix power $P^n$ is non-negative. Aperiodic is guaranteed when the diagonal entries of $P$ are strictly positive. Irreducibility means that every state $\{1,\ldots,L\}$ is reachable from every other state and being aperiodic implies there aren't times $n$ where only a subset of the states can be visited. For such Markov chains, one important vector quantity is the stationary probability (row) $L$-vector. Here it is denoted $\pi$ with $\sum_{i=1}^L \pi_i =1$ and $\pi_i>0$ for every $i$. This vector $\pi$ counts the long term statistical occupancy of states, and can be characterized/computed in several ways:

  1. It satisfies $\pi P = \pi$ (note that this is an undetermined system of linear equations because $(I-P)$ is singular; still the additional requirement that $\sum_{i=1}^L \pi_i =1$ ensures uniqueness of $\pi$).

  2. For any $i$,

\[ \lim_{n \to \infty} [P^n]_{ij} = \pi_j. \]

  1. The theory of non-negative matrices implies that $P$ has a a real eigenvalue equal to $1$. This is the eigenvalue with maximal magnitude among all eigenvalues of $P$. The vector $\pi$ is proportional to the eigenvector associated with this maximal eigenvalue ($\sum_{i=1}^L \pi_i =1$).

  2. Given a sequence of random variables from this Markov chain $\{X_0,X_1,X_2,\ldots\}$ we estimate $\pi_i$ for large $N$ via,

\[ \pi_i \approx \frac{1}{N} \sum_{n=1}^N {\mathbf 1} \{X_n = i\}, \]

where ${\mathbf 1}\{\cdot\}$ is the indicator function equalling $1$ if the argument occurs, and $0$ otherwise. This is true for any initial probability vector $p^{0}$.

In this problem you will compute/estimate $\pi$ using methods 1-4 for a specific class of problems.

The code below generates a very special class of Markov chains where there is an explicit expression for the stationary probability vector $\pi$. The function structured_P with input argument L creates and $L \times L$ matrix $P$ and the function structured_π creates an $L$-vector (column vector) $\pi$ matching this $P$:

using LinearAlgebra

function structured_P(L::Int; p::Float64 = 0.45, r::Float64 = 0.01)::Matrix{Float64}
    q = 1 - p - r
    P = diagm(fill(r,L)) + diagm(-1=>fill(q,L-1)) + diagm(1 => fill(p,L-1))
    P[1,1] = 1-p
    P[L,L] = 1-q
    return P
end

structured_π(L::Int; p::Float64 = 0.45, r::Float64 = 0.01)::Vector{Float64} = begin
    q = 1 - p - r
    [(p/q)^i  for i in 1:L] * (q-p) / p / (1-(p/q)^L) #Explicit expression (birth death)
end;

Here is for example the matrix $P$ for $L=5$:

P = structured_P(5)
5×5 Matrix{Float64}:
 0.55  0.45  0.0   0.0   0.0
 0.54  0.01  0.45  0.0   0.0
 0.0   0.54  0.01  0.45  0.0
 0.0   0.0   0.54  0.01  0.45
 0.0   0.0   0.0   0.54  0.46

And here is the vector $\pi$ for $L=5$:

π = structured_π(5)
@show sum(π)
@show sum(π)  1.0 #\approx + [TAB]
π
sum(π) = 0.9999999999999996
sum(π) ≈ 1.0 = true
5-element Vector{Float64}:
 0.2786497527413459
 0.2322081272844549
 0.1935067727370457
 0.16125564394753808
 0.13437970328961507

We can see that it is a stationary distribution:

@show norm(π'*P - π')
π'*P  π'
norm(π' * P - π') = 2.7755575615628914e-17
true

You will now consider $L=2,3,4,5,10,20,30,40,50,100,200,300,400,500,1000$ and in each case use methods 1-4 above to compute/estimate $\pi$. Create 4 functions, one for each method and in each case use the function to compute/estimate $\pi$ and compare your estimated vector with the output of structured_π. Note that methods 1-3 will use functions from LinearAlgebra while method 4 requires random variable generation. You may make use of sample from StatsBase.jl for this, or any other means. You may choose for how many steps to run the Markov Chain ($N$). Similarly, since the second method requires you take take matrix powers $P^n$ for high $n$, you may also choose which $n$ to use.

Once your code works for the four methods, create some graphic or tabular output of your choice (one or two plots at most) which evaluates performance of each of the methods in terms of accuracy (and optional: running time). To measure accuracy use the norm function to compute the euclidean norm between your calculated $\pi$ and that arising from the explicit solution structured_π.

Question 15: Planetary Motion - Part 1

In this problem and the next we deal with ordinary differential equations (ODEs), similar to those associated with planetary motion. Specifically we focus on a very simple two body problem reduced to two one body problems. Here you will implement the Euler Method and the classic Runge–Kutta method for solving ODEs. You will use these methods on the simple problem and analyze their performance in terms of step size.

In general, take a function $f: {\mathbb R}^d \to {\mathbb R}^d$ which we called the right hand side (RHS). The ODE seeks a function $u(\cdot)$ where $u(t) \in {\mathbb R}^d$, with,

\[ \frac{d u}{dt} = f(u), \qquad \text{and} \qquad u(0) = u_0. \]

Here $u_0 \in {\mathbb R}^d$ is a given initial condition. Note also that in this formulation here, $f(\cdot)$ only depends on the state $u$ and not on time, $t$. A more general formulation allows dependence on time.

A solution $u(\cdot)$ is a function which for any $t\ge 0$ specifies the state $u(t)$ and satisfies the ODE above.

For a given ODE problem, namely a RHS $f(\cdot)$ and an initial condition $u_0$, an ODE numerical solver is an algorithm that tries to find $u(t)$ over time $t$ in some range $[0,t_{\max}]$. Euler's method works by setting $u(0) = u_0$ and following the key recursion,

\[ u(t+h) = u(t) + h f\big(u(t)\big), \]

starting with $t=0$ and incrementing $t$ by $h$ until $t\ge t_{\max}$.

The classic Runge–Kutta method, called also RK4, works via a more intricate scheme where,

\[ u(t+h) = u(t) + \frac{h}{6}(k_1 + 2k_2 + 2k_3 +k_4), \]

where,

\[ \begin{align*} k_{1} &=f\left(u(t)\right) \\ k_{2} &=f\left(u(t) + h \, \frac{k_{1}}{2}\right), \\ k_{3} &=f\left(u(t) + h \, \frac{k_{2}}{2}\right) \\ k_{4} &=f\left(u(t) + h\, k_{3}\right) . \end{align*} \]

For both algorithms the step size $h>0$ is a key parameter.

Our example ODE deals with a mass orbiting a much larger mass on a plane (a reasonable approx to the earth around the sun). This is a two body problem reduced to two one body problems. A trivial one for the barycentre and a second order differential equation for the orbit around the centre of mass. We now develop this ODE from simple physical principles (note that the physics of the problem is not crucial for solving this assignment problem).

Newtons Law states that,

\[ m \, a = F(r), \]

where $m$ is the mass of earth, $a$ is the acceleration vector, and $F(r)$ is the gravitational force vector acting on earth as a function of $r$, the distance between the earth and the sun (which is the distance between earth and the barycentre). This force, driven by a $1/r$ potential, is given via,

\[ F(r) = -\frac{m\, M\, G}{~~r^2} r' \]

where $r'$ is a unit vector pointing at the barycentre, $M$ is the mass of the sun, and $G$ is the gravitational constant. Hence the acceleration vector of earth is

\[ a = -\frac{M\, G}{~~~r^2}r'. \]

We now describe the location of earth via $x(t)$ and $y(t)$ in Cartesian coordinates, using the standard unit vectors $\hat{i}$ and $\hat{j}$. With this description, the acceleration vector is,

\[ a = \frac{d^2x}{dt^2} \hat{i} + \frac{d^2y}{dt^2} \hat{j}. \]

Now since,

\[ r = \sqrt{x^2 + y^2} \qquad \text{and} \qquad r' = \frac{x}{|r|} \hat{i} + \frac{y}{|r|} \hat{j}, \]

we get a system of second order ordinary differential equations:

\[ \left\{ \begin{align*} \frac{d^2x}{dt^2} &= -\frac{M\, G\, x}{~~r^{3}},\\ \frac{d^2y}{dt^2} &= -\frac{M\, G\, y}{~~r^{3}}. \end{align*} \right. \]

This system has an explicit ellipsoid solution.

We also expect the total energy $E$ to be conserved where $E = T(t) + V(t)$ with the kinetic energy satisfying $T(t) \propto \|v(t)\|^2$ and the potential energy satisfying and $V(t) \propto -1/r(t)$. Here "$\propto$" indicates "proportional to". Here $v(t)$ is the velocity vector comprised of $v_x(t)$ and $v_y(t)$ with,

\[ v_x = \frac{dx}{dt}, \qquad \text{and} \qquad v_y = \frac{dy}{dt}. \]

With such an explicit solution and the invariant of conservation of energy, we can now evaluate the performance of the various numerical schemes and their parameters. To do so, we also convert the system into the ODE form presented above (converting it from a second order ODE to a first order one). This is done by representing the state vector $u(t)$ with $d=4$ as $v_x(t)$, $v_y(t)$, $x(t)$, $y(t)$. Thus,

\[ \left\{ \begin{align*} \frac{dv_x}{dt} &= -\frac{M\,G\,x}{~~r^{3}},\\ \frac{dv_y}{dt} &= -\frac{M\,G\,y}{~~r^{3}},\\ \frac{dx}{dt} &= v_x, \\ \frac{dy}{dt} &= v_y. \end{align*} \right. \]

The following code specifies the function df_dt_one_body which is the RHS for this ODE. The other four one line functions are convenience accessors to the state variables.

#These four convenience functions extract the state variable from the state vector
#It is assumed the layout of the vector u is u = [v_x, v_y, x, y]
state_v_x(u::Vector{Float64}) = u[1]
state_v_y(u::Vector{Float64}) = u[2]
state_x(u::Vector{Float64}) = u[3]
state_y(u::Vector{Float64}) = u[4]

"""
Computes the RHS for the one body problem. 
"""
function df_dt_one_body(u::Vector{Float64}, t::Float64)::Vector{Float64}
    M, G = 1, 1 #We take these constants as normalized. Naturally they would need to be set for physical values.
    r = sqrt(state_x(u)^2 + state_y(u)^2)
    return [-M*G*state_x(u)/r^3, -M*G*state_y(u)/r^3, state_v_x(u), state_v_y(u)]
end;

Once you use it to obtain a solution on a time vector t and a state (vector of vectors) u, a function such as plot_solution below can be used to plot the solution, also allowing you to use a title for the plot and a label for the series:

using Plots, Measures

function plot_solution( t::AbstractArray{T}, 
                        u::Vector{Vector{Float64}}; 
                        title::String = "",
                        label::Union{String, Bool} = false) where T
    x, y, v_x, v_y = state_x.(u), state_y.(u), state_v_x.(u), state_v_y.(u)
 
    #"Energy"
    r = @. sqrt(x^2 + y^2)
    E = @. 0.5*(v_x^2 + v_y^2) - 1.0/r

    p1 = plot(  x, y, label = label, xlabel= "X", ylabel = "Y",
                title = title*" (position)", aspectratio=1,legend=:topleft,ylim=(-7,7))
    scatter!([0], [0], ms=15, msw=0, c=:orange, shape =:star, label="Sun")
    scatter!([x[1]], [y[1]], ms=5, msw=0, c=:blue, shape =:circle, label="Earth initial position")

    p4 = plot(  t, E, xlabel = "Time", ylabel = "Energy",
                label = label, title = title*" (energy)")
    plot(p1, p4, margin = 10mm,size=(800,400))
end;

Here for example we read binary data from a file (from the web), where we previously created example_trajectory.dat by running the Euler method on the time-horizon $[0,200]$ with a step size of $h=0.01$, an initial location of $(1.5,0)$, and initial vertical velocity of $1$. We stored the binary data of the trajectory of type Vector{Vector{Float64}} in a file using the serialization mechanism and the code below reads the binary data back into u so it can plot it.

The plot illustrates that that trajectory deviates from the desired ellipsoid solution and that the energy is not kept constant as needed.

using Serialization, HTTP
h = 0.01
t = 0:h:200
u_0 = [0., 1, 1.5, 0]

#Note: the data was written to file via serialize("data/example_trajectory.dat", u) and then committed to GitHub
ode_web = HTTP.request("GET","https://github.com/yoninazarathy/"*
                            "ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/"*
                            "raw/e71c6d085e0fcf79e3737c3a1db0fc56904e3f10/data/example_trajectory.dat")
u = deserialize(IOBuffer(ode_web.body))
@show typeof(u)
plot_solution(t,u,title = "Euler's method", label="h=$h")
typeof(u) = Vector{Vector{Float64}}

15a: Implement the Euler's method yourself and create similar plots with $h=0.001$ and $h=0.0001$ for the same time range and initial conditions. Plot plots similar to the plot above for the three values of $h$ ($0.01$, $0.001$, and $0.0001$). The trajectory plots should appear separately, but the energy plots should appear on the same plot.

15b: Repeat by implementing the RK4 method for $h=0.01$, $h=0.1$, and $h=0.5$.

15c: Comment on the efficiency of the methods and their ability to predict the trajectory, also as a function of $h$ and comparatively.

15d: Create a single animation which jointly illustrates the trajectory of the worst trajectory and the best trajectory from the 6 trajectories above. It is recommended that the animation display the running time, and trace the path of the earth in both trajectories, ideally using different colors and any other means you find for best visibility. The animation can be placed in your GitHub submission and a link to it provided with your submission.

Note: Your solution to this question does not need to deal with binary data or use serialization as we have. Here it is simply a mechanism for obtaining data for our example plot (which you can reproduce).

Question 16: Planetary Motion - Part 2

The equations of motion described above for a planetary system have the special property of being a Hamiltonian System. Hamiltonian Systems are of great interest to physicists and mathematicians due to their significance in physical modelling and their special properties. A (time invariant) Hamiltonian System is one where the dynamics of the system are completely determined by a smooth scalar function $H(q,p)$, for $q$ and $p \in {\mathbb R}^{\tilde{d}}$, where the dynamics are described by the system of first order differential equations for $i=1,\ldots,{\tilde d}$:

\[ \left\{ \begin{align*} \frac{dq_{i}}{dt} &= \frac{\partial H}{\partial p_{i}},\\ \frac{dp_{i}}{dt} &= -\frac{\partial H}{\partial q_{i}}. \end{align*} \right. \]

For the planetary motion example, $\tilde{d} = 2$, and the variables $q$ and $p$ can be interpreted as the position ($x \hat{i} + y \hat{j}$) and momentum ($v_{x} \hat{i} + v_{y} \hat{j}$) vectors respectively. The Hamiltonian can be interpreted as the total energy of the system and has the special separable form in terms of kinetic ($T$) and potential ($V$) energy,

\[ \begin{align*} H(q,p) = T(p) + V(q) = H(x, y, v_{x},v_{y}) = \frac{1}{2}(v_{x}^2 + v_{y}^2) - \frac{1}{\sqrt{x^2 + y^2}}. \end{align*} \]

Note that length and time scales have been resized to remove the mass and gravitational constants. It is beyond the scope of this course but a special property of Hamiltonian systems that can be exploited for numerical methods is Sympletic Geometry. The numerical methods for ODE solutions that use this property are correspondingly known as Symplectic Integrators.

A particularly simple and effective symplectic integrator that works for variable time steps ($h$) is a leapfrog integrator in kick-drift-kick form. This takes a second order ODE for $u(t) \in {\mathbb R}^{\tilde{d}}$ of the form

\[ \frac{d^2 u}{dt^2} = f(u), \qquad \frac{du}{dt} = v, \qquad \text{and} \qquad u(0) = u_0, \qquad \frac{du}{dt}(0) = v_0. \]

The leapfrog integrator uses an intermediate velocity update, $v_{\frac{1}{2}}$, computed as

\[ v_{\frac{1}{2}} = v(t) + \frac{h}{2} \, f\big(u(t)\big). \]

This value is then used to update the system via the recursion:

\[ \begin{align*} u(t + h) &= u(t) + h \, v_{\frac{1}{2}},\\ v(t + h) &= v_{\frac{1}{2}} + \frac{h}{2}\, f\big(u(t + h)\big).\\ \end{align*} \]

16a: Implement the Second Order Leapfrog method above for yourself for the second order form of the dynamics of planetary motion. Create similar plots of the Euler and RK4 methods above with $h=0.01$, $h=0.1$, and $h=0.5$.

16b: Compare the computational effort of the Leapfrog method with Euler and RK4 methods by comparing how many floating point operations per time step each method uses in your implementation. (This is not a coding question, but a theoretical question).

16c: Choose a range to time horizons to simulate over, comment on the trend of the accuracy of each method for long simulation runs. Summarize this analysis in a single compelling plot.

16d: The Julia package DifferentialEquations.jl has support for various state of the art techniques, including symplectic integrators. Choose a state of the art standard ODE solver and a symplectic integrator and use it to find a solution for this ODE system. Compare results with your own implementation of leapfrog and comment on the results.

Question 17: Perspective seminar (Dr. Amy Chan).

This question presents you with an additional 10 bonus points.

Write a 1-4 paragraph summary of Amy Chan's perspective seminar. The writeup should answer these questions without treating the questions directly. That is, don't write "the answer to Q1 is..." etc. Instead incorporate the answers in a flowing writeup report (like a blog post or letter). You may use first person and refer to the speaker as "Dr. Chan", or "Amy", or similar. Present the writeup as part of your PDF file submission (as the first item in the submission).

The writeup should address the following questions (in paragraph form):

  1. What was Amy’s career path, and her relationship with software, mathematics, and statistics?

  2. Where does she currently work? What does she do in her job? How do you perceive her personal experience from her job?

  3. Discuss a few key tools that she spoke about during the talk, such as programming languages, mathematics, machine learning, statistical tools, platforms, and the like. Was this the first time that you heard about any of these tools?

  4. Have you picked up any tips from Amy’s talk, which you can perhaps use in your career down the road?

  5. Feel free to state anything else that you found interesting from the talk.