UQ MATH2504
Programming of Simulation, Analysis, and Learning Systems
(Semester 2 2022)

This is an OLDER SEMESTER.
Go to current semester

Unit 4: More language features for software architecture

In the previous three units we explored basics of programming and computation (Unit 1), algorithms and data structures (Unit 2), and data files and numerics (Unit 3). In this unit we take a deeper and more thorough approach at basic Julia language features.

Programming languages, including Julia, are designed mostly with these aims in mind:

Execution speed — programs should run fast enough to solve the problem at hand.
Coding speed — writing programs to solve problems should be quick & easy for the coder.
Scalable engineering — creating and maintaining large programs with many contributors should be no harder than writing small programs yourself.

Different programming languages have different goals and target audiences, and make different trade-offs and affordances for the above. The trickiest one to get right is the third one — as codebases grow it is typical to get stuck in a tarpit of complexity, where it gets harder and harder to change and improve the code, and progress grinds to a halt. A corollary of that is that code in large pieces of software is read far more often than it is written, so it becomes crucial to make code that is written easy to understand.

Julia is a high-performance language for solving large mathematical problems, and compiled code needs to run "close to the metal". It makes many affordances to solve mathematical problems quickly and efficiently. But most crucially, it tries to solve these in such a way that is "scalable" — where generic code can be reused and repurposed, and pieces of a program can be connected together elegantly like lego bricks.

The goal of this unit is to show you the most important language features of Julia and how to use them to best effect. The most important language features that we explore are the syntax, type system, user-defined types, and multiple-dispatch. At the end we'll consider how to use these to construct your code. The notes below often refer to the Julia documentation.

Syntax

Before moving to types and dispatch, we'll cover some fundaments of Julia code syntax so we understand what we are looking at later. A lot of this might be intuitive to some of you, but since we are mathematicians, we'd like to spell it out more formally :)

Everything is an expression, and everything has a value

An expression is a piece of code that, when evaluated, returns a value. For example:

1 + 1

In Julia, every piece of syntactically valid code is an expression — which means at run time it generates a value which you can assign to a variable. Even things like if statements!

x = 42

is_x_even = if x % 2 == 0
   "$x is even"
else
   "$x is odd"
end

"42 is even"

Like many languages, Julia has a shorthand "ternary operator" (i.e. operator taking 3 inputs) for if-else using ? and :

is_x_even = x % 2 == 0 ? "$x is even" : "$x is odd"

"42 is even"

It's worth noting these are 100% identical. We say that x ? y : z is "syntax sugar" for the if statement — it's a sweetener to make the programmer's life easier, but not an entirely new feature since you can always achieve the same thing with more characters of code. Another piece of syntax sugar is that x + y has exactly the same meaning as +(x, y).

Blocks of code

Above, we saw blocks of code containing values. They didn't really do anything, like assign values to variables.

The rule in Julia is that the last expression in a block of code is the value returned from that block of code. If for some reason you want to create a code block (sometimes useful) you can use begin like so:

begin
    1
    2
    3
end

We see blocks of code everywhere in Julia — inside if and else statements, within for and while loops, within function bodies, etc. Blocks of code can be nested — we can have if statements inside while loops inside function bodies, and we tend to represent this visually with indentation:

function f(x)
    x = 0
    while x < 10
        if x % 2 == 0
            println("$x is even")
        else
            println("$x is odd")
        end
        x = x + 1
    end
    return x
end

f (generic function with 1 method)

Note: for and while loops always return nothing!

If you prefer, you can seperate multiple expressions in a code block with ; instead of on seperate lines. Sometimes you'd even use parentheses to clarify where the code block begins and ends.

x = (1; 2; 3)

Empty blocks of code return `nothing`

So what happens when the block of code is empty?

begin
end

Since every expression has a value in Julia, it must return something! That something is called nothing. It is a special builtin value that contains 0 bytes of data and has the builtin type Nothing.

typeof(nothing)

Nothing

While the above is contrived, there are some more likely places to see this.

half_of_x = if x % 2 == 0
    x ÷ 2
end

This is implicitly the same as

half_of_x = if x % 2 == 0
    x ÷ 2
else
end

which is implicitly the same as

half_of_x = if x % 2 == 0
    x ÷ 2
else
    nothing
end

So if x is odd, then half_of_x is nothing.

We also use nothing as the return value for functions that don't have a value to return — e.g. when the purpose of the function is to perform some action rather that compute some value.

function hello()
    println("Hello class")
    return nothing
end

x = hello()
@show x;

Hello class
x = nothing

Q: Does anyone know what the equivalent thing in C is?

Why expressions? Metaprogramming!

Having the syntax rule that "all valid syntax is an expression" makes it easier to analyse and manipulate code than it would be otherwise. It lets you rearrange code much like you can manipulate a mathematical expression with algebra (via substitution, etc), resulting in code that is correct and still compiles.

For human coders, this means it is easier to copy-paste code from one place to another, or refactor it into a function.

A "metaprogram" is a program that writes a program. Julia supports metaprogramming through macros and other advanced techniques. We won't be teaching how to create a macro, but using a macro is pretty easy. A good example is the @show macro:

x = 10
@show x + 1

x + 1 = 11
11

Effectively the macro takes the "expression" x + 1, prints the expression, prints equals, and prints the evaluated value. While Julia is busy replacing the builtin syntax sugar (like operators, x + 1) with more fundamental expressions (like function calls, +(x, 1)), it also "expands" the macro, resulting in code that is roughly:

x = 10
println("x + 1 = ", x + 1)

x + 1 = 11

Each macro can be thought of as like user-defined syntax sugar. We can extend the language with our own ideas of how we can write programs.

An even better way to metaprogram

Another form of metaprogramming in Julia is multiple dispatch. Multiple dispatch allows us to construct entire classes of programs that depend on the types of values — and Julia will construct on-demand a particular program specialized on the inputs you actually provide. Next we'll learn more about Julia functions, Julia types and how multiple dispatch works.

Functions in Julia

Functions are at the heart of Julia, and there's lots of ways of expressing different functions. The Julia documentation of functions provides a rich description of all of the details. We now overview a few special features that were perhaps not evident from Units 1-3. There are also links to the documentation for it.

Function syntaxes

In Julia there are multiple ways of creating a function. There is the "long form":

function f1(x)
    return x^2
end

f1 (generic function with 1 method)

The return is optional in this case — why?

There is a "short form":

f2(x) = x^2

f2 (generic function with 1 method)

And there are "anonymous functions" (or "arrow functions"):

const f3 = x -> x^2

#5 (generic function with 1 method)

In this form x -> x^2 is an expression that constructs a function which is "anonymous" — it has no name. We have bound it to the variable f3 so we can use it later. You can even do this in "long form"!

const f4 = function (x)
    return x^2
end

#7 (generic function with 1 method)

(Note: the const isn't essential — but it will let us attach other methods to f3 or f4 later)

Nested functions: "closures"

In Julia you can define functions within functions.

function adder(x)
    return y -> y + x
end

adder (generic function with 1 method)

This function returns another function! We can use the output to add things.

add_2 = adder(2)
add_2(3)

Such functions are called "closures" because they "close over" (or "capture") the values of variables outside their scope — in this case it captured the value of x, 2, and implicitly stores it inside add_2.

add_2.x

Note again it doesn't matter which form of function syntax we use. It's just different syntax for the same thing. We could have equally written:

function adder(x)
    return function (y)
        return y + x
    end
end

adder (generic function with 1 method)

In Julia, closures are given their own types and methods. The closure is just an automatically generated struct datatype with a method attached.

typeof(add_2)

var"#9#10"{Int64}

methods(add_2)

# 1 method for anonymous function #9:

(::var"#9#10")(y) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:3

Because closures are "just" a convenient way to do things you can already do in Julia with structs and methods, you can think of them as an advanced type of syntax sugar. The above is equivalent to this:

struct Adder{T}
    x::T
end

function (a::Adder)(y)
    return a.x + y
end

function adder(x)
    return Adder(x)
end

add_2 = adder(2)

Adder{Int64}(2)

add_2(3)

Note that every function in Julia has its own type — e.g. sin and cos have different types. We'll come back to types and methods later.

Type assertions

The :: operator has two closely-related meanings.

First is type assertion. In programming an "assertion" is something the programmer asserts to be true, which either the compiler or runtime will check.

42::Int

You could read the above as "42 must be an Int". The result is just the value — :: doesn't normally do anything, unless the assertion is false

42::String

ERROR: TypeError: in typeassert, expected String, got a value of type Int64

Julia has an "abstract" Number type.

42::Number

This is fine since Int is a subtype of Number, which itself could be written:

Int <: Number

true

(Note that this one returns true or false.)

The type Any is the supertype of every type.

42::Any  # always correct!

This type assertion always checks out, and doesn't really have any affect on your program. Therefore x and x::Any are the same thing. Such code is said to be a "no-op" because it corresponds to "no operation".

Argument types

The second meaning of :: is in function / method signatures, which constrains the types of arguments the method will accept.

We have seen Julia methods from the start, e.g.

f(x::Int) = x^2

f (generic function with 2 methods)

or,

f(x::Number) = x^2

f (generic function with 3 methods)

or simply,

f(x) = x^2

f (generic function with 3 methods)

Remember that x and x::Any are the same thing. The final method will accept any inputs. Whether or not x^2 works or throws an error depends on x.

In a sense, the third method isn't safe since someone could provide an input like (1, 2, 3) that can't be squared. What happens if someone provides a string? Is this expected? On the other hand, the third method provides maximum flexibility — someone can introduce a new type that works with f at any point in time, without having to edit the definition of f to make it work.

Varargs functions

Sometimes functions have a variable number of arguments — sometimes called "varargs" (or "variadic functions").

A simple thing to do might be to add up all the inputs:

function add_all(inputs...)
    out = 0
    for x in inputs
        out += x
    end
    return out
end

add_all(1, 2, 3, 4, 5)

function polynomialGenerator(a...)
    n = length(a) - 1
    poly = function(x)
        return sum([a[i+1]*x^i for i in 0:n])
    end
    return poly
end

polynomial = polynomialGenerator(1, 3, -10)

[polynomial(-1), polynomial(0), polynomial(1)]

3-element Vector{Int64}:
 -12
   1
  -6

You could use then use Roots package to automatically find out where all the inputs produce zero output:

using Roots

zero_vals = find_zeros(polynomial, -10, 10)
println("Zeros of the function f(x): ", zero_vals)

Zeros of the function f(x): [-0.19999999999999998, 0.5]

Here you start to see how Julia features like varargs functions, closures, and packages let you solve a wide range of problems without writing very much code.

Optional arguments

You can have optional arguments (or default values):

using Distributions

function my_density(x::Float64, μ::Float64 = 0.0, σ::Float64 = 1.0)
    return exp(-(x-μ)^2 / (2σ^2)) / (σ*√(2π))
end

x = 1.5
@show pdf(Normal(), x), my_density(x)
@show pdf(Normal(0.5), x), my_density(x, 0.5);

(pdf(Normal(), x), my_density(x)) = (0.12951759566589174, 0.129517595665891
74)
(pdf(Normal(0.5), x), my_density(x, 0.5)) = (0.24197072451914337, 0.2419707
2451914337)

Keyword arguments

Arguments following the ; character in the function definition or the function call are called keyword arguments. These are named as they are used.

function my_density(x::Float64; μ::Float64 = 0.0, σ::Float64 = 1.0)
    return exp(-(x-μ)^2/(2σ^2) ) / (σ*√(2π)) 
end

@show pdf(Normal(0.0,2.5),x), my_density(x, σ=2.5);

(pdf(Normal(0.0, 2.5), x), my_density(x, σ = 2.5)) = (0.13328984115671988, 
0.13328984115671988)

We can even make a function that takes arbitrary numbers of positional and keyword arguments:

function very_flexible_function(args...; kwargs...)
    @show args
    @show kwargs
end

very_flexible_function(2.5, false, a=1, b="two", c=:three)

args = (2.5, false)
kwargs = Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol, Symbol}, NamedTuple{
(:a, :b, :c), Tuple{Int64, String, Symbol}}}(:a => 1, :b => "two", :c => :t
hree)
pairs(::NamedTuple) with 3 entries:
  :a => 1
  :b => "two"
  :c => :three

The standard non-keyword arguments are called "positional" arguments since they are identified by their position not their name.

Note that when calling a function you may either put a ; or a , before the keyword arguments. In a function definition there's an ambiguity between optional arguments and keyword

Do Block Syntax for Function Arguments

As you know, you can pass functions as arguments and sometimes you can use an anonymous function for that. For example,

using Random

Random.seed!(0)
data = rand(-10_000:10_000, 100)
filter(x -> (x % 10) == 0, data)

16-element Vector{Int64}:
 -8630
  4040
 -4210
  7940
  1510
 -2970
  2650
 -7680
   480
  7820
  -820
 -9300
 -8900
  6660
 -2710
 -8030

What if the anonymous function has more lines of code?

using Primes

filter(function (x)
    if x >= 0
        return isprime(x)
    else
        return isprime(-x)
    end
end, data)

8-element Vector{Int64}:
  7243
   773
 -4831
 -3221
   367
  -101
  7121
  4051

This is a little ugly — it can be hard to see where the function starts and ends, find the second argument, etc.

In Julia there's a standard pattern that higher-order functions accept a function in their first argument. There is a fancy bit of syntax sugar called do which lets you inject a function into the first argument of a function call.

This looks like:

filter(data) do (x)
    if x >= 0
        return isprime(x)
    else
        return isprime(-x)
    end
end

8-element Vector{Int64}:
  7243
   773
 -4831
 -3221
   367
  -101
  7121
  4051

Destructing

In Julia you can destructure tuples (and other iterables) in the reverse way you construct them:

my_tuple = (42, "abc")  # construct
(a, b) = my_tuple       # destruct

@show a
@show b;

a = 42
b = "abc"

A really common pattern here is swapping two variables:

x = 1
y = 2
(x, y) = (y, x)
@show x
@show y;

x = 2
y = 1

You can also destructure fields of structs and named tuples in the reverse way you create a named tuple:

function f(x::Complex)
    (; re, im) = x
    println("Real part: $re")
    println("Imaginary part: $im")
end

f(10 + 42im)

Real part: 10
Imaginary part: 42

One thing you can do is destructure arguments to functions as they come in. For example:

f((x,y)) = x + y
my_pair = (4, 5)
f(my_pair)

Function composition and piping

Just a little bit of syntax sugar for "piping" values into function calls:

π/4 |> cos |> acos |> x -> 4x

3.141592653589793

Here is a function just like identity:

const ii = cos ∘ acos #\circ + [TAB]
(ii(π/4), π/4)

(0.7853981633974483, 0.7853981633974483)

Dot syntax for broadcasting functions

You already know the broadcast operator. It is syntax sugar for the broadcast function.

x_range = 0:0.5:π
cos.(x_range)

7-element Vector{Float64}:
  1.0
  0.8775825618903728
  0.5403023058681398
  0.0707372016677029
 -0.4161468365471424
 -0.8011436155469337
 -0.9899924966004454

See the docs for a discussion of performance of broadcasting.

You can also use the macro @.

x_range = 0:0.5:π
@. cos(x_range + 2)^2

7-element Vector{Float64}:
 0.17317818956819406
 0.6418310927316131
 0.9800851433251829
 0.8769511271716524
 0.4272499830956933
 0.0444348690576615
 0.08046423546177377

More items from control flow

You are already very familiar with conditional statements (if, elseif, else), with loops (for and while), with short circuit evaluation, and with many other variants. For example you can use continue and break in loops, and Julia even supports @goto and @label for when that isn't enough.

You can find more control flow details here: control flow in Julia docs. We'll just dive into for loops and error handling below.

Iteration and `for` loops

One impressive piece of syntax sugar is the for loop. It is in fact a special case of a while loop, like so:

# this loop...
for x in iterable
    # <CODE>
end

# ...is equivalent to
tmp = iterate(iterable)
while tmp !== nothing
    (x, state) = tmp

    # <CODE>
    
    tmp = iterate(iterable, state)
end

In the above, tmp is either nothing or it's a tuple of length two — containing the next value and whatever metadata is needed to iterate to the next element.

You can make any of your data types iterable in a for loop by implementing a method on the iterate function. We won't be doing this in this course, but it is an excellent example of an interface in Julia. We'll talk more about interfaces later.

Errors and exceptions

One additional thing to know about is exception handling.

function my_2_by_2_inv(A::Matrix{Float64})
    if size(A) != (2,2)
        error("This function only works for 2x2 matrices")
    end

    d = A[1,1]*A[2,2] - A[2,1]*A[1,2]

    if d ≈ 0
        throw(ArgumentError("matrix is singular or near singular")) #\approx + [TAB]
    end

    return [A[2,2] -A[1,2]; -A[2,1] A[1,1]]/d
end

my_2_by_2_inv(rand(3,3))

ERROR: This function only works for 2x2 matrices

my_2_by_2_inv([ones(2) ones(2)])

ERROR: ArgumentError: matrix is singular or near singular

using LinearAlgebra

Random.seed!(0)
A = rand(2, 2)
A_inv = my_2_by_2_inv(A)
@assert A_inv*A ≈ I

Random.seed!(0)
for _ ∈ 1:10 #\in + [TAB]
    A = float.(rand(1:5,2,2))
    try
        my_2_by_2_inv(A)
    catch e
        println(e)
    end
end

An exception may be caught way down the call stack:

A = ones(2,2)
f(mat) = 10*my_2_by_2_inv(A)
g(mat) = f(mat .+ 3)
h(mat) = 2g(mat)
h(A)

ERROR: ArgumentError: matrix is singular or near singular

ERROR: ArgumentError: matrix is singular or near singular
Stacktrace:
 [1] my_2_by_2_inv(A::Matrix{Float64})
   @ Main ~/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:5
 [2] f(mat::Matrix{Float64})
   @ Main ./REPL[33]:1
 [3] g(mat::Matrix{Float64})
   @ Main ./REPL[34]:1
 [4] h(mat::Matrix{Float64})
   @ Main ./REPL[35]:1
 [5] top-level scope
   @ REPL[36]:1

try 
    h(A)
catch e
    println(e)
end

ArgumentError("matrix is singular or near singular")

Try, catch and finally

It's possible to catch and handle errors that happened using try, catch and finally.

try
    0 ÷ 0
catch e
    @show e
finally
    println("done")
end;

e = DivideError()
done

Resource cleanup

One place closures, and do syntax in particular, gets used is when you want resources to be cleaned up automatically. Here's how one method of open is defined in Julia:

function open(f, x)
    io = open(f)
    try    
        f(io)
    finally
        close(io)
    end
end

When you want to use this function it is common to use do syntax like so:

open(file) do (io)
    # Read the first 128 bytes from the file
    return read(io, 128) # might throw an exception
end

This way even your code might error out, the file is always closed in the finally block above. Using this pattern avoids mistakes and resource leakage.

Variable scope

See variables and scoping in Julia docs.

Julia actually has two seperate scopes — global or "toplevel" scope, and local scope.

Local scope is the simplest. Variables inside a function body take precedence (or "shadows") any variables of the same name outside the function body.

x = 0

function f()
    x = 1
    return 2 * x
end

f()

It gets a little more complicated when things are outside.

data = [1, 2, 3]
s = 0
β, γ = 2, 1

for i in 1:length(data)
    global s    #This usage of the `global` keyword is not needed in Jupyter
                #But elsewhere without it:
                #ERROR: LoadError: UndefVarError: s not defined
    s += β*data[i]
    data[i] *= -1 #Note that we didn't need 'global' for data
end
#print(i)       #Would cause ERROR: LoadError: UndefVarError: i not defined
@show data
@show s

function sum_data(β)
    s = 0           #try adding the prefix global
    for i in 1:length(data)
        s += β*(data[i] + γ)
    end
    return s
end
@show sum_data(β/2)
@show s

data = [-1, -2, -3]
s = 12
sum_data(β / 2) = -3.0
s = 12
12

Julia uses Lexical scoping:

function my_function()
    x = 10
    function my_function_inside_a_function()
        @show x
    end
    return my_function_inside_a_function
end

x = 20
f_ret = my_function()
f_ret();

x = 10

The use of outer:

function f()
    i = 0
    for i = 1:3
        # empty
    end
    return i
end
f()

function f()
    i = 0
    for outer i = 1:3
        # empty
    end
    return i
end
f()

My advice:

use global variables as little as possible
use distinct names for variables as much as possilbe
the longer a varible hangs around, the longer and more discriptive it's name should be

Remember you want your programs to be simple and easy to read. Variable names are cheap.

Types and the Type System

See types in the Julia docs.

Everything has a type

typeof(2.3)

Float64

typeof(2.3f0)

Float32

typeof(2)

Int64

typeof(23 // 10)

Rational{Int64}

typeof(2 + 3im)

Complex{Int64}

typeof(2.0 + 3im)

ComplexF64 (alias for Complex{Float64})

typeof("Hello!")

String

typeof([1, 2, 3])

Vector{Int64} (alias for Array{Int64, 1})

typeof([1, 2, 3.0])

Vector{Float64} (alias for Array{Float64, 1})

typeof([1.0, 2, "three"])

Vector{Any} (alias for Array{Any, 1})

typeof((1.0, 2, "three"))

Tuple{Float64, Int64, String}

typeof(1:3)

UnitRange{Int64}

typeof([1 2; 3 4])

Matrix{Int64} (alias for Array{Int64, 2})

typeof(Float64)

DataType

typeof(:Hello)

Symbol

Concrete Types

Every value in Julia has what is called a "concrete" type. Julia also has "abstract" types (like Any and Number) that we'll discuss below. Only concrete types can have values - abstract types can have subtypes and those may be concrete.

Primitive types

The most basic type in Julia is the primitive type. You can create your own types with the primitive type keywords.

Here's some definitions from Base:

primitive type UInt8
    8
end

primitive type Int32
    32
end

primitive type Float64
    64
end

The number of bits must be a multiple of 8 (i.e. whole bytes).

By default, primitive types have no methods associated with them - you need to define low-level primitive actions for them (e.g. how to add integers). The one thing you can do is reinterpret one primtivie type as another with the same number of bits.

reinterpret(UInt64, 0.0)

0x0000000000000000

reinterpret(UInt64, Float64(π))

0x400921fb54442d18

Primitive types are treated by Julia as immutable — once a value is created, it cannot be modified or "mutated". You don't modify an Int64 in Julia when you increment it like i += 1, this is syntax sugar for i = i + 1, where i + 1 returns a new Int64 and the variable i is bound to a new value.

(As an optimization, compiled Julia code may actually modify some bit of RAM in-place, but this effect is not visible to the user in any way other than being fast.)

Structs and mutable structs

Another concrete type in Julia is the struct. A simple example is a complex number with a real and complex part (as defined in Base):

struct Complex{T <: Number} <: Number
    re::T
    im::T
end

Here we see a few features of struct. A struct — or structure — contains structured data within, having fields with names and types. We again use the type assertion operator :: to indicate the type, which could be a primitive type, a struct, or an abstract type like Any or Number. And the type can be parameterized — in this case it has an associated type T which must be a subtype of Number. Finally, we can define that Complex is a subtype of Number. We'll talk more about subtypes later.

Generally you construct a struct value using a function call, like:

z = Complex{Int64}(2, 3)

2 + 3im

Like primitive types, structs are by default immutable. You cannot modify an existing complex number — an operation like z += 2im returns a new complex number via a constructor, e.g. Complex{Int64}(2, 5). It is an error to do z.im = 5.

If you wanted to, you could create your own mutable struct using the mutable struct keywords:

mutable struct MutableComplex{T <: Number}
    re::T
    im::T
end

z = MutableComplex{Int64}(2, 3)
z.im += 2
z

MutableComplex{Int64}(2, 5)

Part of the reason the built-in Complex type is immutable is that means the compiler can produce faster code (as it can make more assumptions about the data in your program).

Note that we haven't yet defined any functions that act on our new MutableComplex type. If we wanted it to print like the built-in Complex numbers (like 2 + 5im), we'd need to define a method for Base.show.

function Base.show(io::IO, z::MutableComplex)
    print(io, z.re)
    print(io, " + ")
    print(io, z.im)
    print(io, "im")
end

MutableComplex{Int64}(2, 3)

2 + 3im

If we wanted it to add, subtract, multiply or divide we'd need to define methods of +, -, * and /. For example:

function Base.:+(z1::MutableComplex, z2::MutableComplex)
    return MutableComplex(z1.re + z2.re, z1.im + z2.im)
end

MutableComplex{Int64}(2, 3) + MutableComplex{Int64}(10, 6)

12 + 9im

There are yet more functions to overload that would allow it to work well with other Number types (e.g. add a MutableComplex{Float64} and a real Int64). We could do that with brute force by defining methods for every possible combination — but we will see a better way when we study generic code and interfaces, later.

`DataType`

Julia comes with a built-in struct called DataType. It has a pretty complex definition:

mutable struct DataType <: Type
    name::TypeName
    super::Type
    parameters::Tuple
    names::Tuple
    types::Tuple
    ctor::Any
    instance::Any
    size::Int32
    abstract::Bool
    mutable::Bool
    pointerfree::Bool
end

It holds all sorts of details about concrete data types. Every concrete data type is a DataType:

@show typeof(Int64)
@show typeof(Complex{Float64})
@show typeof(Vector{String});

typeof(Int64) = DataType
typeof(Complex{Float64}) = DataType
typeof(Vector{String}) = DataType

You can poke around in these structures to learn things:

@show Int64.size
@show Complex{Float64}.size
@show Complex{Float64}.name.names
@show Complex{Float64}.types;

Int64.size = 8
(Complex{Float64}).size = 16
(Complex{Float64}).name.names = svec(:re, :im)
(Complex{Float64}).types = svec(Float64, Float64)

Given that DataType is just a mutable struct, it's type is also a DataType:

@show typeof(DataType)

typeof(DataType) = DataType
DataType

It's circular, but logically consistent. Note that all Turing-complete systems have some kind of circular or higher-order logic like this at some point.

Built-in types

There are a handful of special builtin types in Julia. Below are the only concrete types which are not just regular primitive types, structs and mutable structs.

`Array`

The (multidimensional) array is Julia's built-in type for storing data of arbitrary size. A primitive type or struct consists of a known set of bits, while an Array can be dynamically sized at run time.

Arrays have two type parameters, Array{T, N}. Here T is the element type, and N is the dimensionality of the array.

There are a couple type aliases defined:

const Vector{T} = Array{T, 1}
const Matrix{T} = Array{T, 2}

Note that Vectors are resizable after creation (push!, pop!, empty!, etc), but this isn't true for other dimensionalities.

In Julia, arbitrary bytes of data (like the contents of a file) is generally represented by Vector{UInt8}.

`String`

Strings represent text of arbitrary length.

In Julia strings are immutable and assumed to be UTF-8 encoded (Windows, Java and JavaScript tend to use UTF-16 endoding). The compiler is allowed to make a few optimizations - e.g. for short strings, etc. You can mostly consider the backing bytes to be like an immutable Vector{UInt8}, which is how Julia strings are passed to functions from other programming languages like C.

`Tuple`

A tuple is like an immutable struct where the field names are 1, 2, 3, etc.

tuple = (42, 3.14, "abc")

(42, 3.14, "abc")

Usually you use indexing syntax to get the elements:

tuple[3]

"abc"

There is a special Tuple datatype with arbitrary number of type parameters (the only type that supports this).

typeof(tuple)

Tuple{Int64, Float64, String}

When you call a function in Julia, f(42, 3.14, "abc"), it is a bit like the function takes a single tuple as its input. If you want to return more than one thing from a function, it is common to return a tuple.

Tuples have some special rules when it comes to abstract types to help with multiple-dispatch, but that is a rather advanced topic. Despite that, every tuple value behaves just like plain old data.

`NamedTuple`

The NamedTuple, like (a = 1, b = true, c = "abc"), is a special built-in type. However, it behaves just like a regular struct with field names you can choose, and is immutable.

typeof((a = 1, b = true, c = "abc"))

NamedTuple{(:a, :b, :c), Tuple{Int64, Bool, String}}

More on mutation

As we saw earlier, a type can be mutable or not (immutable). Variables/data of the immutable types are typically stored on the stack. Variables/data of mutable types are typically allocated and stored on the heap.

@show ismutable(7)
@show ismutable([7]);

ismutable(7) = false
ismutable([7]) = true

When you pass a mutable variable to a function, the function can change the data contained within. However, if you bind the input variable to a new value, this will only have an effect on the scope of variables inside the function.

f(z::Int) = (z = 0)
f(z::Array{Int}) = (z[1] = 0)

x = 1
@show typeof(x)
@show isimmutable(x)
println("Before call by value: ", x)
f(x)
println("After call by value: ", x, "\n")

x = [1]
@show typeof(x)
@show isimmutable(x)
println("Before call by reference: ", x)
f(x)
println("After call by reference: ", x)

typeof(x) = Int64
isimmutable(x) = true
Before call by value: 1
After call by value: 1

typeof(x) = Vector{Int64}
isimmutable(x) = false
Before call by reference: [1]
After call by reference: [0]

The rule is that you can mutate the inside of a mutable value (Array or mutable struct) and variables and/or data structures with references to that mutable value will be able to see the effect of the mutation.

However variables are always bound to new values with = (e.g. z = 0) and the usual lexical scoping rules apply.

To help with preserving data that might get modified, we often make a copy with copy and deepcopy:

println("Immutable:")
a = 10
b = a
b = 20
@show a;

Immutable:
a = 10

println("\nNo copy:")
a = [10]
b = a
b[1] = 20
@show a;

No copy:
a = [20]

println("\nCopy:")
a = [10]
b = copy(a)
b[1] = 20
@show a;

Copy:
a = [10]

println("\nShallow copy:")
a = [[10]]
b = copy(a)
b[1][1] = 20
@show a;

Shallow copy:
a = [[20]]

println("\nDeep copy:")
a = [[10]]
b = deepcopy(a)
b[1][1] = 20
@show a;

Deep copy:
a = [[10]]

Abstract types

Julia has a type hierarchy (a tree). At the top of the tree is the type Any. All types have a supertype (the supertype of Any is Any). Types that are not leaves of the tree have subtypes. Some types are abstract while others are concrete. One particularly distinctive feature of Julia's type system is that concrete types may not subtype each other: all concrete types are final and may only have abstract types as their supertypes.

x = 2.3
@show typeof(x)
@show supertype(Float64)
@show supertype(AbstractFloat)
@show supertype(Real)
@show supertype(Number)
@show supertype(Any);

typeof(x) = Float64
supertype(Float64) = AbstractFloat
supertype(AbstractFloat) = Real
supertype(Real) = Number
supertype(Number) = Any
supertype(Any) = Any

There is an is a relationship:

isa(2.3, Number)

true

isa(2.3, String)

false

2.3 isa Float64

true

Note that x isa T is the same as typeof(x) <: T, where we say <: as "is a subtype of".

@show Float64 <: Number
@show String <: Number;

Float64 <: Number = true
String <: Number = false

We can ask whether a given type is abstract or concrete.

@show isabstracttype(Float64)
@show isconcretetype(Float64);

isabstracttype(Float64) = false
isconcretetype(Float64) = true

@show isabstracttype(Real)
@show isconcretetype(Real);

isabstracttype(Real) = true
isconcretetype(Real) = false

Structs with undefined type paremeters are not concrete:

@show isconcretetype(Complex);

isconcretetype(Complex) = false

Once we provide the type parameters we do get a concrete type:

@show isconcretetype(Complex{Float64});

isconcretetype(Complex{Float64}) = true

As mentioned, Julia has a type tree. Let's walk down from Number:

using InteractiveUtils: subtypes

function type_and_children(type, prefix = "", child_prefix = "")
    if isconcretetype(type)
        @assert isempty(subtypes(type))

        println(prefix, type, ": concrete")
    else
        println(prefix, type, isabstracttype(type) ? ": abstract" : ": parameterized")

        children = subtypes(type)
        for (i, c) in enumerate(children)
            if i == length(children)
                type_and_children(c, "$(child_prefix) └─╴", "$(child_prefix)    ")
            else
                type_and_children(c, "$(child_prefix) ├─╴", "$(child_prefix) │  ")
            end 
        end
    end
end

type_and_children(Number)

Number: abstract
 ├─╴Complex: parameterized
 ├─╴DualNumbers.Dual: parameterized
 └─╴Real: abstract
     ├─╴AbstractFloat: abstract
     │   ├─╴BigFloat: concrete
     │   ├─╴Float16: concrete
     │   ├─╴Float32: concrete
     │   └─╴Float64: concrete
     ├─╴AbstractIrrational: abstract
     │   └─╴Irrational: parameterized
     ├─╴Integer: abstract
     │   ├─╴Bool: concrete
     │   ├─╴Signed: abstract
     │   │   ├─╴BigInt: concrete
     │   │   ├─╴Int128: concrete
     │   │   ├─╴Int16: concrete
     │   │   ├─╴Int32: concrete
     │   │   ├─╴Int64: concrete
     │   │   └─╴Int8: concrete
     │   └─╴Unsigned: abstract
     │       ├─╴UInt128: concrete
     │       ├─╴UInt16: concrete
     │       ├─╴UInt32: concrete
     │       ├─╴UInt64: concrete
     │       └─╴UInt8: concrete
     ├─╴Rational: parameterized
     ├─╴StatsBase.PValue: concrete
     └─╴StatsBase.TestStat: concrete

In Julia, you can define abstract types with the abstract type keywords:

abstract type Number
end

abstract type Real <: Number
end

abstract type AbstractFloat <: Real
end

primitive type Float64 <: AbstractFloat
    64
end

Union types

We've seen two types of abstract types — the abstract types that make up the type tree, and the parameterized types (abstract Complex vs concrete Complex{Float64}).

Julia has a third abstract type called Union which let's you reasona about a finite set of (abstract or concrete) types.

42::Union{Int, Float64}

3.14::Union{Int, Float64}

3.14

"abc"::Union{Int, Float64}

ERROR: TypeError: in typeassert, expected Union{Float64, Int64}, got a value of type String

Union can handle an arbitrary number of types, Union{T1, T2, T3, ...}.

As a special case Union{T} is just the same as T. We also have Union{T, T} == T, etc.

The union of no types at all, Union{}, is a special builtin type which is the opposite of Any. No value can exist with type Union{}! Sometimes Any is called the "top" type and Union{} is called the "bottom" type. It's used internally by the compiler to rule out impossible situations, but it's not something for you to worry about.

Making types for your programs

You can define your own types. Any "serious" programming task would almost always merit that you do that.

Concrete `Person`

In object oriented languages (e.g. C++, Java, Python) types are typically called classes. A class (in such a language) will have both definitions of data and actions, typically called variables and methods respectively. An instance of a class would be called an object.

Julia is not object oriented. It rather provides a different paradigm based on multiple-dispatch (which we describe below). Nevertheless, there are user defined types, called structs (structures). The name comes from C.

struct Person # Notice the convention of using capital letters for the first letter of a struct
    height::Float64
    weight::Float64
    name::String
end

person = Person(1.79, 78.6, "Miriam") # A struct comes with a constructor function

@show typeof(person)

@show person.height # The fields of a struct are accessed via "." - not to be confused with "." used for broadcasting.
@show person.weight
@show person.name

typeof(person) = Person
person.height = 1.79
person.weight = 78.6
person.name = "Miriam"
"Miriam"

@show ismutable(person)
person.weight = 85.4 # gained some weight - but this will generate and error

ismutable(person) = false

ERROR: setfield!: immutable struct of type Person cannot be changed

Here is a mutable struct

mutable struct MutablePerson
    height::Float64
    weight::Float64
    name::String
end

person = MutablePerson(1.79, 78.6, "Miriam")
person.weight = 85.4
println(person)

MutablePerson(1.79, 85.4, "Miriam")

Note: You typically cannot redefine a type during the same Julia session. One workaround for that is to use the Revise.jl package. We won't use it just yet.

struct MyStruct
    x::Int
end

struct MyStruct # Will generate an ERROR because redefining a struct
    x::Int
    y::Float64
end

ERROR: invalid redefinition of constant MyStruct

Abstract `Animal`

We extends the above to a whole type hierarchy of different types of animals (including humans):

abstract type Animal end

abstract type Mammal <: Animal end
abstract type Reptile <: Animal end

struct Human <: Mammal
    height::Float64
    weight::Float64
    name::String
end    

struct Dog <: Mammal
    height::Float64
    weight::Float64
end

struct FlexDog{T <: Real} <: Mammal
    height::T
    weight::T
end

struct Crocodile <: Reptile
    length::Float64
    weight::Float64
    type::Symbol # Expect to be :salt_water or :fresh_water
end

type_and_children(Animal)

Animal: abstract
 ├─╴Mammal: abstract
 │   ├─╴Dog: concrete
 │   ├─╴FlexDog: parameterized
 │   └─╴Human: concrete
 └─╴Reptile: abstract
     └─╴Crocodile: concrete

As stated above, the function that creates an instance of the type is called the constructor. Every concrete type comes with a default constructor.

methods(Crocodile)

# 2 methods for type constructor:

Crocodile(length::Float64, weight::Float64, type::Symbol) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:24
Crocodile(length, weight, type) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:24

tick_tock = Crocodile(2.3, 204, :salt_water)

Crocodile(2.3, 204.0, :salt_water)

You can also create other constructor methods:

function Crocodile(type::Symbol)
    if type == :salt_water
        return Crocodile(4.2, 410, :salt_water) # average male salt water croc
    elseif type == :fresh_water
        return Crocodile(2.3, 70, :fresh_water) # average male fresh water croc
    else
        error("Can't make crocodile of type $type")
    end
end

methods(Crocodile)

# 3 methods for type constructor:

Crocodile(length::Float64, weight::Float64, type::Symbol) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:24
Crocodile(type::Symbol) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:2
Crocodile(length, weight, type) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:24

Crocodile(:salt_water)

Crocodile(4.2, 410.0, :salt_water)

Crocodile(:fresh_water)

Crocodile(2.3, 70.0, :fresh_water)

Crocodile(:ice_water) # will generate an error

ERROR: Can't make crocodile of type ice_water

A bit more on constructors will come later.

Notice we had the parameteric type FlexDog:

dash_my_dog = FlexDog(2, 4)
@show typeof(dash_my_dog)

lassy_your_dog = FlexDog(2.3f0, 5.7f0)
@show typeof(lassy_your_dog)

my_dog_array = FlexDog{UInt16}[]

typeof(dash_my_dog) = FlexDog{Int64}
typeof(lassy_your_dog) = FlexDog{Float32}
FlexDog{UInt16}[]

my_dog_array = FlexDog{Complex}[] # Will not work because Complex is not a Real

ERROR: TypeError: in FlexDog, in T, expected T<:Real, got Type{Complex}

With multiple dispatch we can get a form of polymorphism:

animal_noise(animal::Dog) = "woof"
animal_noise(animal::Human) = "hello"
animal_noise(animal::Crocodile) = "chchch"

animals = [Crocodile(:fresh_water), Human(1.79, 78.6, "Miriam"), Crocodile(:salt_water), Dog(0.63, 12.5)]
animal_noise.(animals)

4-element Vector{String}:
 "chchch"
 "hello"
 "chchch"
 "woof"

methods(animal_noise)

# 3 methods for generic function animal_noise:

animal_noise(animal::Dog) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:2
animal_noise(animal::Human) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:3
animal_noise(animal::Crocodile) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:4

We can even handle the FlexDog and Dog together:

animal_noise(animal::Union{Dog, FlexDog}) = "woof"
push!(animals, FlexDog{Int16}(2,4))
animal_noise.(animals)

5-element Vector{String}:
 "chchch"
 "hello"
 "chchch"
 "woof"
 "woof"

We can now say that animal_noise is a part of the Animal interface. Every animal::Animal will have an animal_noise(animal) which is returns a String. This is a guarantee people can use to build programs about Animals.

The `Animal` interface

Most interestingly - at any point users could come along and define new types of Animal, and so long as the respect the interface the new types will function in pre-existing programs that did not anticipate them.

# sometime later...
struct Cat <: Mammal
    height::Float64
    weight::Float64
end

animal_noise(::Cat) = "meow"

animal_noise (generic function with 5 methods)

Using interfaces lets you build larger programs by assembling generic logic with different data types like they were Lego bricks. Generic code can be reused - it doesn't need to be rewritten when a new type is added, or a when new field is added to a struct, or when a new interface is introduced.

Interfaces are flexible in two ways — you can add new types of Animal or of AbstractArray, or you think up new interfaces to Animal or add new interfaces to AbstractArray (e.g. LinearAlgebra).

Methods and Multiple Dispatch

See methods in Julia docs.

If there is one key attribute to Julia it is multiple dispatch as we have just seen above.

function my_f(x::Int)
    println("My integer is $x")
end

function my_f(x::Float64)
    println("My floating point number is $x")
end

@show my_f(2)
@show my_f(2.5)
@show methods(my_f);

My integer is 2
my_f(2) = nothing
My floating point number is 2.5
my_f(2.5) = nothing
methods(my_f) = # 2 methods for generic function "my_f":
[1] my_f(x::Int64) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-wi
th-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:2
[2] my_f(x::Float64) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-
with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd
:6

It is worthwhile to watch this video about the philosphy of multiple dispatch. Some of the content of the video may be a bit advanced, but perhaps towards the end of the course it would be worth listening to it again to see what makes sense and what not yet.

Specificity

When a function has multiple methods on it, Julia will automatically use the most specific one:

function my_f(x::Number)
    println("My number is $x")
end

my_f(2)
my_f(2.5)
my_f(2 // 3)

println()
@show methods(my_f);

My integer is 2
My floating point number is 2.5
My number is 2//3

methods(my_f) = # 3 methods for generic function "my_f":
[1] my_f(x::Int64) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-wi
th-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:2
[2] my_f(x::Float64) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-
with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd
:6
[3] my_f(x::Number) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-w
ith-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:
2

Dispatch always uses the concrete types of the input values to find the correct method. This might mean it doesn't actually compile your code until the middle of program execution. (The Julia runtime includes the compiler itself).

Defining more methods for existing functions

Almost any operation in julia is a function. For example,

using InteractiveUtils

@which 2 + 3

+(x::T, y::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} in Base at int.jl:87

There are many methods for +:

methods(+) |> length

So what if we had our own type and wanted to have + for it, and say an integer.

struct PlayerScore
    player_name::String
    score::Int
end

me = PlayerScore("Johnny", 22)

PlayerScore("Johnny", 22)

me = me + 10 # will generate an error since `+` for me and an integer is not defined

ERROR: MethodError: no method matching +(::PlayerScore, ::Int64)
Closest candidates are:
  +(::Any, ::Any, !Matched::Any, !Matched::Any...) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/operators.jl:655
  +(!Matched::T, ::T) where T<:Union{Int128, Int16, Int32, Int64, Int8, UInt128, UInt16, UInt32, UInt64, UInt8} at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/int.jl:87
  +(!Matched::Base.TwicePrecision, ::Number) at /Applications/Julia-1.7.app/Contents/Resources/julia/share/julia/base/twiceprecision.jl:279
  ...

So let's define it:

import Base: + # we do this to let Julia know we will add more methods to `+`

function +(ps::PlayerScore, n::Int)::PlayerScore
    return PlayerScore(ps.player_name, ps.score + n)
end

me = me + 10

PlayerScore("Johnny", 32)

You can do this for every operation and function you want (and makes sense). What if we wanted "pretty printing"?

import Base: show

show(io::IO, ps::PlayerScore) = print(io, "Score for $(ps.player_name) = $(ps.score)")

println("We have a some score: $me. Pretty good!")

We have a some score: Score for Johnny = 32. Pretty good!

Example: online statistics

Lets consider an example where we want to collect some quick running statistics from some data coming in. E.g.:

using Random, Statistics

# A function that returns a data point - could be streamed from disk or the internet
fetch_new_data() = 100*rand()

function print_running_stats(n::Integer, running_stats_storage)
    for i in 1:n
        # Get some data
        data_point = fetch_new_data()

        # Collect data for statistics
        push!(running_stats_storage, data_point)

        # Peridoically we look at summary statistics
        if i % 20 == 0
            println("-------")
            println("Count: ", length(running_stats_storage))
            println("Mean: ", mean(running_stats_storage))
            println("Max: ", maximum(running_stats_storage))
        end
    end
end

Random.seed!(0)
running_stats_storage = Float64[] # Some place to hold the running data
print_running_stats(100, running_stats_storage)

-------
Count: 20
Mean: 43.111155613892045
Max: 90.47275767596541
-------
Count: 40
Mean: 44.657640026434834
Max: 90.47275767596541
-------
Count: 60
Mean: 47.483269933729034
Max: 98.13388392678519
-------
Count: 80
Mean: 46.553989816155244
Max: 98.13388392678519
-------
Count: 100
Mean: 45.13778124972069
Max: 98.13388392678519

Consider the scenario where you have a lot of data to stream and n gets very big... so you don't want to recompute the mean and max every time. In fact — you might not even want to store the a copy of the data in RAM at all!

Here's an approach to do "online" statistics:

mutable struct RunningStats
    count::Int
    sum::Float64
    max::Float64

    RunningStats() = new(0, 0.0, -Inf)
end

running_stats_storage = RunningStats()

RunningStats(0, 0.0, -Inf)

We can now make specific methods for push!, length, sum, mean and maximum for this new type:

import Base: push!, length, sum, maximum
import Statistics: mean

length(rsd::RunningStats) = rsd.count
sum(rsd::RunningStats) = rsd.sum
mean(rsd::RunningStats) = rsd.sum / rsd.count
maximum(rsd::RunningStats) = rsd.max

function push!(rsd::RunningStats, data_point)
    # Update the count
    rsd.count += 1

    # Update the sum
    rsd.sum += data_point

    # Update the maximum
    if rsd.max < data_point
        rsd.max = data_point
    end
end

Random.seed!(0)
running_stats_storage = RunningStats()
print_running_stats(100, running_stats_storage)

-------
Count: 20
Mean: 43.111155613892045
Max: 90.47275767596541
-------
Count: 40
Mean: 44.65764002643482
Max: 90.47275767596541
-------
Count: 60
Mean: 47.483269933729034
Max: 98.13388392678519
-------
Count: 80
Mean: 46.553989816155244
Max: 98.13388392678519
-------
Count: 100
Mean: 45.137781249720675
Max: 98.13388392678519

Running statistics of other types

Going a bit more generic we could have also had,

mutable struct FlexRunningStats{T <: Number}
    data::Vector{T}
    count::Int
    sum::T
    max::T

    FlexRunningStats{T}() where {T} = new{T}(T[], 0, zero(T), typemin(T))
end

length(rsd::FlexRunningStats) = rsd.count
sum(rsd::FlexRunningStats) = rsd.sum
mean(rsd::FlexRunningStats) = rsd.sum / rsd.count
maximum(rsd::FlexRunningStats) = rsd.max

function push!(rsd::FlexRunningStats, data_point)
    # Insert the new datapoint
    push!(rsd.data, data_point)

    # Update the count
    rsd.count += 1

    # Update the sum
    rsd.sum += data_point

    # Update the maximum
    if rsd.max < data_point
        rsd.max = data_point
    end
end

fetch_new_data() = rand(0:10^4) # override this to return integers

Random.seed!(0)
running_stats_storage = FlexRunningStats{Int}()
print_running_stats(100, running_stats_storage)

-------
Count: 20
Mean: 4310.95
Max: 9048
-------
Count: 40
Mean: 4465.675
Max: 9048
-------
Count: 60
Mean: 4748.3
Max: 9814
-------
Count: 80
Mean: 4655.375
Max: 9814
-------
Count: 100
Mean: 4513.76
Max: 9814

@which mean(running_stats_storage)

mean(rsd::FlexRunningStats) in Main at /Users/uqjnazar/git/mine/ProgrammingCourse-with-Julia-SimulationAnalysisAndLearningSystems/markdown/lecture-unit-4.jmd:13

But there is a problem with the above. What if T was Complex? Should we have done <: Number or <: Real?

In generic code its good to consider what interfaces we want to use, so we can figure out which type constraints are appropriate. Julia will go ahead and assemble the peices of your program together into a coherent whole.

Interfaces

So above we used a few interface methods to implement our online-statistics accumulator.

push! - add an element to a collection (mutates the input)
length - the number of element in a collection
sum - add up all the element in a collection
maximum - the maximum number of elements in a collection
mean - the mean of all the elements in a collection

Let's have a look at some common interfaces

Numbers

Common arithmetic operations +, -, *, /, etc.
promote(x, y) - take two numbers and return two numbers of the same type, e.g promote_type(3.14, 1) = (3.14, 1.0)
promote_rule(T1, T2) - e.g. promote_rule(Float64, Int64) = Float64.

Some subtypes of Number like AbstractFloat have more interface methods.

Iterables

iterate(iter) - used in for loops
length(iter) - the number of elements to iterate
eltype(iter) - the type of the elements to iterate

Some things iterate but we might not have known length or element type, so there you can define some traits to describe these facts:

IteratorSize(iter) - return whether length is known
IteratorEltype(iter) - return whether eltype is known

Arrays

Arrays are iterable and satisfy the the above. They are also indexable

getindex(a, i) - the function behind a[i]
setindex!(a, v, i) - the function behind a[i] = v
size(a) - gives a tuple of sizes, e.g. size([1,2,3]) = (3,)

In fact you can create a custom array in Julia with just those methods above! Things like ranges 1:10 are subtypes of AbstractArray.

Some types like Vector are resizable and for these types there are functions like push!, pop!, insert!, deleteat!, append!, empty!, etc to manipulate them as arbitrary-sized lists.

Arrays are also defined as arithmetic objects according to linear algebra, and therefore have +, -, *, /, etc defined. The LinearAlgebra standard library package has lots of functions to invert or decompose matrices, find eigenvalues, etc.

Sets

Julia has a Set{T} type and an AbstractSet{T} supertype. Sets are iterable and support things like:

in
union
intersect
setdiff
symdiff

Here's where interfaces get interesting. Things like Array also support in, but it is slow (must check every element). With Set lookup is fast (e.g. O(1) instead of O(N)) - so the operations above are fast.

Dictionaries

Julia has a Dict{K, T} type and an AbstractDict{K, T} supertype. Like arrays, dictionaries are iterable and indexable. You can imagine they are like a Set but have associate values (in fact their implementation is related).

keys(dict) - return the set of keys
haskey(dict, key) - check if a key exists in the dictionary
getindex(dict, key) - get a value associated with a key
setindex!(dict, value, key) - insert or update a key
delete!(dict, key) - remove a key
keytype(dict) - the type of the dictionary keys
valtype(dict) - the type of the dictionary values

You generally iterate a dictionary by specifying if you want to iterate the keys, values, or both:

pairs(dict) - an iterable of key => value pairs (the default)
keys(dict) - an iterable of the dictionary keys only
values(dict) - an iterable of the dictionary values only

Using structs for complex, nested data structures

You can make interesting data structures by referencing other instances of yourself, like this tree structure below:

Random.seed!(0)

struct Node
    id::UInt16
    friends::Vector{Node}
    Node() = new(rand(UInt16), [])
    Node(friend::Node) = new(rand(UInt16),[friend])
end

"""
Makes 'n` children to node, each with a single friend
"""
function make_children(node::Node, n::Int, friend::Node)
    for _ in 1:n
        new_node = Node(friend)
        push!(node.friends, new_node)
    end
end

root = Node()
make_children(root, 3, root)
for node in root.friends
    make_children(node, 2,root)
end
root

Node(0x67db, Node[Node(0x118c, Node[Node(#= circular reference @-4 =#), Nod
e(0xa95f, Node[Node(#= circular reference @-6 =#)]), Node(0x1dc7, Node[Node
(#= circular reference @-6 =#)])]), Node(0xdcb5, Node[Node(#= circular refe
rence @-4 =#), Node(0x1c00, Node[Node(#= circular reference @-6 =#)]), Node
(0xb3b6, Node[Node(#= circular reference @-6 =#)])]), Node(0x1602, Node[Nod
e(#= circular reference @-4 =#), Node(0x4a1d, Node[Node(#= circular referen
ce @-6 =#)]), Node(0x074f, Node[Node(#= circular reference @-6 =#)])])])

More to be covered as part of Unit 6:

See constructors in Julia docs. More on this in Unit 6.

See conversion and promotion in Julia docs

See interfaces in Julia docs

UQ MATH2504Programming of Simulation, Analysis, and Learning Systems(Semester 2 2022)

This is an OLDER SEMESTER. Go to current semester