Mathematics for Data Science 1

Welcome to the semester 1, 2021 version of MATH7501. This is a bridging course in the Masters of Data Science Program at the University of Queensland. The course is designed to bring students up to speed with mathematical concepts from discrete mathematics, calculus and elementary linear algebra - all with a view of data science, statistics and machine learning applications that follow.

The course is recommended for data science students (or similar masters of engineering students) that have not taken more than two dedicated mathematics courses in their undergraduate degree. It can also be taken by students that have had several undergraduate mathematics courses a while back and require a refresher. The prerequisite for the course is to have basic knowledge of high-school mathematics, including algebra, geometry, (basic) trigonometry, working with functions, logarithms and related concepts of a similar level.

The goal of the course is to enable students to speak the "language of mathematics" in a way sufficient for understanding further data science, statistics and machine learning concepts. Since the course material is quite broad, there is less emphasis on the detailed mechanics, and more detail on the concepts at hand. The course closely follows the course reader. This document accompanies the students throughout the semester and serves as the basis for the course structure. It is composed of 10 units where unit 1 deals with basics of linear algebra, units 2-4 deal mostly with discrete mathematics, and units 5-10 deal with calculus including elementary aspects from multi-variate calculus.

The study units generally cover basic mathematical concepts. In addition each unit closes with an "application" section where a concrete data science application of the underlying mathematics is discussed. Nevertheless, the focus of the course is not the application but rather the underlying mathematical foundations. The applications covered include k-means clustering, the set cover problem, automated reasoning, classification using neural networks, analysis of the number of iterations of algorithms, exploratory data analysis, basic profit optimization, and analysis of probability distributions. All of these are simply presented to illustrate how the "languages" of linear algebra, discrete mathematics and calculus come into play in the world of data science.

Students are required to follow course materials prior to lectures. Lectures then help students to further understand material studied individually. The lectures are also there to shed a light on the applications of the material and to guide students towards individual study in future lectures. The course assessment includes 3 homework assignments, 2 quizzes, and a final exam that is treated from a study perspective as an extended quiz. Some of the homework assignments are to be solved "by hand" while others need to be carried out computationally using Wolfram Mathematica. Some Mathematica based problems are taken from this extended collection of Mathematica based exercises created by Sam Hambleton (whose YouTube channel may also give further insight into Mathematica).

Students may also look at resources from previous years: 2020 | 2019 | 2018. Keep in mind that the unit order has changed over the years.

After this course, Data Science students are encouraged to take MATH 7502.

The course is recommended for data science students (or similar masters of engineering students) that have not taken more than two dedicated mathematics courses in their undergraduate degree. It can also be taken by students that have had several undergraduate mathematics courses a while back and require a refresher. The prerequisite for the course is to have basic knowledge of high-school mathematics, including algebra, geometry, (basic) trigonometry, working with functions, logarithms and related concepts of a similar level.

The goal of the course is to enable students to speak the "language of mathematics" in a way sufficient for understanding further data science, statistics and machine learning concepts. Since the course material is quite broad, there is less emphasis on the detailed mechanics, and more detail on the concepts at hand. The course closely follows the course reader. This document accompanies the students throughout the semester and serves as the basis for the course structure. It is composed of 10 units where unit 1 deals with basics of linear algebra, units 2-4 deal mostly with discrete mathematics, and units 5-10 deal with calculus including elementary aspects from multi-variate calculus.

The study units generally cover basic mathematical concepts. In addition each unit closes with an "application" section where a concrete data science application of the underlying mathematics is discussed. Nevertheless, the focus of the course is not the application but rather the underlying mathematical foundations. The applications covered include k-means clustering, the set cover problem, automated reasoning, classification using neural networks, analysis of the number of iterations of algorithms, exploratory data analysis, basic profit optimization, and analysis of probability distributions. All of these are simply presented to illustrate how the "languages" of linear algebra, discrete mathematics and calculus come into play in the world of data science.

Students are required to follow course materials prior to lectures. Lectures then help students to further understand material studied individually. The lectures are also there to shed a light on the applications of the material and to guide students towards individual study in future lectures. The course assessment includes 3 homework assignments, 2 quizzes, and a final exam that is treated from a study perspective as an extended quiz. Some of the homework assignments are to be solved "by hand" while others need to be carried out computationally using Wolfram Mathematica. Some Mathematica based problems are taken from this extended collection of Mathematica based exercises created by Sam Hambleton (whose YouTube channel may also give further insight into Mathematica).

Students may also look at resources from previous years: 2020 | 2019 | 2018. Keep in mind that the unit order has changed over the years.

After this course, Data Science students are encouraged to take MATH 7502.

The course is coordinated by A/Prof. Yoni Nazarathy (y.nazarathy@uq.edu.au). In addition, the bulk of the lectures are by Dr. Timothy Buttsworth (t.buttsworth@uq.edu.au). Practicals are run by Dr. Aminath Shausan.

In addition to this website, there are a few other resources for information/action:

In addition to this website, there are a few other resources for information/action:

- UQ Blackboard. Use this for official announcements and grades.
- UQ Course Profile. This is the official statement of the content, requirements, and assessment of the course.
- GitHub repo for the course. This will contain ad hoc material from lectures and practicals.
- Piazza for MATH7501. Use this to discuss and ask questions. Note that teaching staff will generally only address questions if they remain un-answered for more than 48 hours. It is preferred that students help each other via Piazza.
- Lecture and practical recordings are on this playlist .
- Lecturer visit hours. See the visit hours and Zoom links on Blackboard announcements. Join for specific technical questions.
- Course submission e-mail: mathdatasciencebridgingsubmissions@uq.edu.au. Use this e-mail only to submit assessment and not for questions.
- Lecturer e-mails: Please don't use this for technical questions but you may e-mail about logistical and personal issues.
- Tutor e-mail: Please avoid e-mailing the tutor.
- If you feel you want additional help, you may also (e-)attend meetings of the Mathematics First-year learning centres.

A detailed week by week schedule is below. The lectures are Tuesday 17:00 - 20:00 (in three chunks of 50 minutes) via Zoom. The practicals are on Friday and are both in FD mode (in person) and EX model (via Zoom).

You can may want to also see the UQ Calendar.

Week | Tuesday Lecture Date | Lecturer(s) | Lecture Topics/Activity | Friday Practical Date | Practical Topics/Activity | Assignment Due | Comments |
---|---|---|---|---|---|---|---|

1 | Feb-23 | YN+TB | Intro + Mathematica + Unit 1 | Feb-26 | Introduction to Mathematica | ||

2 | Mar-2 | YN | Intro + Unit 1 + Unit 2 | Mar-5 | Help for assignment 1 | ||

3 | Mar-9 | YN | Unit 2 + Unit 3 | Mar-12 | Help for assignment 1 | ||

4 | Mar-16 | YN | Unit 3 + Unit 4 | Mar-19 | Prep for quiz 1 | Assignment 1 on units 1, 2, and 3 due March 18 | |

5 | Mar-23 | YN | Quiz 1 + Unit 4 | Mar-26 | Quiz 1 solution | Quiz 1 is on units 1, 2 and 3. | |

6 | Mar-30 | TB | Unit 5 | No practical | |||

break | |||||||

7 | Apr-13 | TB | Unit 6 | Apr-16 | Help for assignment 2 | ||

8 | Apr-20 | TB | Unit 6 + Unit 7 | Apr-23 | Help for assignment 2 | ||

9 | Apr-27 | TB | Unit 7 | Apr-30 | Help for assignments 2 and 3 | ||

10 | May-4 | TB | Unit 7 + Unit 8 | May-7 | Prep for quiz 2 | Assignment 2 on units 4, 5, 6, and 7 due May 6 | While centrally at UQ, May 4 is "Monday schedule" we will still run lectures. |

11 | May-11 | TB | Quiz 2 + Unit 9 | May-14 | Quiz 2 solution | Quiz 2 is on units 5, 6, 7 and 8. | |

12 | May-18 | TB | Unit 9 | May-21 | Help with assigment 3 | ||

13 | May-25 | TB+YN | Unit 9 + Unit 10 | May-28 | Prep for final exam/quiz + Unit 10 | Assignment 3 on units 7, 8, and 9 due May 27 | |

Exam period | Final exam (aka Quiz 3) is on units 1, 5, 6, 7, 8, 9, and 10 |

You can may want to also see the UQ Calendar.

Below are links to supporting material for the course, some of which needs to be covered prior to the lectures:

- Basics, Mathematica, and other resources

- Unit 1 - Basic operations with matrices and vectors

- Unit 2 - Sets, counting and cardinality

- Unit 3 - Foundations in logic

- Unit 4 - Relations and functions

- Unit 5 - Sequences, their limits and series

- Unit 6 - Real functions: Limits and continuity

- Unit 7 - Derivatives, Optimisation and basic ODEs

- Unit 8 - Linear approximations and Taylor series

- Unit 9 - Integration

- Unit 10 - Partial derivatives and gradient descent

Below are homework assignments, quizzes and solutions:

- Assignment 1. Solution 1. Focus on Units 1, 2 and 3. Due Thursday March 18, 17:00.
- Assignment 2. Solution 2. Focus on Units 4, 5, 6 and 7. Due Thursday May 6, 17:00.
- Assignment 3. Solution 3. Focus on Units 7, 8 and 9. Due Thursday May 27, 17:00.

- Quiz 1. Solution. Focus on Units 1, 2, and 3. During first 90 minutes (including overheads) of lecture of week 5.
- Quiz 2. Solution. Focus on Units 5, 6, 7 and 8. During first 90 minutes (including overheads of lecture of week 11.
- Final exam (similar to an extended Quiz). Focus on Units 1, 5, 6, 7, 8, 9, 10.

Submission instructions:

**Quiz, assignment, and project submission information**: All assessment items are to be submitted to mathdatasciencebridgingsubmissions@uq.edu.au (this account should not be used for any queries - only for submissions). Submissions should include two files, a PDF file and an audio clip. The submission must adhere to the following guidelines:

- Submit a single PDF file, with pages of uniform size, and a file size that does not exceed 8MB (you can use a pdf compression utility if needed). The name of the PDF file should be FFFF_LLLL_SN-IIII.pdf where FFFF is your first name, LLLL, is your last name, SN is your student number, and IIII is "Quiz1", "Assignment1", "Project", etc.
- Do not submit code files - instead format your code into the PDF file.
- Both handwritten notes and typed notes are acceptable. A combination of handwritten and typed notes is acceptable as long as it is formatted nicely and continuously in a single PDF file.
- All graphs, plots, source code, and other figures must be clearly labeled.
- All questions/items must appear in order.
- A recorded audio clip in any standard format with a minimum duration of one minute and a maximum duration of two minutes. The file size may note exceed 4MB. In your recording, state your name, and your experience with this assignment. Mention the resources that you used to carry out the assignment, and if valid, indicate that you did not plagiarize. Name the audio clip in the same way that your PDF file is named, but with the valid audio format extension.