{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Assignment 1 " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " # MATH 7502 - Semsester 2, 2018\n", "## Mathematics for Data Science 2\n", "\n", "#### Created by Zhihao Qiao, Maria Kleshnina and Yoni Nazarathy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question (1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(a) Let $u = \\begin{bmatrix} \n", "1\\\\\n", "2\\\\\n", "3\\\\\n", "\\end{bmatrix} ,\n", "v = \\begin{bmatrix} \n", "-6\\\\\n", "1\\\\\n", "-2\\\\\n", "\\end{bmatrix}\n", ",\n", "w = \\begin{bmatrix} \n", "4\\\\\n", "-6\\\\\n", "-1\\\\\n", "\\end{bmatrix},$\n", "\n", "prove or disprove that $u,v$ and $w$ lie in the same plane.\n", "\n", "(b)\n", "Use the dot product to determine the angle between $u$ and $v$.\n", "\n", "(c) \n", "Find the length (L2 norm) of $u$ and $v$. Illustrate that Cauchy-Schwarz inequality and the Triangle Inequality holds for these vectors.\n", "\n", "(d)\n", "Find the unit vectors associated with $u$,$v$ and $w$. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 2 \n", "\n", "(a) Find the unit vectors $u_1$ and $u_2$ in the direction of $v=(1,4)$ and $w=(-2,1,2)$ respectively.\n", "\n", "(b) Find unit vectors $v_1$ and $v_2$ that are parallel to $u_1$ and $u_2$ (Are there such vectors that are not $u_1$ and $u_2$?).\n", "\n", "(c) Find unit vectors $w_1$ and $w_2$ that are perpendicular to $u_1$ and $u_2$.\n", "\n", "(d)Find the dot product of any unit vector $u+w$ and $u-w$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 3\n", "Show the implementation of k-means algorithm (using k=2) for the following table by hand:\n", "\n", "| Individual | Variable 1 | Variable 2 |\n", "|------------|-------------|------------|\n", "| 1 | 1.0 | 1.0 |\n", "| 2 | 1.5 | 2.0 |\n", "| 3 | 3.0 | 4.0 |\n", "| 4 | 5.0 | 7.0 |\n", "| 5 | 3.5 | 5.0 |\n", "| 6 | 4.5 | 5.0 |\n", "| 7 | 3.5 | 4.5 |\n", "\n", "Initialize with individual (1) and (4), in this case the centroids are $m_1 = (1.0,1.0), m_2 = (5.0,7.0)$. \n", "\n", "(a) Using the initilizations given, verify that two clusters of the next step are $\\{1,2,3\\}$ and $\\{4,5,6,7\\}$. \n", "\n", "(b) Find the new centroids using (a). \n", "\n", "(c) Find the most optimized cluster with $k=2$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 4 \n", "Let $f$ be a mapping from $R^2 \\rightarrow R$, $$f(x,y) = \\bigg(e^x \\sin\\big(y\\big), e^x \\cos\\big(y\\big)\\bigg)$$ \n", "\n", "(a) Compute the Jacobian Matrix.\n", "\n", "(b) Find its determinant.\n", "\n", "(c) State conditions for invertability for every $x$ and $y$. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 5 \n", "True or false (give a reason or prove if true or present a counter example if false).\n", "\n", "(a) If $u=(1,1,1) $ is perpendicular to $v$ and $w$ for any $v$ and $w$, then $v$ is parallel to $w$.\n", "\n", "(b) If $u$ is perpendicular to $v$ and $w$, then $u$ is perpendicular to $v+w$.\n", "\n", "(c) If $u$ and $v$ are perpendicular unit vectors, then $\\|u-v\\|=\\sqrt{2}$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 6 \n", "Suppose each of the vectors $u_1,....,u_k$ is a linear combinaton of the vectors $v_1,....,v_m$. Assume that $w$ is a linear combination of $u_1,...,u_k$. \n", "\n", "(a) Show that $w$ is also a linear combianton of $v_1,...,v_m$ for the case $m=k=2$. \n", "\n", "(b) Show the above for the genernal $m$ and $k$. (Note: $m$ does not necessilary equal to $k$)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 7\n", "Determine whether each of the following scalar-valued functions of n-vectors is linear. If it is a linear function, find its inner product representation, i.e Find an n-vector $a$ such that $f(x)=a^T x$. If it is not linear, give specific $x,y,\\alpha,\\beta$ such that \n", "$$ f(\\alpha x +\\beta y) \\neq \\alpha f(x) +\\beta f(y)$$\n", "\n", "(a) $f(x) = \\max_{k}x_k - \\min_{k}x_k$ (The spread of values of the vector).\n", "\n", "(b) $f(x)=x_n-x_1$ (The difference of last element and the first).\n", "\n", "(c) $f(x)$ = The median of an n-vector, suppose $n=2k+1$ is odd. then the median is the (k+1)th largest number among all entries of $x$.\n", "\n", "(d) Define $x_{n+1} = f(x_n) = x_n+(x_n-x_{n-1})$, for $n\\geq 2$ (This is a simple prediction of what $x_{n+1}$ should be based on a straight line drawn through $x_n$ and $x_{n-1}$." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 8\n", "Clustering a collection of vectors into $k=2$ groups is called 2-way partitioning, since we are partitioning the the vector into 2 groups, with index sets $G_1$ and $G_2.$ Suppose we run k-means, with $k=2$, on the n-vectors $x_1,...,x_n$ with $x_i \\in R^n$.\n", "\n", "Show that there is a nonzero vector $w$ and a scalar $v$ that statisfy \n", "$$ w^T x_i +v \\geq 0 \\ \\ \\text{for}\\ \\ i \\in G_1$$\n", "$$ w^T x_i +v \\leq 0 \\ \\ \\text{for} \\ \\ i \\in G_2$$\n", "In other words, the affine function $f(x) = w^Tx_i +v$ is greater than or equal to zero in the first group, and less than or equal to zero in the second group. This is called the linear separation of the two groups. \n", "Hint: Consider the function $\\| x-z_1 \\|^2 - \\| x- z_2 \\|^2$, where $z_1$ and $z_2$ are the group representatives. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 9 \n", "Use the same data set as in Question 3, \n", "\n", "| Individual | Variable 1 | Variable 2 |\n", "|------------|-------------|------------|\n", "| 1 | 1.0 | 1.0 |\n", "| 2 | 1.5 | 2.0 |\n", "| 3 | 3.0 | 4.0 |\n", "| 4 | 5.0 | 7.0 |\n", "| 5 | 3.5 | 5.0 |\n", "| 6 | 4.5 | 5.0 |\n", "| 7 | 3.5 | 4.5 |\n", "\n", "(a) Write code to find the distance between individuals and the initialization for $k=2$. \n", "\n", "(b) Write code to cluster the data into two set and find the new centorid. \n", "\n", "(c) Using (a) and (b), write code to find the most optimized clustering with k=2. (Does your answer agree with Q3?)\n", "\n", "(d) Repeat (c) with $k=3$. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 11\n", "Consider the mapping $f: R^2 \\rightarrow R$\n", "$$f\\big(x_1,x_2\\big) =\\sin\\big(x_1\\big)\\exp\\big(x_2\\big).$$\n", "\n", "(a) Find the Taylor approximation at $z= (1,1).$ \n", "\n", "(b) Plot function $f.$\n", "\n", "(c) Plot the taylor approximation with different numbers of terms, illrustrates the closeness of the approximating to the acutal function, comment on your result. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 11\n", "\n", "This Question requires some package (using Distributions, Pyplot and etc.) (https://juliastats.github.io/Distributions.jl/latest/fit.html)\n", "\n", "(a) Generate four 3-vectors using different distributions from the above document. i.e rand(sampledist(),3). \n", "\n", "(b) Find the unit vectors of the 3-vectors you found in (a). \n", "\n", "(c) Find the dot product of two 3-vectors generated by the Normal distribution. \n", "\n", "(d) Fill the follwing code which generates a vector that stores 20,000 dot products between two 3-vector generated by the Normal distribution " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "using Distributions,PyPlot\n", "\n", "storage=zeros(20000)\n", "\n", "for i in 1:20000\n", " m= ??????????\n", " a[i]=m\n", "end " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(e) Using \"hist\" function from Pyplot, create a histogram plot of the dot products, and comment on your result. \n", "\n", "(f) Repeat (d) and (e) with Beta distribution (choosing some parameters), and comment on your result. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Question 12\n", "\n", "Plot the solution of the following ODE:\n", "$$ \\frac{du}{dt} = f(u,p,t).$$\n", "\n", "On the time interval $t\\in[0,5]$, where $f(u,p,t) = tu-u^2$ with initial condition $u_0 = 2$. " ] } ], "metadata": { "kernelspec": { "display_name": "Julia 0.6.2", "language": "julia", "name": "julia-0.6" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "0.6.2" } }, "nbformat": 4, "nbformat_minor": 2 }