{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "using ForwardDiff\n",
    "using LinearAlgebra"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Jacobian"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\nabla f(x)= [\\frac{\\partial }{\\partial _x}f(x)]^T \\in \\mathbb{R}^n$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hessian(symmetric matrix)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\nabla^2 f(x)= \\left\\lgroup\\begin{matrix}\\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_1}}f(x)&  \\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_2}}f(x) & ... & \\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_n}}f(x) \\\\ \n",
    "  \\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_2}}f(x)&  &  & \\vdots \\\\ \n",
    "\\vdots  &  &  & \\vdots \\\\ \n",
    "  \\frac{\\partial^2 }{\\partial _{x_n}\\partial _{x_2}}f(x) & ... & ... &   \\frac{\\partial^2 }{\\partial _{x_n}\\partial _{x_n}}f(x)\\end{matrix}\\right\\rgroup\\in\\mathbb{R}^{n\\times n}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Newton Method"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## By solving this equation\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\Delta = \\nabla^2 f(x)^{-1} \\nabla f(x)$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##  Find the value of x that minimizes f(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "newton_method (generic function with 1 method)"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "function newton_method(initial,f,limitTime)\n",
    "    x1 = initial\n",
    "    α= 1 # stepsize\n",
    "    λ = 10^(-10) # damping\n",
    "    hessian = x -> ForwardDiff.hessian(f, x)\n",
    "    jacobian = x ->ForwardDiff.gradient(f, x)\n",
    "    for i in 1:limitTime\n",
    "        Δ =  - α*inv(hessian(x1))*jacobian(x1)\n",
    "        x2 = x1 + Δ\n",
    "        if f(x2) <= f(x1) #step is accepted\n",
    "            x1 = x2\n",
    "            #increase stepsize towards α =1\n",
    "            α = α^0.5 \n",
    "        else #step is rejected\n",
    "            #decrease stepsize\n",
    "            α = 0.1*α \n",
    "        end\n",
    "        return x1\n",
    "    end\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Gauss-Newton Method"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Residual function\n",
    "$$r = f(x) - \\hat {f(x)}$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Jacobian and Hessian matrix can be simplified to this form:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\nabla f(x) = 2 \\nabla r^{T} r$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$ H \\approx 2 \\nabla r^{T} \\nabla r$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### By solving this equation\n",
    "$$\\Delta = (\\nabla r^{T} \\nabla r)^{-1} \\nabla r^{T}r$$\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "gaussNewton (generic function with 1 method)"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Solve R1->R1 problems\n",
    "# Input: initial->initial point, f->fit function, data->data,limitTime->max loop time, limitDist->min distance\n",
    "# Output: the optimized parameters of fit function\n",
    "function gaussNewton(initial,f,data, limitTime,limitDis)\n",
    "    x = data\n",
    "    # define residual function \n",
    "    r(a; x) = x[length(x)] - f(x[1: (length(x) - 1)]; a=a)\n",
    "    # initial a0 for the first iterate\n",
    "    a0 = initial\n",
    "    # start iterate\n",
    "    looptime = 0\n",
    "    for t in 1:limitTime\n",
    "        looptime = t\n",
    "        fs = [r(a0; x = x[1,:])]\n",
    "        for i in 2:length(x[:,1])\n",
    "            fs = [fs;r(a0;x =  x[i,:])]\n",
    "        end\n",
    "        js = Array(ForwardDiff.gradient(a -> r(a;x = x[1,:]), a0)')\n",
    "        for i in 2:length(x[:,1])\n",
    "            js = [js;Array(ForwardDiff.gradient(a -> r(a;x=x[i,:]), a0)')]\n",
    "        end\n",
    "        a1 = a0 - inv(js'*js)*js'*fs\n",
    "        # if a1 is near a0 jump out the loop <=> minimum for the residual function found\n",
    "        if norm(a0-a1) <=  limitDis\n",
    "            a0 = a1\n",
    "            break\n",
    "        end\n",
    "        a0 = a1\n",
    "    end\n",
    "    println(\"Loop time: \",looptime)\n",
    "    return a0\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Levenberg-Marquardt method"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### The residual function, Jacobian and Hessian matrix are same with Gauss-Newton Method"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Different from Guass-Newton method, the equation Levenberg-Marquardt method should solve is "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$\\Delta = (\\nabla r^{T} \\nabla r + \\lambda I)^{-1} \\nabla r^{T}r$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "LevenbergMarquardt (generic function with 1 method)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Solve R1->R1 problems\n",
    "# Input: initial->initial point, f->fit function, data->data,limitTime->max loop time, limitDist->min distance\n",
    "# Output: the optimized parameters of fit function\n",
    "function LevenbergMarquardt(initial,f,data, limitTime,limitDis)\n",
    "    x = data\n",
    "    # define residual function \n",
    "    r(a; x) = x[length(x)] - f(x[1: (length(x) - 1)]; a=a)\n",
    "    # initial a0 for the first iterate\n",
    "    a0 = initial\n",
    "    λ = 10^-10\n",
    "    # start iterate\n",
    "    looptime = 0\n",
    "    for t in 1:limitTime\n",
    "        looptime = t\n",
    "        fs = [r(a0; x = x[1,:])]\n",
    "        for i in 2:length(x[:,1])\n",
    "            fs = [fs;r(a0;x =  x[i,:])]\n",
    "        end\n",
    "        js = Array(ForwardDiff.gradient(a -> r(a;x = x[1,:]), a0)')\n",
    "        for i in 2:length(x[:,1])\n",
    "            js = [js;Array(ForwardDiff.gradient(a -> r(a;x=x[i,:]), a0)')]\n",
    "        end\n",
    "        # Levenberg-Marquardt method\n",
    "        a1 = a0 - inv(js'*js+λ*Matrix(I, length(initial), length(initial)))*js'*fs\n",
    "        # if a1 is near a0 jump out the loop <=> minimum for the residual function found\n",
    "        if norm(a0-a1) <=  limitDis\n",
    "            a0 = a1\n",
    "            break\n",
    "        end\n",
    "        y1=[]\n",
    "        y2=[]\n",
    "        for i in 1:length(x[:,1])\n",
    "            y1 = [y1;A(x[i,:];a=a0)]\n",
    "            y2 = [y2;A(x[i,:];a=a1)]\n",
    "        end\n",
    "        residual1 = norm(y1-x[:,length(x[1,:])])\n",
    "        residual2 = norm(y2-x[:,length(x[1,:])])\n",
    "        if residual2<=residual1\n",
    "            λ = 0.2*λ\n",
    "            a0 = a1\n",
    "        else\n",
    "            λ = 10*λ\n",
    "        end\n",
    "    end\n",
    "    println(\"Loop time: \",looptime)\n",
    "    return a0\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define a function $$f1(x) = -x_1^2 + x_2^2 - 3x_1 + 4x_2 - x_1x_3 + 26$$to test newton method"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "f1 (generic function with 1 method)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "f1(x) = -x[1]^2 + x[2]^2 - 3*x[1] + 4*x[2] - x[1]*x[3] + 26"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Set the initial point to be x = [1, 2, 3] "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3-element Array{Float64,1}:\n",
       "  0.0\n",
       " -2.0\n",
       " -3.0"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = newton_method([1;2;3],f1,1000,-100,0.00001)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "and we find the value of x that minimizes f(x) is [0, -2.0, -3.0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example 2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define a nonlinear function B"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$B(x) = e^{x_1^3 + x_2^2 + x_1x_2}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "B (generic function with 1 method)"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B(x) = exp(x[1]^3+x[2]^2 + x[1]*x[2])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generate a set of data by the above nonlinear function B"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "xs1 = 0:0.02:1\n",
    "xs2 = 0:0.02:1\n",
    "\n",
    "data=[0 0 B([0,0])]\n",
    "for i in 1:length(xs1)\n",
    "    for j in 1: length(xs2)\n",
    "        data = [data;xs1[i] xs2[j] B([xs1[i]; xs2[j]])]\n",
    "    end\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define a nonlinear function A to fit the data generated by function B"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$A(x;a) = a_1x_1^2 + a_2x_2^2 + a_3x_1x_2 + a_4$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "A (generic function with 1 method)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A(x;a) = a[1]*x[1]^2+a[2]*x[2]^2+a[3]*x[1]*x[2]+a[4]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now assume the parameter of the function that generate the dataset above is unknow. \n",
    "This is a R1->R1 nonlinear least square problem,that is to find the optimized parameters a in function A."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " In Gauss-Newton Method, set the initial parameters to be [1, 2, 3, 4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loop time: 2\n",
      "  2.736285 seconds (5.03 M allocations: 511.953 MiB, 9.83% gc time)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       "  2.139104342427509   \n",
       "  2.103249990271506   \n",
       "  6.169006304472237   \n",
       " -0.053387257275212374"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@time begin\n",
    "    initial = [1;2;3;4]\n",
    "    optimize = gaussNewton(initial,A,data,100,10^-5)\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Residual: 51.05062195039618\n"
     ]
    }
   ],
   "source": [
    "ys = data[:,length(data[1,:])]\n",
    "xs = data[:,1:(length(data[1,:])-1)]\n",
    "yp=[]\n",
    "for i in 1:length(ys)\n",
    "    #println(xs[i,:])\n",
    "    yp = [yp;A(xs[i,:];a=optimize)]\n",
    "end\n",
    "residual = norm(yp-ys)\n",
    "println(\"Residual: \",residual)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " In Levenberg-Marquardt method, set the initial parameters to be [1, 2, 3, 4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loop time: 2\n",
      "  1.679500 seconds (2.88 M allocations: 453.333 MiB, 10.05% gc time)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       "  2.139104342427509  \n",
       "  2.1032499902715056 \n",
       "  6.169006304472237  \n",
       " -0.05338725727521243"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@time begin\n",
    "    initial = [1;2;3;4]\n",
    "    optimize = LevenbergMarquardt(initial,A,data,100,10^-5)\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Residual: 51.050621950396184\n"
     ]
    }
   ],
   "source": [
    "ys = data[:,length(data[1,:])]\n",
    "xs = data[:,1:(length(data[1,:])-1)]\n",
    "yp=[]\n",
    "for i in 1:length(ys)\n",
    "    #println(xs[i,:])\n",
    "    yp = [yp;A(xs[i,:];a=optimize)]\n",
    "end\n",
    "residual = norm(yp-ys)\n",
    "println(\"Residual: \",residual)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By using Guass-Newton method and Levenberg-Marquardt method, We find the optimized parameters are [2.14, 2.10, 6.17, -0.05], and the residual is 51.05"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Example 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To compare these two methods, in this example, we will consider a complicated function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define a nonlinear function B1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$B1(x) = \\frac {5x_1^2}{3 + 4\\times3^{1.5x_2}}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "B1 (generic function with 1 method)"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "B1(x) = 5*x[1]^2/(3+4*3^(1.5*x[2]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Generate a set of data by the above nonlinear function B1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "xs1 = 0:0.02:1\n",
    "xs2 = 0:0.02:1\n",
    "\n",
    "data1=[0 0 B1([0,0])]\n",
    "for i in 1:length(xs1)\n",
    "    for j in 1: length(xs2)\n",
    "        data1 = [data1;xs1[i] xs2[j] B1([xs1[i]; xs2[j]])]\n",
    "    end\n",
    "end"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define a nonlinear function A1 to fit the data generated by function B1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "$$A1(x;a) = \\frac{a_1x_1^2}{a_2 + a_3\\times3^{a_4x_2}}$$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "A1 (generic function with 1 method)"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "A1(x;a) = a[1]*x[1]^2/(a[2]+a[3]*3^(a[4]*x[2]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now assume the parameter of the function that generate the dataset above is unknow. \n",
    "This is a R1->R1 nonlinear least square problem,that is to find the optimized parameters a in function A1."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " In Gauss-Newton Method, set the initial parameters to be [1, 2, 3, 4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loop time: 100\n",
      "  9.053897 seconds (13.04 M allocations: 13.283 GiB, 17.17% gc time)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       " NaN\n",
       " NaN\n",
       " NaN\n",
       " NaN"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@time begin\n",
    "    initial = [1;2;3;4]\n",
    "    optimize = gaussNewton(initial,A1,data1,100,10^-5)\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Residual: NaN\n"
     ]
    }
   ],
   "source": [
    "ys = data1[:,length(data1[1,:])]\n",
    "xs = data1[:,1:(length(data1[1,:])-1)]\n",
    "yp=[]\n",
    "for i in 1:length(ys)\n",
    "    yp = [yp;A1(xs[i,:];a=optimize)]\n",
    "end\n",
    "residual = norm(yp-ys)\n",
    "println(\"Residual: \",residual)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    " In Levenberg-Marquardt method, set the initial parameters to be [1, 2, 3, 4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Loop time: 53\n",
      "  6.658476 seconds (15.96 M allocations: 9.996 GiB, 17.02% gc time)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "4-element Array{Float64,1}:\n",
       "  0.9989291222734306\n",
       "  5.426350476005932 \n",
       " -4.156828381469578 \n",
       " -0.6716430753650134"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "@time begin\n",
    "    initial = [1;2;3;4]\n",
    "    optimize = LevenbergMarquardt(initial,A1,data1,100,10^-5)\n",
    "end"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Residual: 0.9495149247371497\n"
     ]
    }
   ],
   "source": [
    "ys = data1[:,length(data1[1,:])]\n",
    "xs = data1[:,1:(length(data1[1,:])-1)]\n",
    "yp=[]\n",
    "for i in 1:length(ys)\n",
    "    yp = [yp;A1(xs[i,:];a=optimize)]\n",
    "end\n",
    "residual = norm(yp-ys)\n",
    "println(\"Residual: \",residual)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "By using Levenberg-Marquardt method, We find the optimized parameters are [1.00, 5.43, -4.16, -0.67], and the residual is 0.95"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare Levenberg-Marquardt method with Guass-Newton method, we found that Levenberg-Marquardt method can apply to more situations"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.1.1",
   "language": "julia",
   "name": "julia-1.1"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.1.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}