{ "cells": [ { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "using ForwardDiff\n", "using LinearAlgebra" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Jacobian" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\nabla f(x)= [\\frac{\\partial }{\\partial _x}f(x)]^T \\in \\mathbb{R}^n$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Hessian(symmetric matrix)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\nabla^2 f(x)= \\left\\lgroup\\begin{matrix}\\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_1}}f(x)& \\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_2}}f(x) & ... & \\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_n}}f(x) \\\\ \n", " \\frac{\\partial^2 }{\\partial _{x_1}\\partial _{x_2}}f(x)& & & \\vdots \\\\ \n", "\\vdots & & & \\vdots \\\\ \n", " \\frac{\\partial^2 }{\\partial _{x_n}\\partial _{x_2}}f(x) & ... & ... & \\frac{\\partial^2 }{\\partial _{x_n}\\partial _{x_n}}f(x)\\end{matrix}\\right\\rgroup\\in\\mathbb{R}^{n\\times n}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Newton Method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## By solving this equation\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\Delta = \\nabla^2 f(x)^{-1} \\nabla f(x)$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Find the value of x that minimizes f(x)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "newton_method (generic function with 1 method)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "function newton_method(initial,f,limitTime)\n", " x1 = initial\n", " α= 1 # stepsize\n", " λ = 10^(-10) # damping\n", " hessian = x -> ForwardDiff.hessian(f, x)\n", " jacobian = x ->ForwardDiff.gradient(f, x)\n", " for i in 1:limitTime\n", " Δ = - α*inv(hessian(x1))*jacobian(x1)\n", " x2 = x1 + Δ\n", " if f(x2) <= f(x1) #step is accepted\n", " x1 = x2\n", " #increase stepsize towards α =1\n", " α = α^0.5 \n", " else #step is rejected\n", " #decrease stepsize\n", " α = 0.1*α \n", " end\n", " return x1\n", " end\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Gauss-Newton Method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Residual function\n", "$$r = f(x) - \\hat {f(x)}$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Jacobian and Hessian matrix can be simplified to this form:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\nabla f(x) = 2 \\nabla r^{T} r$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$ H \\approx 2 \\nabla r^{T} \\nabla r$$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### By solving this equation\n", "$$\\Delta = (\\nabla r^{T} \\nabla r)^{-1} \\nabla r^{T}r$$\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "gaussNewton (generic function with 1 method)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Solve R1->R1 problems\n", "# Input: initial->initial point, f->fit function, data->data,limitTime->max loop time, limitDist->min distance\n", "# Output: the optimized parameters of fit function\n", "function gaussNewton(initial,f,data, limitTime,limitDis)\n", " x = data\n", " # define residual function \n", " r(a; x) = x[length(x)] - f(x[1: (length(x) - 1)]; a=a)\n", " # initial a0 for the first iterate\n", " a0 = initial\n", " # start iterate\n", " looptime = 0\n", " for t in 1:limitTime\n", " looptime = t\n", " fs = [r(a0; x = x[1,:])]\n", " for i in 2:length(x[:,1])\n", " fs = [fs;r(a0;x = x[i,:])]\n", " end\n", " js = Array(ForwardDiff.gradient(a -> r(a;x = x[1,:]), a0)')\n", " for i in 2:length(x[:,1])\n", " js = [js;Array(ForwardDiff.gradient(a -> r(a;x=x[i,:]), a0)')]\n", " end\n", " a1 = a0 - inv(js'*js)*js'*fs\n", " # if a1 is near a0 jump out the loop <=> minimum for the residual function found\n", " if norm(a0-a1) <= limitDis\n", " a0 = a1\n", " break\n", " end\n", " a0 = a1\n", " end\n", " println(\"Loop time: \",looptime)\n", " return a0\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Levenberg-Marquardt method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### The residual function, Jacobian and Hessian matrix are same with Gauss-Newton Method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Different from Guass-Newton method, the equation Levenberg-Marquardt method should solve is " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$\\Delta = (\\nabla r^{T} \\nabla r + \\lambda I)^{-1} \\nabla r^{T}r$$" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LevenbergMarquardt (generic function with 1 method)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Solve R1->R1 problems\n", "# Input: initial->initial point, f->fit function, data->data,limitTime->max loop time, limitDist->min distance\n", "# Output: the optimized parameters of fit function\n", "function LevenbergMarquardt(initial,f,data, limitTime,limitDis)\n", " x = data\n", " # define residual function \n", " r(a; x) = x[length(x)] - f(x[1: (length(x) - 1)]; a=a)\n", " # initial a0 for the first iterate\n", " a0 = initial\n", " λ = 10^-10\n", " # start iterate\n", " looptime = 0\n", " for t in 1:limitTime\n", " looptime = t\n", " fs = [r(a0; x = x[1,:])]\n", " for i in 2:length(x[:,1])\n", " fs = [fs;r(a0;x = x[i,:])]\n", " end\n", " js = Array(ForwardDiff.gradient(a -> r(a;x = x[1,:]), a0)')\n", " for i in 2:length(x[:,1])\n", " js = [js;Array(ForwardDiff.gradient(a -> r(a;x=x[i,:]), a0)')]\n", " end\n", " # Levenberg-Marquardt method\n", " a1 = a0 - inv(js'*js+λ*Matrix(I, length(initial), length(initial)))*js'*fs\n", " # if a1 is near a0 jump out the loop <=> minimum for the residual function found\n", " if norm(a0-a1) <= limitDis\n", " a0 = a1\n", " break\n", " end\n", " y1=[]\n", " y2=[]\n", " for i in 1:length(x[:,1])\n", " y1 = [y1;A(x[i,:];a=a0)]\n", " y2 = [y2;A(x[i,:];a=a1)]\n", " end\n", " residual1 = norm(y1-x[:,length(x[1,:])])\n", " residual2 = norm(y2-x[:,length(x[1,:])])\n", " if residual2<=residual1\n", " λ = 0.2*λ\n", " a0 = a1\n", " else\n", " λ = 10*λ\n", " end\n", " end\n", " println(\"Loop time: \",looptime)\n", " return a0\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a function $$f1(x) = -x_1^2 + x_2^2 - 3x_1 + 4x_2 - x_1x_3 + 26$$to test newton method" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "f1 (generic function with 1 method)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f1(x) = -x[1]^2 + x[2]^2 - 3*x[1] + 4*x[2] - x[1]*x[3] + 26" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set the initial point to be x = [1, 2, 3] " ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "3-element Array{Float64,1}:\n", " 0.0\n", " -2.0\n", " -3.0" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = newton_method([1;2;3],f1,1000,-100,0.00001)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and we find the value of x that minimizes f(x) is [0, -2.0, -3.0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example 2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a nonlinear function B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$B(x) = e^{x_1^3 + x_2^2 + x_1x_2}$$" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "B (generic function with 1 method)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B(x) = exp(x[1]^3+x[2]^2 + x[1]*x[2])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate a set of data by the above nonlinear function B" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "xs1 = 0:0.02:1\n", "xs2 = 0:0.02:1\n", "\n", "data=[0 0 B([0,0])]\n", "for i in 1:length(xs1)\n", " for j in 1: length(xs2)\n", " data = [data;xs1[i] xs2[j] B([xs1[i]; xs2[j]])]\n", " end\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a nonlinear function A to fit the data generated by function B" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$A(x;a) = a_1x_1^2 + a_2x_2^2 + a_3x_1x_2 + a_4$$" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "A (generic function with 1 method)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A(x;a) = a[1]*x[1]^2+a[2]*x[2]^2+a[3]*x[1]*x[2]+a[4]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now assume the parameter of the function that generate the dataset above is unknow. \n", "This is a R1->R1 nonlinear least square problem,that is to find the optimized parameters a in function A." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " In Gauss-Newton Method, set the initial parameters to be [1, 2, 3, 4]" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loop time: 2\n", " 2.736285 seconds (5.03 M allocations: 511.953 MiB, 9.83% gc time)\n" ] }, { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " 2.139104342427509 \n", " 2.103249990271506 \n", " 6.169006304472237 \n", " -0.053387257275212374" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@time begin\n", " initial = [1;2;3;4]\n", " optimize = gaussNewton(initial,A,data,100,10^-5)\n", "end" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Residual: 51.05062195039618\n" ] } ], "source": [ "ys = data[:,length(data[1,:])]\n", "xs = data[:,1:(length(data[1,:])-1)]\n", "yp=[]\n", "for i in 1:length(ys)\n", " #println(xs[i,:])\n", " yp = [yp;A(xs[i,:];a=optimize)]\n", "end\n", "residual = norm(yp-ys)\n", "println(\"Residual: \",residual)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " In Levenberg-Marquardt method, set the initial parameters to be [1, 2, 3, 4]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loop time: 2\n", " 1.679500 seconds (2.88 M allocations: 453.333 MiB, 10.05% gc time)\n" ] }, { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " 2.139104342427509 \n", " 2.1032499902715056 \n", " 6.169006304472237 \n", " -0.05338725727521243" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@time begin\n", " initial = [1;2;3;4]\n", " optimize = LevenbergMarquardt(initial,A,data,100,10^-5)\n", "end" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Residual: 51.050621950396184\n" ] } ], "source": [ "ys = data[:,length(data[1,:])]\n", "xs = data[:,1:(length(data[1,:])-1)]\n", "yp=[]\n", "for i in 1:length(ys)\n", " #println(xs[i,:])\n", " yp = [yp;A(xs[i,:];a=optimize)]\n", "end\n", "residual = norm(yp-ys)\n", "println(\"Residual: \",residual)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By using Guass-Newton method and Levenberg-Marquardt method, We find the optimized parameters are [2.14, 2.10, 6.17, -0.05], and the residual is 51.05" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Example 3" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To compare these two methods, in this example, we will consider a complicated function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a nonlinear function B1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$B1(x) = \\frac {5x_1^2}{3 + 4\\times3^{1.5x_2}}$$" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "B1 (generic function with 1 method)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "B1(x) = 5*x[1]^2/(3+4*3^(1.5*x[2]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Generate a set of data by the above nonlinear function B1" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "xs1 = 0:0.02:1\n", "xs2 = 0:0.02:1\n", "\n", "data1=[0 0 B1([0,0])]\n", "for i in 1:length(xs1)\n", " for j in 1: length(xs2)\n", " data1 = [data1;xs1[i] xs2[j] B1([xs1[i]; xs2[j]])]\n", " end\n", "end" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Define a nonlinear function A1 to fit the data generated by function B1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "$$A1(x;a) = \\frac{a_1x_1^2}{a_2 + a_3\\times3^{a_4x_2}}$$" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "A1 (generic function with 1 method)" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "A1(x;a) = a[1]*x[1]^2/(a[2]+a[3]*3^(a[4]*x[2]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now assume the parameter of the function that generate the dataset above is unknow. \n", "This is a R1->R1 nonlinear least square problem,that is to find the optimized parameters a in function A1." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " In Gauss-Newton Method, set the initial parameters to be [1, 2, 3, 4]" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loop time: 100\n", " 9.053897 seconds (13.04 M allocations: 13.283 GiB, 17.17% gc time)\n" ] }, { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " NaN\n", " NaN\n", " NaN\n", " NaN" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@time begin\n", " initial = [1;2;3;4]\n", " optimize = gaussNewton(initial,A1,data1,100,10^-5)\n", "end" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Residual: NaN\n" ] } ], "source": [ "ys = data1[:,length(data1[1,:])]\n", "xs = data1[:,1:(length(data1[1,:])-1)]\n", "yp=[]\n", "for i in 1:length(ys)\n", " yp = [yp;A1(xs[i,:];a=optimize)]\n", "end\n", "residual = norm(yp-ys)\n", "println(\"Residual: \",residual)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ " In Levenberg-Marquardt method, set the initial parameters to be [1, 2, 3, 4]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Loop time: 53\n", " 6.658476 seconds (15.96 M allocations: 9.996 GiB, 17.02% gc time)\n" ] }, { "data": { "text/plain": [ "4-element Array{Float64,1}:\n", " 0.9989291222734306\n", " 5.426350476005932 \n", " -4.156828381469578 \n", " -0.6716430753650134" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "@time begin\n", " initial = [1;2;3;4]\n", " optimize = LevenbergMarquardt(initial,A1,data1,100,10^-5)\n", "end" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Residual: 0.9495149247371497\n" ] } ], "source": [ "ys = data1[:,length(data1[1,:])]\n", "xs = data1[:,1:(length(data1[1,:])-1)]\n", "yp=[]\n", "for i in 1:length(ys)\n", " yp = [yp;A1(xs[i,:];a=optimize)]\n", "end\n", "residual = norm(yp-ys)\n", "println(\"Residual: \",residual)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By using Levenberg-Marquardt method, We find the optimized parameters are [1.00, 5.43, -4.16, -0.67], and the residual is 0.95" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compare Levenberg-Marquardt method with Guass-Newton method, we found that Levenberg-Marquardt method can apply to more situations" ] } ], "metadata": { "kernelspec": { "display_name": "Julia 1.1.1", "language": "julia", "name": "julia-1.1" }, "language_info": { "file_extension": ".jl", "mimetype": "application/julia", "name": "julia", "version": "1.1.1" } }, "nbformat": 4, "nbformat_minor": 2 }