Introduction
Today, I will talk about Taylor expansion with Hessian matrix.It is important for optimization to understand Taylor expansion with Hessian matrix. Especially, Machine learning has always situation about thinking optimization. Thus I will write it to save knowledge about Hessian matrix.
Overview
- Definition of Hessian matrix
- Expression Taylor expansion with vector
- Optimality of the function
Definition of Hessian matrix
Assumptionf is a function which meets a condition as follows.
- f output Real value after getting the n-dimensional vector.
This vector is expressed as follows.
\[x = [x_1,x_2,,,,x_n]\] - \(\forall x_i , i \in {1,2,,,n}\), f have twice partial differential
Definition Hessian matrix
Hessian matrix have \(\frac{\partial^2}{\partial x_i \partial x_j} f(x) ~in~ ~element~ ~of~ (i,j)\)Thus Hessian matrix is expressed following.
\[ H(f) = \left( \begin{array}{cccc} \frac{\partial^ 2}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & &\ldots \frac{\partial^2 f}{\partial x_1 \partial x_n} \\ \frac{\partial^ 2 f}{\partial x_1 \partial x_2} & \frac{\partial^ 2 f}{\partial x_2^ 2} & \ldots & \frac{\partial^2 f}{\partial x_2 \partial x_n} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial^ 2 f}{\partial x_n \partial x_2} & \frac{\partial^ 2 f}{\partial x_n \partial x_2} & \ldots & \frac{\partial^2 f}{\partial x_n^2} \\ \end{array} \right) \]
Expression Taylor expansion with vector
I put Taylor expansion until the quadratic terms.\[f(a+h) = f(a) + \nabla f(a) h + \frac{1}{2} h^T \nabla f(a) h + R_3\]
note that \(H=\nabla ^2 f\) is Hessian matrix.
Please check out the point that I put Taylor expansion until the quadratic terms
There is the reason. It is that I want to optimize f.
Optimality of the function
Definiteness
- \(n \times n ~matrix~ A\) is positive definite
\(\forall x \in ~n-dimentinal~ ~space~, z^T A z > 0\) - \(n \times n ~matrix~ A\) is negative definite
\(\forall x \in ~n-dimentinal~ ~space~, z^T A z < 0\) - \(n \times n ~matrix~ A\) is positive semidefinite
\(\forall x \in ~n-dimentinal~ ~space~, z^T A z => 0\) - \(n \times n ~matrix~ A\) is negative semidefinite
\(\forall x \in ~n-dimentinal~ ~space~, z^T A z <= 0\)
Optimality
Now, of course, f deformed by Taylor expansion until the quadratic form is quadratic form.Thus, We are able to judge Optimality of f.
- if H(a)(Hessian matrix) is positive definite, f(a) is the minimum value.
- if H(a)(Hessian matrix) is negative definite, f(a) is the maximum value.
Reference
https://ja.wikipedia.org/wiki/%E3%83%98%E3%83%83%E3%82%BB%E8%A1%8C%E5%88%97http://www2.kaiyodai.ac.jp/~takenawa/optimization/resume10-1.pdf
http://www.dais.is.tohoku.ac.jp/~shioura/teaching/mp04/mp04-8.pdf
http://tau.doshisha.ac.jp/lectures/2008.calculus-II/html.dir/node43.html
コメント
コメントを投稿