journey of Froakie (ケロマツの旅路)

投稿

ラベル（Machine Learning）が付いた投稿を表示しています

Theorem of kernel K-means

Introduction 日本語 ver Today, I will write about a theorem of kernel K-means. The kernel K-means cover the weak point of K-means. I will explain this weak point of K-means and strong point of kernel K-means. If you have not looked yet, please look at the Theorem of K-means. I implement kernel K-means. Its post is Implement kernel K-means . Overview A weak point of K-means Kernel trick kernel K means Algorithm A weak point of K-means For example, I prepare the following dataset. It is impossible for this dataset to cluster by K-means because this data is distributed shape of the circle. K-means classify data in accordance with the Euclid distance between data and prototype. The prototype is representative of each class. A Prototype of K-means is mean vector. Thus, K-means classify dataset as follows. K-means does not work, if not so this like dataset. The dataset which is able to classify by K-means is ...

Theorem of K-means

Introduction 日本語 ver Today, I will write about the theorem of K-means algorithm. K-means algorithm is a method to do clustering about K of class. The post of implement K-means is Implement kernel K-means This post is written about kernel K-means. The kernel K-means is the method which covers the weak point of K-means. I will explain kernel K-means another post. Overview 1 of K coding scheme prototype vector Distortion measure Computing 1 of K coding scheme K-means algorithm is clustering K of class. Now, K-means algorithm expresses that $x_n$ be belong to k like as follows. Let vector $r_n:1 \times K$ is $$r_n := (0,0,..,1,..,0)$$ this vector have 1 in element of k'th and have 0 in else. This expression is called 1 of K coding scheme. Prototype vector K-means algorithm chooses vector which called a prototype. this vector has represented by the cluster. K-means algorithm is regard mean vector as representative of the cluster. I...

Thorem of SVM part 2

Introduction 日本語 ver This post is written about theorem of SVM. This post is part 2. Part 1 is written about deriving the objective function. I will write about dual problem. The objective function which deriving in Part 1 is called the main problem. Typically, We optimize not the main problem, but the dual problem. I will write about deriving the dual problem from the main problem. If you until look up part 1, please look up Theorem of SVM part 1 . I implement SVM. It post is Implement linear SVM Implement kernel SVM Overview Main problem Dual problem Lagurange function maximize L about dual variable, minimize L about primal variable minimize L about primal variable, maximize L about dual variable. Main problem I will review the main problem in Part 1. $$\min_{w,b} \frac{1}{2}||W||^2 + C\sum_{i \in N} \epsilon_i$$ $$~~s.t~~ \forall i \in N, y_i (w^T \phi(x_i) + b) \geq 1 - \epsilon_i ,~~~, \forall i \in N~\epsilon \geq 0$$ This problem is called s...

Theorem of SVM part 1

Introduction 日本語 ver I will explain theorem of SVM. Please look at my implementation of SVM. Implement linear SV M Implement kernel SVM Today, I will explain about SVM until deriving the objective function. Overview Generalized linear model Explain SVM hard margin soft margin Generalized linear model SVM is used generalized linear model. Generalized linear model is following function $$f(x) = w^T\phi(x) + b$$ b is called bias. $$0 = w^T\phi(x) + b$$is hyper plane. This hyper plane separate two class of $\phi(x)$. hyper plance is n-dimensional plane. if n = 1, hyper plane is line. if n = 2, hyper plane is normal plane. $\phi(x)$ have effect of converting x to data which can be separated by a line. image of $\phi(x)$ is the left image has nonlinear data. right image has linear data. $\phi(x)$ convert from left image to right image. I will handle $w^T \phi(x) + b$ a...

Implement kernel k-means

Introduction 日本語 ver Today, I implement kernel k-means. The k-means algorithm is clustering algorithm. A reason that I implement kernel k-means algorithm is that I and my friend conceived introducing kernel to k-means. I investigated paper of kernel k-means. I found [This page](http://www.cs.utexas.edu/users/inderjit/public_papers/kdd_spectral_kernelkmeans.pdf) Thus I could implement kernel k-means algorithm. I introduce the implementation of normal k-means and kernel k-means. I handle the only implementation of kernel k-means. I will write the theory of kernel k-means. If I finished writing it, I publish on this post. # I finished. Theorem of K-means Overview dataset a few explaining k-means k-means kernel k-means Dataset I used two datasets. first data is designated for normal k-means. second data is designated for kernel k-means. First data ha...

Implement kernel SVM

Introduction 日本語 ver Today, I implement the kernel SVM. Oputimization is interror point method. This post is written about Implementation. I will write Theorem of kernel SVM in another post. I will put when writing its post finished. # I finished writing theorem of SVM. Theorem of SVM part1 My computer is windows. Also, OS is windows. I implement by Python3. Overview introduce kernel introduce dataset result of implementation kernel The kernel is the method of solving the nonlinear problem. kernel is map converting data so that linear can separate class of data. ⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓⇓↓ Converting data is expressed $\phi(x)$. \[\phi:x -> \phi(x)\] but, kernel function is used \[K(x,y)=\phi(x)^T \phi(y)\] in SVM, Because, SVM can cumpute only $\phi(x)^T \phi$. The famous kernel is RBF and polynomial. RBF \[K(x,y) = \exp(-\gamma ||x-y||^2)\] polynomial \[K(x,y) = (x^Ty+c)^p\] If $K(x,y) = x^Ty$ , T...

Implement linear SVM

Introduction 日本語 ver Today, I implement liner SVM (support vector machine). SVM is one of the most strong algorithm of machine learning before becoming popular Deep learning. I will write detail of Logic of liner SVM in other posts. # I finished writing theorem of SVM. Theorem of SVM part1 My computer is windows. OS is windows. Program is written by Python3. This program is used Interrior point method in Oputimization. Dataset I used two dataset. First, I used data separated by hyper plane. Second, I used data mixed $C_1$ and $C_2$. First data is distributed such as the following. I can write separate line of distribution. Second data is distributed such as the following. I can not write separate line of distribution. I will try liner SVM about these data. Implementation data1 This line is hyper plane. It is written like copperplate. data2 I tried to estimate hyper plane by variable C. C value cause this line. The smoller C value is the ea...

Implementation of Logistic Regression

Introduction 日本語ver Today, I implement Logistic Regression. My OS of computer is the windows10. Implementation is used by Python3. I use the IRLS to estimate optimization value. I introduce the theory of Logistic Regression in another post. If you interested, look at this post . Overview I will introduce used data set I will introduce my code in Python I will show you result on Command line. Dataset I use this dataset to implement Logistic Regression. This dataset is Residential area data. I diplay this data in Pandas DataFrame Python3. This is data set from top to five elements. if people live the house, occupancy is 1. if people do not live the house, occupancy is 0. This data consist of 8000 samples to use as training data, and 2000 samples to use as test data. However I use 100 samples as training data and 100 samples as test data, because my computer is not designated programing. Sorry, . CODE This code is very long. Thus, I publish my ...

Theorem of Logistic regression.

Introduction 日本語ver Today, I will write about Logistic regression. Logistic regression is the basis of Machine Learning. Logistic regression is the model to classify two value Overview This post is written by using PRML for reference. Optimization is used for Iterative reweighted least squares method. First, I will introduce the sigmoid function Second, I will define probability to classify Third, I will write cross-entropy error function Fourth, I will explain IRLS Firstly, We define Sigmoid function. sigmoid function is following. \[\sigma(a) = \frac{1}{1+\exp(a)}\] I will compute differential of this function. \begin{eqnarray*} \frac{d}{d a} \frac{1}{1+\exp(a)} &=& \frac{\exp(-a)}{(1+\exp(-a))^2} \\ &=& \frac{1}{1+\exp(-a)} \frac{\exp(-a)}{1+\exp(-a)}\\ &=& \frac{1}{1+\exp(-a)} \{ \frac{1+\exp(-a)}{1+\exp(-a)} - \frac{1}{1+\exp(-a)} \} \\ &=& \sigma(a)(1-\sigma(a)) \end{eqnarray*} This function is very important in term...