标签:reduce Dimensionality frac ML sum Component approx variation error
Source: Coursera Machine Learning provided by Stanford University Andrew Ng - Machine Learning | Coursera
Dimensionality Reduction - Principal Component Analysis (PCA)
notations:
$u_k$: the k-th principal component of variation
$z^{(i)}$: the projection of the $i$-th example $x^{(i)}$
$x_{approx}^{(i)}$: the recovered data of $x^{(i)}$ from its projection $z^{(i)}$
problem formulation:
For an $n$ dimensional input dataset, reduce it to $k$ dimension. That is, find $k$ vectors ($u_1, u_2, \cdots, u_k$) onto which to project the data, so as to minimize the projection error:
$$ error = \frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)} - x_{approx}^{(i)}\right\| ^ 2 $$
algorithm process:
1. feature scaling and mean normalization for the original dataset $x^{(i)} \in \mathbb{R}^n$
2. compute the covariance matrix $\Sigma \in \mathbb{R}^{n \times n}$:
$$ \Sigma = \frac{1}{m} \sum_{i=1}^{m} (x^{(i)})(x^{(i)})^{T} $$
Sigma = X' * X / m;
3. compute the eigenvectors of the covariance matrix using:
[U, S, V] = svc(Sigma);
$$ U = \begin{bmatrix}| & | & & | \\u_1 & u_2 & \vdots & u_n \\| & | & & | \\\end{bmatrix} $$
4. select the first $k$ columns of matrix $U \in \mathbb{R}^{n \times n}$ as the $k$ principal components:
U_reduce = U(:, 1:k);
5. project $x^{(i)}$ into a $k$ dimensional vector $z^{(i)}$:
$$ z^{(i)} = U_{reduce}^{T}x^{(i)} $$
Z = X * U_reduce;
6. reconstruction from compressed representation:
$$ x_{approx}^{(i)} = U_{reduce}z^{(i)} $$
X_approx = U_reduce * Z;
choosing the number of principal components:
The average squared projection error is:
$$ error = \frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)} - x_{approx}^{(i)}\right\| ^ 2 $$
The total variation of the dataset is:
$$ variation = \frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)}\right\|^2 $$
Typically, choose $k$ to be the smallest value so that:
$$ \frac{error}{variation} = \frac{\frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)} - x_{approx}^{(i)}\right\| ^ 2}{\frac{1}{m} \sum_{i=1}^{m} \left\| x^{(i)}\right\|^2} \leq 0.01 $$
i.e. 99% of variation is retained.
In practice, this value is found to be:
$$ \frac{error}{variation} = 1 - \frac{\sum_{i=1}^{k}S_{ii}}{\sum_{i=1}^{n}S_{ii}} $$
$$ S = \begin{bmatrix}S_{11} & & & \\ & S_{22} & & \\ & & \ddots & \\ & & & S_{nn} \\\end{bmatrix} $$
Hence, the algorithm only needs to be run once. And pick the smallest $k$ so that:
$$ \frac{\sum_{i=1}^{k}S_{ii}}{\sum_{i=1}^{n}S_{ii}} \geq 0.99 $$
usages of PCA:
- data compression: reduce the memory needed & speed up the learning algorithm
- data visualization: reduce the data to 2D or 3D so that they can be plotted
- improper use: use PCA for regularization, because some information is lost during the process of PCA
标签:reduce,Dimensionality,frac,ML,sum,Component,approx,variation,error 来源: https://www.cnblogs.com/ms-qwq/p/16484697.html
本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享; 2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关; 3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关; 4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除; 5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。