vampire.amath._pca_svd#
- vampire.amath._pca_svd(A)[source]#
Principal component analysis of matrix A by singular value decomposition.
Returns loadings, principal components, and explained variance.
- Parameters:
- Andarray
Matrix with shape (m, n), where n features are in columns, and m measurements are in rows.
- Returns:
- Vndarray
Loadings, weights, principal directions, principal axes, eigenvector of covariance matrix of mean-subtracted A, with shape (n, n).
- Tndarray
PC score, principal components, coordinates of mean-subtracted A in its principal directions, with shape (m, n).
- dndarray
Explained variance, eigenvalues of covariance matrix of mean-subtracted A, with size n.
See also
Notes
Suppose we have a matrix \(\mathbf{A} \in \mathbb{R}^{m \times n}\) with \(n\) columns of features \(\mathbf{x}_1, \mathbf{x}_2, \dots, \mathbf{x}_n\) and \(m\) rows of measurements:
\[\begin{split}\mathbf{A} = \begin{bmatrix} | & | & & | \\ \mathbf{x}_1 & \mathbf{x}_2 & \cdots & \mathbf{x}_n \\ | & | & & | \\ \end{bmatrix}.\end{split}\]We can perform principal component analysis (PCA) [1] on the matrix using singular value decomposition (SVD).
Mean subtraction
We first calculate the mean of the features \(\bar{x}_1, \bar{x}_2, \dots, \bar{x}_n\), respectively, and stored them in the matrix
\[\begin{split}\mathbf{\bar{A}} = \begin{bmatrix} 1 \\ 1 \\ \vdots \\ 1 \end{bmatrix} \begin{bmatrix} \bar{x}_1 & \bar{x}_2 & \cdots & \bar{x}_n \end{bmatrix}.\end{split}\]We then calculate the mean-subtracted data
\[\mathbf{B = A - \bar{A}}\]to make the data zero mean.
Singular value decomposition
We compute the SVD of \(\mathbf{B}\):
\[\mathbf{B} = \mathbf{U \Sigma V}^T.\]Multiply \(\mathbf{V}\) at the right on both sides, we get the principal components
\[\mathbf{T \equiv BV = U\Sigma},\]where \(\mathbf{V}\) is the loading. The explained variance matrix \(\mathbf{D}\) is related to \(\mathbf{\Sigma}\) by
\[\mathbf{D} = \dfrac{1}{n-1}\mathbf{\Sigma}^2.\]References
[1]Brunton, S., & Kutz, J. (2019). Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge: Cambridge University Press. doi:10.1017/9781108380690