A Principle Component Method Explained (in R)
The following example demonstrates factor analysis using the covariance matrix with the iris
data set in R. Instead of directly using psych::priciple()
, a step-by-step approach is used.
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
---|---|---|---|---|
5.1 | 3.5 | 1.4 | 0.2 | setosa |
4.9 | 3.0 | 1.4 | 0.2 | setosa |
4.7 | 3.2 | 1.3 | 0.2 | setosa |
4.6 | 3.1 | 1.5 | 0.2 | setosa |
5.0 | 3.6 | 1.4 | 0.2 | setosa |
5.4 | 3.9 | 1.7 | 0.4 | setosa |
A.1 \(\lambda_i\) & \(e_i\) of \(\mathbf{S}\)
Find the covariance matrix \(\mathbf{S}\) with the cov()
function.
S <- cov(iris[,-5])
S %>%
kable(format="html", align="c", digits = 3) %>%
kable_styling(bootstrap_options =
c("striped", "condensed")) %>%
scroll_box(width = "100%")
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | |
---|---|---|---|---|
Sepal.Length | 0.686 | -0.042 | 1.274 | 0.516 |
Sepal.Width | -0.042 | 0.190 | -0.330 | -0.122 |
Petal.Length | 1.274 | -0.330 | 3.116 | 1.296 |
Petal.Width | 0.516 | -0.122 | 1.296 | 0.581 |
The eigenvalues and eigenvectors are then computed from the covariance matrix with the eigen()
function.
S.eigen <- eigen(S)
S.eigen
eigen() decomposition
$values
[1] 4.22824171 0.24267075 0.07820950 0.02383509
$vectors
[,1] [,2] [,3] [,4]
[1,] 0.36138659 -0.65658877 -0.58202985 0.3154872
[2,] -0.08452251 -0.73016143 0.59791083 -0.3197231
[3,] 0.85667061 0.17337266 0.07623608 -0.4798390
[4,] 0.35828920 0.07548102 0.54583143 0.7536574
A.2 Construction of \(\hat{\mathbf{L}}\)
Before proceeding with factoring \(\mathbf{S}\) into \(PDP^T\), the number of factors \(m\) must be selected. The last two eigenvalues of \(\mathbf{S}\) are practically \(0\), so \(m = 2\) is likely a good choice.
With \(m = 2\) factors, construct the \(P\) and \(D\) matrices from the covariance matrix with the largest two eigenvalues and the corresponding eigenvectors.
P <- S.eigen$vectors[,1:2] %>% as.matrix()
D <- diag(S.eigen$values[1:2],
nrow = dim(P)[2],
ncol = dim(P)[2])
\(\hat{\mathbf{L}}\) is then found from the \(P\) and \(D\) matrices as in \(\hat{\mathbf{L}} = PD^{1/2}\)
S.loadings <- P %*% sqrt(D)
S.loadings %>% round(3)
[,1] [,2]
[1,] 0.743 -0.323
[2,] -0.174 -0.360
[3,] 1.762 0.085
[4,] 0.737 0.037
Which are the unrotated factor loadings. We can see where the term principal component method is derived from as the columns of \(\hat{\mathbf{L}}\) are proportional to the eigenvectors of \(\mathbf{S}\), which are also equal to the corresponding coefficient of the principal components.
# Perform PCA and take the resulting first two PCs
prcomp(iris[,-5])$rotation[,1:2] %>%
round(3)
PC1 PC2
Sepal.Length 0.361 -0.657
Sepal.Width -0.085 -0.730
Petal.Length 0.857 0.173
Petal.Width 0.358 0.075
# eigenvector of S from iris
S.eigen$vectors[,1:2] %>%
round(3)
[,1] [,2]
[1,] 0.361 -0.657
[2,] -0.085 -0.730
[3,] 0.857 0.173
[4,] 0.358 0.075
A.3 Variances: Communalities, & Specific Variances
The communality \(h^2_i\), as noted previously, is the sum of squares of the row of \(\hat{\mathbf{L}}\).
\[ \hat{h}^2_i = \sum^m_{j=1} \hat{\ell}^2_{ij} \]
S.h2 <- S.loadings^2 %>% rowSums()
S.h2 %>% round(3)
[1] 0.657 0.160 3.110 0.544
The sum of squares of the columns of \(\hat{\mathbf{L}}\) are the respective eigenvalues of \(\mathbf{S}\), since
\[ \sqrt{\hat{\lambda_i}} \hat{e_i} = \begin{pmatrix} \ell_{i1} \\ \ell_{i2} \\ \vdots \\ \ell_{ip} \\ \end{pmatrix}_{p \times 1}, ~ (\sqrt{\hat{\lambda_i}} \hat{e_i})^T (\sqrt{\hat{\lambda_i}} \hat{e_i}) = \hat{\lambda_i} \]
colSums(S.loadings^2) %>% round(3)
[1] 4.228 0.243
S.eigen$values[1:2] %>% round(3)
[1] 4.228 0.243
A.3.1 Specific variance \(\psi_i\)
The specific variance, \(\psi_i\), is a component unique to the particular variable and is found by subtracting the diagonal of \(\mathbf{S}\) by the respective communality \(\hat{h}^2_i\):
\[ \psi_i = s_{ii} - \hat{h}^2_i \]
diag(S) - S.h2 %>%
round(4)
Sepal.Length Sepal.Width Petal.Length Petal.Width
0.028893512 0.030379418 0.005977852 0.036806264
A.3.2 Proportion of variance due to Factors
The proportions of total variance of \(\mathbf{S}\) due to common factor 1 and 2, respectively, are found by dividing the sum of squares of the columns of \(\hat{\mathbf{L}}\) by \(tr(\mathbf{S}) = \Sigma s_{ii} = \Sigma \lambda_i\).
var_F1 <- colSums(S.loadings^2)[1]
var_F2 <- colSums(S.loadings^2)[2]
cbind(var_F1 / sum(S.eigen$values),
var_F2 / sum(S.eigen$values)) %>%
round(3)
[,1] [,2]
[1,] 0.925 0.053