Use PCA Results to Reconstruct All or Part of the Original Data Set

This function allows one to reconstruct an approximation (Xhat) of the original data using some or all of the principal components, starting from the results of PCA. Inspired by and follows https://stackoverflow.com/a/23603958/633251 very closely. We are grateful for this post by StackOverflow contributor "Marc in the box."

PCAtoXhat(pca, ncomp = NULL)

Arguments

pca: An object of class prcomp or princomp (automatically detected). #' The results of data reduction by PCA.
ncomp: Integer. The number of principal components to use in reconstructing the data set. Must be no larger than the number of variables. If not specified, all the components are used and the original data set is reconstructed.

Value

A matrix with the same dimensions as pca$x (the dimensions of the original data set).

Examples

# Example data from ?prcomp (see discussion at Stats.StackExchange.com/q/397793)
C <- chol(S <- toeplitz(.9 ^ (0:31)))
set.seed(17)
X <- matrix(rnorm(32000), 1000, 32)
Z <- X %*% C

pcaz <- prcomp(Z)
tst <- PCAtoXhat(pcaz)
all.equal(tst, Z, check.attributes = FALSE)
#> [1] TRUE

# Plot to show the effect of increasing ncomp

ntests <- ncol(Z)
rmsd <- rep(NA_real_, ntests)
for (i in 1:ntests) {
  ans <- XtoPCAtoXhat(X, i, sd)
  del<- ans - X
  rmsd[i] <- sqrt(sum(del^2)/length(del)) # RMSD
}
plot(rmsd, type = "b",
  main = "Root Mean Squared Deviation\nReconstructed - Original Data",
  xlab = "No. of Components Retained", ylab = "RMSD")
abline(h = 0.0, col = "pink")