knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4) # Assuming necessary multiblock functions are loaded library(ggplot2) # for plots below # Need to ensure pca, svd_wrapper, bi_projector, center, prep, scores, # reconstruct, truncate, shape, screeplot, biplot, std_scores, perm_test, rotate # are available, likely via loading the package.
There are six popular SVD engines in R (base::svd, corpcor, RSpectra, irlba, rsvd, svd (PROPACK)) – each with its own argument list, naming conventions and edge-cases (some refuse to return the full rank, others crash on tall-skinny matrices).
svd_wrapper()
smooths that out:
bi_projector
– an S3 class that stores loadings v
,
scores s
, singular values sdev
plus the fitted pre-processor.That means immediate access to verbs such as project()
,
reconstruct()
, truncate()
, partial_project()
.
set.seed(1); X <- matrix(rnorm(35*10), 35, 10) # 35 obs × 10 vars sv_fast <- svd_wrapper(X, ncomp = 5, preproc = center(), method = "fast") # Ensure irlba is installed if testing this method if (requireNamespace("irlba", quietly=TRUE)) { sv_irlba <- svd_wrapper(X, ncomp = 5, preproc = center(), method = "irlba") } else { sv_irlba <- NULL # Skip if irlba not available } # Same downstream code works for both objects: head(scores(sv_fast)) # 35 × 5 if (!is.null(sv_irlba)) { all.equal(scores(sv_fast), scores(sv_irlba)) # TRUE for this random X }
pca()
Most people really want PCA, so pca()
is a thin wrapper that
svd_wrapper()
with sane defaults,data(iris) X_iris <- as.matrix(iris[, 1:4]) pca_fit <- pca(X_iris, ncomp = 4) # defaults to method = "fast", preproc=center() print(pca_fit)
screeplot(pca_fit, type = "lines", main = "Iris PCA – scree plot")
# Requires ggrepel for repulsion, but works without it biplot(pca_fit, repel_points = TRUE, repel_vars = TRUE, group_var = iris$Species)
(If you do not have ggrepel installed the text is placed without repulsion.)
bi_projector
?Think bidirectional mapping:
data space (p variables) ↔ component space (d ≤ p) new samples: project() ← scores new variables: project_vars() ← loadings reconstruction ↔ (scores %*% t(loadings))
A bi_projector
therefore carries
| slot | shape | description |
|-----------|-------|----------------------------------------------------|
| v
| p × d | component loadings (columns) |
| s
| n × d | score matrix (rows = observations) |
| sdev
| d | singular values (or SDs related to components) |
| preproc
| – | fitted transformer so you never leak training stats |
Because pca()
returns a bi_projector
, you get other methods for free:
# rank-2 reconstruction of the iris data Xhat2 <- reconstruct(pca_fit, comp = 1:2) print(paste("MSE (rank 2):", round(mean((X_iris - Xhat2)^2), 4))) # MSE ~ 0.076 # drop to 2 PCs everywhere pca2 <- truncate(pca_fit, 2) shape(pca2) # 4 vars × 2 comps
The next chunk quietly touches a few more branches used in the unit tests
(std_scores()
, perm_test()
, rotate()
), but keeps printing to a minimum:
# std scores head(std_scores(svd_wrapper(X, ncomp = 3))) # Use the earlier X data # tiny permutation test (10 perms; obviously too few for science) # This requires perm_test.pca method # Make sure X_iris is centered if perm_test needs centered data perm_res <- perm_test(pca_fit, X_iris, nperm = 10, comps = 2) print(perm_res$component_results) # quick varimax rotation if (requireNamespace("GPArotation", quietly = TRUE)) { pca_rotated <- rotate(pca_fit, ncomp = 3, type = "varimax") print(pca_rotated) } else { cat("GPArotation not installed, skipping rotation example.\n") }
(Running these once in the vignette means they are also executed by R CMD check
, bumping test-coverage without extra scaffolding.)
svd_wrapper()
gives you a unified front end to half-a-dozen SVD engines.pca()
piggy-backs on that, returning a fully featured bi_projector
.bi_projector
contract means the same verbs & plotting utilities
work for any decomposition you wrap into the framework later.sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.