knitr::opts_chunk$set(collapse = TRUE, comment = "#>", fig.width=6, fig.height=4) # Assuming necessary multivarious functions are loaded # e.g., via devtools::load_all() or library(multivarious) library(multivarious) library(tibble) # For summary output
The composed partial projector (compose_partial_projector) lets you snap together any number of ordinary projector objects (PCA, PLS, cPCA++, block projectors, ...) and treat the whole chain as if it were a single map from the original input space to the final output space:
[ \mathbb R^{p_{\text{orig}}} \longrightarrow \mathbb R^{q_{\text{final}}} ]
Typical Motives:
| Why compose? | What you get | |---------------------------------------------------------------------|-----------------------------------------------------------------------| | Pre-whitening, centring or wavelet-decomposition before the "real" model | Keep the preparation and the model in one tidy object. | | Block-wise modelling (e.g. one PCA per sensor block) | Treat the concatenation of block-specific results as a single projector. | | Dimensionality milk-run – reduce > filter > reduce again | A single set of scores from the final stage to feed to a classifier. |
Let's compose two PCA steps:
set.seed(1) X <- matrix(rnorm(30*15), 30, 15) # raw data, 30 samples, 15 variables p1 <- pca(X, ncomp = 8) # first reduction: 15 -> 8 components p2 <- pca(scores(p1), ncomp = 7) # second reduction: 8 -> 4 components # Compose the two projectors pipe <- compose_partial_projector( first = p1, second = p2) print(pipe) # Project original data through the entire pipeline S <- project(pipe, X) # 30 × 4 scores – as if the two steps were one dim(S) # Get a summary of the pipeline stages summary(pipe)
The summary() output provides a clear overview of the stages, their names, input/output dimensions, and underlying class.
partial_project() works on composed projectors, allowing you to apply projections using only a subset of variables at specific stages.
You supply the colind argument as either:
NULL for a stage that should receive the full input from the previous stage.# Example 1: Use only variables 1:5 for the *first* PCA stage. # The second PCA stage receives the full 8 components from the (partial) first stage. S15 <- partial_project(pipe, X[, 1:5, drop=FALSE], colind = 1:5) cat("Dimensions after partial projection (cols 1:5 in first stage):", dim(S15), "\n") # Example 2: Multi-stage pipeline (conceptual) # Imagine a 3-stage pipeline: wavelets -> PCA (block1) -> PCA (global) # pipe2 <- wavelet_projector(...) %>>% # pca(..., ncomp = 10) %>>% # pca(..., ncomp = 3) # To focus on coefficients 12:20 *after* the wavelet step (i.e., input to stage 2): # S_sel <- partial_project(pipe2, X, # Assuming X is appropriate input for wavelets # colind = list(NULL, 12:20, NULL)) # Note: The indices in the list always refer to the dimensions *entering* that specific stage.
Behind the scenes, the composed projector manages the mapping of indices through the pipeline.
Since each stage typically provides a way to reverse its projection (often via inverse_projection()), the composed projector can also reconstruct the original data from the final scores.
# Reconstruct original data from the final scores 'S' X_hat <- reconstruct(pipe, S) cat("Dimensions of reconstructed data:", dim(X_hat), "\n") # Check reconstruction accuracy # Note: Since the pipeline involves dimensionality reduction (15 -> 8 -> 4), # reconstruction will not be exact. The error reflects the information lost. max_reconstruction_error <- max(abs(X - X_hat)) cat("Maximum absolute reconstruction error:", format(max_reconstruction_error, digits=3), "\n") # stopifnot(max_reconstruction_error < 1e-5) # Removed: This check is too strict for lossy reconstruction # Get the overall coefficient matrix (p_orig x q_final) V <- coef(pipe) cat("Dimensions of overall coefficient matrix:", dim(V), "\n") # Get the overall pseudo-inverse matrix (q_final x p_orig) Vplus <- inverse_projection(pipe) cat("Dimensions of overall inverse projection matrix:", dim(Vplus), "\n")
Both the forward (coef) and inverse (inverse_projection) matrices for the entire pipeline are calculated and potentially cached for efficiency.
Some useful helper functions:
%>>%: A pipe operator specifically for composing projectors. It preserves stage names if the projectors are named.
r
# pipe3 <- pca1 %>>% pca2 %>>% pca3truncate(pipe, ncomp = k): Safely reduces the number of components kept from the last stage of the pipeline.variables_used(pipe) / vars_for_component(pipe, k): (Potential future helpers) Intended to trace which original variables contribute to the final scores, especially useful if any stages perform variable selection.Composed projectors open up possibilities:
Happy composing!
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.