knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The multivarious package provides generic functions and some basic implementations for dimensionality reduction of high-dimensional data. This vignette focuses on two main classes in the package, projector and bi_projector, and demonstrates how to use the project function for projecting new data onto a lower-dimensional subspace.
projector and bi_projector are two core classes in the multivarious
package. They represent linear transformations from a high-dimensional space to a lower-dimensional space.
A projector instance maps a matrix from an $N$-dimensional space to a $d$-dimensional space, where $d$ may be less than $N$. The projection matrix, $V$, is not necessarily orthogonal. This class can be used for various dimensionality reduction techniques like PCA, LDA, etc.
A bi_projector instance offers a two-way mapping from samples (rows) to scores and from variables (columns) to components. This allows projecting from a $D$-dimensional input space to a $d$-dimensional subspace, and projecting from an $n$-dimensional variable space to the $d$-dimensional component space. The singular value decomposition (SVD) is a canonical example of such a two-way mapping.
The project function is a generic function that takes a model fit (typically an object of class bi_projector or any other class that implements a project method) and new observations. It projects these observations onto the subspace defined by the model. This enables the transformation of new data into the same lower-dimensional space as the original data. Mathematically, projection consists of the following:
$$ X \approx USV^T $$
$$ \text{projected_data} = \text{new_data} \cdot V $$
In this example, we will demonstrate how to create a bi_projector object using the results of an SVD and project new data onto the same subspace as the original data.
# Load the multivarious package library(multivarious) # Create a synthetic dataset set.seed(42) X <- matrix(rnorm(200), 10, 20) # Perform SVD on the dataset svdfit <- svd(X) # Create a bi_projector object p <- bi_projector(svdfit$v, s = svdfit$u %*% diag(svdfit$d), sdev = svdfit$d) # Generate new data to project onto the same subspace as the original data new_data <- matrix(rnorm(5 * 20), 5, 20) projected_data <- project(p, new_data) print(projected_data)
In the multivarious
package, the bi_projector
class allows you to project new variables into the subspace defined by the model. The project_vars
function is a generic function that operates on an object of a class implementing the project_vars
method, such as a bi_projector
object. This function projects one or more variables onto a subspace, which can be computed for a biorthogonal decomposition like Singular Value Decomposition (SVD).
Remember, given an original data matrix $X$, the SVD of $X$ can be written as:
$$ X \approx USV^T $$
Where $U$ contains the left singular vectors (scores), $S$ is a diagonal matrix containing the singular values, and $V^T$ contains the right singular vectors (components). When we have new variables (columns) that we want to project into the same subspace as the original data, we can use the project_vars
function.
Let's say we have a new data matrix new_data
with the same number of rows as the original data. To project these new variables into the subspace, we can compute:
\text{projected_vars} = U^T \cdot \text{new_data}
The result is a matrix or vector of the projected variables in the subspace.
Here's an example of how you can use the svd_wrapper
function in the multivarious
package with the iris
dataset to compute the SVD and project new variables into the subspace.
First, let's load the iris
dataset and compute the SVD using the svd_wrapper
function:
# Load iris dataset and select the first four columns data(iris) X <- iris[, 1:4] # Compute SVD using the base method and 3 components fit <- svd_wrapper(X, ncomp = 3, preproc = center(), method = "base")
Now, let's assume we have a new data matrix new_data
with the same number of rows as the original data. To project these new variables into the subspace, we can use the project_vars
function:
# Define new_data new_data <- rnorm(nrow(iris)) # Project the new variables into the subspace projected_vars <- project_vars(fit, new_data)
This example demonstrates how to compute the SVD using the svd_wrapper
function and project new variables into the subspace defined by the SVD using the project_vars
function.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.