library(SPEAR)
For this tutorial, we should have already prepared our data and made a SPEARobject.
When you initialize a SPEARobject, the $fit
list will be NULL
… This
is where SPEAR will store results from training (see below)
names(SPEARobj$fit)
NULL
Some parameters are automatically initialized if you don’t specify them
in the make.SPEARobject
function…
SPEARobj$params$num_factors
5
This includes the $params$weights
argument, which is a matrix. Each
row corresponds to a widx (weight index), which will affect how
SPEAR reconstructs latent factors.
SPEARobj$params$weights
# Yields...
w.x w.y
1 2.00 1
2 1.00 1
3 0.50 1
4 0.10 1
5 0.01 1
6 0.00 1
Cross validated SPEAR uses the parallel
package to efficiently produce
results from K-fold cross validation (defaults to 5…)
User can specify the following parameters…
num.folds - how many folds for the K-fold cross validation to use? Defaults to 5, but smaller sample sizes could benefit from higher numbers (i.e. 10)
fold.ids - how to assign each sample (row) per fold? Defaults to
random with sample(1:num.folds, num.samples, replace = FALSE)
, but
can be specified via a vector of integers as well
(i.e. fold.ids = c(1, 1, 1, 1, 2, 2, 2, 2, ... 10)
, but length of
fold.ids
must be num.samples
!)
If working properly, you should see output similar to that below…
# First, run cv spear:
SPEARobj$run.cv.spear(num.folds = ..., # how many folds? Defaults to 5
fold.ids = ... # How should each sample (row) be assigned? Defaults to randomly
)
NOTE: By default, CV SPEAR only prints out the results from one fold. Due to the parallel processing, the folds are all running simultaneously.
Running either $run.cv.spear()
or $run.spear()
will populate the
SPEARobject$fit
parameter (which is set to NULL
when the SPEARobject
is created):
names(SPEARobj$fit)
[1] "regression.coefs" "projection.coefs.x" "projection.coefs.y" "nonzero.probs" "projection.probs"
[6] "marginal.probs" "joint.probs" "intercepts.x" "intercepts.y" "regression.coefs.cv"
[11] "projection.coefs.y.cv" "type" "projection.coefs.y.scaled" "projection.coefs.y.cv.scaled"
[15] "intercepts.scaled" "factor.contributions" "factor.contributions.pvals"
These are all of the coefficients needed to calculate the loadings, probabilities, predictions and factor scores.
When finished, the CV runs will automatically be evaluated to adaptively
determine the most predictive factors from varying influences of X.
(You can opt to manually run this via setting do.cv.eval = FALSE
in
the run.cv.spear
function and then using the $cv.evaluate()
function…). This process performs many important calculations, including
getting the mean cross validation error per weight.
This will also add $fit$cv.eval
and $fit$loss
to the SPEAR object,
which are essential for some downstream functions.
Finally, save the SPEARobject via…
SPEARobj$save.model("_name_to_save_model_.rds")
… which can be easily reloaded with …
SPEARobj <- load.SPEARobject(file = "_name_to_save_model_.rds")
# or readRDS(...)
You may find it easier of you have experience with object oriented programing to chain functions like below…
SPEARobj$run.cv.spear(num.folds = 10)$save.model("_name_to_save_model_.rds")
While it is highly recommended to use the $run.cv.spear()
function
instead, it is possible to use run.spear()
to use all of the data at
once rather than cross validation.
If working properly, you should see output similar to that below…
SPEARobj$run.spear(...)
To return to the main SPEAR vignette, click here
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.