docs/vignettes/04_Running_SPEAR.md

Running SPEAR

Required Libraries:
library(SPEAR)

For this tutorial, we should have already prepared our data and made a SPEARobject.

When you initialize a SPEARobject, the $fit list will be NULL… This is where SPEAR will store results from training (see below)

names(SPEARobj$fit)

NULL

Some parameters are automatically initialized if you don’t specify them in the make.SPEARobject function…

SPEARobj$params$num_factors

5

This includes the $params$weights argument, which is a matrix. Each row corresponds to a widx (weight index), which will affect how SPEAR reconstructs latent factors.

SPEARobj$params$weights

# Yields...
    w.x w.y
 1 2.00   1
 2 1.00   1
 3 0.50   1
 4 0.10   1
 5 0.01   1
 6 0.00   1

Running SPEAR (with cross validation):

Cross validated SPEAR uses the parallel package to efficiently produce results from K-fold cross validation (defaults to 5…)

User can specify the following parameters…

If working properly, you should see output similar to that below…

# First, run cv spear:
SPEARobj$run.cv.spear(num.folds = ..., # how many folds? Defaults to 5
                      fold.ids = ...   # How should each sample (row) be assigned? Defaults to randomly
                      )

NOTE: By default, CV SPEAR only prints out the results from one fold. Due to the parallel processing, the folds are all running simultaneously.

Running either $run.cv.spear() or $run.spear() will populate the SPEARobject$fit parameter (which is set to NULL when the SPEARobject is created):

names(SPEARobj$fit)

 [1] "regression.coefs"             "projection.coefs.x"           "projection.coefs.y"           "nonzero.probs"                "projection.probs"            
 [6] "marginal.probs"               "joint.probs"                  "intercepts.x"                 "intercepts.y"                 "regression.coefs.cv"         
[11] "projection.coefs.y.cv"        "type"                         "projection.coefs.y.scaled"    "projection.coefs.y.cv.scaled"
[15] "intercepts.scaled"            "factor.contributions"         "factor.contributions.pvals"      

These are all of the coefficients needed to calculate the loadings, probabilities, predictions and factor scores.

When finished, the CV runs will automatically be evaluated to adaptively determine the most predictive factors from varying influences of X. (You can opt to manually run this via setting do.cv.eval = FALSE in the run.cv.spear function and then using the $cv.evaluate() function…). This process performs many important calculations, including getting the mean cross validation error per weight.

This will also add $fit$cv.eval and $fit$loss to the SPEAR object, which are essential for some downstream functions.

Finally, save the SPEARobject via…

SPEARobj$save.model("_name_to_save_model_.rds")

… which can be easily reloaded with …

SPEARobj <- load.SPEARobject(file = "_name_to_save_model_.rds")
# or readRDS(...)

Chaining functions:

You may find it easier of you have experience with object oriented programing to chain functions like below…

SPEARobj$run.cv.spear(num.folds = 10)$save.model("_name_to_save_model_.rds")

Running SPEAR (without cross validation):

While it is highly recommended to use the $run.cv.spear() function instead, it is possible to use run.spear() to use all of the data at once rather than cross validation.

If working properly, you should see output similar to that below…

SPEARobj$run.spear(...)

Other Vignettes

To return to the main SPEAR vignette, click here



jgygi/SPEAR documentation built on July 5, 2023, 5:35 p.m.