perf.sgspls: Performance evaluation of sgsPLS objects

Description Usage Arguments Value References See Also Examples

Description

Function to evaluate the performance of the fitted the PLS models using various criteria. Evaluation is made for each component in the PLS object and can only be evaluated for regression PLS.

Usage

1
2
3
4
## S3 method for class 'sgspls'
perf(object, validation = c("Mfold", "loo"), folds = 10,
  BIC = FALSE, progressBar = TRUE, setseed = FALSE, scale_resp = FALSE,
  ...)

Arguments

object

Object of class inheriting from "sgspls". The function will retrieve some key parameters stored in that object.

validation

What kind of cross validation to use, matching one of "Mfold" or "loo" (leave one out). Default is "Mfold".

folds

Number of folds to use in the cross validation.

BIC

Return the BIC type criterion for large sample sizes (see paper for details). Note that this is not an actual BIC criterion.

progressBar

Logical to show a progress bar while computing the performances

setseed

optional double to set random seed for replication (default is no seed).

scale_resp

Logical to scale the responses. This is useful if comparing the fit on multiple responses.

...

additional arguments to be passed to fitting functions.

Value

perf returns a list that contains the following performance measures:

MSEP

A matrix of Mean Square Error of Prediction (MSEP) estimates by cross validation. The penalty is defined as:

MSEP = 1/n ∑ ∑ (f_k(x_i) - y_i)^2

see the references for details.

PRESS0

A vector of cross validated Predictive Residual Sum of Squares (PRESS) values. Each column corresponds to a response. Matches the PLS package.

PRESS

A matrix of cross validated Predictive Residual Sum of Squares (PRESS) values. Each row contains the values for a different component and each column corresponds to a response.

R2

a matrix of R^2 values of the Y-variables with ncomp components

BIC

A BIC type criterion for large samples (see references for details). Note that this is not an actual BIC criterion.

cvPred

an array with the cross-validated predictions.

folds

A list of the folds used in the cross validation.

References

Mevik, Bjorn-Helge, and Henrik Rene Cederkvist. 2004. Mean Squared Error of Prediction (MSEP) Estimates for Principal Component Regression (PCR) and Partial Least Squares Regression (PLSR). Journal of Chemometrics 18 (9). John Wiley & Sons, Ltd.:422-29.

See Also

Tuning functions calc_pve, tune_sgspls, tune_groups. Model performance and estimation predict.sgspls, perf.sgspls, coef.sgspls

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
set.seed(1)
n = 50; p = 500; 

size.groups = 30; size.subgroups = 5
groupX <- ceiling(1:p / size.groups)
subgroupX <- ceiling(1:p / size.subgroups)

X = matrix(rnorm(n * p), ncol = p, nrow = n)

beta <- rep(0,p)
bSG <- -2:2; b0 <- rep(0,length(bSG))
betaG <- c(bSG, b0, bSG, b0, bSG, b0)
beta[1:size.groups] <- betaG

y = X %*% beta + 0.1*rnorm(n)

model <- sgspls(X, y, ncomp = 3, mode = "regression", keepX = 1,
                groupX = groupX, subgroupX = subgroupX,
                indiv_sparsity_x = 0.8, subgroup_sparsity_x = 0.15)

model_perf <- perf(model, folds = 5)

# Check model performance
model_perf$MSEP

matt-sutton/sgspls documentation built on June 22, 2019, 10:21 a.m.