plsr_agg: PLSR with aggregation of latent variables

View source: R/plsr_agg.R

plsr_aggR Documentation

PLSR with aggregation of latent variables

Description

Ensemblist approach where the predictions are calculated by averaging the predictions of PLSR models (plskern) built with different numbers of latent variables (LVs).

For instance, if argument nlv is set to nlv = "5:10", the prediction for a new observation is the average (without weighting) of the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs.

Usage


plsr_agg(X, Y, weights = NULL, nlv)

## S3 method for class 'Plsr_agg'
predict(object, X, ...)  

Arguments

X

For the main functions: Training X-data (n, p). — For auxiliary functions: New X-data (m, p) to consider.

Y

Training Y-data (n, q).

weights

Weights (n, 1) to apply to the training observations. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to 1 / n).

nlv

A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs).

object

A fitted model, output of a call to the main functions.

...

Optional arguments. Not used.

Value

See the examples.

Examples


n <- 20 ; p <- 4
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- rnorm(n)
Ytrain <- cbind(y1 = ytrain, y2 = 100 * ytrain)
m <- 3
Xtest <- Xtrain[1:m, , drop = FALSE] 
Ytest <- Ytrain[1:m, , drop = FALSE] ; ytest <- Ytest[1:m, 1]

nlv <- "1:3"
#nlv <- "2:3"
fm <- plsr_agg(Xtrain, ytrain, nlv = nlv)
names(fm)

## Maximal PLSR model
zfm <- fm$fm
class(zfm)
names(zfm)
summary(zfm, Xtrain)

##### Predictions
res <- predict(fm, Xtest)
names(res)
## Final predictions (after aggregation)
res$pred
msep(res$pred, ytest)
## Intermediate predictions (Per nb. LVs)
res$predlv

## Gridscore
## Here, there is no sense to use gridscorelv 
pars <- mpars(nlv = c("1:3", "2:5"))
## Same as:
## pars <- list(nlv = c("1:3", "2:5"))
pars
res <- gridscore(
    Xtrain, Ytrain, Xtest, Ytest, 
    score = msep, 
    fun = plsr_agg, 
    pars = pars)
res

## Gridcv
## Here, there is no sense to use gridcvlv 
K = 3
segm <- segmkf(n = n, K = K, nrep = 1)
segm
res <- gridcv(
    Xtrain, Ytrain, 
    segm, score = msep, 
    fun = plsr_agg, 
    pars = pars,
    verb = TRUE)
res


mlesnoff/rchemo documentation built on April 15, 2023, 1:25 p.m.