lwplsda_agg: Aggregation of KNN-LWPLSDA models with different numbers of...

lwplsrda_aggR Documentation

Aggregation of KNN-LWPLSDA models with different numbers of LVs

Description

Ensemblist method where the predictions are calculated by "averaging" the predictions of KNN-LWPLSDA models built with different numbers of latent variables (LVs).

For instance, if argument nlv is set to nlv = "5:10", the prediction for a new observation is the most occurent level (vote) over the predictions returned by the models with 5 LVS, 6 LVs, ... 10 LVs, respectively.

- lwplsrda_agg: use plsrda.

- lwplslda_agg: use plslda.

- lwplsqda_agg: use plsqda.

Usage


lwplsrda_agg(
    X, y,
    nlvdis, diss = c("eucl", "mahal"),
    h, k,
    nlv,
    cri = 4,
    verb = FALSE
    ) 

lwplslda_agg(
    X, y,
    nlvdis, diss = c("eucl", "mahal"),
    h, k,
    nlv, 
    prior = c("unif", "prop"),
    cri = 4,
    verb = FALSE
    ) 

lwplsqda_agg(
    X, y,
    nlvdis, diss = c("eucl", "mahal"),
    h, k,
    nlv, 
    prior = c("unif", "prop"),
    cri = 4,
    verb = FALSE
    ) 

## S3 method for class 'Lwplsrda_agg'
predict(object, X, ...)

## S3 method for class 'Lwplsprobda_agg'
predict(object, X, ...)

Arguments

X

For the main functions: Training X-data (n, p). — For the auxiliary functions: New X-data (m, p) to consider.

y

Training class membership (n). Note: If y is a factor, it is replaced by a character vector.

nlvdis

The number of LVs to consider in the global PLS used for the dimension reduction before calculating the dissimilarities. If nlvdis = 0, there is no dimension reduction.

diss

The type of dissimilarity used for defining the neighbors. Possible values are "eucl" (default; Euclidean distance), "mahal" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)).

h

A scale scalar defining the shape of the weight function. Lower is h, sharper is the function. See wdist.

k

The number of nearest neighbors to select for each observation to predict.

nlv

A character string such as "5:20" defining the range of the numbers of LVs to consider (here: the models with nb LVS = 5, 6, ..., 20 are averaged). Syntax such as "10" is also allowed (here: correponds to the single model with 10 LVs).

prior

The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in y).

cri

Argument cri in function wdist.

verb

Logical. If TRUE, fitting information are printed.

object

A fitted model, output of a call to the main function.

...

Optional arguments. Not used.

Value

See the examples.

Examples


n <- 50 ; p <- 7
X <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
y <- sample(c(1, 4, 10), size = n, replace = TRUE)
#y <- sample(c("a", "10", "d"), size = n, replace = TRUE)
#y <- as.factor(sample(c(1, 4, 10), size = n, replace = TRUE))
#y <- as.factor(sample(c("a", "10", "d"), size = n, replace = TRUE))
Xtrain <- X ; ytrain <- y
m <- 5
Xtest <- X[1:m, ] ; ytest <- y[1:m]

############################# KNN-LWPLSRDA-AGG

nlvdis <- 5 ; diss <- "mahal"
h <- 2 ; k <- 10
nlv <- "2:6" 
fm <- lwplsrda_agg(
    Xtrain, ytrain, 
    nlvdis = nlvdis, diss = diss,
    h = h, k = k,
    nlv = nlv)
res <- predict(fm, Xtest)
res$pred
res$listnn

## Gridscore & gridcv
## Here, there is no sense to use gridscorelv & gridcvlv 
nlvdis <- 5 ; diss <- "mahal"
h <- c(2, Inf)
k <- c(10, 20)
nlv <- c("1:3", "2:5")
pars <- mpars(nlvdis = nlvdis, diss = diss,
              h = h, k = k, nlv = nlv)
pars

res <- gridscore(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = lwplsrda_agg, 
    pars = pars)
res

segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
    Xtrain, ytrain, 
    segm, score = err, 
    fun = lwplsrda_agg, 
    pars = pars,
    verb = TRUE)
names(res)
res$val

############################# KNN-LWPLSLDA-AGG

nlvdis <- 5 ; diss <- "mahal"
h <- 2 ; k <- 10
nlv <- "2:6" 
fm <- lwplslda_agg(
    Xtrain, ytrain, 
    nlvdis = nlvdis, diss = diss,
    h = h, k = k,
    nlv = nlv, prior = "prop")
res <- predict(fm, Xtest)
res$pred
res$listnn

nlvdis <- 5 ; diss <- "mahal"
h <- c(2, Inf)
k <- c(10, 20)
nlv <- c("1:3", "2:5")
pars <- mpars(nlvdis = nlvdis, diss = diss,
              h = h, k = k, nlv = nlv, 
              prior = c("unif", "prop"))
pars

res <- gridscore(
    Xtrain, ytrain, Xtest, ytest, 
    score = err, 
    fun = lwplslda_agg, 
    pars = pars)
res

segm <- segmkf(n = n, K = 3, nrep = 1)
res <- gridcv(
    Xtrain, ytrain, 
    segm, score = err, 
    fun = lwplslda_agg, 
    pars = pars,
    verb = TRUE)
names(res)
res$val


mlesnoff/rchemo documentation built on April 15, 2023, 1:25 p.m.