predict.ddsPLS: Function to predict from ddsPLS objects

View source: R/predict.ddsPLS.R

predict.ddsPLSR Documentation

Function to predict from ddsPLS objects

Description

Function to predict from ddsPLS objects

Usage

## S3 method for class 'ddsPLS'
predict(
  object,
  X_test = NULL,
  toPlot = FALSE,
  doDiagnosis = TRUE,
  legend.position = "topright",
  cex = 1,
  cex.text = 1,
  ...
)

Arguments

object

ddsPLS object.

X_test

matrix, a test data-set. If is "NULL", the default value, the predicted values for the train test are returned.

toPlot

boolean, wether or not to plot the extreme value test plot. Default to 'TRUE'.

doDiagnosis

boolean, wether or not to perform the diagnoses operations. See Value section for more details.

legend.position

character. Where to put the legend.

cex

float positive. Number indicating the amount by which plotting symbols should be scaled relative to the default.

cex.text

float positive. Number indicating the amount by which plotting text elements should be scaled relative to the default.

...

arguments to be passed to methods, such as graphical parameters.

Details

The diagnostic descriptors are usefull to detect potential outliers in the train or in the test datasets. This can be appreciated putting the parameter toPlot to TRUE. Thus a graph is created projecting, the observations i of the train and the test datasets, in the standardized subspaces spanned by the ddsPLS model:

\left( \epsilon_x(i),\epsilon_t(i) \right)= \left( \dfrac{1}{\sqrt{p}}\sqrt{\sum_{j=1}^p \left(\dfrac{\left[\hat{\mathbf{x}}_i- \mathbf{x}_i\right]_{(j)}}{\hat{\sigma}_j^{(x)}}\right)^2}, \dfrac{1}{\sqrt{R}}\sqrt{\sum_{r=1}^R \left(\dfrac{\hat{{t}}_i^{(r)}} {\hat{\sigma}_r}\right)^2} \right),

where [\cdot]_{(j)} takes the j^{th} coordinate of its argument. The different estimators are

\hat{{t}}_i^{(r)} = (\mathbf{x}_i-\hat{\boldsymbol{\mu}}_\mathbf{x}) \mathbf{u}_r,

\hat{\mathbf{x}}_i = \dfrac{1}{R}(\mathbf{x}_i-\hat{\boldsymbol{\mu}}_\mathbf{x}) \sum_{r=1}^R\mathbf{u}_r\mathbf{p}_r^\top,

plus \forall j\in[\![1,p]\!], \hat{\sigma}_j^{(x)2} is the estimated empirical variance of the j^{th} variable of X estimated on the train set of size n. Also, \hat{\sigma}_r^2=\dfrac{1}{n}\sum_{j=1}^n\left((\mathbf{x}_j-\hat{\boldsymbol{\mu}}_\mathbf{x}) \mathbf{u}_r\right)^2 is thus the estimated empirical variance of the r^{th}-component. Further, R is the approximated number of components of the ddsPLS model, \hat{\boldsymbol{\mu}}_\mathbf{x} is the empirical mean of X, \mathbf{u}_r is the weight of X along the r^{th} component, \mathbf{p}_r is the loading for X. The diagnoses object of the output is filled with two lists:

  • $object the first coordinate of the previous bivariate description corresponding to the reconstruction by the ddsPLS model.

  • $t the second coordinate of the previous bivariate description, corresponding to the score.

Value

A list of two objects:

  • Y_est the estimated values for the response variable.

  • diagnoses the results of diagnostic operations, useful to detect potential outliers in the dataset.

See Also

ddsPLS, plot.ddsPLS, summary.ddsPLS

Examples

n <- 100 ; d <- 2 ; p <- 20 ; q <- 2 ; n_test <- 1000
phi <- matrix(rnorm(n*d),n,d)
phi_test <- matrix(rnorm(n_test*d),n_test,d)
a <- rep(1,p/4) ; b <- rep(1,p/2)
X <- phi%*%matrix(c(1*a,0*a,0*b,1*a,3*b,0*a),nrow = d,byrow = TRUE) +
matrix(rnorm(n*p,sd = 1/4),n,p)
X_test <- phi_test%*%matrix(c(1*a,0*a,0*b,1*a,3*b,0*a),nrow = d,byrow=TRUE) +
matrix(rnorm(n_test*p,sd = 1/4),n_test,p)
Y <- phi%*%matrix(c(1,0,0,0),nrow = d,byrow = TRUE) +
matrix(rnorm(n*q,sd = 1/4),n,q)
res <- ddsPLS(X,Y,verbose=FALSE)
pre <- predict(res,X_test = X_test,toPlot = TRUE,doDiagnosis = TRUE)


ddsPLS documentation built on May 31, 2023, 7:50 p.m.