pls_warper: Partial least squares transformation of feature space

View source: R/pls_warper.R

pls_warperR Documentation

Partial least squares transformation of feature space

Description

This function generates a warper object based on a partial least squares transformation with respect to a selected feature.

Usage

pls_warper(
  xdata,
  xvars,
  pvars,
  wvars = "resid",
  yvar,
  uvars = NULL,
  title = wvars
)

Arguments

xdata

A data frame containing the observations in the original feature space.

xvars

A character vector with the column names of features in xdata that should be transformed.

pvars

A character string identifying one feature in xdata that is used as the selected feature; the other features in xvars will be transformed by orthogonalization, while this one will remain unchanged.

wvars

A character string giving a prefix for partial residual features.

yvar

Name of the response variable (not to be transformed)

uvars

Names of additional variables that should remain untouched.

title

Optional name of the transformation, may be used for printing summaries or for plotting.

Details

The arguments pvars, xvars, uvars and yvar should not overlap.

Value

An object of class warper, rotation_warper and pls_warper.

Examples

### Create PLS warper for late-season mean NDVI
### in the Maipo data set to explore the combined
### effect of these variables:
xvars <- c(paste("ndvi0", 1:8, sep = ""),
           paste("ndwi0", 1:8, sep = ""))
fo <- as.formula(paste("class ~", paste(xvars, collapse=" +" )))
d <- maipofields
fit <- randomForest::randomForest(fo, data = d)

# Late-season NDVI and NDWI features:
late_ndis <- c(paste("ndvi0", 4:7, sep = ""),
               paste("ndwi0", 4:7, sep = ""))
# Note that they are strongly correlated,
# especially for same or adjacent image dates,
# e.g. ndvi04 and ndwi04, or ndvi06 and ndvi07:
round(cor(d[,late_ndis]), 2)
# PC #1 explains 91% of the variance:
late_pca <- stats::prcomp(d[,late_ndis],
                          scale. = TRUE,
                          rank. = 3)
summary(late_pca)

# Mean late-season NDVI + NDWI:
# (we can average them since they have a comparable scale,
# otherwise we'd want to standardize them first)
d$late_ndi <- rowMeans(d[, late_ndis])
# Note that this variable was not in the RF's feature set!

wrp <- pls_warper(d, xvars = xvars,
                  pvars = "late_ndi",
                  yvar = "class")

# Warp the model and the feature data:
wd <- warp(d, warper = wrp)
wfit <- warp_fitted_model(fit, warper = wrp)

# Use iml package to create partial dependence plot:
if (require("iml")) {
  wprd <- Predictor$new(wfit, data = wd, y = "class",
                       type = "prob", class = "crop1")
  weff <- FeatureEffect$new(wprd, feature = "late_ndi",
                            method = "pdp", grid.size = 100)
  plot(weff)
}

# For comparison, the traditional, untransformed
# perspective:
if (require("iml")) {
  prd <- Predictor$new(fit, data = d, y = "class",
                       type = "prob", class = "crop1")
  eff <- FeatureEffects$new(prd, features = late_ndis,
                            method = "pdp", grid.size = 100)
  plot(eff, ncol = 4)
}

# ...and remember that this feature set has 64 features,
# and four classes, therefore our transformed perspective
# is much tidier as it allows you to represent a combined
# effect using only one figure.

# Backtransform it, should be identical to d:
d2 <- unwarp(wd, warper = wrp)
all.equal(d[,xvars], d2[,xvars])
# Default tolerance works for this data set, but you may have to use
# e.g. tol = 10^(-6) for less well conditioned data sets and
# transformations.

alexanderbrenning/wiml documentation built on Sept. 29, 2023, 4:45 a.m.