setupCritPlot: Set up an optimality criterion plot of a sequence of...
In robustHD: Robust Methods for High-Dimensional Data

setupCritPlot

R Documentation

Set up an optimality criterion plot of a sequence of regression models

Description

Extract the relevent information for a plot of the values of the optimality criterion for a sequence of regression models, such as submodels along a robust or groupwise least angle regression sequence, or sparse least trimmed squares regression models for a grid of values for the penalty parameter.

Usage

setupCritPlot(object, ...)

## S3 method for class 'seqModel'
setupCritPlot(object, which = c("line", "dot"), ...)

## S3 method for class 'tslars'
setupCritPlot(object, p, ...)

## S3 method for class 'sparseLTS'
setupCritPlot(
  object,
  which = c("line", "dot"),
  fit = c("reweighted", "raw", "both"),
  ...
)

## S3 method for class 'perrySeqModel'
setupCritPlot(object, which = c("line", "dot", "box", "density"), ...)

## S3 method for class 'perrySparseLTS'
setupCritPlot(
  object,
  which = c("line", "dot", "box", "density"),
  fit = c("reweighted", "raw", "both"),
  ...
)

Arguments

`object`	the model fit from which to extract information.
`...`	additional arguments to be passed down.
`which`	a character string specifying the type of plot. Possible values are `"line"` (the default) to plot the (average) results for each model as a connected line, `"dot"` to create a dot plot, `"box"` to create a box plot, or `"density"` to create a smooth density plot. Note that the last two plots are only available in case of prediction error estimation via repeated resampling.
`p`	an integer giving the lag length for which to extract information (the default is to use the optimal lag length).
`fit`	a character string specifying for which estimator to extract information. Possible values are `"reweighted"` (the default) for the reweighted fits, `"raw"` for the raw fits, or `"both"` for both estimators.

Value

An object inheriting from class "setupCritPlot" with the following components:

data

a data frame containing the following columns:

Fit: a vector or factor containing the identifiers of the models along the sequence.
Name: a factor specifying the estimator for which the optimality criterion was estimated ("reweighted" or "raw"; only returned if both are requested in the "sparseLTS" or "perrySparseLTS" methods).
PE: the estimated prediction errors (only returned if applicable).
BIC: the estimated values of the Bayesian information criterion (only returned if applicable).
Lower: the lower end points of the error bars (only returned if possible to compute).
Upper: the upper end points of the error bars (only returned if possible to compute).

which

a character string specifying the type of plot.

grouped

a logical indicating whether density plots should be grouped due to multiple model fits along the sequence (only returned in case of density plots for the "perrySeqModel" and "perrySparseLTS" methods).

includeSE

a logical indicating whether error bars based on standard errors are available (only returned in case of line plots or dot plots).

mapping

default aesthetic mapping for the plots.

facets

default faceting formula for the plots (only returned if both estimators are requested in the "sparseLTS" or "perrySparseLTS" methods).

tuning

a data frame containing the grid of tuning parameter values for which the optimality criterion was estimated (only returned for the "sparseLTS" and "perrySparseLTS" methods).

Author(s)

Andreas Alfons

Examples

## generate data
# example is not high-dimensional to keep computation time low
library("mvtnorm")
set.seed(1234)  # for reproducibility
n <- 100  # number of observations
p <- 25   # number of variables
beta <- rep.int(c(1, 0), c(5, p-5))  # coefficients
sigma <- 0.5      # controls signal-to-noise ratio
epsilon <- 0.1    # contamination level
Sigma <- 0.5^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
x <- rmvnorm(n, sigma=Sigma)    # predictor matrix
e <- rnorm(n)                   # error terms
i <- 1:ceiling(epsilon*n)       # observations to be contaminated
e[i] <- e[i] + 5                # vertical outliers
y <- c(x %*% beta + sigma * e)  # response
x[i,] <- x[i,] + 5              # bad leverage points


## robust LARS
# fit model
fitRlars <- rlars(x, y, sMax = 10)
# extract information for plotting
setup <- setupCritPlot(fitRlars)
critPlot(setup)


## sparse LTS over a grid of values for lambda
# fit model
frac <- seq(0.2, 0.05, by = -0.05)
fitSparseLTS <- sparseLTS(x, y, lambda = frac, mode = "fraction")
# extract information for plotting
setup1 <- setupCritPlot(fitSparseLTS)
critPlot(setup1)
setup2 <- setupCritPlot(fitSparseLTS, fit = "both")
critPlot(setup2)

robustHD documentation built on July 1, 2024, 1:06 a.m.