setupCoefPlot: Set up a coefficient plot of a sequence of regression models
In robustHD: Robust Methods for High-Dimensional Data

setupCoefPlot

R Documentation

Set up a coefficient plot of a sequence of regression models

Description

Extract the relevent information for a plot of the coefficients for a sequence of regression models, such as submodels along a robust or groupwise least angle regression sequence, or sparse least trimmed squares regression models for a grid of values for the penalty parameter.

Usage

setupCoefPlot(object, ...)

## S3 method for class 'seqModel'
setupCoefPlot(object, zeros = FALSE, labels = NULL, ...)

## S3 method for class 'tslars'
setupCoefPlot(object, p, ...)

## S3 method for class 'sparseLTS'
setupCoefPlot(
  object,
  fit = c("reweighted", "raw", "both"),
  zeros = FALSE,
  labels = NULL,
  ...
)

Arguments

`object`	the model fit from which to extract information.
`...`	additional arguments to be passed down.
`zeros`	a logical indicating whether predictors that never enter the model and thus have zero coefficients should be included in the plot (`TRUE`) or omitted (`FALSE`, the default). This is useful if the number of predictors is much larger than the number of observations, in which case many coefficients are never nonzero.
`labels`	an optional character vector containing labels for the predictors. Information on labels can be suppressed by setting this to `NA`.
`p`	an integer giving the lag length for which to extract information (the default is to use the optimal lag length).
`fit`	a character string specifying for which estimator to extract information. Possible values are `"reweighted"` (the default) for the reweighted fits, `"raw"` for the raw fits, or `"both"` for both estimators.

Value

An object inheriting from class "setupCoefPlot" with the following components:

coefficients

a data frame containing the following columns:

fit: the model fit for which the coefficient is computed (only returned if both the reweighted and raw fit are requested in the "sparseLTS" method).
lambda: the value of the penalty parameter for which the coefficient is computed (only returned for the "sparseLTS" method).
step: the step along the sequence for which the coefficient is computed.
df: the degrees of freedom of the submodel along the sequence for which the coefficient is computed.
coefficient: the value of the coefficient.
variable: a character string specifying to which variable the coefficient belongs.

abscissa

a character string specifying available options for what to plot on the x-axis

lambda

a numeric vector giving the values of the penalty parameter. (only returned for the "sparseLTS" method).

step

an integer vector containing the steps for which submodels along the sequence have been computed.

df

an integer vector containing the degrees of freedom of the submodels along the sequence (i.e., the number of estimated coefficients; only returned for the "seqModel" method).

includeLabels

a logical indicating whether information on labels for the variables should be included in the plot.

labels

a data frame containing the following columns (not returned if information on labels is suppressed):

fit: the model fit for which the coefficient is computed (only returned if both the reweighted and raw fit are requested in the "sparseLTS" method).
lambda: the smallest value of the penalty parameter (only returned for the "sparseLTS" method).
step: the last step along the sequence.
df: the degrees of freedom of the last submodel along the sequence.
coefficient: the value of the coefficient.
label: the label of the corresponding variable to be displayed in the plot.

facets

default faceting formula for the plots (only returned if both estimators are requested in the "sparseLTS" method).

Author(s)

Andreas Alfons

Examples

## generate data
# example is not high-dimensional to keep computation time low
library("mvtnorm")
set.seed(1234)  # for reproducibility
n <- 100  # number of observations
p <- 25   # number of variables
beta <- rep.int(c(1, 0), c(5, p-5))  # coefficients
sigma <- 0.5      # controls signal-to-noise ratio
epsilon <- 0.1    # contamination level
Sigma <- 0.5^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
x <- rmvnorm(n, sigma=Sigma)    # predictor matrix
e <- rnorm(n)                   # error terms
i <- 1:ceiling(epsilon*n)       # observations to be contaminated
e[i] <- e[i] + 5                # vertical outliers
y <- c(x %*% beta + sigma * e)  # response
x[i,] <- x[i,] + 5              # bad leverage points


## robust LARS
# fit model
fitRlars <- rlars(x, y, sMax = 10)
# extract information for plotting
setup <- setupCoefPlot(fitRlars)
coefPlot(setup)


## sparse LTS over a grid of values for lambda
# fit model
frac <- seq(0.2, 0.05, by = -0.05)
fitSparseLTS <- sparseLTS(x, y, lambda = frac, mode = "fraction")
# extract information for plotting
setup1 <- setupCoefPlot(fitSparseLTS)
coefPlot(setup1)
setup2 <- setupCoefPlot(fitSparseLTS, fit = "both")
coefPlot(setup2)

robustHD documentation built on July 1, 2024, 1:06 a.m.