cv_gds: Cross-Validated Generalized Dantzig Selector

View source: R/cv_gds.R

cv_gdsR Documentation

Cross-Validated Generalized Dantzig Selector

Description

Generalized Dantzig Selector with cross-validation.

Usage

cv_gds(
  X,
  y,
  family = "gaussian",
  no_lambda = 10,
  lambda = NULL,
  n_folds = 5,
  weights = rep(1, length(y))
)

Arguments

X

Design matrix.

y

Vector of the continuous response value.

family

Use "gaussian" for linear regression, "binomial" for logistic regression and "poisson" for Poisson regression.

no_lambda

Length of the vector lambda of regularization parameters. Note that if lambda is not provided, the actual number of values might differ slightly, due to the algorithm used by glmnet::glmnet in finding a grid of lambda values.

lambda

Regularization parameter. If not supplied and if no_lambda > 1, a sequence of no_lambda regularization parameters is computed with glmnet::glmnet. If no_lambda = 1 then the cross-validated optimum for the lasso is computed using glmnet::cv.glmnet.

n_folds

Number of cross-validation folds to use.

weights

A vector of weights for each row of X. Defaults to 1 per observation.

Details

Cross-validation loss is calculated as the deviance of the model divided by the number of observations. For the Gaussian case, this is the mean squared error. Weights supplied through the weights argument are used both in fitting the models and when evaluating the test set deviance.

Value

An object of class cv_gds.

References

\insertRef

candes2007hdme

\insertRef

james2009hdme

Examples

## Not run: 
# Example with logistic regression
n <- 1000  # Number of samples
p <- 10 # Number of covariates
X <- matrix(rnorm(n * p), nrow = n) # True (latent) variables # Design matrix
beta <- c(seq(from = 0.1, to = 1, length.out = 5), rep(0, p-5)) # True regression coefficients
y <- rbinom(n, 1, (1 + exp(-X %*% beta))^(-1)) # Binomially distributed response
cv_fit <- cv_gds(X, y, family = "binomial", no_lambda = 50, n_folds = 10)
print(cv_fit)
plot(cv_fit)

# Now fit a single GDS at the optimum lambda value determined by cross-validation
fit <- gds(X, y, lambda = cv_fit$lambda_min, family = "binomial")
plot(fit)

# Compare this to the fit for which lambda is selected by GDS
# This automatic selection is performed by glmnet::cv.glmnet, for
# the sake of speed
fit2 <- gds(X, y, family = "binomial")

The following plot compares the two fits.
library(ggplot2)
library(tidyr)
df <- data.frame(fit = fit$beta, fit2 = fit2$beta, index = seq(1, p, by = 1))
ggplot(gather(df, key = "Model", value = "Coefficient", -index),
       aes(x = index, y = Coefficient, color = Model)) +
       geom_point() +
       theme(legend.title = element_blank())


## End(Not run)


osorensen/hdme documentation built on May 18, 2023, 11:35 p.m.