nnGarrote: Nonnegative Garrote Method with Hierarchical Structures

View source: R/HiGarrote.R

nnGarroteR Documentation

Nonnegative Garrote Method with Hierarchical Structures

Description

'nnGarrote()' implements the nonnegative garrote method, as described in Yuan et al. (2009), for selecting important variables while preserving hierarchical structures. The method begins by obtaining the least squares estimates of the regression parameters under a linear model. These initial estimates are then used in the nonnegative garrote to perform variable selection. This function supports prediction based on the linear model fitted with the selected variables and their nonnegative garrote estimates. Note that this method is suitable only when the number of observations is much larger than the number of variables, ensuring that the least squares estimation remains reliable.

Usage

nnGarrote(U, y, new_U = NULL, heredity = "weak")

Arguments

U

An n \times P model matrix, where n is the number of data and P is the number of potential variables. The inclusion of potential variables supports only up to second-order interactions. Three-order and higher order interactions are not supported. The colon symbol ":" must be included in the names of a second-order interaction for separating its parent variables. Please see the example for the naming format.

y

A vector for the responses.

new_U

Optional. A matrix or data frame of the new model matrix for prediction.

heredity

Choice of heredity principles: weak or strong. The default is weak.

Value

If new_U is NULL, the function returns a vector for the nonnegative garrote estimates of the identified variables.

If new_U is not NULL, the function returns a list with:

  • beta_nng: a vector for the nonnegative garrote estimates of the identified variables.

  • pred: predictions for the output corresponding to new_U.

References

Yuan, M., Joseph, V. R., and Zou H. (2009) "Structured Variable Selection and Estimation," The Annals of Applied Statistics, 3(4):1738–1757.

Examples

x1 <- runif(1000)
x2 <- runif(1000)
x3 <- runif(1000)
error <- rnorm(1000)
X <- data.frame(x1, x2, x3)
U_all <- data.frame(model.matrix(~. + x1:x2 + x1:x3 + x2:x3 + I(x1^2) + I(x2^2) + I(x3^2), X))
colnames(U_all) <- c("X.Intercept.", "x1", "x2", "x3", "x1:x1", "x2:x2", "x3:x3",
 "x1:x2", "x1:x3", "x2:x3")
# ":" is required for detecting the parent variables of a second-order interaction.

new_idx <- sample(1:1000, 800)
new_U <- U_all[new_idx,]
U_idx <- setdiff(1:1000, new_idx)
U <- U_all[U_idx,]
y_all <- 20*U_all$x1 + 15*U_all$`x1:x1` + 10*U_all$`x1:x2` + error
y <- y_all[U_idx]
nnGarrote(U, y, new_U)



HiGarrote documentation built on April 4, 2025, 12:37 a.m.