Description Usage Arguments Details Value Note Author(s) See Also Examples
function to fit the Strong Heredity Interaction Model for a sequence of tuning parameters. This is a penalized regression method that ensures the interaction term is non-zero only if its corresponding main-effects are non-zero.
1 2 3 4 5 6 | shim(x, y, main.effect.names, interaction.names, family = c("gaussian",
"binomial", "poisson"), weights, lambda.factor = ifelse(nobs < nvars, 0.01,
1e-06), lambda.beta = NULL, lambda.gamma = NULL, nlambda.gamma = 10,
nlambda.beta = 10, nlambda = 100, threshold = 1e-04, max.iter = 100,
initialization.type = c("ridge", "univariate"), center = TRUE,
normalize = TRUE, verbose = TRUE, cores = 2)
|
x |
Design matrix of dimension |
y |
response variable. For |
main.effect.names |
character vector of main effects names. MUST be
ordered in the same way as the column names of |
interaction.names |
character vector of interaction names. MUST be
separated by a colon (e.g. x1:x2), AND MUST be ordered in the same way as
the column names of |
family |
response type. see |
weights |
observation weights. Can be total counts if responses are proportion matrices. Default is 1 for each observation. Currently NOT IMPLEMENTED |
lambda.factor |
The factor for getting the minimal lambda in lambda
sequence, where |
lambda.beta |
sequence of tuning parameters for the main effects. If
|
lambda.gamma |
sequence of tuning parameters for the interaction
effects. Default is |
nlambda.gamma |
number of tuning parameters for gamma. This needs to be specified even for user defined inputs |
nlambda.beta |
number of tuning parameters for beta. This needs to be specified even for user defined inputs |
nlambda |
total number of tuning parameters. If |
threshold |
Convergence threshold for coordinate descent. Each
coordinate-descent loop continues until the change in the objective
function after all coefficient updates is less than threshold. Default
value is |
max.iter |
Maximum number of passes over the data for all tuning parameter values; default is 100. |
initialization.type |
The procedure used to estimate the regression
coefficients and used in the |
center |
Should |
normalize |
Should |
verbose |
Should iteration number and vector of length |
cores |
The number of cores to use for certain calculations in the
|
the index of the tuning parameters is as follows. If for example there are 10 lambda_gammas, and 20 lambda_betas, then the first lambda_gamma gets repeated 20 times. So the first twenty entries of tuning parameters correspond to 1 lambda_gamma and the 20 lambda_betas
An object with S3 class "shim"
Intercept
sequence of length nlambda
A nvars x nlambda
matrix of main effects (β) coefficients, stored in sparse column
format ("CsparseMatrix")
A nvars x nlambda
matrix of interaction effects (α) coefficients, stored in sparse
column format ("CsparseMatrix")
A nvars x
nlambda
matrix of (γ) coefficients, stored in sparse
column format ("CsparseMatrix")
The sequence of tuning parameters used for the main effects
The sequence of tuning parameters used for the interaction effects
2 x nlambda matrix of tuning parameters. The first
row corresponds to lambda.beta
and the second row corresponds to
lambda.gamma
list of length nlambda
where each
element gives the index of the nonzero β coefficients
list of length nlambda
where each element gives the
index of the nonzero α coefficients
x matrix
response data
column means of x matrix
mean of response
column standard deviations of x matrix
the call to the function
nlambda.gamma
nlambda.beta
nlambda
interaction names
main effect names
if the user specifies lambda.beta and lambda.gamma then they this will not take all possible combinations of lambda.beta and lambda.gamma. It will be the first element of each as a pair, and so on. This is done on purpose for use with the cv.shim function which uses the same lambda sequences for each fold.
Sahir Bhatnagar
Maintainer: Sahir Bhatnagar sahir.bhatnagar@mail.mcgill.ca
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | # number of observations
n <- 100
# number of predictors
p <- 5
# environment variable
e <- sample(c(0,1), n, replace = T)
# main effects
x <- cbind(matrix(rnorm(n*p), ncol = p), e)
# need to label columns
dimnames(x)[[2]] <- c("x1","x2","x3","x4","x5","e")
# design matrix without intercept (can be user defined interactions)
X <- model.matrix(~(x1+x2+x3)*e+x1*x4+x3*x5-1, data = as.data.frame(x))
# names must appear in the same order as X matrix
interaction_names <- grep(":", colnames(X), value = T)
main_effect_names <- setdiff(colnames(X), interaction_names)
# response
Y <- X %*% rbinom(ncol(X), 1, 0.6) + 3*rnorm(n)
# standardize data
data_std <- standardize(X,Y)
result <- shim(x = data_std$x, y = data_std$y,
main.effect.names = main_effect_names,
interaction.names = interaction_names)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.