Description Usage Arguments Details Value Author(s) References See Also Examples
Function to fit the Sparse Additive Interaction Model with strong heredity for a sequence of tuning parameters. This is a penalized regression method that ensures the interaction term is nonzero only if its corresponding maineffects are nonzero. This model only considers the interactions between a single exposure (E) variable and a highdimensional matrix (X). Additive (nonlinear) main effects and interactions can be specified by the user. This can also be seen as a varyingcoefficient model.
1 2 3 4 5 6 7 8  sail(x, y, e, basis = function(i) splines::bs(i, df = 5),
strong = TRUE, group.penalty = c("gglasso", "grMCP", "grSCAD"),
family = c("gaussian", "binomial"), center.x = TRUE,
center.e = TRUE, expand = TRUE, group, weights,
penalty.factor = rep(1, 1 + 2 * nvars), lambda.factor = ifelse(nobs <
(1 + 2 * bscols * nvars), 0.01, 1e04), lambda = NULL, alpha = 0.5,
nlambda = 100, thresh = 1e04, fdev = 1e05, maxit = 1000,
dfmax = 2 * nvars + 1, verbose = 0)

x 
input matrix of dimension 
y 
response variable. For 
e 
exposure or environment vector. Must be a numeric vector. Factors must be converted to numeric. 
basis 
user defined basis expansion function. This function will be
applied to every column in 
strong 
Flag specifying strong hierarchy (TRUE) or weak hierarchy (FALSE). Default FALSE. 
group.penalty 
group lasso penalty. Can be one of 
family 
response type. See 
center.x 
should the columns of 
center.e 
should exposure variable 
expand 
should 
group 
a vector of consecutive integers, starting from 1, describing
the grouping of the coefficients. Only required when 
weights 
observation weights. Default is 1 for each observation. Currently NOT IMPLEMENTED. 
penalty.factor 
separate penalty factors can be applied to each
coefficient. This is a number that multiplies lambda to allow differential
shrinkage. Can be 0 for some variables, which implies no shrinkage, and
that variable is always included in the model. Default is 1 for all
variables. Must be of length 
lambda.factor 
the factor for getting the minimal lambda in the lambda
sequence, where 
lambda 
a user supplied lambda sequence. Typically, by leaving this
option unspecified users can have the program compute its own lambda
sequence based on 
alpha 
the mixing tuning parameter, with 0<α<1. It controls the penalization strength between the main effects and the interactions. The penalty is defined as λ(1α)(w_eβ_e+ ∑ w_j β_j_2) + λα(∑ w_{je} γ_j) Larger values of

nlambda 
the number of lambda values. Default: 100 
thresh 
convergence threshold for coordinate descent. Each
coordinatedescent loop continues until the change in the objective
function after all coefficient updates is less than 
fdev 
minimum fractional change in deviance for stopping path. Default:

maxit 
maximum number of outerloop iterations allowed at fixed lambda
value. If models do not converge, consider increasing 
dfmax 
limit the maximum number of variables in the model. Useful for
very large 
verbose 
display progress. Can be either 0,1 or 2. 0 will not display any progress, 2 will display very detailed progress and 1 is somewhere in between. Default: 1. 
The objective function for family="gaussian"
is
RSS/2n + λ(1α)(w_eβ_e+ ∑ w_j β_j_2) + λα(∑ w_{je} γ_j)
where RSS
is the residual sum
of squares and n
is the number of observations. See Bhatnagar et al.
(2018+) for details.
It is highly recommended to specify center.x = TRUE
and
center.e = TRUE
for both convergence and interpretation reasons. If
centered, the final estimates can be interpreted as the effect of the
predictor on the response while holding all other predictors at their mean
value. For computing speed reasons, if models are not converging or running
slow, consider increasing thresh
, decreasing nlambda
, or
increasing lambda.factor
before increasing maxit
. Then try
increasing the value of alpha
(which translates to more penalization
on the interactions).
By default, sail
uses the group lasso penalty on the basis
expansions of x
. To use the group MCP and group SCAD penalties (see
Breheny and Huang 2015), the grpreg
package must be installed.
an object with S3 class "sail", "*"
, where "*"
is
"lspath" or "logitreg". Results are provided for converged values of lambda
only.
the call that produced this object
intercept sequence of length nlambda
a (#
main effects after basis expansion x nlambda
) matrix of main effects
coefficients, stored in sparse column format ("dgCMatrix")
a (# interaction effects after basis expansion x
nlambda
) matrix of interaction effects coefficients, stored in
sparse column format ("dgCMatrix")
A p x
nlambda
matrix of (γ) coefficients, stored in sparse column
format ("dgCMatrix")
exposure effect estimates of length
nlambda
list of length nlambda
containing
character vector of selected variables
the actual sequence of lambda values used
value for the mixing tuning parameter 0<α<1
the number of nonzero main effect coefficients for each value of lambda
the number of nonzero interaction coefficients for each value of lambda
the
number of nonzero exposure (e
) coefficients for each value of
lambda
the fraction of (null) deviance explained (for "lspath", this is the Rsquare). The deviance calculations incorporate weights if present in the model. The deviance is defined to be 2*(loglike_sat  loglike), where loglike_sat is the loglikelihood for the saturated model (a model with a free parameter per observation). Hence dev.ratio=1dev/nulldev.
vector of logicals of length
nlambda
indicating if the algorithm converged
number of converged lambdas
design matrix (X, E, X:E), of dimension
n x (2*ncols*p+1)
if expand=TRUE
. This is used in the
predict
method.
number of observations
number of main effect variables
character of variable names for main effects (without expansion)
an
integer of basis for each column of x if expand=TRUE
, or an integer
vector of basis for each variable if expand=FALSE
were the columns of x (after expansion) centered?
was e
centered?
user defined basis expansion function
was the basis function applied to each column of x?
a vector of consecutive integers describing the grouping of the coefficients. Only if expand=FALSE
character vector of names of interaction variables
character vector of names of main effect variables (with expansion)
Sahir Bhatnagar
Maintainer: Sahir Bhatnagar sahir.bhatnagar@gmail.com
Jerome Friedman, Trevor Hastie, Robert Tibshirani (2010). Regularization Paths for Generalized Linear Models via Coordinate Descent. Journal of Statistical Software, 33(1), 122. http://www.jstatsoft.org/v33/i01/.
Breheny P and Huang J (2015). Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Statistics and Computing, 25: 173187.
Yang Y, Zou H (2015). A fast unified algorithm for solving grouplasso penalize learning problems. Statistics and Computing. Nov 1;25(6):112941. http://www.math.mcgill.ca/yyang/resources/papers/STCO_gglasso.pdf
Bhatnagar SR, Yang Y, Greenwood CMT. Sparse additive interaction models with the strong heredity property (2018+). Preprint.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  f.basis < function(i) splines::bs(i, degree = 3)
# we specify dfmax to early stop the solution path to
# limit the execution time of the example
fit < sail(x = sailsim$x, y = sailsim$y, e = sailsim$e,
basis = f.basis, nlambda = 10, dfmax = 10,
maxit = 100)
# estimated coefficients at each value of lambda
coef(fit)
# predicted response at each value of lambda
predict(fit)
#predicted response at a specific value of lambda
predict(fit, s = 0.5)
# plot solution path for main effects and interactions
plot(fit)
# plot solution path only for main effects
plot(fit, type = "main")
# plot solution path only for interactions
plot(fit, type = "interaction")

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.