Description Usage Arguments Details Value Examples
Estimation procedure for HAL, the Highly Adaptive Lasso
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  fit_hal(
X,
Y,
formula = NULL,
X_unpenalized = NULL,
max_degree = ifelse(ncol(X) >= 20, 2, 3),
smoothness_orders = 1,
num_knots = num_knots_generator(max_degree = max_degree, smoothness_orders =
smoothness_orders, base_num_knots_0 = 200, base_num_knots_1 = 50),
reduce_basis = 1/sqrt(length(Y)),
family = c("gaussian", "binomial", "poisson", "cox"),
lambda = NULL,
id = NULL,
offset = NULL,
fit_control = list(cv_select = TRUE, n_folds = 10, foldid = NULL, use_min = TRUE,
lambda.min.ratio = 1e04, prediction_bounds = "default"),
basis_list = NULL,
return_lasso = TRUE,
return_x_basis = FALSE,
yolo = FALSE
)

X 
An input 
Y 
A 
formula 
A character string formula to be used in

X_unpenalized 
An input 
max_degree 
The highest order of interaction terms for which basis functions ought to be generated. 
smoothness_orders 
An 
num_knots 
An 
reduce_basis 
A 
family 
A 
lambda 
Userspecified sequence of values of the regularization
parameter for the lasso L1 regression. If 
id 
A vector of ID values that is used to generate crossvalidation
folds for 
offset 
a vector of offset values, used in fitting. 
fit_control 
List of arguments for fitting. Includes the following
arguments, and any others to be passed to

basis_list 
The full set of basis functions generated from 
return_lasso 
A 
return_x_basis 
A 
yolo 
A 
The procedure uses a custom C++ implementation to generate a design
matrix of spline basis functions of covariates and interactions of
covariates. The lasso regression is fit to this design matrix via
cv.glmnet
or a custom implementation derived from
origami. The maximum dimension of the design matrix is n by
(n * 2^(d1)), where where n is the number of observations and
d is the number of covariates.
For smoothness_orders = 0
, only zeroorder splines (piecewise
constant) are generated, which assume the true regression function has no
smoothness or continuity. When smoothness_orders = 1
, firstorder
splines (piecewise linear) are generated, which assume continuity of the
true regression function. When smoothness_orders = 2
, secondorder
splines (piecewise quadratic and linear terms) are generated, which assume
a the true regression function has a single order of differentiability.
num_knots
argument specifies the number of knot points for each
covariate and for each max_degree
. Fewer knot points can
significantly decrease runtime, but might be overly simplistic. When
considering smoothness_orders = 0
, too few knot points (e.g., < 50)
can significantly reduce performance. When smoothness_orders = 1
or
higher, then fewer knot points (e.g., 1030) is actually better for
performance. We recommend specifying num_knots
with respect to
smoothness_orders
, and as a vector of length max_degree
with
values decreasing exponentially. This prevents combinatorial explosions in
the number of higherdegree basis functions generated. The default behavior
of num_knots
follows this logic — for smoothness_orders = 0
,
num_knots
is set to 500 / 2^{j1}, and for
smoothness_orders = 1
or higher, num_knots
is set to
200 / 2^{j1}, where j is the interaction degree. We also
include some other suitable settings for num_knots
below, all of
which are less complex than default num_knots
and will thus result
in a faster runtime:
Some good settings for little to no cost in performance:
If smoothness_orders = 0
and max_degree = 3
,
num_knots = c(400, 200, 100)
.
If smoothness_orders = 1+
and max_degree = 3
,
num_knots = c(100, 75, 50)
.
Recommended settings for fairly fast runtime:
If smoothness_orders = 0
and max_degree = 3
,
num_knots = c(200, 100, 50)
.
If smoothness_orders = 1+
and max_degree = 3
,
num_knots = c(50, 25, 15)
.
Recommended settings for fast runtime:
If smoothness_orders = 0
and max_degree = 3
,
num_knots = c(100, 50, 25)
.
If smoothness_orders = 1+
and max_degree = 3
,
num_knots = c(40, 15, 10)
.
Recommended settings for very fast runtime:
If smoothness_orders = 0
and max_degree = 3
,
num_knots = c(50, 25, 10)
.
If smoothness_orders = 1+
and max_degree = 3
,
num_knots = c(25, 10, 5)
.
Object of class hal9001
, containing a list of basis
functions, a copy map, coefficients estimated for basis functions, and
timing results (for assessing computational efficiency).
1 2 3 4 5 6 7 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.