isle_post: Importance sampled learning ensemble

View source: R/isle_post.R

isle_postR Documentation

Importance sampled learning ensemble

Description

Uses glmnet or cv.glmnet to fit the entire LASSO path for post-processing the individual trees of a tree-based ensemble (e.g., a random forest).

Usage

isle_post(
  X,
  y,
  newX = NULL,
  newy = NULL,
  cv = FALSE,
  nfolds = 5,
  family = NULL,
  loss = "default",
  offset = NULL,
  ...
)

Arguments

X

A matrix of training predictions, one column for each tree in the ensemble.

y

Vector of training response values. See glmnet for acceptable values (e.g., numeric for family = "gaussian").

newX

Same as argument X, but should correspond to an independent test set. (Required whenever cv = FALSE.)

newy

Same as argument y, but should correspond to an independent test set. (Required whenever cv = FALSE.)

cv

Logical indicating whether or not to use n-fold cross-validation. Default is FALSE (Must be TRUE whenever newX = NULL and newy = NULL.)

nfolds

Integer specifying the number of folds to use for cross-validation (i.e., whenever cv = TRUE). Default is FALSE.

family

The model fitting family (e.g., family = "binomial" for binary outcomes); see glmnet for details on acceptable values.

loss

Optional character string specifying the loss to use for n-fold cross-validation. Default is "default"; see cv.glmnet for details. (Only used when cv = TRUE.)

offset

Optional value for the offset. Default is NULL, which corresponds to no offset.

...

Additional (optional) arguments to be passed on to glmnet (e.g., intercept = FALSE).

Value

A list with two components:

results

A data frame with one row for each value of lambda in the coefficient path and columns giving the corresponding number of trees/non-zero coefficients, error metric(s), and the corresponding value of lambda.

lasso.fit

The fitted glmnet or cv.glmnet object.


bgreenwell/treemisc documentation built on Oct. 26, 2022, 12:56 a.m.